PAOLA_3EKO_8LIGANDS very low GPU load

Message boards : Number crunching : PAOLA_3EKO_8LIGANDS very low GPU load
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Snow Crash

Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26819 - Posted: 8 Sep 2012, 3:01:58 UTC - in response to Message 26801.  
Last modified: 8 Sep 2012, 3:03:08 UTC

... it looks like I may have an oppportunity when I get home today but it depends on how ambitious I am

The deed is done ... 2 at a time is taking 30 hours on a 660Ti.
Currently my 670 is going to take 16 hours to do 1.
Overall this is going to kill my RAC but I'm going to try to stick with it for a while, may even do it on my 670 just to clear the queue.

Anyone from the project have an estimate on how many more we will need to finish out this run?
Thanks - Steve
ID: 26819 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
voss749

Send message
Joined: 27 Mar 11
Posts: 26
Credit: 307,452,808
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 26829 - Posted: 8 Sep 2012, 12:52:55 UTC - in response to Message 26760.  

Are I'm the only one who abort every "PAOLA_3EKO" task I get?

My GPU goes below 400MHz and the task time goes up to ~ 18H.
That is more than double the time it should take, and I could have done more than 2 "NATHAN_RPS1120801" tasks in that amount of time.


You really shouldn't do that. Remember that they rely on you and other volunteers to do the crunching for their research. If everyone aborted certain kinds of tasks, they'd never get any research done. If you're concerned about low utilisation, I suggest using a custom app_info.xml - I posted about it a few posts ago in tis thread.


Well then maybe they shouldnt send out these workunits. Perhaps If everyone aborted these tasks maybe they would get the message and fix the problems. We shouldnt have to hack our way around badly behaving workunits. We are donating resources to their project, we have a right to expect our donated resources to be used as efficiently and effectively as possible.
ID: 26829 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
flashawk

Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 26830 - Posted: 8 Sep 2012, 13:28:46 UTC

voss has a good point there, though I don't advocate open rebellion. Why hasn't the project scientist responded to any of these threads? Ya know, something like "Were working on rectifying the situation" or letting us know why they haven't pulled them from the hopper? I'm starting to wonder if this might not be deliberate because their getting overwhelmed from the new video cards.
ID: 26830 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
5pot

Send message
Joined: 8 Mar 12
Posts: 411
Credit: 2,083,882,218
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26831 - Posted: 8 Sep 2012, 14:41:39 UTC

ATTENTION GPUGRID STAFF:

My 3way 680 setup caught another 3 of these tasks in a row last night and crashed yet again. This is now the 4th time that this has happened.

I will be switching over to the Short Run Tasks. Which I do not want to do. But your current Long Runs give me NO CHOICE.

PLEASE LET US KNOW when these bad tasks are out of the hopper. This is unacceptable.

I enjoy crunching here, and as I've said before, I love this project. But, I expect better from you guys. And girls :).

PLEASE do not let this happen again.

Cheers
ID: 26831 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26832 - Posted: 8 Sep 2012, 15:33:42 UTC - in response to Message 26831.  
Last modified: 9 Sep 2012, 9:37:13 UTC

I'm running a long PAOLA_3EKO_8LIGANDS task on a now fairly old GTX470
(2003x64, i7-2600, 8GB, 2nd hdd). Normally I get 99% GPU utilization (or very close to it). For the 3EKO_8LIGANDS task I'm seeing 45% GPU utilization with 2 CPU threads free to support the GPU. When I suspend all CPU tasks the GPU utilization rises to 56%. HT is on, and I can see that the 7% CPU usage is almost half of one CPU thread.

Do you think Boinc could be forcing the task to only use the one thread?
Affinity is for all threads according to Task Manager. With Priority set to high the GPU utilization looks to be about 1% more (so not really significant).

Following a restart I configured the system to not use HT. I also ran the task with Boinc Manager closed. Even at High Priority the task still only ran at 58% GPU utilization. The memory controller load was only at 15%. GPU temp is still low (55°C). CPU usage is around 12% (half of one Core). Starting Boinc Manager didn't appear to make any difference.
As a side note, the CPU continually jumps back and fourth from ~1616MHz to 3737MHz. Running WUProp and FreeHal did not force the CPU to remain at 3737MHz, but running one Docking task forced the CPU to vary from 3636 to 3535MHz. GPU utilization and memory usage didn't change however.
With 2 of the 4 CPU cores used for CPU tasks the GPU utilization is ~56%.
With 3 CPU cores used for CPU tasks GPU utilization remained at 56%; suggesting that CPU projects are competing with these GPU tasks in some way (as 6threads resulted in 45% GPU utilization). This is somewhat similar to what you see at POEM and Donate - only run a few CPU tasks, and it makes little or no impact on the GPU project, but 5/8threads (or more) reduces GPU performance.
Could this issue be related to the memory controller?

- The task finished in around the same time as previous tasks but there is time variation between these tasks, so nothing further can be concluded.
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help
ID: 26832 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile The King's Own
Avatar

Send message
Joined: 25 Apr 12
Posts: 32
Credit: 945,543,997
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 26867 - Posted: 11 Sep 2012, 0:22:50 UTC

I feel guilty when I abort these work units; however, I can run 2 of these per day or 6 to eight others. If I do the latter my RAC doesn't plummet and my good will is not lessened.

I refer you to my "Not Happy" post. My girlfiend doesn't fully comprehend why I spend $100 a month on electricity. She doesn't know what 2 GTS450s, a GTX580 and a 660Ti cost, and I'm not telling her. Nevertheless, I bought 2 of those GPUs solely for this project. I live in the US and would at least get a tax deduction if GPUGrid were based here.

Respectfully,

The King's Own
ID: 26867 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Luke Formosa

Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26871 - Posted: 11 Sep 2012, 1:01:45 UTC - in response to Message 26867.  
Last modified: 11 Sep 2012, 1:02:29 UTC

I feel guilty when I abort these work units; however, I can run 2 of these per day or 6 to eight others. If I do the latter my RAC doesn't plummet and my good will is not lessened.

I refer you to my "Not Happy" post. My girlfiend doesn't fully comprehend why I spend $100 a month on electricity. She doesn't know what 2 GTS450s, a GTX580 and a 660Ti cost, and I'm not telling her. Nevertheless, I bought 2 of those GPUs solely for this project. I live in the US and would at least get a tax deduction if GPUGrid were based here.

Respectfully,

The King's Own


Something's not right here. You shouldn't be spending $100 a month for an RAC of just 300,000. I live in Malta, where the electricity is at least twice as expensive, and I still manage a global RAC of 800,000 on €30 a month (running cost of the computer alone) with a single GTX 670 and an i7-3770K (Ivy bridge) if I leave it on 24/7. I think the problem is that you're using older generation cards. The Nvidia 6-series cards give about twice the performance per watt compared to the 5- or 4- series ones. So you should switch over completely to 6-series cards. Consider it an investment - within 6 months they'd have paid for themselves in electricity costs. Same argument used for switching incandescent light bulbs for energy saving ones :)
ID: 26871 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ken Florian

Send message
Joined: 4 May 12
Posts: 56
Credit: 1,832,989,878
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26873 - Posted: 11 Sep 2012, 1:52:57 UTC - in response to Message 26867.  

This IS painful.

I spent about $3,600 to build a machine exclusively for gpugrid crunching. It does nothing else. Ever. The thing runs so hot that I can't run it in my home. It is in my son's basement a very long way from here. This means I never get to run, say, Flight Simulator or Civ5, on two video cards that a few years ago I could not have dreamed of being able to afford.

I hope they fix it soon.

Ken Florian
ID: 26873 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
werdwerdus

Send message
Joined: 15 Apr 10
Posts: 123
Credit: 1,004,473,861
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26874 - Posted: 11 Sep 2012, 6:05:02 UTC - in response to Message 26867.  
Last modified: 11 Sep 2012, 6:08:58 UTC

I think you should not feel guilty. It is YOUR hardware and you should choose how it is used. Aborting some work units is really no worse than if somebody decided to crunch on a different project for a while, or there was a power outage, or the server ran out of disk space, or the DNS servers got hacked and the internet didn't work, or.... (hopefully you see my point)

We are VOLUNTEERS. We are giving up our own time, money, and resources to the scientists. If we should decide that a certain project or task or workunit is unfit for our individual tastes, time commitment, or energy cost, we have that right to abort and try something different. That could mean choosing a different project alltogether, or just another workunit.
ID: 26874 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Paden Cast

Send message
Joined: 1 Jun 10
Posts: 1
Credit: 83,369,250
RAC: 0
Level
Thr
Scientific publications
watwatwatwat
Message 26914 - Posted: 16 Sep 2012, 4:47:55 UTC
Last modified: 16 Sep 2012, 5:27:22 UTC

I felt i should chime in since i am seeing nowhere near the times for this project that others are seeing. Granted, the WU's are all jacked up. I'm seeing a GPU load at 55%, Memory load 15%, at 45% power consumption.

My rig:

i5-3750 oc to 4.2 ghz
16 gb 1600MHZ oc to 2000
2x GTX 670 FTW LE (both on x16 rails)
SSD on USB 3.0

I'm running the my first unit of this now. Im going to guess around 12 hours.

I'm going to guess that we are being limited by the CPU core utilization. My GPU load matches my CPU load almost to a T.

Unless we get a change that lets us choose the CPU core utilization (as in 1:1), I think we are stuck with high run times.

Let me know if I can do anything on my end to test. It would be helpful to include how to do it. I just got win7 and am having a hell of a time finding things.
ID: 26914 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
M_M

Send message
Joined: 11 Nov 10
Posts: 9
Credit: 53,476,066
RAC: 0
Level
Thr
Scientific publications
watwatwatwat
Message 26956 - Posted: 22 Sep 2012, 5:40:24 UTC

Still getting these 3-PAOLA_3EKO_8LIGANDS loooooong runs... :(

GPU Usage only between 35 and 39% and taking over 13h to complete. Other long runs have GPU usage of 95-99% and taking 6-9hrs. Cuda42, 306.23 drivers, Win7x64, i7-2600k@4.5GHz.



ID: 26956 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26958 - Posted: 22 Sep 2012, 8:26:57 UTC - in response to Message 26956.  
Last modified: 22 Sep 2012, 8:48:55 UTC

All you can do is increase GPU utilization by about 10% by running fewer CPU tasks (say 4 from 8 threads - see below). That would improve your task performance by around 28%. In terms of overall Boinc Credits it's worth it, but it depends on where your priorities are. Other GPU tasks don't require this, so I would only do it if I was getting lots of these tasks, or if I spotted one, and can change settings back later.
I guess you could write a script to poll for and identify the task being run and change Boinc settings accordingly, but that's a pile of work, and we might not see many of these.

What's the performance like on Linux?
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help
ID: 26958 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
M_M

Send message
Joined: 11 Nov 10
Posts: 9
Credit: 53,476,066
RAC: 0
Level
Thr
Scientific publications
watwatwatwat
Message 26961 - Posted: 22 Sep 2012, 12:45:10 UTC - in response to Message 26958.  

All you can do is increase GPU utilization by about 10% by running fewer CPU tasks (say 4 from 8 threads - see below). That would improve your task performance by around 28%.


Seems to be right. By running fewer other CPU tasks, GPU utilization for these long runs WU increase to around 48-50%. Does this mean that there is a CPU bottleneck in supplying a actual work to GPU for these particular WUs? I see those CPU tasks are single threded. Just wondering if my CPU is for example twice as fast in single threads, would GPU utilization improve?
ID: 26961 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
werdwerdus

Send message
Joined: 15 Apr 10
Posts: 123
Credit: 1,004,473,861
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26965 - Posted: 22 Sep 2012, 23:22:59 UTC - in response to Message 26961.  
Last modified: 22 Sep 2012, 23:23:11 UTC

yes that seems to be true. I tried underclocking one of my rigs from 2.67ghz to 1.6ghz and the gpu usage dropped from ~38% to ~25% iirc
ID: 26965 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26973 - Posted: 23 Sep 2012, 11:41:06 UTC - in response to Message 26965.  

Clearly it's partially CPU dependant, but another bottleneck factor is at play too, otherwise if we stopped running CPU tasks altogether the GPU utilization would rise to 99% on XP systems.
The candidates are CPU Cache, system BUS, RAM freq./timings, HDD I/O, the app and Boinc.
If it's CPU cache then the high end 2nd Gen Intel's would allow you to go higher then ~50% GPU utilization.
3rd Generation Intel systems should allow higher GPU utilization if it's BUS related, DDR2 vs DDR3 would make a big impact if RAM is a factor (as would higher freq. RAM).
HDD I/O would improve with a good SSD (disk write caching might also make some difference).
The app and Boinc might behave differently on Linux.

Anyway, it's down to the researchers to improve, if they think it's worthwhile for the project. All we can do is optimize our systems for the app/tasks that are there to run, if we want to.
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help
ID: 26973 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
werdwerdus

Send message
Joined: 15 Apr 10
Posts: 123
Credit: 1,004,473,861
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26979 - Posted: 23 Sep 2012, 19:50:28 UTC - in response to Message 26973.  
Last modified: 23 Sep 2012, 20:12:47 UTC

Anyway, it's down to the researchers to improve, if they think it's worthwhile for the project. All we can do is optimize our systems for the app/tasks that are there to run, if we want to.


agree to disagree, I think it is up to the researchers to optimize the tasks for the hardware that is available to them (the volunteers' systems), while there are and should be small things we can do to squeeze out that last 5-10%

since it seems that the majority of current users are having the same issue with only these tasks, there must be some major difference either in the actual work being done (could explain why it is much more CPU dependent) or a major difference in the coding that was either over-looked (accidental) or not able to be worked around (if the work being done does not benefit from parallelization for instance)
XtremeSystems.org - #1 Team in GPUGrid
ID: 26979 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 27032 - Posted: 26 Sep 2012, 22:12:45 UTC - in response to Message 26979.  

I'm not sure we even disagree!

While most of us would prefer tasks to all run at 99% the research doesn't always fall into this apparently perfect model. Unfortunately that concept might even be false; just because the GPU is being used at 99% doesn't mean the the WU is best optimized. It might be the case that the code could be changed so that the task utilizes 99% of the GPU, but is slower overall than alternative code that only uses 66% (some things are just done faster on the CPU). Then there is the power consumption consideration, and the inevitable argument of value and chance; which piece of research is the most important? We won't know until a cure for Parkinson's, Altimeters or Cancer is actually derived. Even then, most research is built on other research, including research that went no-where...

Different research necessitates different code.

GPUGrid is involved in many research lines, which is fantastic for GPUGrid, GPU research and Science as a whole - the GPU is an established and developing tool for crunching and facilitates many techniques.
GPU crunching is key to the future of Scientific research, especially in such financially austere times.
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help
ID: 27032 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 27040 - Posted: 30 Sep 2012, 23:07:55 UTC - in response to Message 26973.  
Last modified: 1 Oct 2012, 17:37:10 UTC

Clearly it's partially CPU dependant, but another bottleneck factor is at play too, otherwise if we stopped running CPU tasks altogether the GPU utilization would rise to 99% on XP systems.
The candidates are CPU Cache, system BUS, RAM freq./timings, HDD I/O, the app and Boinc.
If it's CPU cache then the high end 2nd Gen Intel's would allow you to go higher then ~50% GPU utilization.
3rd Generation Intel systems should allow higher GPU utilization if it's BUS related, DDR2 vs DDR3 would make a big impact if RAM is a factor (as would higher freq. RAM).
HDD I/O would improve with a good SSD (disk write caching might also make some difference).
The app and Boinc might behave differently on Linux.

Anyway, it's down to the researchers to improve, if they think it's worthwhile for the project. All we can do is optimize our systems for the app/tasks that are there to run, if we want to.


Can't add much, but when I put my GPU into a system with a lesser CPU (IC2D 2.13GHz rather than i7-2600), the GPU Utilization dropped to 37% (when not crunching with the CPU). Both systems were DDR3 dual channel, and I used an SSD with the IC2D, to eliminate any possible I/O bottlenecks. I noted that the task was >600MB in size.

The task returned in 22h for full bonus credit, but took twice as long as some tasks for the same credit.
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help
ID: 27040 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 27095 - Posted: 19 Oct 2012, 16:21:56 UTC - in response to Message 27040.  

Name 3EKO_15_2-PAOLA_3EKO_8LIGANDS-23-100-RND9894_2
Workunit 3734669 (says, errors WU cancelled)
Created 17 Oct 2012 | 0:23:45 UTC
Sent 17 Oct 2012 | 3:56:40 UTC
Received 17 Oct 2012 | 12:07:48 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 98 (0x62)
Computer ID 135026
Report deadline 22 Oct 2012 | 3:56:40 UTC
Run time 28,330.88 :(an 8h hairball)
CPU time 18,017.27
Validate state Invalid
Credit 0.00
Application version Long runs (8-12 hours on fastest card) v6.16 (cuda42)

Stderr output

<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
- exit code 98 (0x62)
</message>
<stderr_txt>
MDIO: cannot open file "restart.coor"
ERROR: file tclutil.cpp line 31: get_Dvec() element 0 (b)
called boinc_finish

</stderr_txt>
]]>


Just saying,
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help
ID: 27095 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 27100 - Posted: 19 Oct 2012, 21:20:48 UTC - in response to Message 27095.  

These are now stopped.
We have found the source of the problem in some scripting called inside the input files. It was quite unexpected.

These functions will now be embedded into the applications for speed.

gdf
ID: 27100 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : PAOLA_3EKO_8LIGANDS very low GPU load

©2025 Universitat Pompeu Fabra