PAOLA_3EKO_8LIGANDS very low GPU load

Author	Message
Snow Crash Send message Joined: 4 Apr 09 Posts: 450 Credit: 539,316,349 RAC: 0 Level Scientific publications	Message 26761 - Posted: 2 Sep 2012, 9:24:14 UTC Clearly there is not an easy fix or it would have been done by now. Apparently even changing the points award is not something to be taken lightly once released so let's see what we can do to get the complete stream finished. Who knows, maybe this is some truely awesome data that will help Paola advance her research by leaps and bounds. Can anyone who is running these and getting good utilization please post up the rig specs so we can see what DOES work well? If it would be helpful overall, I would like to offer to be a volunteer as an alpha tester for any new WU streams (660Ti Win7x64 + 480 same rig, 670 Win7x64). Thanks - Steve ID: 26761 · Rating: 0 · rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 26764 - Posted: 2 Sep 2012, 9:40:28 UTC - in response to Message 26761. Can anyone who is running these and getting good utilization please post up the rig specs so we can see what DOES work well? I'm curious about it as well. However, I don't expect any answer to this question, because these tasks' GPU utilization is low even on PCIe3.0 systems. ID: 26764 · Rating: 0 · rate: / Reply Quote

Snow Crash Send message Joined: 4 Apr 09 Posts: 450 Credit: 539,316,349 RAC: 0 Level Scientific publications	Message 26765 - Posted: 2 Sep 2012, 9:50:45 UTC - in response to Message 26764. Last modified: 2 Sep 2012, 10:39:25 UTC Can anyone who is running these and getting good utilization please post up the rig specs so we can see what DOES work well? I'm curious about it as well. However, I don't expect any answer to this question, because these tasks' GPU utilization is low even on PCIe3.0 systems. http://www.gpugrid.net/forum_thread.php?id=3116&nowrap=true#26657 Nate's earler post shows a couple of good runtimes, maybe he can dig out the rig specs for us ... Nate? Thanks - Steve ID: 26765 · Rating: 0 · rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 26768 - Posted: 2 Sep 2012, 11:07:20 UTC - in response to Message 26765. Last modified: 2 Sep 2012, 11:09:49 UTC Nate's earler post shows a couple of good runtimes, maybe he can dig out the rig specs for us ... Nate? As Nate said, these workunits have a very high variation in their runtimes. My first one is completed in 22h 19m. 2nd: 22h 2m. 3rd: 18h 31m. 4th: 14h 14m. 5th: 11h 39m. 6th: 12h 2m. 7th: 12h 25m. 8th: 12h 4m. I've found a very good rig in the toplist: The shortest runtime for a PAOLA_3EKO_8LIGANDS on this host is 7h 34m. This is a Linux system with a Core i7-2600K overclocked to 4.6GHz (according to my estimation) with two GTX 680s (I bet these are overclocked too). But even on this system the PAOLA_3EKO_8LIGANDS use less CPU time than GPU time (unlike all other workunits), so I guess even this system could have shorter runtimes if the "SWAN_SYNC=0" setting would have been applied to these workunits. ID: 26768 · Rating: 0 · rate: / Reply Quote

The King's Own Send message Joined: 25 Apr 12 Posts: 32 Credit: 945,543,997 RAC: 0 Level Scientific publications	Message 26769 - Posted: 2 Sep 2012, 12:03:00 UTC Runtime on new 660Ti, power target set to 105% (One sample): 1. 20 hrs 17 min Same rig (Core i5-750, 8 gByte RAM) when run on 580 GTX, no overclock (Three samples): i. 22 hrs 30 min ii. 27 hrs 39 min iii. 28 hrs 03 min ID: 26769 · Rating: 0 · rate: / Reply Quote

Luke Formosa Send message Joined: 11 Jul 12 Posts: 32 Credit: 33,298,777 RAC: 0 Level Scientific publications	Message 26770 - Posted: 2 Sep 2012, 13:04:12 UTC The problem with looking solely at runtimes is that the number you're seeing is only the time taken for that task to complete - it doesn't say how many tasks were running simultaneously. So anyone using a custom app_info.xml and running 2 or more tasks at once might be doubling his points per second and the runtimes would look the same. Has anyone tried running multiple tasks at once? Can you please post your runtimes for running two of them and your runtimes when running just one at a time? ID: 26770 · Rating: 0 · rate: / Reply Quote

Luke Formosa Send message Joined: 11 Jul 12 Posts: 32 Credit: 33,298,777 RAC: 0 Level Scientific publications	Message 26771 - Posted: 2 Sep 2012, 13:07:01 UTC - in response to Message 26770. Oh, and my runtime for Paola tasks is on average 13.33 hours on a stock Gigabyte GTX 670. The variation isn't that great in mine - about half an hour either way. But I haven't run GPUgrid for a few days so I wouldn't know if the newer tasks have different runtimes. ID: 26771 · Rating: 0 · rate: / Reply Quote

flashawk Send message Joined: 18 Jun 12 Posts: 297 Credit: 3,572,627,986 RAC: 0 Level Scientific publications	Message 26773 - Posted: 2 Sep 2012, 21:34:32 UTC I noticed almost 2 weeks ago when this all started that others were aborting these tasks or getting a lot of computational errors, that means the rest of us have to pickup the slack for those who refuse to do the work. I'm looking at this problem much simpler than everyone else. I use a program called gpushark to monitor my cards (it's free) and it shows the VRAM memory controller is getting very low utilization (9% - 11% as compared to 31% - 39%), I think that's the choke point, it will slow down the CPU. ID: 26773 · Rating: 0 · rate: / Reply Quote

dskagcommunity Send message Joined: 28 Apr 11 Posts: 463 Credit: 979,266,958 RAC: 76,910 Level Scientific publications	Message 26775 - Posted: 3 Sep 2012, 5:40:47 UTC Do you get 9x% gpu load while this memory controller load? When not, it is normal that the memory controller has lesser to do perhaps when the gpu load is less too. Only as suggest ^^ DSKAG Austria: http://www.dskag.at ID: 26775 · Rating: 0 · rate: / Reply Quote

Luke Formosa Send message Joined: 11 Jul 12 Posts: 32 Credit: 33,298,777 RAC: 0 Level Scientific publications	Message 26776 - Posted: 3 Sep 2012, 12:17:04 UTC - in response to Message 26773. I noticed almost 2 weeks ago when this all started that others were aborting these tasks or getting a lot of computational errors, that means the rest of us have to pickup the slack for those who refuse to do the work. I'm looking at this problem much simpler than everyone else. I use a program called gpushark to monitor my cards (it's free) and it shows the VRAM memory controller is getting very low utilization (9% - 11% as compared to 31% - 39%), I think that's the choke point, it will slow down the CPU. GPU-Z is better in my opinion (it's also free), because it displays those readings and plots graphs of them in real time as well. And you can save the data to a log file. ID: 26776 · Rating: 0 · rate: / Reply Quote

flashawk Send message Joined: 18 Jun 12 Posts: 297 Credit: 3,572,627,986 RAC: 0 Level Scientific publications	Message 26777 - Posted: 3 Sep 2012, 19:57:20 UTC - in response to Message 26776. GPU-Z is better in my opinion (it's also free), because it displays those readings and plots graphs of them in real time as well. And you can save the data to a log file. It is a very good program and I've been familiar with it for years but gpushark has a much smaller foot print and uses less resources and I leave it running 24/7 on all 4 of my computers and it gives real time info on up to 4 video cards at the same time when in advanced mode. I wasn't implying that all should use it (sorry for the misunderstanding), I was just letting folks know how I was monitoring my video cards. ID: 26777 · Rating: 0 · rate: / Reply Quote

robertmiles Send message Joined: 16 Apr 09 Posts: 503 Credit: 769,991,668 RAC: 0 Level Scientific publications	Message 26783 - Posted: 4 Sep 2012, 12:51:17 UTC 3EKO_19_4-PAOLA_3EKO_8LIGANDS-3-100-RND3778 has run 123 hours so far, 57 to go, 67.623% progress already past deadline http://www.gpugrid.net/result.php?resultid=5800806 Is this a reasonable run time on a GTX 560? Or is it so much past what is expected that I should abort the workunit? ID: 26783 · Rating: 0 · rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 26784 - Posted: 4 Sep 2012, 15:25:01 UTC - in response to Message 26783. Last modified: 4 Sep 2012, 15:28:21 UTC 3EKO_19_4-PAOLA_3EKO_8LIGANDS-3-100-RND3778 has run 123 hours so far, 57 to go, 67.623% progress already past deadline http://www.gpugrid.net/result.php?resultid=5800806 Is this a reasonable run time on a GTX 560? Or is it so much past what is expected that I should abort the workunit? You should abort it immediately. It has been resent to another host already, which has a GTX 680, and will probably return the result much sooner than 57 hours. This is not a reasonable run time at all. ID: 26784 · Rating: 0 · rate: / Reply Quote

flashawk Send message Joined: 18 Jun 12 Posts: 297 Credit: 3,572,627,986 RAC: 0 Level Scientific publications	Message 26788 - Posted: 5 Sep 2012, 7:42:51 UTC I just got a new task I've never seen before, it's PAOLA_2UY5 and it's doing the exact same thing as the other PAOLA task. 30% to 50% GPU usage, is this the way of all new WU's to come? There are going to be lots of grumpy folks that have older cards. It's looking like close to 30 hours on my GTX560Ti, who the heck is writing these things? ID: 26788 · Rating: 0 · rate: / Reply Quote

dskagcommunity Send message Joined: 28 Apr 11 Posts: 463 Credit: 979,266,958 RAC: 76,910 Level Scientific publications	Message 26794 - Posted: 6 Sep 2012, 11:15:16 UTC Last modified: 6 Sep 2012, 11:17:54 UTC There are more liangs in the queue it seems i nearly only get this wus :/ dont want 36h to compute on one wu...thx to cuda31 i never git an error on these cross his fingers and still get 50% more the credits then in short queue!!! Wtf. DSKAG Austria: http://www.dskag.at ID: 26794 · Rating: 0 · rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 26797 - Posted: 6 Sep 2012, 13:58:30 UTC Last modified: 6 Sep 2012, 13:58:49 UTC Hello everyone, sorry I haven't posted in awhile. But, I'm beginning to get pissed off about these tasks. Whenever I have 1 running it will run fine although incredibly slowly 36%. BUT, when I have 3 running on my 3x 680s acemd crashed or my computer locks up and must be restarted. It should not be my responsibility to constantly abort tasks and keep an eye on this rig at all times. These tasks just caused me to lose another 20hrs of combined crunching. Further, when running 2 one time, the computer crashed, causing a Nathan task which had 20min left to fail. This is unacceptable. I have never, ever aborted WUs before. Not even when the points disparity was so different many others complained. I let them run. But I've reached the tipping point, and don't know what to do anymore. Please, after this batch is finished, should this ever happen again, pull the tasks from the hopper and figure out what is wrong with them. ID: 26797 · Rating: 0 · rate: / Reply Quote

Luke Formosa Send message Joined: 11 Jul 12 Posts: 32 Credit: 33,298,777 RAC: 0 Level Scientific publications	Message 26799 - Posted: 6 Sep 2012, 16:00:43 UTC For the last couple of posters - have you tried using my modified app_info.xml ? It won't make the tasks any faster (in fact it might slow things down slightly), but at least you'll be doing two at once so you'll be getting almost twice the points per unit time and so the performance hit isn't as bad. ID: 26799 · Rating: 0 · rate: / Reply Quote

Snow Crash Send message Joined: 4 Apr 09 Posts: 450 Credit: 539,316,349 RAC: 0 Level Scientific publications	Message 26801 - Posted: 6 Sep 2012, 16:47:14 UTC - in response to Message 26799. have you tried using my modified app_info.xml That's a tough prospect as we can't count on getting only PAOLA tasks and I believe it will be counter productive if a NATE WU gets doubled up. That being said, it looks like I may have an oppportunity when I get home today but it depends on how ambitious I am because the 2 PAOLA's I have are on a 2 card rig so I'm going to pull a card to make this work as cleanly as possible. If I'm going that far I am also going to swap which slot the remaining card is in. If I can get this done and working correctly I will think about aborting the NATE's and run PAOLA exclusively. side note: the card I'm pulling is a GTX480 and I'm thinking about decomissioning it, anyone interested can send me a PM. Thanks - Steve ID: 26801 · Rating: 0 · rate: / Reply Quote

Luke Formosa Send message Joined: 11 Jul 12 Posts: 32 Credit: 33,298,777 RAC: 0 Level Scientific publications	Message 26806 - Posted: 6 Sep 2012, 18:27:25 UTC - in response to Message 26801. That's a tough prospect as we can't count on getting only PAOLA tasks and I believe it will be counter productive if a NATE WU gets doubled up. It would be counterproductive to run two nathans, but you can work around that if you're willing to babysit your computer a bit (I know some people aren't). If you have two Paola tasks, leave coproc count = 0.5. If you get a nathan task, exit boinc, modify the coproc count to 1, then start up boinc again. If you have a Paola and a Nathan... er, you're out of luck on one GPU. But as you have two GPUs and hence a four task limit on that host, you might be able to work something out. ID: 26806 · Rating: 0 · rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 26808 - Posted: 6 Sep 2012, 19:59:20 UTC That's the issue though. I tend to get more Nathan's than Paola. Further complicating things is that with 3 cards in one rig. I have 6 tasks in total to watch. This is not something we should be doing. I understood that when they switched to the new CUDA app, that there were some server issues with correctly sending tasks to rigs that were compatible and a workaround was used. This is a different case. Please, GPUgrid, I love this project, and I am not leaving. Ever. But do not let this happen again. I understand you tested them on your own software before being sent. So, in the future could you please possibly run some under BOINC in house first to see if any problems arise. I hope this batch is nearing completion. Cheers. ID: 26808 · Rating: 0 · rate: / Reply Quote