Message boards :
Number crunching :
ADRIA extremely slow - not checkpointing
Message board moderation
Previous · 1 · 2
| Author | Message |
|---|---|
|
Send message Joined: 21 Jan 10 Posts: 46 Credit: 1,388,234,528 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
My first WU completed and validated. Finally. https://www.gpugrid.net/workunit.php?wuid=27023218 Sent 10 Feb 2021 | 17:42:37 UTC Received 12 Feb 2021 | 2:30:35 UTC Credit 435,937.50 (wow!) GTX 980 Ti My next WU is a retry of a failed attempt by another system. It's running A LOT faster than my first WU. I'll report back when it completes. https://www.gpugrid.net/workunit.php?wuid=27025153 |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
You cant undervolt in Linux directly, but you can lower the power limit if all you are trying to do is use less power. yeah I know about the power limiting. it's all you can do in Linux. but if you apply an overclock on top of the power limit you can claw back some lost performance. you can probably do better on power for that 2060. I have a system with 2070's that I power limit to 150W, and it completes tasks in about 60-61,000s
|
|
Send message Joined: 21 Jan 10 Posts: 46 Credit: 1,388,234,528 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
10% complete 03:22:00 Oof. At least it's running. The prior attempt by another user errored-out in a few seconds. |
|
Send message Joined: 21 Jan 10 Posts: 46 Credit: 1,388,234,528 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I was able to exit/restart the client at 54% and it continued as expected. (Win64) |
|
Send message Joined: 21 Jan 10 Posts: 46 Credit: 1,388,234,528 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
And...scene. Run time 106,374.11 CPU time 106,292.60 Validate state Valid Credit 348,750.00 https://www.gpugrid.net/workunit.php?wuid=27025153 |
|
Send message Joined: 22 May 20 Posts: 110 Credit: 115,525,136 RAC: 0 Level ![]() Scientific publications
|
Be happy if they finish at all. Mine (3 at this time) ran between 5,000 and 90,000 sec on a GTX 1660 Super and all errored out. I aborted the last WU in progress now... |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
Be happy if they finish at all. Mine (3 at this time) ran between 5,000 and 90,000 sec) on a GTX 1660 Super and all errored out. My 1660 Super takes about 96,000 seconds on Linux. Maybe 100,000+ on Windows since the Windows app is slower. Of your 4 tasks. On your 2-GPU Windows 10 host. 1 was aborted by the user. 1 failed due to a BOINC restart (or system restart) where it attempted to restart on a different device. It started on device 1 then tried to restart on device 0. This is a common and known situation that causes failures. 2 tasks failed with “particle coordinate is nan” (nan = not a number). This commonly happens from too much overclock or overheating. GPUGRID tasks are quite intense, and these tasks are no exception. An overclock that’s stable on another project or application can be unstable for GPUGRID. Try to remove any overclock and make sure the GPU has sufficient airflow. Try to avoid restarting the system.
|
|
Send message Joined: 22 May 20 Posts: 110 Credit: 115,525,136 RAC: 0 Level ![]() Scientific publications
|
Thanks Ian for the diagnostics! Just did revert back to the stock settings. Hope I'll catch a resend and can try again. Still weird as all other apps do work with the mild OC setting just fine. Never had a single error thrown so far, except on MLC. But that's mostly due to the inherent nature of these tasks where occasionally a WU results in a NaN error. About that restarted WU on the other device. I noticed that too, and noticed that due to dry spell here I forgot to set up the <exclude gpu> poilicy in my cc_config file on this new host and the slow 750Ti just happend to pick it up. That's solved as well for now. I'll see how the 1660 Super handles tasks in the future! |
ServicEnginICSend message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 1,447 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
A reduced amount of a new kind of ADRIA tasks seems to be in the field. My host #557889 received one of them, named "e1s45_homeodomain_folded_100ns_44-ADRIA_HomeoFolded100ns-0-1-RND7761_0" Progress for this task is about 66% after 5,5 hours on a GTX 1650 GPU. Therefore, initial estimation is pointing that these tasks are much shorter than previous ones. I'm not testing for the moment whether they checkpoint right or not... |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
i also received a couple of these tasks, and concur that they complete in less time. my 2080ti completed them in about 9,000s vs ~36,000s on the longer running tasks. payout is 76,500cred
|
ServicEnginICSend message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 1,447 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
In the case of my GTX 1650 and mentioned task, 30,471 seconds versus ~170,000 seconds for the previous ADRIA tasks in this device. The same amount of 76,500 credits was awarded, since result was returned in time for full bonus. e1s45_homeodomain_folded_100ns_44-ADRIA_HomeoFolded100ns-0-1-RND7761_0 |
©2025 Universitat Pompeu Fabra