Message boards :
News :
New NATHAN_KID WUs on long
Message board moderation
Previous · 1 · 2 · 3 · 4 · Next
Author | Message |
---|---|
![]() Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Ok, there are two groups from me on the grid right now: Thanks for the info. The SOD WUs do run faster and they just barely allow a GTX 460 to slip under the 24 hour mark (if micromanaged). So for me it's a large improvement, but still on the long side. Micromanagement drill: set a backup project, set BOINC to report immediately: then DL a GPUGrid WU, then turn off GPUGrid work fetch, then wait until you notice the GPUGrid WU is done and the backup project is running, then turn on work fetch to DL a new WU, then pause the WU from the backup project so the GPUGrid WU starts immediately, then un-pause the backup project WU so it will run again when the GPUGrid WU finishes. Repeat ad infinitum... If you're lucky you've squeaked under 24 hours. |
Send message Joined: 21 Feb 09 Posts: 497 Credit: 700,690,702 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The SOD WUs do run faster and they just barely allow a GTX 460 to slip under the 24 hour mark (if micromanaged). Do what I just did: replaced my GTX 460 with a GTX 660. You'll sleep easy! |
Send message Joined: 5 May 13 Posts: 187 Credit: 349,254,454 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I acknowledge the new NATHAN WUs, I'm about to finish I84R7-NATHAN_KIDc22_SODcharge-0-10-RND9833. Runtime, just above 18h on a GTX 650TI on Linux x86_64. Nice-behaving, these NATHANs! Keep 'em coming, Nate! |
![]() Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The SOD WUs do run faster and they just barely allow a GTX 460 to slip under the 24 hour mark (if micromanaged). If I had unlimited money I'd buy the fastest GPUs available. Some of us are retired and on fixed incomes. Buying less expensive cards and paying the electric bill is enough of a strain. Of course if you send $$$ I'll certainly purchase some GTX 660 GPUs ;-) JK, BTW: congrats on your GTX 660, it's a nice card. |
Send message Joined: 21 Feb 09 Posts: 497 Credit: 700,690,702 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Some of us are retired and on fixed incomes. Me too, but I had a strategy vs. She who holds the purse strings and must be obeyed. We replace our rigs every four year. My rig is almost four year old. She was persuaded that changing the PSU from 425W to 620W, and replacing the video card, was a better alternative to a new PC! BTW: congrats on your GTX 660, it's a nice card. Thank you! Love it!! |
![]() Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'm getting system restarts while running NATHAN_KIDc22 WU's on XP-x86. The only other app I have running is WUProp (NCI). The tasks are recovering however. I also had one of these WU's fail on Linux: Exit status 255 (0xff) Unknown error number Stderr output <core_client_version>7.0.27</core_client_version> <![CDATA[ <message> process exited with code 255 (0xff, -1) </message> <stderr_txt> MDIO: cannot open file "output.restart.coor" </stderr_txt> ]]>
FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
![]() Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'm getting system restarts while running NATHAN_KIDc22 WU's on XP-x86. I had several acemd crashes (more than are listed in the file below, think there were about 5) on one of the NATHAN_KID WUs yesterday on a non-OCed GTX 460: core_client_version>7.0.64</core_client_version> <![CDATA[ <stderr_txt> MDIO: cannot open file "output.restart.coor" SWAN : FATAL : Cuda driver error 999 in file 'swanlibnv2.cpp' in line 1574. Assertion failed: a, file swanlibnv2.cpp, line 59 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. SWAN : FATAL : Cuda driver error 999 in file 'swanlibnv2.cpp' in line 1574. Assertion failed: a, file swanlibnv2.cpp, line 59 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. SWAN : FATAL : Cuda driver error 999 in file 'swanlibnv2.cpp' in line 1574. Assertion failed: a, file swanlibnv2.cpp, line 59 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. # Time per step (avg over 4705000 steps): 6.844 ms # Approximate elapsed time for entire WU: 82131.953 s called boinc_finish </stderr_txt> ]]> Did the usual drill: shut down BOINC THEN hit the X on the acemd error message then reboot (as the GPU can sometimes become unstable when this happens) It eventually finished successfully (and I beat the 24hr deadline by 12 minutes!). This WU failed on 2 previous machines: http://www.gpugrid.net/workunit.php?wuid=4482496 |
Send message Joined: 29 May 12 Posts: 8 Credit: 21,605,500 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() |
12mins? I should feel especially lucky then... to get this (I10R4-NATHAN_KIDc22_2-6-8-RND2039) in with 51secs to spare. :) GPU-load was 90%. terencewee* Sicituradastra. |
Send message Joined: 5 May 13 Posts: 187 Credit: 349,254,454 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Ahh, the thrill of sending your WU in within the 24h window!! |
Send message Joined: 12 Dec 11 Posts: 91 Credit: 2,730,095,033 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Ahh, the thrill of sending your WU in within the 24h window!! Hint: sell all the old hardware around and grab the best GPU you can. Save energy and produce more :) Gamers would love or 1 year "old" series 5 and older cards :D |
![]() Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'm getting system restarts while running NATHAN_KIDc22 WU's on XP-x86. A different type of problem this time: The WU ran for 29h but had only reached 30%. There was an acemd.2865P.exe pop-up error sitting on the screen when I checked. I exited Boinc, restarted and got the same error, however the Elapsed time now told me that the WU had only run for ~5h30min. When I suspended the task the error message disappeared. I started to run a different WU (from GPUGrid, then suspended it and ran an Einstein WU, which had been suspended), then suspended it and tried to run the Nathan WU. After about 10sec I got the same error message. I closed the message and the task went to 100% and Error after ~10sec. I81R1-NATHAN_KIDc22_3-0-8-RND4827_0 4488285 31 May 2013 | 3:33:05 UTC 5 Jun 2013 | 3:33:05 UTC In progress --- --- --- Long runs (8-12 hours on fastest card) v6.18 (cuda42) Stderr output <core_client_version>7.0.64</core_client_version> <![CDATA[ <message> The system cannot find the path specified. (0x3) - exit code 3 (0x3) </message> <stderr_txt> MDIO: cannot open file "output.restart.coor" Kernel not foundAssertion failed: a, file swanlibnv2.cpp, line 59 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. Kernel not foundAssertion failed: a, file swanlibnv2.cpp, line 59 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. Kernel not foundAssertion failed: a, file swanlibnv2.cpp, line 59 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. </stderr_txt> ]]> I also had one of these WU's fail on Linux: ...and 2 more, same system, same error: I69R6-NATHAN_KIDc22_3-0-8-RND5284_1 4488166 31 May 2013 | 3:14:50 UTC 31 May 2013 | 6:08:41 UTC Error while computing 9,273.96 9,196.11 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42) I97R8-NATHAN_KIDc22_3-0-8-RND1236_0 4488457 31 May 2013 | 6:08:41 UTC 31 May 2013 | 12:10:10 UTC Error while computing 21,353.54 21,177.70 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42) FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
![]() Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
A different type of problem this time: Looks like pretty much exactly what happened here: http://www.gpugrid.net/forum_thread.php?id=3378&nowrap=true#30556 |
![]() Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I31R1-NATHAN_KIDc22_SODcharge-3-10-RND8395_0 4492253 1 Jun 2013 | 19:43:28 UTC 1 Jun 2013 | 23:27:30 UTC Error while computing 13,091.41 12,971.71 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42) I76R5-NATHAN_KIDc22_SODcharge-3-10-RND5030_0 4491243 1 Jun 2013 | 10:43:07 UTC 1 Jun 2013 | 19:43:28 UTC Error while computing 12,157.94 12,044.00 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42) I14R2-NATHAN_KIDc22_3-2-8-RND5964_0 4490995 1 Jun 2013 | 6:06:40 UTC 1 Jun 2013 | 10:43:07 UTC Error while computing 15,460.73 15,325.41 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42) Top two: Stderr output <core_client_version>7.0.27</core_client_version> <![CDATA[ <message> process exited with code 255 (0xff, -1) </message> <stderr_txt> MDIO: cannot open file "output.restart.coor" </stderr_txt> ]]> Stderr output <core_client_version>7.0.27</core_client_version> <![CDATA[ <message> process exited with code 255 (0xff, -1) </message> <stderr_txt> MDIO: cannot open file "output.restart.coor" SWAN : FATAL : Cuda driver error 700 in file 'swanlibnv2.cpp' in line 1841. </stderr_txt> ]]> FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
Send message Joined: 20 Jan 13 Posts: 9 Credit: 206,731,892 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I have had two WU with the same error: I36R2-NATHAN_KIDc22_2-3-8-RND5161_8 Run time 11.7 seconds. CPU time 0.55 seconds. There are seven other error reports on this WU all with very small CPU times. and an older one I55R4-NATHAN_KIDc22_SODcharge-1-10-RND5713_0 Run time 140.58 seconds. CPU time 99.79 seconds. There is no other error reports and one completion report for this WU. |
Send message Joined: 23 Dec 09 Posts: 189 Credit: 4,798,881,008 RAC: 311 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
STRANGE: http://www.gpugrid.net/result.php?resultid=6930954, I19R7-NATHAN_KIDc22_3-1-8-RND5865_1: Runtime: 14:51.51 and counting, advanced: 16.457 %, Remaining: 32:13:07. AVGA OC Scanner X says: GPU load 66-82%, MEM load: 32%, 412 MB and MCU load 20%. On a GTX570, AMD 6200 FX. I have to go, so up for comments. I will not be able to do anything untill tomorrow morning. |
Send message Joined: 5 May 13 Posts: 187 Credit: 349,254,454 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Something must be wrong with this WU, it has failed on another host: http://www.gpugrid.net/result.php?resultid=6930512 ![]() |
Send message Joined: 5 Mar 13 Posts: 348 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
WU's can fail many times. I had one which failed 8 times and at the last one succeeded. So I think it's not really any indication. |
![]() Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The computer it failed on was a titan (which cannot run these WU's):
FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
Send message Joined: 23 Dec 09 Posts: 189 Credit: 4,798,881,008 RAC: 311 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
STRANGE: http://www.gpugrid.net/result.php?resultid=6930954, I19R7-NATHAN_KIDc22_3-1-8-RND5865_1: UP-DATE on this one: Finished after 28:11:19 hours. Will miss 24 hours deadline... It is uploading at the moment. In the case of the GTX570, it's most likely that the GPU clocks dropped, but these were not reported, and neither was whether the system is set to prefer maximum performance, or how much the of CPU was being used... This system does this RNDXXXX normally in around 58000 seconds, and has done so quite a few times, so this is really a strange WU. I never had issues with the system although this RND tasks takes quite long for one of the faster Video cards of the last generation. Video card is EVGA GTX 570 SC (not further pressed, I prefer that it is running more or less cool with 68º to 73º C Fan speed at 85%) and on the AMD 6200 FX there is a core reserved for the Video card as recommended. The card is a recent replacement for a faulty card by EVGA, so new. |
Send message Joined: 23 Dec 09 Posts: 189 Credit: 4,798,881,008 RAC: 311 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Oh, and I just noticed the up-load file does have a size of 107.88 MB (so more or less the double as before), therefore it will take quite a while until it is up-loaded, as my internet connection will brake several times and will end up with the famous 5 hours brakes between each up-load try. |
©2025 Universitat Pompeu Fabra