Message boards :
Graphics cards (GPUs) :
Compute error 1(0x1) on all units since last night
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 17 Nov 12 Posts: 10 Credit: 185,958,753 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
MY GTX 660 TI and GTX 650 just suddenly started erroring out every task all with the same error as far as I can tell. Name 2x11_8-NOELIA_hfXA_long-0-2-RND7200_1 Workunit 3977330 Created 1 Jan 2013 | 5:44:59 UTC Sent 1 Jan 2013 | 10:14:59 UTC Received 1 Jan 2013 | 10:23:43 UTC Server state Over Outcome Computation error Client state Compute error Exit status 1 (0x1) Computer ID 138949 <core_client_version>7.0.33</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> ]]> Everything was working fine until last night. nVidia drivers 306,97 any ideas whats wrong ? |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
MY GTX 660 TI and GTX 650 just suddenly started erroring out every task all with the same error as far as I can tell. Sometimes the card (driver, or the OS) gets stuck, and only a restart can resolve it. Have you tried a system restart? |
|
Send message Joined: 17 Nov 12 Posts: 10 Credit: 185,958,753 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Just reset GPUgrid and away to restart system. Was wondering if there was a known error as i've already trashed 32 units so didn't want to keep trashing more. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I appear to have had a similar problem. It started last night, just after midnight CET. http://www.gpugrid.net/results.php?hostid=139265&offset=0&show_names=0&state=5&appid= Long tasks just started failing, one after the other. Most failed after ~200sec. They might have just been failing on my GTX660Ti, and not my GTX470's; a task was running on it. After I restarted the same task started to run on my GTX660Ti, and now seems to be progressing normally... GPUGrid stopped sending me work, so I will have to run some jobs from other projects and wait for my rating to improve before getting new tasks (only ~4h if the one task I have completes and reports successfully). As well as the possibility that this was cause by bad tasks, this could have been cause by a CPU Boinc project, Boinc, or be down to the driver (306.97 in my case). W7x64. Of the failed WU's, two tasks also failed on other systems: http://www.gpugrid.net/workunit.php?wuid=3977079 http://www.gpugrid.net/workunit.php?wuid=3977023 However some resends ran successfully, suggesting it's not an issue with GPUGrid. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
|
Send message Joined: 17 Nov 12 Posts: 10 Credit: 185,958,753 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Reset the project, did a clean nVidia driver update to 310.70 and rebooted. So far got 1 task and that seems to be running to completion. 15 more % to go and we will see.... |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The task I had failed! http://www.gpugrid.net/result.php?resultid=6235285 Name 2x12_4-NOELIA_hfXA_long-0-2-RND6878_0 Workunit 3977346 Created 19 Dec 2012 | 20:37:02 UTC Sent 1 Jan 2013 | 5:58:06 UTC Received 1 Jan 2013 | 17:01:28 UTC Server state Over Outcome Computation error Client state Compute error Exit status 98 (0x62) Computer ID 139265 Report deadline 6 Jan 2013 | 5:58:06 UTC Run time 35,731.98 CPU time 30,914.86 Validate state Invalid Credit 0.00 Application version Long runs (8-12 hours on fastest card) v6.16 (cuda42) ERROR: file deven.cpp line 1106: # Energies have become nan Perhaps it was one of the earlier tasks that failed on completion? It wasn't resent. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The task I had failed! Since then, it was resent to another host, so we will see. We have 17880 unsent workunits (and as low as 2174 in progress) at the moment, so a resend takes more time than usual. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I have identified the root of the problem I was encountering, and it was simply that the GTX660Ti's fan remained/was stuck at 40%. I had it on a profile, so fan speed would increase with temperature, but after updating MSI Afterburner a couple of days back the profile was not applied to the GTX660Ti, it only applied to the GTX470. That's what I get for 'upgrading' software without any real need. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I've had another error on that system (GTX660Ti now at 62°C): 6286250 4012377 2 Jan 2013 | 13:14:43 UTC 2 Jan 2013 | 20:26:00 UTC Error while computing 18,859.67 1,537.28 --- Long runs (8-12 hours on fastest card) v6.16 (cuda42) Stderr output <core_client_version>7.0.42</core_client_version> <![CDATA[ <message> - exit code 98 (0x62) </message> <stderr_txt> MDIO: cannot open file "restart.coor" ERROR: file deven.cpp line 1106: # Energies have become nan called boinc_finish </stderr_txt> ]]> It also failed on another system using the 3.1app. 6285719 79738 2 Jan 2013 | 8:55:50 UTC 2 Jan 2013 | 10:36:53 UTC Error while computing 9.51 0.05 --- Long runs (8-12 hours on fastest card) v6.16 (cuda31) 6286250 139265 2 Jan 2013 | 13:14:43 UTC 2 Jan 2013 | 20:26:00 UTC Error while computing 18,859.67 1,537.28 --- Long runs (8-12 hours on fastest card) v6.16 (cuda42) 6287663 142106 2 Jan 2013 | 23:35:09 UTC 7 Jan 2013 | 23:35:09 UTC In progress --- --- --- Long runs (8-12 hours on fastest card) v6.16 (cuda42) I went through earlier WU failures and while most WU's eventually succeeded most of the resends failed on at least one other system, some failing numerous times. The issue seems to be the same for Long and Short WU's: http://www.gpugrid.net/results.php?hostid=139265&offset=0&show_names=0&state=5&appid= While the errors are mostly early in the runs, some occur late into the run. It's also an issue for both apps (3.2 and 4.2), and there seems to be quite a few 'error while downloading' failures. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
These probably belong in the Energies have become nan thread, but, 6294105 4017217 139265 4 Jan 2013 | 23:14:14 UTC 5 Jan 2013 | 12:31:59 UTC Error while computing 42,955.90 3,395.41 --- Long runs (8-12 hours on fastest card) v6.17 (cuda42) 6293240 112581 4 Jan 2013 | 18:40:13 UTC 4 Jan 2013 | 18:49:20 UTC Error while computing 2.16 2.09 --- Long runs (8-12 hours on fastest card) v6.17 (cuda42) 6294105 139265 4 Jan 2013 | 23:14:14 UTC 5 Jan 2013 | 12:31:59 UTC Error while computing 42,955.90 3,395.41 --- Long runs (8-12 hours on fastest card) v6.17 (cuda42) 6296657 --- --- --- Unsent --- --- --- FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
©2025 Universitat Pompeu Fabra