Message boards :
Number crunching :
Problem - Tasks error when exiting/resuming using 334.67 drivers
Message board moderation
Previous · 1 · 2 · 3 · 4 · Next
| Author | Message |
|---|---|
|
Send message Joined: 20 Nov 13 Posts: 21 Credit: 480,846,415 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I downclocked my card slightly (~50MHz), or more precisely reduced the overclock, and haven't gotten any more errors since. Not sure if that's causal or coincidental since I haven't bumped it back up yet to test. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Variable: Your issue(s) are different than the one posted in this thread (see post 1). If you continue to have problems, please create a new thread. Thanks, Jacob |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
And... another 8.15 task crashed just now, losing tons of work. Why are we still using 8.15?!?
|
|
Send message Joined: 6 Feb 10 Posts: 38 Credit: 274,204,838 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
Power went out yesterday, I lost work units. Power went out today, I lost work units. This needs to get fixed!!!!! |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
MJH: Although the 8.41 app appears to have improved the situation, I am still occasionally getting what appears to be the same error. I think the scenario is suspending activity, then restarting BOINC. Can you see if there's some scenario/condition that still causes the task to fail? Error summary: Exit status 80 (0x50) Unknown error number The file exists. (0x50) - exit code 80 (0x50) Last messaged logged in stderr.txt: # BOINC suspending at user request (exit) Task results and stderr.txt: http://www.gpugrid.net/result.php?resultid=9339200 Name I188-NATHAN_RPS1_adapt4-1-5-RND2310_0 Workunit 6566597 Created 25 Apr 2014 | 22:04:16 UTC Sent 26 Apr 2014 | 11:18:41 UTC Received 27 Apr 2014 | 4:06:56 UTC Server state Over Outcome Computation error Client state Compute error Exit status 80 (0x50) Unknown error number Computer ID 153764 Report deadline 1 May 2014 | 11:18:41 UTC Run time 38,039.02 CPU time 6,213.84 Validate state Invalid Credit 0.00 Application version Long runs (8-12 hours on fastest card) v8.41 (cuda60) Stderr output <core_client_version>7.3.15</core_client_version> <![CDATA[ <message> The file exists. (0x50) - exit code 80 (0x50) </message> <stderr_txt> # GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 0 : # Name : GeForce GTX 660 Ti # ECC : Disabled # Global mem : 3072MB # Capability : 3.0 # PCI ID : 0000:09:00.0 # Device clock : 1124MHz # Memory clock : 3004MHz # Memory width : 192bit # Driver version : DM337_50 : 33761 # GPU 0 : 68C # GPU 1 : 69C # GPU 2 : 77C # GPU 0 : 69C # GPU 1 : 70C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 70C # GPU 1 : 69C # GPU 2 : 78C # GPU 1 : 70C # GPU 1 : 71C # GPU 2 : 79C # GPU 0 : 71C # GPU 1 : 72C # GPU 0 : 72C # GPU 0 : 73C # GPU 2 : 80C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 70C # GPU 1 : 65C # GPU 2 : 74C # GPU 0 : 71C # GPU 1 : 68C # GPU 2 : 75C # GPU 0 : 72C # GPU 1 : 70C # GPU 1 : 72C # GPU 2 : 76C # GPU 2 : 77C # GPU 1 : 73C # GPU 2 : 79C # GPU 1 : 74C # GPU 1 : 75C # GPU 1 : 76C # GPU 1 : 77C # GPU 1 : 78C # GPU 1 : 79C # GPU 1 : 80C # GPU 0 : 74C # GPU 0 : 75C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 60C # GPU 1 : 58C # GPU 2 : 55C # GPU 0 : 65C # GPU 1 : 63C # GPU 2 : 70C # GPU 0 : 68C # GPU 1 : 67C # GPU 2 : 74C # GPU 0 : 70C # GPU 1 : 69C # GPU 2 : 75C # GPU 0 : 71C # GPU 1 : 70C # GPU 2 : 76C # GPU 0 : 72C # GPU 1 : 73C # GPU 0 : 73C # GPU 2 : 77C # GPU 2 : 78C # GPU 2 : 79C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 59C # GPU 1 : 58C # GPU 2 : 55C # GPU 0 : 61C # GPU 1 : 63C # GPU 0 : 63C # GPU 1 : 67C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 61C # GPU 1 : 64C # GPU 2 : 55C # GPU 0 : 63C # GPU 1 : 68C # GPU 0 : 64C # GPU 1 : 70C # GPU 0 : 65C # GPU 1 : 71C # GPU 2 : 56C # GPU 0 : 66C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 63C # GPU 1 : 66C # GPU 2 : 62C # GPU 0 : 65C # GPU 1 : 70C # GPU 0 : 66C # GPU 1 : 74C # GPU 0 : 67C # GPU 1 : 78C # GPU 0 : 69C # GPU 0 : 71C # GPU 0 : 73C # GPU 2 : 71C # GPU 2 : 73C # GPU 2 : 74C # GPU 2 : 75C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 61C # GPU 1 : 63C # GPU 2 : 57C # GPU 0 : 63C # GPU 1 : 67C # GPU 0 : 64C # GPU 1 : 70C # GPU 0 : 66C # GPU 1 : 71C # GPU 1 : 73C # GPU 0 : 67C # GPU 1 : 74C # GPU 1 : 75C # GPU 0 : 71C # GPU 0 : 72C # GPU 2 : 69C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 59C # GPU 1 : 59C # GPU 2 : 56C # GPU 0 : 65C # GPU 1 : 64C # GPU 2 : 64C # GPU 0 : 68C # GPU 1 : 67C # GPU 2 : 69C # GPU 0 : 69C # GPU 1 : 69C # GPU 2 : 71C # GPU 0 : 71C # GPU 2 : 72C # GPU 0 : 73C # GPU 1 : 72C # GPU 2 : 73C # GPU 1 : 73C # GPU 2 : 74C # GPU 2 : 75C # GPU 2 : 76C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 62C # GPU 1 : 65C # GPU 2 : 57C # GPU 0 : 63C # GPU 1 : 68C # BOINC suspending at user request (exit) </stderr_txt> ]]> |
|
Send message Joined: 7 Jun 12 Posts: 112 Credit: 1,140,895,172 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I have highlighted the problem in counting the cards gtx 680 a month now happens to me from . Every day becomes that the tasks of collapse in such a weird way-slow down your PC system in windows and also according to GPU-Z stops the card count. entire system is as if in slow motion ... only helps suspend computation on graphics card, abortions every task and the new has withdrawn. ., and after about cca 6-12 aborted about the tasks shall start another 3 working normally .. it's weird errors and concerns only nvidia cards 600, to 700 card counting goes perfectly. I play with the problem for months.... and computing of other projects without problems. It's not boiling cards or a weak PSU.. I'm not able to count on 680 of these normally GPUGRID, consider selling them or any other project.. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
MJH: The v8.41 version of the application still has the occasional "The file exists. (0x50) - exit code 80 (0x50)" error, trashing loads of work :( Can you please invest some time to fix it? http://www.gpugrid.net/result.php?resultid=10318262 Name A2ART4Ex05x95-GERARD_A2ART4E-13-14-RND0991_0 Workunit 7496762 Created 14 May 2014 | 5:52:04 UTC Sent 16 May 2014 | 13:57:32 UTC Received 17 May 2014 | 3:24:11 UTC Server state Over Outcome Computation error Client state Compute error Exit status 80 (0x50) Unknown error number Computer ID 153764 Report deadline 21 May 2014 | 13:57:32 UTC Run time 24,161.19 CPU time 6,302.88 Validate state Invalid Credit 0.00 Application version Long runs (8-12 hours on fastest card) v8.41 (cuda60) Stderr output <core_client_version>7.3.19</core_client_version> <![CDATA[ <message> The file exists. (0x50) - exit code 80 (0x50) </message> <stderr_txt> # GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 0 : # Name : GeForce GTX 660 Ti # ECC : Disabled # Global mem : 3072MB # Capability : 3.0 # PCI ID : 0000:09:00.0 # Device clock : 1124MHz # Memory clock : 3004MHz # Memory width : 192bit # Driver version : DM337_50 : 33761 # GPU 0 : 67C # GPU 1 : 75C # GPU 2 : 74C # GPU 0 : 68C # GPU 1 : 76C # GPU 0 : 69C # GPU 0 : 70C # GPU 1 : 77C # GPU 0 : 71C # GPU 0 : 72C # GPU 2 : 75C # BOINC suspending at user request (exit) # GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 0 : # Name : GeForce GTX 660 Ti # ECC : Disabled # Global mem : 3072MB # Capability : 3.0 # PCI ID : 0000:09:00.0 # Device clock : 1124MHz # Memory clock : 3004MHz # Memory width : 192bit # Driver version : DM337_50 : 33761 # GPU 0 : 66C # GPU 1 : 71C # GPU 2 : 58C # GPU 0 : 67C # GPU 2 : 62C # GPU 2 : 66C # GPU 2 : 67C # GPU 0 : 68C # GPU 1 : 72C # GPU 2 : 68C # GPU 2 : 69C # GPU 2 : 70C # GPU 0 : 69C # GPU 1 : 73C # GPU 2 : 71C # BOINC suspending at user request (exit) # GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 0 : # Name : GeForce GTX 660 Ti # ECC : Disabled # Global mem : 3072MB # Capability : 3.0 # PCI ID : 0000:09:00.0 # Device clock : 1124MHz # Memory clock : 3004MHz # Memory width : 192bit # Driver version : DM337_50 : 33761 # GPU 0 : 66C # GPU 1 : 71C # GPU 2 : 65C # GPU 0 : 67C # GPU 1 : 72C # GPU 2 : 67C # GPU 2 : 68C # GPU 0 : 68C # GPU 2 : 69C # GPU 1 : 73C # GPU 0 : 69C # GPU 2 : 70C # GPU 2 : 71C # GPU 1 : 74C # BOINC suspending at user request (exit) # GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 0 : # Name : GeForce GTX 660 Ti # ECC : Disabled # Global mem : 3072MB # Capability : 3.0 # PCI ID : 0000:09:00.0 # Device clock : 1124MHz # Memory clock : 3004MHz # Memory width : 192bit # Driver version : DM337_50 : 33761 # GPU 0 : 68C # GPU 1 : 73C # GPU 2 : 68C # GPU 2 : 69C # GPU 2 : 70C # GPU 0 : 69C # GPU 1 : 74C # GPU 2 : 71C # GPU 0 : 70C # GPU 1 : 75C # GPU 2 : 72C # GPU 2 : 73C # GPU 1 : 76C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 57C # GPU 1 : 68C # GPU 2 : 61C # GPU 0 : 61C # GPU 1 : 69C # GPU 0 : 64C # GPU 1 : 70C # GPU 0 : 65C # GPU 1 : 71C # GPU 0 : 66C # GPU 1 : 72C # GPU 0 : 67C # GPU 1 : 73C # GPU 0 : 69C # GPU 0 : 70C # GPU 2 : 67C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 61C # GPU 1 : 53C # GPU 2 : 67C # GPU 0 : 64C # GPU 1 : 58C # GPU 2 : 69C # GPU 0 : 66C # GPU 1 : 61C # GPU 0 : 67C # GPU 1 : 64C # GPU 2 : 70C # GPU 0 : 68C # GPU 1 : 65C # GPU 1 : 67C # GPU 2 : 71C # GPU 0 : 69C # GPU 1 : 69C # GPU 0 : 70C # GPU 1 : 70C # GPU 1 : 71C # GPU 2 : 72C # GPU 0 : 71C # GPU 1 : 72C # GPU 2 : 73C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 61C # GPU 1 : 53C # GPU 2 : 67C # GPU 0 : 64C # GPU 1 : 57C # GPU 2 : 68C # GPU 0 : 66C # GPU 1 : 60C # GPU 2 : 69C # GPU 0 : 67C # GPU 1 : 63C # GPU 2 : 70C # GPU 0 : 68C # GPU 1 : 64C # GPU 0 : 69C # GPU 1 : 67C # GPU 1 : 68C # GPU 2 : 71C # GPU 1 : 69C # GPU 0 : 70C # GPU 1 : 70C # GPU 1 : 72C # GPU 2 : 72C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 54C # GPU 1 : 58C # GPU 2 : 59C # GPU 1 : 62C # GPU 1 : 64C # GPU 0 : 60C # GPU 1 : 66C # GPU 0 : 62C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 58C # GPU 1 : 53C # GPU 2 : 58C # GPU 0 : 60C # GPU 1 : 58C # GPU 0 : 63C # GPU 1 : 62C </stderr_txt> ]]> |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
MJH: +2 http://www.gpugrid.net/result.php?resultid=10328606 http://www.gpugrid.net/result.php?resultid=10328572 These failed after a simple system restart. |
|
Send message Joined: 6 Feb 10 Posts: 38 Credit: 274,204,838 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
Every time the lights go out I lose all the units that are being worked on. If I restart the system using the proper procedures, no problem. This has been going on for months and I am really getting sick of it. Bought UPS, now lets see. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Is your error:
If not, then create a new thread please. This thread is about that error. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
This is *STILL* an issue. When can we finally get it fully fixed? :( http://www.gpugrid.net/result.php?resultid=12800989 Outcome Computation error Client state Compute error Exit status 80 (0x50) Unknown error number Run time 32,087.62 Stderr output <core_client_version>7.4.8</core_client_version> <![CDATA[ <message> The file exists. (0x50) - exit code 80 (0x50) </message> <stderr_txt> ... ... ... # BOINC suspending at user request (exit) </stderr_txt> ]]> http://www.gpugrid.net/result.php?resultid=12796113 Outcome Computation error Client state Compute error Exit status 80 (0x50) Unknown error number Run time 2,221.71 Stderr output <core_client_version>7.4.8</core_client_version> <![CDATA[ <message> The file exists. (0x50) - exit code 80 (0x50) </message> <stderr_txt> ... ... ... # BOINC suspending at user request (exit) </stderr_txt> ]]> |
MJHSend message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level ![]() Scientific publications ![]()
|
Jacob, You are in luck. It's time for another round of GPUGRID development. Remind me, please, the circumstance under which this is occuring. Matt |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I'm on the road, but will be home tonight. I'll try to re-review, probably tomorrow. Thanks! |
|
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Jacob, Hi Matt, I don't know if we need to made a new post for this, but I have a request. Is it possible inn the Stderr output file, show only the temperature of the GPU that did the job? Now the temperature change from every card is shown. Thank you. Greetings from TJ |
|
Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Tricky - the GPU ordering from the temperature query interface doesn't correspond to the CUDA ordering. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
MJH: I've reviewed the notes in the thread. The main posts that detail the problem are: http://www.gpugrid.net/forum_thread.php?id=3621&nowrap=true#35348 http://www.gpugrid.net/forum_thread.php?id=3621&nowrap=true#37242 It is not easy to reproduce on demand. I suspect that your best bet is to investigate/walk the code, to find an area that could result in: <message> The file exists. (0x50) - exit code 80 (0x50) </message> It seems to happen more frequently when the task is suspended before BOINC is shutdown, but suspending the task might not be a requirement of the bug. Testing should involve suspending BOINC, and then shutting BOINC down, and then starting BOINC back up. Also, to test the "power outage" scenario, I think testing could involve right clicking boincmgr.exe in Task Manager, and clicking "End process tree". I hope this helps. The focus should be on code areas that could result in that error message. Regards, Jacob |
|
Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
That exit circumstance is the failsafe exit that stops a WU getting stuck in an endless cycle of abort - resume, without making any progress. It should only trigger if the machine has been up for a few minutes (from which we infer that the WU crashed the machine). Matt |
|
Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
That exit circumstance is the failsafe exit that stops a WU getting stuck in an endless cycle of abort - resume, without making any progress. It should only trigger if the machine has been up for a few minutes (from which we infer that the WU crashed the machine). Matt |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Perhaps you could give me even more clues on how to reproduce the error on demand? It seems that it is currently too stringent, causing otherwise-healthy tasks to fail when starting BOINC. |
|
Send message Joined: 5 May 13 Posts: 187 Credit: 349,254,454 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
He said: It should only trigger if the machine has been up for a few minutes So, you could try suspending / closing BOINC then resuming it without shutting down the machine in-between and with shutting it down.
|
©2025 Universitat Pompeu Fabra