Problem - Tasks error when exiting/resuming using 334.67 drivers

Message boards : Number crunching : Problem - Tasks error when exiting/resuming using 334.67 drivers
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Variable

Send message
Joined: 20 Nov 13
Posts: 21
Credit: 480,846,415
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 36279 - Posted: 14 Apr 2014, 13:48:03 UTC

I downclocked my card slightly (~50MHz), or more precisely reduced the overclock, and haven't gotten any more errors since. Not sure if that's causal or coincidental since I haven't bumped it back up yet to test.
ID: 36279 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 36280 - Posted: 14 Apr 2014, 14:05:15 UTC - in response to Message 36279.  
Last modified: 14 Apr 2014, 14:05:49 UTC

Variable: Your issue(s) are different than the one posted in this thread (see post 1). If you continue to have problems, please create a new thread.

Thanks,
Jacob
ID: 36280 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 36427 - Posted: 19 Apr 2014, 11:16:21 UTC - in response to Message 36252.  
Last modified: 19 Apr 2014, 11:16:40 UTC

And... another 8.15 task crashed just now, losing tons of work. Why are we still using 8.15?!?


Thank you!


Thank you too, for your help in diagnosing it.
On to the next problem!

Matt


I thought this problem was fixed -- why are we still receiving 8.15 tasks? I just had 2 more fail, losing several hours of work, presumably because they were 8.15 instead of 8.20. Upsetting.
ID: 36427 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Wdethomas

Send message
Joined: 6 Feb 10
Posts: 38
Credit: 274,204,838
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwat
Message 36439 - Posted: 19 Apr 2014, 16:23:50 UTC

Power went out yesterday, I lost work units. Power went out today, I lost work units. This needs to get fixed!!!!!
ID: 36439 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 36707 - Posted: 28 Apr 2014, 12:34:28 UTC
Last modified: 28 Apr 2014, 12:35:36 UTC

MJH:

Although the 8.41 app appears to have improved the situation, I am still occasionally getting what appears to be the same error. I think the scenario is suspending activity, then restarting BOINC. Can you see if there's some scenario/condition that still causes the task to fail?

Error summary:
Exit status 80 (0x50) Unknown error number
The file exists.
(0x50) - exit code 80 (0x50)

Last messaged logged in stderr.txt:
# BOINC suspending at user request (exit)

Task results and stderr.txt:

http://www.gpugrid.net/result.php?resultid=9339200

Name I188-NATHAN_RPS1_adapt4-1-5-RND2310_0
Workunit 6566597
Created 25 Apr 2014 | 22:04:16 UTC
Sent 26 Apr 2014 | 11:18:41 UTC
Received 27 Apr 2014 | 4:06:56 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 80 (0x50) Unknown error number
Computer ID 153764
Report deadline 1 May 2014 | 11:18:41 UTC
Run time 38,039.02
CPU time 6,213.84
Validate state Invalid
Credit 0.00
Application version Long runs (8-12 hours on fastest card) v8.41 (cuda60)
Stderr output

<core_client_version>7.3.15</core_client_version>
<![CDATA[
<message>
The file exists.
(0x50) - exit code 80 (0x50)
</message>
<stderr_txt>
# GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 0 :
# Name : GeForce GTX 660 Ti
# ECC : Disabled
# Global mem : 3072MB
# Capability : 3.0
# PCI ID : 0000:09:00.0
# Device clock : 1124MHz
# Memory clock : 3004MHz
# Memory width : 192bit
# Driver version : DM337_50 : 33761
# GPU 0 : 68C
# GPU 1 : 69C
# GPU 2 : 77C
# GPU 0 : 69C
# GPU 1 : 70C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 1 :
# Name : GeForce GTX 460
# ECC : Disabled
# Global mem : 1024MB
# Capability : 2.1
# PCI ID : 0000:08:00.0
# Device clock : 1526MHz
# Memory clock : 1900MHz
# Memory width : 256bit
# Driver version : DM337_50 : 33761
# GPU 0 : 70C
# GPU 1 : 69C
# GPU 2 : 78C
# GPU 1 : 70C
# GPU 1 : 71C
# GPU 2 : 79C
# GPU 0 : 71C
# GPU 1 : 72C
# GPU 0 : 72C
# GPU 0 : 73C
# GPU 2 : 80C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 1 :
# Name : GeForce GTX 460
# ECC : Disabled
# Global mem : 1024MB
# Capability : 2.1
# PCI ID : 0000:08:00.0
# Device clock : 1526MHz
# Memory clock : 1900MHz
# Memory width : 256bit
# Driver version : DM337_50 : 33761
# GPU 0 : 70C
# GPU 1 : 65C
# GPU 2 : 74C
# GPU 0 : 71C
# GPU 1 : 68C
# GPU 2 : 75C
# GPU 0 : 72C
# GPU 1 : 70C
# GPU 1 : 72C
# GPU 2 : 76C
# GPU 2 : 77C
# GPU 1 : 73C
# GPU 2 : 79C
# GPU 1 : 74C
# GPU 1 : 75C
# GPU 1 : 76C
# GPU 1 : 77C
# GPU 1 : 78C
# GPU 1 : 79C
# GPU 1 : 80C
# GPU 0 : 74C
# GPU 0 : 75C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 1 :
# Name : GeForce GTX 460
# ECC : Disabled
# Global mem : 1024MB
# Capability : 2.1
# PCI ID : 0000:08:00.0
# Device clock : 1526MHz
# Memory clock : 1900MHz
# Memory width : 256bit
# Driver version : DM337_50 : 33761
# GPU 0 : 60C
# GPU 1 : 58C
# GPU 2 : 55C
# GPU 0 : 65C
# GPU 1 : 63C
# GPU 2 : 70C
# GPU 0 : 68C
# GPU 1 : 67C
# GPU 2 : 74C
# GPU 0 : 70C
# GPU 1 : 69C
# GPU 2 : 75C
# GPU 0 : 71C
# GPU 1 : 70C
# GPU 2 : 76C
# GPU 0 : 72C
# GPU 1 : 73C
# GPU 0 : 73C
# GPU 2 : 77C
# GPU 2 : 78C
# GPU 2 : 79C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 1 :
# Name : GeForce GTX 460
# ECC : Disabled
# Global mem : 1024MB
# Capability : 2.1
# PCI ID : 0000:08:00.0
# Device clock : 1526MHz
# Memory clock : 1900MHz
# Memory width : 256bit
# Driver version : DM337_50 : 33761
# GPU 0 : 59C
# GPU 1 : 58C
# GPU 2 : 55C
# GPU 0 : 61C
# GPU 1 : 63C
# GPU 0 : 63C
# GPU 1 : 67C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 1 :
# Name : GeForce GTX 460
# ECC : Disabled
# Global mem : 1024MB
# Capability : 2.1
# PCI ID : 0000:08:00.0
# Device clock : 1526MHz
# Memory clock : 1900MHz
# Memory width : 256bit
# Driver version : DM337_50 : 33761
# GPU 0 : 61C
# GPU 1 : 64C
# GPU 2 : 55C
# GPU 0 : 63C
# GPU 1 : 68C
# GPU 0 : 64C
# GPU 1 : 70C
# GPU 0 : 65C
# GPU 1 : 71C
# GPU 2 : 56C
# GPU 0 : 66C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 1 :
# Name : GeForce GTX 460
# ECC : Disabled
# Global mem : 1024MB
# Capability : 2.1
# PCI ID : 0000:08:00.0
# Device clock : 1526MHz
# Memory clock : 1900MHz
# Memory width : 256bit
# Driver version : DM337_50 : 33761
# GPU 0 : 63C
# GPU 1 : 66C
# GPU 2 : 62C
# GPU 0 : 65C
# GPU 1 : 70C
# GPU 0 : 66C
# GPU 1 : 74C
# GPU 0 : 67C
# GPU 1 : 78C
# GPU 0 : 69C
# GPU 0 : 71C
# GPU 0 : 73C
# GPU 2 : 71C
# GPU 2 : 73C
# GPU 2 : 74C
# GPU 2 : 75C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 1 :
# Name : GeForce GTX 460
# ECC : Disabled
# Global mem : 1024MB
# Capability : 2.1
# PCI ID : 0000:08:00.0
# Device clock : 1526MHz
# Memory clock : 1900MHz
# Memory width : 256bit
# Driver version : DM337_50 : 33761
# GPU 0 : 61C
# GPU 1 : 63C
# GPU 2 : 57C
# GPU 0 : 63C
# GPU 1 : 67C
# GPU 0 : 64C
# GPU 1 : 70C
# GPU 0 : 66C
# GPU 1 : 71C
# GPU 1 : 73C
# GPU 0 : 67C
# GPU 1 : 74C
# GPU 1 : 75C
# GPU 0 : 71C
# GPU 0 : 72C
# GPU 2 : 69C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 1 :
# Name : GeForce GTX 460
# ECC : Disabled
# Global mem : 1024MB
# Capability : 2.1
# PCI ID : 0000:08:00.0
# Device clock : 1526MHz
# Memory clock : 1900MHz
# Memory width : 256bit
# Driver version : DM337_50 : 33761
# GPU 0 : 59C
# GPU 1 : 59C
# GPU 2 : 56C
# GPU 0 : 65C
# GPU 1 : 64C
# GPU 2 : 64C
# GPU 0 : 68C
# GPU 1 : 67C
# GPU 2 : 69C
# GPU 0 : 69C
# GPU 1 : 69C
# GPU 2 : 71C
# GPU 0 : 71C
# GPU 2 : 72C
# GPU 0 : 73C
# GPU 1 : 72C
# GPU 2 : 73C
# GPU 1 : 73C
# GPU 2 : 74C
# GPU 2 : 75C
# GPU 2 : 76C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 1 :
# Name : GeForce GTX 460
# ECC : Disabled
# Global mem : 1024MB
# Capability : 2.1
# PCI ID : 0000:08:00.0
# Device clock : 1526MHz
# Memory clock : 1900MHz
# Memory width : 256bit
# Driver version : DM337_50 : 33761
# GPU 0 : 62C
# GPU 1 : 65C
# GPU 2 : 57C
# GPU 0 : 63C
# GPU 1 : 68C
# BOINC suspending at user request (exit)

</stderr_txt>
]]>
ID: 36707 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jozef J

Send message
Joined: 7 Jun 12
Posts: 112
Credit: 1,140,895,172
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 36835 - Posted: 14 May 2014, 16:20:05 UTC

I have highlighted the problem in counting the cards gtx 680 a month now happens to me from . Every day becomes that the tasks of collapse in such a weird way-slow down your PC system in windows and also according to GPU-Z stops the card count. entire system is as if in slow motion ... only helps suspend computation on graphics card, abortions every task and the new has withdrawn. ., and after about cca 6-12 aborted about the tasks shall start another 3 working normally .. it's weird errors and concerns only nvidia cards 600, to 700 card counting goes perfectly.
I play with the problem for months.... and computing of other projects without problems.
It's not boiling cards or a weak PSU.. I'm not able to count on 680 of these normally GPUGRID, consider selling them or any other project..
ID: 36835 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 36856 - Posted: 17 May 2014, 8:01:05 UTC
Last modified: 17 May 2014, 8:02:41 UTC

MJH:

The v8.41 version of the application still has the occasional "The file exists. (0x50) - exit code 80 (0x50)" error, trashing loads of work :( Can you please invest some time to fix it?

http://www.gpugrid.net/result.php?resultid=10318262

Name	A2ART4Ex05x95-GERARD_A2ART4E-13-14-RND0991_0
Workunit	7496762
Created	14 May 2014 | 5:52:04 UTC
Sent	16 May 2014 | 13:57:32 UTC
Received	17 May 2014 | 3:24:11 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	80 (0x50) Unknown error number
Computer ID	153764
Report deadline	21 May 2014 | 13:57:32 UTC
Run time	24,161.19
CPU time	6,302.88
Validate state	Invalid
Credit	0.00
Application version	Long runs (8-12 hours on fastest card) v8.41 (cuda60)
Stderr output

<core_client_version>7.3.19</core_client_version>
<![CDATA[
<message>
The file exists.
 (0x50) - exit code 80 (0x50)
</message>
<stderr_txt>
# GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 0	:
#	Name		: GeForce GTX 660 Ti
#	ECC		: Disabled
#	Global mem	: 3072MB
#	Capability	: 3.0
#	PCI ID		: 0000:09:00.0
#	Device clock	: 1124MHz
#	Memory clock	: 3004MHz
#	Memory width	: 192bit
#	Driver version	: DM337_50 : 33761
# GPU 0 : 67C
# GPU 1 : 75C
# GPU 2 : 74C
# GPU 0 : 68C
# GPU 1 : 76C
# GPU 0 : 69C
# GPU 0 : 70C
# GPU 1 : 77C
# GPU 0 : 71C
# GPU 0 : 72C
# GPU 2 : 75C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 0	:
#	Name		: GeForce GTX 660 Ti
#	ECC		: Disabled
#	Global mem	: 3072MB
#	Capability	: 3.0
#	PCI ID		: 0000:09:00.0
#	Device clock	: 1124MHz
#	Memory clock	: 3004MHz
#	Memory width	: 192bit
#	Driver version	: DM337_50 : 33761
# GPU 0 : 66C
# GPU 1 : 71C
# GPU 2 : 58C
# GPU 0 : 67C
# GPU 2 : 62C
# GPU 2 : 66C
# GPU 2 : 67C
# GPU 0 : 68C
# GPU 1 : 72C
# GPU 2 : 68C
# GPU 2 : 69C
# GPU 2 : 70C
# GPU 0 : 69C
# GPU 1 : 73C
# GPU 2 : 71C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 0	:
#	Name		: GeForce GTX 660 Ti
#	ECC		: Disabled
#	Global mem	: 3072MB
#	Capability	: 3.0
#	PCI ID		: 0000:09:00.0
#	Device clock	: 1124MHz
#	Memory clock	: 3004MHz
#	Memory width	: 192bit
#	Driver version	: DM337_50 : 33761
# GPU 0 : 66C
# GPU 1 : 71C
# GPU 2 : 65C
# GPU 0 : 67C
# GPU 1 : 72C
# GPU 2 : 67C
# GPU 2 : 68C
# GPU 0 : 68C
# GPU 2 : 69C
# GPU 1 : 73C
# GPU 0 : 69C
# GPU 2 : 70C
# GPU 2 : 71C
# GPU 1 : 74C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 0	:
#	Name		: GeForce GTX 660 Ti
#	ECC		: Disabled
#	Global mem	: 3072MB
#	Capability	: 3.0
#	PCI ID		: 0000:09:00.0
#	Device clock	: 1124MHz
#	Memory clock	: 3004MHz
#	Memory width	: 192bit
#	Driver version	: DM337_50 : 33761
# GPU 0 : 68C
# GPU 1 : 73C
# GPU 2 : 68C
# GPU 2 : 69C
# GPU 2 : 70C
# GPU 0 : 69C
# GPU 1 : 74C
# GPU 2 : 71C
# GPU 0 : 70C
# GPU 1 : 75C
# GPU 2 : 72C
# GPU 2 : 73C
# GPU 1 : 76C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 1	:
#	Name		: GeForce GTX 460
#	ECC		: Disabled
#	Global mem	: 1024MB
#	Capability	: 2.1
#	PCI ID		: 0000:08:00.0
#	Device clock	: 1526MHz
#	Memory clock	: 1900MHz
#	Memory width	: 256bit
#	Driver version	: DM337_50 : 33761
# GPU 0 : 57C
# GPU 1 : 68C
# GPU 2 : 61C
# GPU 0 : 61C
# GPU 1 : 69C
# GPU 0 : 64C
# GPU 1 : 70C
# GPU 0 : 65C
# GPU 1 : 71C
# GPU 0 : 66C
# GPU 1 : 72C
# GPU 0 : 67C
# GPU 1 : 73C
# GPU 0 : 69C
# GPU 0 : 70C
# GPU 2 : 67C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 1	:
#	Name		: GeForce GTX 460
#	ECC		: Disabled
#	Global mem	: 1024MB
#	Capability	: 2.1
#	PCI ID		: 0000:08:00.0
#	Device clock	: 1526MHz
#	Memory clock	: 1900MHz
#	Memory width	: 256bit
#	Driver version	: DM337_50 : 33761
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 1	:
#	Name		: GeForce GTX 460
#	ECC		: Disabled
#	Global mem	: 1024MB
#	Capability	: 2.1
#	PCI ID		: 0000:08:00.0
#	Device clock	: 1526MHz
#	Memory clock	: 1900MHz
#	Memory width	: 256bit
#	Driver version	: DM337_50 : 33761
# GPU 0 : 61C
# GPU 1 : 53C
# GPU 2 : 67C
# GPU 0 : 64C
# GPU 1 : 58C
# GPU 2 : 69C
# GPU 0 : 66C
# GPU 1 : 61C
# GPU 0 : 67C
# GPU 1 : 64C
# GPU 2 : 70C
# GPU 0 : 68C
# GPU 1 : 65C
# GPU 1 : 67C
# GPU 2 : 71C
# GPU 0 : 69C
# GPU 1 : 69C
# GPU 0 : 70C
# GPU 1 : 70C
# GPU 1 : 71C
# GPU 2 : 72C
# GPU 0 : 71C
# GPU 1 : 72C
# GPU 2 : 73C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 1	:
#	Name		: GeForce GTX 460
#	ECC		: Disabled
#	Global mem	: 1024MB
#	Capability	: 2.1
#	PCI ID		: 0000:08:00.0
#	Device clock	: 1526MHz
#	Memory clock	: 1900MHz
#	Memory width	: 256bit
#	Driver version	: DM337_50 : 33761
# GPU 0 : 61C
# GPU 1 : 53C
# GPU 2 : 67C
# GPU 0 : 64C
# GPU 1 : 57C
# GPU 2 : 68C
# GPU 0 : 66C
# GPU 1 : 60C
# GPU 2 : 69C
# GPU 0 : 67C
# GPU 1 : 63C
# GPU 2 : 70C
# GPU 0 : 68C
# GPU 1 : 64C
# GPU 0 : 69C
# GPU 1 : 67C
# GPU 1 : 68C
# GPU 2 : 71C
# GPU 1 : 69C
# GPU 0 : 70C
# GPU 1 : 70C
# GPU 1 : 72C
# GPU 2 : 72C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 1	:
#	Name		: GeForce GTX 460
#	ECC		: Disabled
#	Global mem	: 1024MB
#	Capability	: 2.1
#	PCI ID		: 0000:08:00.0
#	Device clock	: 1526MHz
#	Memory clock	: 1900MHz
#	Memory width	: 256bit
#	Driver version	: DM337_50 : 33761
# GPU 0 : 54C
# GPU 1 : 58C
# GPU 2 : 59C
# GPU 1 : 62C
# GPU 1 : 64C
# GPU 0 : 60C
# GPU 1 : 66C
# GPU 0 : 62C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 1	:
#	Name		: GeForce GTX 460
#	ECC		: Disabled
#	Global mem	: 1024MB
#	Capability	: 2.1
#	PCI ID		: 0000:08:00.0
#	Device clock	: 1526MHz
#	Memory clock	: 1900MHz
#	Memory width	: 256bit
#	Driver version	: DM337_50 : 33761
# GPU 0 : 58C
# GPU 1 : 53C
# GPU 2 : 58C
# GPU 0 : 60C
# GPU 1 : 58C
# GPU 0 : 63C
# GPU 1 : 62C

</stderr_txt>
]]>

ID: 36856 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 36857 - Posted: 17 May 2014, 8:46:09 UTC - in response to Message 36856.  

MJH:

The v8.41 version of the application still has the occasional "The file exists. (0x50) - exit code 80 (0x50)" error, trashing loads of work :( Can you please invest some time to fix it?

http://www.gpugrid.net/result.php?resultid=10318262

+2

http://www.gpugrid.net/result.php?resultid=10328606

http://www.gpugrid.net/result.php?resultid=10328572

These failed after a simple system restart.
ID: 36857 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Wdethomas

Send message
Joined: 6 Feb 10
Posts: 38
Credit: 274,204,838
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwat
Message 36984 - Posted: 1 Jun 2014, 20:03:13 UTC

Every time the lights go out I lose all the units that are being worked on. If I restart the system using the proper procedures, no problem. This has been going on for months and I am really getting sick of it. Bought UPS, now lets see.
ID: 36984 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 36985 - Posted: 2 Jun 2014, 3:51:56 UTC - in response to Message 36984.  

Is your error:

The file exists.
(0x50) - exit code 80 (0x50)


If not, then create a new thread please.

This thread is about that error.
ID: 36985 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37242 - Posted: 7 Jul 2014, 20:07:12 UTC
Last modified: 7 Jul 2014, 20:09:57 UTC

This is *STILL* an issue. When can we finally get it fully fixed? :(

http://www.gpugrid.net/result.php?resultid=12800989
Outcome Computation error
Client state Compute error
Exit status 80 (0x50) Unknown error number
Run time 32,087.62
Stderr output
<core_client_version>7.4.8</core_client_version>
<![CDATA[
<message>
The file exists.
 (0x50) - exit code 80 (0x50)
</message>
<stderr_txt>
...
...
...
# BOINC suspending at user request (exit)
</stderr_txt>
]]>


http://www.gpugrid.net/result.php?resultid=12796113
Outcome Computation error
Client state Compute error
Exit status 80 (0x50) Unknown error number
Run time 2,221.71
Stderr output
<core_client_version>7.4.8</core_client_version>
<![CDATA[
<message>
The file exists.
 (0x50) - exit code 80 (0x50)
</message>
<stderr_txt>
...
...
...
# BOINC suspending at user request (exit)
</stderr_txt>
]]>
ID: 37242 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MJH

Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 37325 - Posted: 20 Jul 2014, 21:08:19 UTC - in response to Message 37242.  

Jacob,

You are in luck. It's time for another round of GPUGRID development. Remind me, please, the circumstance under which this is occuring.

Matt
ID: 37325 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37327 - Posted: 20 Jul 2014, 22:34:15 UTC

I'm on the road, but will be home tonight. I'll try to re-review, probably tomorrow. Thanks!
ID: 37327 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TJ

Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37338 - Posted: 21 Jul 2014, 14:39:34 UTC - in response to Message 37325.  

Jacob,

You are in luck. It's time for another round of GPUGRID development. Remind me, please, the circumstance under which this is occuring.

Matt

Hi Matt, I don't know if we need to made a new post for this, but I have a request.
Is it possible inn the Stderr output file, show only the temperature of the GPU that did the job? Now the temperature change from every card is shown.
Thank you.
Greetings from TJ
ID: 37338 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
GPUGRID Role account

Send message
Joined: 15 Feb 07
Posts: 134
Credit: 1,349,535,983
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 37339 - Posted: 21 Jul 2014, 15:22:47 UTC - in response to Message 37338.  

Tricky - the GPU ordering from the temperature query interface doesn't correspond to the CUDA ordering.
ID: 37339 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37353 - Posted: 22 Jul 2014, 13:21:25 UTC

MJH:

I've reviewed the notes in the thread. The main posts that detail the problem are:
http://www.gpugrid.net/forum_thread.php?id=3621&nowrap=true#35348
http://www.gpugrid.net/forum_thread.php?id=3621&nowrap=true#37242

It is not easy to reproduce on demand. I suspect that your best bet is to investigate/walk the code, to find an area that could result in:
<message>
The file exists.
(0x50) - exit code 80 (0x50)
</message>

It seems to happen more frequently when the task is suspended before BOINC is shutdown, but suspending the task might not be a requirement of the bug.

Testing should involve suspending BOINC, and then shutting BOINC down, and then starting BOINC back up. Also, to test the "power outage" scenario, I think testing could involve right clicking boincmgr.exe in Task Manager, and clicking "End process tree".

I hope this helps. The focus should be on code areas that could result in that error message.

Regards,
Jacob
ID: 37353 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
GPUGRID Role account

Send message
Joined: 15 Feb 07
Posts: 134
Credit: 1,349,535,983
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 37357 - Posted: 22 Jul 2014, 15:00:47 UTC - in response to Message 37353.  

That exit circumstance is the failsafe exit that stops a WU getting stuck in an endless cycle of abort - resume, without making any progress. It should only trigger if the machine has been up for a few minutes (from which we infer that the WU crashed the machine).

Matt
ID: 37357 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
GPUGRID Role account

Send message
Joined: 15 Feb 07
Posts: 134
Credit: 1,349,535,983
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 37358 - Posted: 22 Jul 2014, 15:00:49 UTC - in response to Message 37353.  

That exit circumstance is the failsafe exit that stops a WU getting stuck in an endless cycle of abort - resume, without making any progress. It should only trigger if the machine has been up for a few minutes (from which we infer that the WU crashed the machine).

Matt
ID: 37358 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37359 - Posted: 22 Jul 2014, 15:28:51 UTC

Perhaps you could give me even more clues on how to reproduce the error on demand? It seems that it is currently too stringent, causing otherwise-healthy tasks to fail when starting BOINC.
ID: 37359 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Vagelis Giannadakis

Send message
Joined: 5 May 13
Posts: 187
Credit: 349,254,454
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 37361 - Posted: 22 Jul 2014, 15:46:46 UTC - in response to Message 37359.  

He said:
It should only trigger if the machine has been up for a few minutes

So, you could try suspending / closing BOINC then resuming it without shutting down the machine in-between and with shutting it down.
ID: 37361 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Problem - Tasks error when exiting/resuming using 334.67 drivers

©2025 Universitat Pompeu Fabra