Advanced search

Message boards : Number crunching : PABLO_bound2KIX2CMY workunits

Author Message
Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1814
Credit: 9,970,837,494
RAC: 6,545,179
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47069 - Posted: 22 Apr 2017 | 17:13:59 UTC
Last modified: 22 Apr 2017 | 17:15:08 UTC

Something wrong with some of these workunits.
Even though there are some successful in this batch, all 3 that I've received failed immediately (on all hosts these have been) with the following error:

# The simulation has become unstable. Terminating to avoid lock-up (1) # Attempting restart (step 5000)

One of them had this error also:
SWAN : FATAL : Cuda driver error 700 in file 'swanlibnv2.cpp' in line 1965.
The following workunit progressed very slowly, so I've restarted the host (probably the previous error downclocked the GPU, but I didn't checked it.)

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 332
Credit: 3,760,508,309
RAC: 374,795
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47094 - Posted: 25 Apr 2017 | 2:02:57 UTC

I had one of these fail as well, and it also failed on the other hosts.

Stderr output

<core_client_version>7.6.33</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -97 (0xffffff9f)
</message>
<stderr_txt>
# GPU [GeForce GTX 980 Ti] Platform [Windows] Rev [3212] VERSION [65]
# SWAN Device 0 :
# Name : GeForce GTX 980 Ti
# ECC : Disabled
# Global mem : 4095MB
# Capability : 5.2
# PCI ID : 0000:02:00.0
# Device clock : 1190MHz
# Memory clock : 3505MHz
# Memory width : 384bit
# Driver version : r355_00 : 35582
# GPU 0 : 43C
# The simulation has become unstable. Terminating to avoid lock-up (1)
# Attempting restart (step 5000)
# GPU [GeForce GTX 980 Ti] Platform [Windows] Rev [3212] VERSION [65]
# SWAN Device 0 :
# Name : GeForce GTX 980 Ti
# ECC : Disabled
# Global mem : 4095MB
# Capability : 5.2
# PCI ID : 0000:02:00.0
# Device clock : 1190MHz
# Memory clock : 3505MHz
# Memory width : 384bit
# Driver version : r355_00 : 35582
# The simulation has become unstable. Terminating to avoid lock-up (1)

</stderr_txt>


http://www.gpugrid.net/workunit.php?wuid=12513254



Profile Alessio Arbetti [VENETO]
Send message
Joined: 23 Sep 08
Posts: 1
Credit: 46,166,232
RAC: 277,470
Level
Val
Scientific publications
watwatwatwatwatwat
Message 47108 - Posted: 26 Apr 2017 | 6:31:55 UTC

Also for me one error
Output su Stderr
<core_client_version>7.6.33</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -1073741515 (0xc0000135)
</message>
<stderr_txt>
# GPU [GeForce GTX 1050 Ti] Platform [Windows] Rev [3212] VERSION [80]
# SWAN Device 0 :
# Name : GeForce GTX 1050 Ti
# ECC : Disabled
# Global mem : 4096MB
# Capability : 6.1
# PCI ID : 0000:01:00.0
# Device clock : 1506MHz
# Memory clock : 3504MHz
# Memory width : 128bit
# Driver version : r381_64 : 38165
# GPU 0 : 50C
# GPU 0 : 51C
# GPU 0 : 53C
SWAN : FATAL : Cuda driver error 719 in file 'swanlibnv2.cpp' in line 1965.
# SWAN swan_assert 0
# GPU [GeForce GTX 1050 Ti] Platform [Windows] Rev [3212] VERSION [80]
# SWAN Device 0 :
# Name : GeForce GTX 1050 Ti
# ECC : Disabled
# Global mem : 4096MB
# Capability : 6.1
# PCI ID : 0000:01:00.0
# Device clock : 1506MHz
# Memory clock : 3504MHz
# Memory width : 128bit
# Driver version : r381_64 : 38165
# GPU 0 : 52C
# GPU 0 : 53C
# GPU 0 : 54C
# GPU 0 : 55C
# GPU 0 : 56C

</stderr_txt>
]]>

Post to thread

Message boards : Number crunching : PABLO_bound2KIX2CMY workunits