Message boards :
Graphics cards (GPUs) :
New Nvidia Driver error
Message board moderation
| Author | Message |
|---|---|
DingoSend message Joined: 1 Nov 07 Posts: 20 Credit: 128,376,317 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I did the driver update for Nvidia to 431.6 and there is an error in the driver code that stops me from running GPU Grid as all the work since then has this error. It is on my windows machine with my 1080Ti. I can run Primegrid on the machine after the update so looks like a project code issue ??? This is the machine: https://www.gpugrid.net/results.php?hostid=453402 At the very end of processing: Name e9s120_e3s89p1f137-PABLO_V4_UCB_p27_sj403_no_salt_IDP-0-2-RND6771_0 Workunit 16678301 Created 28 Jul 2019 | 19:10:51 UTC Sent 28 Jul 2019 | 20:27:33 UTC Received 29 Jul 2019 | 2:37:26 UTC Server state Over Outcome Computation error Client state Compute error Exit status -55 (0xffffffffffffffc9) Unknown error number Computer ID 453402 Report deadline 2 Aug 2019 | 20:27:33 UTC Run time 21,941.82 CPU time 1,907.74 Validate state Invalid Credit 0.00 Application version Long runs (8-12 hours on fastest card) v9.22 (cuda80) <core_client_version>7.14.2</core_client_version> <![CDATA[ <message> (unknown error) - exit code -55 (0xffffffc9)</message> <stderr_txt> # GPU [GeForce GTX 1080 Ti] Platform [Windows] Rev [3212] VERSION [80] # SWAN Device 0 : # Name : GeForce GTX 1080 Ti # ECC : Disabled # Global mem : 11264MB # Capability : 6.1 # PCI ID : 0000:0A:00.0 # Device clock : 1645MHz # Memory clock : 5505MHz # Memory width : 352bit # Driver version : r431_31 : 43136 # GPU 0 : 71C # GPU [GeForce GTX 1080 Ti] Platform [Windows] Rev [3212] VERSION [80] # SWAN Device 0 : # Name : GeForce GTX 1080 Ti # ECC : Disabled # Global mem : 11264MB # Capability : 6.1 # PCI ID : 0000:0A:00.0 # Device clock : 1645MHz # Memory clock : 5505MHz # Memory width : 352bit # Driver version : r431_31 : 43136 # GPU 0 : 68C # GPU 0 : 69C # GPU 0 : 70C SWAN : FATAL : Cuda driver error 999 in file 'swanlibnv2.cpp' in line 1965. # SWAN swan_assert 0 Proud Founder and member of Have a look at my WebCam |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
what seems strange to me is: Run time 21,941.82 CPU time 1,907.74 |
|
Send message Joined: 4 Aug 14 Posts: 266 Credit: 2,219,935,054 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
There are 4 recent errors reported for this host. 3 errors with v431.36 and 1 error with v431.60 v431.36 errors 1 task that failed was from a batch with a 68% failure rate, so this failure can be attributed to the bad batch, 2 tasks failed at exactly the same time with this error: Cuda driver error 999 in file 'swanlibnv2.cpp' in line 1965 so I suspect the attached link explains this issue... https://www.gpugrid.net/forum_thread.php?id=4652#48209 v431.60 error An error appears to be reported (Access violation : progress made, try to restart), and then aborted by user. What was the issue that lead you to abort the task? There are other hosts successfully using v431.60 so I don't think the version is the issue, perhaps try another task to see if further issues are experienced. EDIT: This error could also be attributed to a bad batch, see this thread: https://www.gpugrid.net/forum_thread.php?id=4634#48021 |
|
Send message Joined: 4 Aug 14 Posts: 266 Credit: 2,219,935,054 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
what seems strange to me is: If SWAN_SYNC is not enabled, this can be normal, especially on a fast processor (Ryzen 7 1800X) |
DingoSend message Joined: 1 Nov 07 Posts: 20 Credit: 128,376,317 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
OK I will try another task and see what happens. This is th task that is running now: https://www.gpugrid.net/workunit.php?wuid=16682627 |
DingoSend message Joined: 1 Nov 07 Posts: 20 Credit: 128,376,317 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
OK all is fine now. Must have been a problem of the update happening while GPUGRID was running ?? |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
OK all is fine now. Must have been a problem of the update happening while GPUGRID was running ??Exactly. |
©2025 Universitat Pompeu Fabra