Message boards :
Number crunching :
Error after 4 Hours
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 5 Mar 14 Posts: 16 Credit: 16,903,909 RAC: 0 Level ![]() Scientific publications ![]() ![]()
|
Hi, Here is the task in question: https://www.gpugrid.net/result.php?resultid=15241975 It seems that the WU got unstable and terminated, but I have no idea why. No OC'ing, good PSU (Corsair HX750i), no shut downs or graphic glitches, and good cooling (GPU temps are 72C constant on 23C ambient). Anybody know why? Thanks. |
|
Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
It is probably overclocked too much by the factory. Try reducing the GPU clock in 100 MHz steps. It will probably take only one. |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Here is the task in question: https://www.gpugrid.net/result.php?resultid=15241975 Your card is factory overclocked, as according to the specifications of the GTX 980Ti on NVidia homepage its default base clock is 1000MHz, while your card's is 1190MHz according to the task's log: <stderr_txt> # GPU [GeForce GTX 980 Ti] Platform [Windows] Rev [3212] VERSION [65] # SWAN Device 0 : # Name : GeForce GTX 980 Ti # ECC : Disabled # Global mem : 4095MB # Capability : 5.2 # PCI ID : 0000:01:00.0 # Device clock : 1190MHz # Memory clock : 3505MHz # Memory width : 384bit # Driver version : r372_53 : 37254 While it is not too much - as my GTX980Ti's are running at 1380-1400MHz - it could cause these errors if the GPU voltage is not appropriate for this frequency. Raising the GPU voltage however, will raise the GPU temperature, so it's better to reduce its frequency and/or temperature. good PSU (Corsair HX750i), no shut downs or graphic glitches, and good cooling (GPU temps are 72C constant on 23C ambient). Check the GPU voltage by a GPU monitoring tool like GPU-Z, Nvidia Inspector or MSI Afterburner. Setting a more aggressive fan profile to further decrease GPU temp could fix this error. |
|
Send message Joined: 5 Mar 14 Posts: 16 Credit: 16,903,909 RAC: 0 Level ![]() Scientific publications ![]() ![]()
|
I'm really hesitant in reducing the clocks as I also game and I'm frankly too lazy to switch between OC profiles. However, fans are usually running at 35% so I can always boost the fan curve and up the voltage. Also, my WU is almost complete and so far no terminations. I think the GPU was overloaded randomly but if it becomes a frequent problem, I will resort to upping the voltage and fan speed, and as a last resort lowering frequency. Thanks for the responses. |
©2025 Universitat Pompeu Fabra