Message boards :
Graphics cards (GPUs) :
Temperatures
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 21 Mar 16 Posts: 513 Credit: 4,673,458,277 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
What temperatures do you run your cards at? I keep mine in the low 70s and 60s if I can. Have people been noticing a higher death rate with certain temps, from experience, not theoretical. |
|
Send message Joined: 5 Jan 09 Posts: 670 Credit: 2,498,095,550 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
86c for my 980ti 24/7 no errors due to that temp. Never had a card fail. |
|
Send message Joined: 20 Apr 15 Posts: 285 Credit: 1,102,216,607 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
60-65°C for my previous gtx980 (MSI) and my current gtx1070 and 1080, also 24/7. The latter Palit GameRock/Jetstream Cards are 2,5 slots high and therefore have extremely good cooling just by sheer radiator mass. Fast, cool and silent, but a little bulky and therefore maybe not everyones cup of tea. Mild downclocking by ~20MHz surprisingly reduces the temperature of the Palit (and also for the identical Gainward) by another 5-10°C right away. However that may be different with ASUS, Zotac, Gigabyte, EVGA or other products. Normally the electronic components can withstand much higher temps (>250°C) in order to survive flow soldering, but only for a limited time. So my assumption is that at least the capacitors last longer at 24/7 temperatures less than 70°C. But you know, everything is possible and even a well cooled GPU can fail after one year only. There are little studies on the web with statistical significance, so we can only guess. Have people been noticing a higher death rate with certain temps, from experience, not theoretical. Well... exactly that is difficult to say, you never know why a card was failing. You would have to run 1000 identical cards in parallel with different temps and try to get some statistics out of it. IMHO only a GPU manufacturer can possibly answer that. They surely do some load testing with their cards prior to release. I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday. |
|
Send message Joined: 22 Nov 12 Posts: 72 Credit: 14,040,706,346 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Try to keep mine at 75c or lower as from my experience this helps reduce errors, system crashes or bad data being created. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
My Maxwell GTX 980 Ti GPUs, which use GPU Boost v2.0, ship with the "Temp Target" set at 83*C. And I've noticed that the drivers will start dropping GPU clocks, when temps get near that. So, I create a custom fan curve, using MSI Afterburner, where maximum fans are attained before that point, at 80*C. As a result, when this PC is crunching, the GPU fans are quite loud, but the temps stay around 70-80*C, at full clocks, at all times. Fun! |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
86c for my 980ti 24/7 no errors due to that temp. Never had a card fail. that's interesting to read. For how long have these cards been crunching? I also have two 980ti (Palit Jetstream) in one of my PCs, crunching 24/7, and with the NVIDIA Inspector I set the temp limit to 63°C. Am I too cautious? |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
No. Regarding GPU temperatures the lower the better.86c for my 980ti 24/7 no errors due to that temp. Never had a card fail. Take a look at the pre-programmed behavior of the GPUs: as temperature rises, the GPU clock and voltage is reduced to preserve the chip. Not just the overclocking capabilities but the lifespan and temperature tolerance is a matter of silicon lottery. One chip could work for years at 86°C, another will fail after 6 months on that temperature. It's not just the temperature itself which is dangerous, but the expansion (of the chip and the PCB) caused by the change in temperature. Larger change in temperature causes lager expansion that is more wear. The chip and the soldering could withstand a limited number of thermal cycles, the less the expansion the more thermal cycles they could withstand (so the lifespan will be longer). The PCB has 6-8 layers, two of them is for the supply voltage, but the others have ten thousands of wires (and interconnects between the layers). If only one of them gets cut by the thermal expansion, the card will malfunction. The lifespan of capacitors is degraded by higher temperatures (that's why most of the manufacturers use "military standard" capacitors). Thermal cycle is one of the reasons for not to put the GPUs into sockets (the other is the limited width of the card). |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thanks, Zoltan, for the thorough and informative technical explanations. So I was not wrong by not overdoing it with the temperatures, and I'll keep them at this level. |
|
Send message Joined: 24 May 11 Posts: 7 Credit: 93,272,937 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hello , I've been crunching occassionally with gigabyte geforce gtx-780. Would be nice to do it more but gpu-fans keep noise. How can I make the card silent (lower frequency or something)? I have ubuntu 16.04 , core2duo , driver is 340.101. (newer driver seems to cause computer start-up problems somehow). 'Nvidia x server settings' don't allow to adjust frequencies but is there a way to adjust that somehow? |
|
Send message Joined: 5 Jan 09 Posts: 670 Credit: 2,498,095,550 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
86c for my 980ti 24/7 no errors due to that temp. Never had a card fail. The 980ti has been running 24/7 for about a year but doesn't always hit my max 86c depends on WU. |
|
Send message Joined: 20 Apr 15 Posts: 285 Credit: 1,102,216,607 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
Hello , I've been crunching occassionally with gigabyte geforce gtx-780. Would be nice to do it more but gpu-fans keep noise. How can I make the card silent (lower frequency or something)? I have ubuntu 16.04 , core2duo , driver is 340.101. (newer driver seems to cause computer start-up problems somehow). 'Nvidia x server settings' don't allow to adjust frequencies but is there a way to adjust that somehow? As you already mentioned, there is http://www.phoronix.com/scan.php?px=MTY1OTM&page=news_item https://wiki.ubuntuusers.de/Overclocking/ From those instructions I would have assumed that you can lower the frequency for both the chip and the memory..? Another alternative would be installing a third Party cooler such as the Arctic Accelero. That one will keep the card cool and silent even at full crunching speed. I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
From a terminal,
Section "Screen" Identifier "Screen0" Device "Device0" Monitor "Monitor0" DefaultDepth 24 Option "coolbits" "12" SubSection "Display" Depth 24 EndSubSection EndSection Save the config file and restart. On restarting open NVIDIA X Server Settings Beneath your GPU (GPU0), select PowerMizer To reduce the GPU clock by 96MHz under Editable Performance Levels, Graphics Clock Offset enter -96. Similarly, to reduce the memory transfer clock by 100MHz enter -100. To set an audibly acceptable GPU fan speed click on Thermal Settings, Enable GPU Fan Settings and set the fan at something sensible (probably 60% or more) to test it. Keep an eye on the GPU Temperature, and adjust accordingly; so that it does not go too high. Up to 70C is usually fine (if not there's likely a problem with the GPU), if it's above that but below 80C adjust your settings or at least keep an eye on the temp and performance (look out for failures/system issues). If it's over 80C you could increase the fan speed further &/or reduce the GPU clock & memory clock further, testing as you go. Note that you need to reapply settings after restarting or create an .sh file and enter the settings and set them to run at startup. For multiple GPU's you need to add coolbits for each GPU (under a screen) & you might need specific drivers (375.20). The above is what I'm using for one GPU with 370.28 drivers. Didn't get anywhere with nvclock - might be defunct with 16.04. Your NV settings can be added to a .sh file, which can be set as an executable and added to the startup list: Right click on your desktop and Create a New Document, Empty Document and call it nv.sh (must end in .sh). Past in the following values (note these are for underclocking the GPU & memory and setting the fan), save and close the file, !/bin/bash nvidia-settings -a '[gpu:0]/GPUGraphicsClockOffset[3]=-96' nvidia-settings -a '[gpu:0]/GPUMemoryTransferRateOffset[3]=-100' nvidia-settings -a '[gpu:0]/GPUFanControlState=1' nvidia-settings -a '[fan:0]/GPUTargetFanSpeed=60' Right click on the nv.sh file and select Properties. Under the Permissions tab select Allow executing file as program and close it. Search your PC for Startup Applications and then Add the nv.sh file to the list (located on the desktop):
Command: /home/'username'/Desktop/nv.sh Comment: SetGPUandFanSpeeds
FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
|
Send message Joined: 9 May 13 Posts: 171 Credit: 4,594,296,466 RAC: 140 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Another way to set the coolbits option is to open a terminal session and enter the following command: sudo nvidia-xconfig --cool-bits = 12 This will set the coolbits option in the xorg.conf file for you. Then restart and follow the instructions listed by skgiven. |
|
Send message Joined: 3 Aug 12 Posts: 1 Credit: 5,018,634,225 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I have an EVGA 1080 FTW Hybrid. The waterblock keeps things COOL. Ambient room temp of up to 28-29C, I've never seen the GPU temp go above 51C. I have a 760 in the same box, it runs around 60-65C at the same time. Not worried about either, well within a good temp range for the cards. |
|
Send message Joined: 21 Mar 16 Posts: 513 Credit: 4,673,458,277 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The GP104 chips have a low TDP, 140-180 watts, and since they throw just as big of a cooler on them as the 250 watt cards, they definitely stay cool. |
|
Send message Joined: 11 Jan 13 Posts: 216 Credit: 846,538,252 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Currently running a EVGA GTX 1080 SC ACX 3.0. Previous card was EVGA 980Ti SC ACX 3.0. On GPUGrid, with both cards I rarely got much above 60C. The 1080 almost never goes above mid-50s. I do use a custom fan curve, but it never needs to go above 80% and usually runs about 60 - 70%. Part of that is due to having a very well ventilated case (Phanteks Enthoo Primo) and part of it is just due to the superior cooling ability of the ACX fans over the reference blower style. |
|
Send message Joined: 11 Jan 13 Posts: 216 Credit: 846,538,252 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I have an EVGA 1080 FTW Hybrid. The waterblock keeps things COOL. Ambient room temp of up to 28-29C, I've never seen the GPU temp go above 51C. I have a 760 in the same box, it runs around 60-65C at the same time. Not worried about either, well within a good temp range for the cards. Does your card run at full boost on GPUGrid? My EVGA 1080 SC never wants to boost above base clock when running most BOINC projects. PrimeGrid has a few that always make it boost. I'm just trying to figure out if it's just my card or if BOINC projects are just not working the card hard enough to make it boost. Your temps are similar to mine, but on waterblock so I'm curious. |
|
Send message Joined: 24 May 11 Posts: 7 Credit: 93,272,937 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hello and Thank you all the coolbits-advisors. I somehow managed to get editable options to "Nvidia x-server settings" , obviously with a help of little luck. "Standing by" for further experiments. Thanks! |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
96°C (205°F) is pretty much. It's a laptop (i7-4700MQ CPU @ 2.40GHz, 8GB RAM, GeForce GTX 765M (2048MB)). A good example to demonstrate why *not* to crunch on a laptop. Run time: 66,440.80 (18h 27m 20s) CPU time: 21,693.78 <core_client_version>7.6.33</core_client_version> <![CDATA[ <message> (unknown error) - exit code -97 (0xffffff9f) </message> <stderr_txt> # GPU [GeForce GTX 765M] Platform [Windows] Rev [3212] VERSION [65] # SWAN Device 0 : # Name : GeForce GTX 765M # ECC : Disabled # Global mem : 2048MB # Capability : 3.0 # PCI ID : 0000:01:00.0 # Device clock : 862MHz # Memory clock : 2004MHz # Memory width : 128bit # Driver version : r375_00 : 37633 # GPU 0 : 90C # GPU 0 : 93C # GPU 0 : 94C # GPU 0 : 95C # GPU [GeForce GTX 765M] Platform [Windows] Rev [3212] VERSION [65] # SWAN Device 0 : # Name : GeForce GTX 765M # ECC : Disabled # Global mem : 2048MB # Capability : 3.0 # PCI ID : 0000:01:00.0 # Device clock : 862MHz # Memory clock : 2004MHz # Memory width : 128bit # Driver version : r375_00 : 37633 # GPU 0 : 74C # GPU 0 : 81C # GPU 0 : 83C # GPU 0 : 86C # BOINC suspending at user request (exit) # GPU [GeForce GTX 765M] Platform [Windows] Rev [3212] VERSION [65] # SWAN Device 0 : # Name : GeForce GTX 765M # ECC : Disabled # Global mem : 2048MB # Capability : 3.0 # PCI ID : 0000:01:00.0 # Device clock : 862MHz # Memory clock : 2004MHz # Memory width : 128bit # Driver version : r375_00 : 37633 # GPU 0 : 78C # GPU 0 : 84C # GPU 0 : 90C # GPU 0 : 93C # GPU 0 : 94C # GPU 0 : 95C # GPU 0 : 96C # BOINC suspending at user request (exit) # GPU [GeForce GTX 765M] Platform [Windows] Rev [3212] VERSION [65] # SWAN Device 0 : # Name : GeForce GTX 765M # ECC : Disabled # Global mem : 2048MB # Capability : 3.0 # PCI ID : 0000:01:00.0 # Device clock : 862MHz # Memory clock : 2004MHz # Memory width : 128bit # Driver version : r375_00 : 37633 # GPU 0 : 78C # GPU 0 : 84C # GPU 0 : 89C # GPU 0 : 90C # GPU 0 : 91C # GPU 0 : 92C # GPU [GeForce GTX 765M] Platform [Windows] Rev [3212] VERSION [65] # SWAN Device 0 : # Name : GeForce GTX 765M # ECC : Disabled # Global mem : 2048MB # Capability : 3.0 # PCI ID : 0000:01:00.0 # Device clock : 862MHz # Memory clock : 2004MHz # Memory width : 128bit # Driver version : r375_00 : 37633 # BOINC suspending at user request (exit) # GPU [GeForce GTX 765M] Platform [Windows] Rev [3212] VERSION [65] # SWAN Device 0 : # Name : GeForce GTX 765M # ECC : Disabled # Global mem : 2048MB # Capability : 3.0 # PCI ID : 0000:01:00.0 # Device clock : 862MHz # Memory clock : 2004MHz # Memory width : 128bit # Driver version : r375_00 : 37633 # GPU 0 : 78C # GPU 0 : 86C # GPU 0 : 89C # GPU 0 : 90C # GPU 0 : 91C # GPU 0 : 92C # BOINC suspending at user request (exit) # GPU [GeForce GTX 765M] Platform [Windows] Rev [3212] VERSION [65] # SWAN Device 0 : # Name : GeForce GTX 765M # ECC : Disabled # Global mem : 2048MB # Capability : 3.0 # PCI ID : 0000:01:00.0 # Device clock : 862MHz # Memory clock : 2004MHz # Memory width : 128bit # Driver version : r375_00 : 37633 # GPU 0 : 77C # GPU 0 : 84C # GPU 0 : 88C # GPU 0 : 91C # GPU 0 : 92C # GPU 0 : 93C # GPU 0 : 94C # GPU 0 : 95C # GPU 0 : 96C # BOINC suspending at user request (exit) # GPU [GeForce GTX 765M] Platform [Windows] Rev [3212] VERSION [65] # SWAN Device 0 : # Name : GeForce GTX 765M # ECC : Disabled # Global mem : 2048MB # Capability : 3.0 # PCI ID : 0000:01:00.0 # Device clock : 862MHz # Memory clock : 2004MHz # Memory width : 128bit # Driver version : r375_00 : 37633 # GPU 0 : 86C # GPU 0 : 93C # GPU 0 : 94C # GPU 0 : 96C # The simulation has become unstable. Terminating to avoid lock-up (1) # Attempting restart (step 15255000) # GPU [GeForce GTX 765M] Platform [Windows] Rev [3212] VERSION [65] # SWAN Device 0 : # Name : GeForce GTX 765M # ECC : Disabled # Global mem : 2048MB # Capability : 3.0 # PCI ID : 0000:01:00.0 # Device clock : 862MHz # Memory clock : 2004MHz # Memory width : 128bit # Driver version : r375_00 : 37633 # The simulation has become unstable. Terminating to avoid lock-up (1) </stderr_txt> ]]> No, I was wrong. 96°C is too much. |
|
Send message Joined: 20 Apr 15 Posts: 285 Credit: 1,102,216,607 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
True ... so if one really, really wants to crunch on a Notebook I would strongly recommend to run TThrottle at least in order to keep temperatures below 80°C. http://efmer.com/b/?q=tthrottle Having said this, even that protecion mechanism is relatively slow and cannot fully avoid load and temp fluctuation on a mobile device at a somewhat critical level. As for me, I have stopped crunching on my laptop for that reason. Aside from the vacuum-cleaner-like background noise. Edit: whoever owns that 96°C hot laptop ... didn't you observe burn marks on the table? ;-) I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday. |
©2025 Universitat Pompeu Fabra