Author |
Message |
Mad MattSend message
Joined: 29 Aug 09 Posts: 28 Credit: 101,584,171 RAC: 0 Level
![Cysteine - More than 100M credits Cys](img/badges/aa/badge_cys.png) Scientific publications
![Top 75% (1975th/3118) contribution to Selent et al, PLoS Comput Biol 2010 wat](img/badges/papers/badge_pub_silver.png) ![Top 10% (363rd/4410) contribution to Buch et al, PNAS 2011 wat](img/badges/papers/badge_pub_emerald.png) ![Top 50% (648th/2450) contribution to Giorgino et al, J. Chem. Theory Comput. 2011 wat](img/badges/papers/badge_pub_gold.png) ![Top 10% (130th/9662) contribution to Buch et al, J. Chem. Theory Comput. 2011 wat](img/badges/papers/badge_pub_emerald.png) ![Top 1% (26th/3113) contribution to Giorgino et al, J. Chem. Theory Comput, 2012 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 10% (187th/5798) contribution to Sadiq et al, PNAS 2012 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (130th/3349) contribution to Buch et al, JCIM 2013 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (62nd/3864) contribution to Dainese et al, Biochem. J. 2013 wat](img/badges/papers/badge_pub_emerald.png) ![Top 75% (2865th/4477) contribution to Pérez-Hernández et al, JCP 2013 wat](img/badges/papers/badge_pub_silver.png) ![Top 25% (700th/3183) contribution to Lauro et al., JCIM 2014 wat](img/badges/papers/badge_pub_ruby.png) ![Top 25% (1073rd/4815) contribution to Stanley et al., Sci Rep 2016 wat](img/badges/papers/badge_pub_ruby.png) |
I am having constant problems with this host:
http://www.gpugrid.net/show_host_detail.php?hostid=107176
Device 0 will error out all WUs, device 1 does not seem to have any problems.
Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz
Z68 board, internal graphics deactivated.
Microsoft Windows XP Professional x64 Edition, Service Pack 2, (05.02.3790.00)
[2] NVIDIA GeForce GTX 570 (1279MB) driver: 26658. Stock clocks. No SLI bridge.
Brand: 2x Gigabyte GeForce GTX 570 OC (GV-N570OC-13I). The crashing device interestingly is a newer version without voltage control, but using higher voltages by default according to GPU-Z.
Any ideas?
____________
|
|
|
|
What PSU are you using in this host? (a good 800W (preferably 80+ Gold) is the minimum for your configuration)
A more recent driver (275.33 or 280.26) for the GPUs would be nice to have.
What are the running temps of the cards? Over 90°C is dangerous, but the lower the better. Maybe you have to increase the fan speed of both GPUs with MSI Afterburner (it works with every manufacturer's cards)
Maybe you should:
- disable any screensaver.
- swap the cards, to see if the card or the slot is the root of the problem.
- downclock the failing GPU.
- install the latest chipset driver from intel. |
|
|
Mad MattSend message
Joined: 29 Aug 09 Posts: 28 Credit: 101,584,171 RAC: 0 Level
![Cysteine - More than 100M credits Cys](img/badges/aa/badge_cys.png) Scientific publications
![Top 75% (1975th/3118) contribution to Selent et al, PLoS Comput Biol 2010 wat](img/badges/papers/badge_pub_silver.png) ![Top 10% (363rd/4410) contribution to Buch et al, PNAS 2011 wat](img/badges/papers/badge_pub_emerald.png) ![Top 50% (648th/2450) contribution to Giorgino et al, J. Chem. Theory Comput. 2011 wat](img/badges/papers/badge_pub_gold.png) ![Top 10% (130th/9662) contribution to Buch et al, J. Chem. Theory Comput. 2011 wat](img/badges/papers/badge_pub_emerald.png) ![Top 1% (26th/3113) contribution to Giorgino et al, J. Chem. Theory Comput, 2012 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 10% (187th/5798) contribution to Sadiq et al, PNAS 2012 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (130th/3349) contribution to Buch et al, JCIM 2013 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (62nd/3864) contribution to Dainese et al, Biochem. J. 2013 wat](img/badges/papers/badge_pub_emerald.png) ![Top 75% (2865th/4477) contribution to Pérez-Hernández et al, JCP 2013 wat](img/badges/papers/badge_pub_silver.png) ![Top 25% (700th/3183) contribution to Lauro et al., JCIM 2014 wat](img/badges/papers/badge_pub_ruby.png) ![Top 25% (1073rd/4815) contribution to Stanley et al., Sci Rep 2016 wat](img/badges/papers/badge_pub_ruby.png) |
The PSU might be a bit short, but it does handle more power hungry apps like PG without problems. It is a Seasonic X-650W.
Temps with PG have been 70°-80°C as peak on hottest days, using 900Mhz GPU clocks. So running GPUGRID on stock I did not even bother to check, something below 70°C. Currently running Einstein at 51°-62°C (900Mhz).
I checked one 28x WHQL driver, no improvement with that problem. Intel chipset driver is 9.2.0.1026
____________
|
|
|
skgivenVolunteer moderator Volunteer tester
![Avatar](https://www.gravatar.com/avatar/77be8b04dc35f6033048abca3f3803c4?s=100&d=identicon) Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
![Histidine - More than 1.5B credits His](img/badges/aa/badge_his.png) Scientific publications
![Top 100% (2761st/2932) contribution to Buch et al, J. Chem. Inf. Model. 2010 wat](img/badges/papers/badge_pub_white.png) ![Top 75% (1680th/2466) contribution to Sadiq et al, Proteins 2010 wat](img/badges/papers/badge_pub_silver.png) ![Top 10% (266th/3118) contribution to Selent et al, PLoS Comput Biol 2010 wat](img/badges/papers/badge_pub_emerald.png) ![Top 1% (15th/4410) contribution to Buch et al, PNAS 2011 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (22nd/2450) contribution to Giorgino et al, J. Chem. Theory Comput. 2011 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (15th/9662) contribution to Buch et al, J. Chem. Theory Comput. 2011 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (27th/3113) contribution to Giorgino et al, J. Chem. Theory Comput, 2012 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (14th/5798) contribution to Sadiq et al, PNAS 2012 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 25% (352nd/1995) contribution to Venken et al, JCTC 2013 wat](img/badges/papers/badge_pub_ruby.png) ![Top 1% (15th/3349) contribution to Buch et al, JCIM 2013 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 10% (49th/3864) contribution to Dainese et al, Biochem. J. 2013 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (62nd/4477) contribution to Pérez-Hernández et al, JCP 2013 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (70th/2163) contribution to Bisignano et al. JCIM 2014 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (14th/1283) contribution to Doerr et al. JCTC 2014 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (45th/2838) contribution to Stanley et al, Nat Commun 2014 wat](img/badges/papers/badge_pub_emerald.png) ![Top 1% (18th/3183) contribution to Lauro et al., JCIM 2014 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (27th/3611) contribution to Ferruz et al., JCIM 2015 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (34th/4128) contribution to Ferruz et al., Sci Rep 2016 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (49th/4815) contribution to Stanley et al., Sci Rep 2016 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 10% (105th/4730) contribution to Noe et al., Nat Chem 2017 wat](img/badges/papers/badge_pub_emerald.png) ![Top 100% (1222nd/1348) contribution to Doerr et al, JCTC 2017 wat](img/badges/papers/badge_pub_white.png) ![Top 1% (35th/4634) contribution to Martinez-Rosell et al, JCIM 2018 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 50% (485th/1656) contribution to Kapoor et al., Sci Rep 2017 wat](img/badges/papers/badge_pub_gold.png) ![Top 10% (50th/1885) contribution to Ferruz et al., Sci Rep 2018 wat](img/badges/papers/badge_pub_emerald.png) ![Top 75% (551st/1022) contribution to Wang et al., ACS Cent. Sci. 2019 wat](img/badges/papers/badge_pub_silver.png) ![Top 25% (307th/1541) contribution to Rodriguez-Espigares et al., Nat Meth 2020 wat](img/badges/papers/badge_pub_ruby.png) ![Top 10% (29th/1450) contribution to Herrera-Nieto et al, Sci Rep 2020 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (334th/6232) contribution to Herrera-Nieto et al, JCIM 2020 wat](img/badges/papers/badge_pub_emerald.png) |
Hi Matt,
Good suggestions below.
Suggest you use a tool to control fan speed; 60-65 would be about right for ref on those GPU's, but closed case and not great air throughput could be a problem. PG taxes the GPU in different ways so running those tasks would just tell you if you had a dud GPU or not. I would try downclocking the GDDR5 a notch to see if that helps; has worked in the past. I would also try the failing card in isolation just in case.
GL |
|
|
Mad MattSend message
Joined: 29 Aug 09 Posts: 28 Credit: 101,584,171 RAC: 0 Level
![Cysteine - More than 100M credits Cys](img/badges/aa/badge_cys.png) Scientific publications
![Top 75% (1975th/3118) contribution to Selent et al, PLoS Comput Biol 2010 wat](img/badges/papers/badge_pub_silver.png) ![Top 10% (363rd/4410) contribution to Buch et al, PNAS 2011 wat](img/badges/papers/badge_pub_emerald.png) ![Top 50% (648th/2450) contribution to Giorgino et al, J. Chem. Theory Comput. 2011 wat](img/badges/papers/badge_pub_gold.png) ![Top 10% (130th/9662) contribution to Buch et al, J. Chem. Theory Comput. 2011 wat](img/badges/papers/badge_pub_emerald.png) ![Top 1% (26th/3113) contribution to Giorgino et al, J. Chem. Theory Comput, 2012 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 10% (187th/5798) contribution to Sadiq et al, PNAS 2012 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (130th/3349) contribution to Buch et al, JCIM 2013 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (62nd/3864) contribution to Dainese et al, Biochem. J. 2013 wat](img/badges/papers/badge_pub_emerald.png) ![Top 75% (2865th/4477) contribution to Pérez-Hernández et al, JCP 2013 wat](img/badges/papers/badge_pub_silver.png) ![Top 25% (700th/3183) contribution to Lauro et al., JCIM 2014 wat](img/badges/papers/badge_pub_ruby.png) ![Top 25% (1073rd/4815) contribution to Stanley et al., Sci Rep 2016 wat](img/badges/papers/badge_pub_ruby.png) |
Cheers for the hints, guys. I will try setting fan speed and fiddling with RAM clocks around 1500 MHz first. It's a Silverstone Fortress case, so airflow hopefully should not be the problem.
____________
|
|
|
|
The PSU might be a bit short, but it does handle more power hungry apps like PG without problems. It is a Seasonic X-650W.
While it is a top quality PSU, you are right about it is a bit short for two GTX 570s. It doesn't matter, that it's fine with PG. Even different GPUGrid task are using different amount of the GPU, resulting different power consumption. This can be as much as 50-80W in a dual GPU system.
You should check your system's overall power consumption while crunching GPUGrid (or with furmark while the GPUs are in SLI). If it's over 722W, your system is overloading your PSU. Even if it's under 722W, it is recommended to load a PSU at longest 75-80% of it's maximum wattage in the long term. You should also consider that the efficiency of a PSU is best around 50% load.
|
|
|
Mad MattSend message
Joined: 29 Aug 09 Posts: 28 Credit: 101,584,171 RAC: 0 Level
![Cysteine - More than 100M credits Cys](img/badges/aa/badge_cys.png) Scientific publications
![Top 75% (1975th/3118) contribution to Selent et al, PLoS Comput Biol 2010 wat](img/badges/papers/badge_pub_silver.png) ![Top 10% (363rd/4410) contribution to Buch et al, PNAS 2011 wat](img/badges/papers/badge_pub_emerald.png) ![Top 50% (648th/2450) contribution to Giorgino et al, J. Chem. Theory Comput. 2011 wat](img/badges/papers/badge_pub_gold.png) ![Top 10% (130th/9662) contribution to Buch et al, J. Chem. Theory Comput. 2011 wat](img/badges/papers/badge_pub_emerald.png) ![Top 1% (26th/3113) contribution to Giorgino et al, J. Chem. Theory Comput, 2012 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 10% (187th/5798) contribution to Sadiq et al, PNAS 2012 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (130th/3349) contribution to Buch et al, JCIM 2013 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (62nd/3864) contribution to Dainese et al, Biochem. J. 2013 wat](img/badges/papers/badge_pub_emerald.png) ![Top 75% (2865th/4477) contribution to Pérez-Hernández et al, JCP 2013 wat](img/badges/papers/badge_pub_silver.png) ![Top 25% (700th/3183) contribution to Lauro et al., JCIM 2014 wat](img/badges/papers/badge_pub_ruby.png) ![Top 25% (1073rd/4815) contribution to Stanley et al., Sci Rep 2016 wat](img/badges/papers/badge_pub_ruby.png) |
I am currently waiting for the Platinum series to start consecutive upgrades of most PSUs, so it should get an X-750W. Last time load measured was when running PG around 75% or so.
It seems fan/RAM settings did the trick. :) For some reason I ignored device 0 sees about 10°C higher temps, most likely because the Gigabyte cooler does exhaust some hot air into the case while this case favours all-exhaust devices. Cooler set manually to 62%, so it keeps the card in a better range now, not sure if memory downclocking still is needed.
If that's running stable, I will try to find some better clocks though. :P Thanks for the tips!
____________
|
|
|
Mad MattSend message
Joined: 29 Aug 09 Posts: 28 Credit: 101,584,171 RAC: 0 Level
![Cysteine - More than 100M credits Cys](img/badges/aa/badge_cys.png) Scientific publications
![Top 75% (1975th/3118) contribution to Selent et al, PLoS Comput Biol 2010 wat](img/badges/papers/badge_pub_silver.png) ![Top 10% (363rd/4410) contribution to Buch et al, PNAS 2011 wat](img/badges/papers/badge_pub_emerald.png) ![Top 50% (648th/2450) contribution to Giorgino et al, J. Chem. Theory Comput. 2011 wat](img/badges/papers/badge_pub_gold.png) ![Top 10% (130th/9662) contribution to Buch et al, J. Chem. Theory Comput. 2011 wat](img/badges/papers/badge_pub_emerald.png) ![Top 1% (26th/3113) contribution to Giorgino et al, J. Chem. Theory Comput, 2012 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 10% (187th/5798) contribution to Sadiq et al, PNAS 2012 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (130th/3349) contribution to Buch et al, JCIM 2013 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (62nd/3864) contribution to Dainese et al, Biochem. J. 2013 wat](img/badges/papers/badge_pub_emerald.png) ![Top 75% (2865th/4477) contribution to Pérez-Hernández et al, JCP 2013 wat](img/badges/papers/badge_pub_silver.png) ![Top 25% (700th/3183) contribution to Lauro et al., JCIM 2014 wat](img/badges/papers/badge_pub_ruby.png) ![Top 25% (1073rd/4815) contribution to Stanley et al., Sci Rep 2016 wat](img/badges/papers/badge_pub_ruby.png) |
PS, latest check of power draw:
Running 2x GPUGRID ~447W average. 450W peak. ~68% PSU load.
No trashed WU since adjusting fan and RAM. :) Cheers.
____________
|
|
|
|
Wow, that poor little PSU. It's a good thing it's a Seasonic, or it probably would have already burst into flames and exploded. |
|
|