Failures since upgrading to 190.38

Message boards : Graphics cards (GPUs) : Failures since upgrading to 190.38
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
STE\/E

Send message
Joined: 18 Sep 08
Posts: 368
Credit: 4,174,624,885
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 11487 - Posted: 29 Jul 2009, 22:24:35 UTC - in response to Message 11484.  
Last modified: 29 Jul 2009, 22:25:36 UTC

Question, possibly for PoorBoy. The GTX 260 cards - are they the Core 216 version? I'm having the same problems as everyone else getting GPUGRID wu to run on this card (XP Home 32-bit, Q6600, stock everything). I've tried to roll back to previous versions of the driver (currently running 185.XX) with no positive results. I'm not showing a downclocking problem using GPU-Z.


Yes, the 260's I have are all 216 Shader Versions, once they started Down-clocking I couldn't stop them from doing that no matter what I did. I tried the Down-clocking Fix, going back to the 185.18's, Re-installing the 190.38's & different BOINC Clients. I even set them all back to their Default running Speeds but nothing worked.

I'd reset them to their Default running speeds and within as little as a few minutes some of them would drop to half speed and start giving errors on the Wu's after that. I've been running the Collatz Project for almost 2 days now with the same BOINC Client & NVIDIA Drivers (6.6.36 & 190.38) without 1 single error and not 1 NVIDIA Card has Dropped it's Speed even after re-Overclocking them again to run the Collatz Wu's.

So all I can assume is some how or way the Grid Wu's run must have something to do with them Down-clocking as fast as I could reset them to their original speeds again.
ID: 11487 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
STE\/E

Send message
Joined: 18 Sep 08
Posts: 368
Credit: 4,174,624,885
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 11488 - Posted: 29 Jul 2009, 22:28:34 UTC - in response to Message 11482.  

Am I right that there's not a single reported failure with G9x cards, only G200 are affected? But some of them still run fine with 190.xx?

MrS


Just my GTX 260 216 Shader Versions were affected by the Down-clocking bug, but all my cards 260's, 275's, 280's & 295's gave errors. The 260's seemed to give more than the rest though.
ID: 11488 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Steve Dodd

Send message
Joined: 26 Dec 08
Posts: 19
Credit: 4,622,334,506
RAC: 140,836
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11490 - Posted: 30 Jul 2009, 5:14:35 UTC - in response to Message 11487.  

PoorBoy,
My GTX 260 Core 216 shows clock rates of 576MHz GPU clock, 999MHz Memory clock, and 1242 MHz Shader clock (GPU-Z values). These are also shown as the default clock values. Am I right in my interpretation of your problem that one or all of these clock values are 1/2 of my clock values, or am I running at 1/2 speed and don't know it :)
ID: 11490 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
STE\/E

Send message
Joined: 18 Sep 08
Posts: 368
Credit: 4,174,624,885
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 11501 - Posted: 30 Jul 2009, 12:04:08 UTC - in response to Message 11490.  
Last modified: 30 Jul 2009, 12:07:03 UTC

PoorBoy,
My GTX 260 Core 216 shows clock rates of 576MHz GPU clock, 999MHz Memory clock, and 1242 MHz Shader clock (GPU-Z values). These are also shown as the default clock values. Am I right in my interpretation of your problem that one or all of these clock values are 1/2 of my clock values, or am I running at 1/2 speed and don't know it :)


Your Clock values look like the Default Values for most 260's unless their the OC Type right from the Factory then they would be a little higher.

What my cards would do (Not all but some of them) is after making sure they were indeed running at the Default Values with GPU-Z is after some running time drop to 300 Core & 400 Memory. They weren't at idle either because the Grid Wu's would be running & showing Progression. Once yhey dropped to 1/2 speed the errors would follow soon after. The only way I found to get the Speed back to Defaults is to Re-Boot the Computer affected by the 1/2 Speed GPU.

Sometime it would only take a few minutes before they would drop their speed and other times they would take an hour or two before dropping their speed. Like I said I've been running Collatz with no problems and run all my cards at 650-Core 1475-Shaders 1100-Memory. Some guys run them higher than that but those are the speeds I've found to be the most stable for me with no Hang up's or error's so that's what I run them at.

Colatz is down now & I have no work from them but I'm very reluctant to re-start the Grid Project back up again and have to go thru all the headaches I went thru for 3 days earlier so I haven't. Been just letting the NVIDIA cards sit for now & hoping the Collatz Project comes back up soon.
ID: 11501 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [FVG] bax
Avatar

Send message
Joined: 18 Jun 08
Posts: 29
Credit: 17,772,874
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 11508 - Posted: 30 Jul 2009, 16:49:06 UTC

GTX 260 - 216 SP - Xp SP3

185.xx driver: 10 WUs, 10 error while computing

190.38 driver: 10 WUs, 9 error while computing


sorry, bye bye
ID: 11508 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Bymark
Avatar

Send message
Joined: 23 Feb 09
Posts: 30
Credit: 5,897,921
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 11510 - Posted: 30 Jul 2009, 17:50:49 UTC - in response to Message 11508.  
Last modified: 30 Jul 2009, 18:07:00 UTC

Same here, my gpus went to Collatz and with no errors yet, waiting for a new nvidia driver and if it work then maybe coming back!
"Silakka"
Hello from Turku > Åbo.
ID: 11510 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rabinovitch
Avatar

Send message
Joined: 25 Aug 08
Posts: 143
Credit: 64,937,578
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 11623 - Posted: 3 Aug 2009, 6:22:45 UTC

100% Wus exiting with error after 1 or 2 hours of processing. Win7 Ultimate x64, 6.6.36-6.6.38 BM (have no try 6.6.37 yet, snd I have a doubts it will be helpful), 190.38, certainly, nVidia driver. GTX260 with 192 shader blocks.

p.s. SETI gpu Wus are being processed well (with only few errors).
From Siberia with love!

ID: 11623 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
MarkJ
Volunteer moderator
Volunteer tester

Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11719 - Posted: 8 Aug 2009, 14:21:52 UTC

My GTX260's still seem to have a 50% to 100% failure rate for GPUgrid. Both cards are 216sp versions and are now running the GPUgrid 6.67 app with 190.38 drivers under XP. My other machines which have GTS250's seem quite happy running the 6.67 app with the 190.38 driver.

It seems to be G200 chip cards with the problem when used in conjunction with the 190.38 drivers. It also seems to be specific to GPUgrid as other projects with cuda apps appear to work.

I recall GDF mentioned they had optimised their FFT code, so perhaps that is an area for investigation. Maybe they could look at using the cuda-supplied FFT libraries (unless there isn't an equivilent function) instead of the optimised code? I'm willing to run a few tests if that will help. In the mean time, like other G200 card owners, I will have to keep them occupied by running other cuda work.
BOINC blog
ID: 11719 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rabinovitch
Avatar

Send message
Joined: 25 Aug 08
Posts: 143
Credit: 64,937,578
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 11728 - Posted: 8 Aug 2009, 18:16:26 UTC

I hopethey will fix all the problems at last. I really like this project and I want to participate in it, but I still should only crunch SETI and Collatz project's Wus, 'cause there are almost no errors on these projects...
From Siberia with love!

ID: 11728 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 11731 - Posted: 8 Aug 2009, 19:21:47 UTC - in response to Message 11728.  

We don't have any error on 190.xx. However, we cannot test on 260 because all our cards are 280, 275 or 8800GT.

gdf
ID: 11731 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
MarkJ
Volunteer moderator
Volunteer tester

Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11733 - Posted: 9 Aug 2009, 1:10:17 UTC - in response to Message 11731.  

We don't have any error on 190.xx. However, we cannot test on 260 because all our cards are 280, 275 or 8800GT.

gdf


Its starting to sound like its limited to the GTX260 cards then if other G200 based cards seem to work. Have there been any reports of other cards having similar failure rates?
BOINC blog
ID: 11733 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile rebirther
Avatar

Send message
Joined: 7 Jul 07
Posts: 53
Credit: 3,048,781
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 11737 - Posted: 9 Aug 2009, 7:07:44 UTC
Last modified: 9 Aug 2009, 7:14:44 UTC

It looks like that the failure rate on >=GTX260 is increasing. I have no problems with collatz Wus or the older 182.50 driver. All WUs crashing with exit code 1 not at start but after 8h :(. I will test some seti WUs. Cannot run this project anymore as long as this issue will be solved. Too much waste time...
ID: 11737 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
poppageek
Avatar

Send message
Joined: 4 Jul 09
Posts: 76
Credit: 114,610,402
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11742 - Posted: 9 Aug 2009, 8:28:24 UTC

I have a GTX 260 (192) that worked fine under 182.50 but errors 100% on any driver higher. This one now runs F@H.
I have a GTX 260 (216) that works perfectly with 190.38 on GPUGrid.
ID: 11742 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Bymark
Avatar

Send message
Joined: 23 Feb 09
Posts: 30
Credit: 5,897,921
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 11754 - Posted: 9 Aug 2009, 12:14:22 UTC - in response to Message 11742.  

I have a GTX 260 (192) that worked fine under 182.50 but errors 100% on any driver higher. This one now runs F@H.
I have a GTX 260 (216) that works perfectly with 190.38 on GPUGrid.


My one of 3 GTX 260 (216) wont work with 190.38 on GPUGrid, worked fine under 182.50....
"Silakka"
Hello from Turku > Åbo.
ID: 11754 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
STE\/E

Send message
Joined: 18 Sep 08
Posts: 368
Credit: 4,174,624,885
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 11767 - Posted: 9 Aug 2009, 20:37:02 UTC - in response to Message 11733.  

We don't have any error on 190.xx. However, we cannot test on 260 because all our cards are 280, 275 or 8800GT.

gdf


Its starting to sound like its limited to the GTX260 cards then if other G200 based cards seem to work. Have there been any reports of other cards having similar failure rates?


I have 1 or Possibly 2 GTX 295's that do the same thing, re-set themselves to 300/Core & 400/Memory plus 4 GTX 260's that I know of for sure & possibly 1 or 2 more that do it too.

Some of them don't just do it at this Project either, I've had a few of them do it @ the Collatz Project running their CUDA Wu's. So it leads me to believe it's the Drivers because all the Cards I have that are acting up now ran without error until I Upgraded them to the 190.38 Drivers.

As stated be me and others going back to the 186.xx or even 185.xx Drivers doesn't fix the Problem either once the Cards are infected with the Re-Setting bug ...

I've pulled all the NVIDIA Cards that I know Re-Set themselves & I'm re-testing them in a Box that had a GTX 280 & GTX 295 running without Problems, I'm going to run each Card in that Box to eliminate the PSU as the Cause of the Re-Setting. I figure if the PSU can run a GTX 280 & 295 without problems it should have the power to run a lone GTX 260 without PSU Problems.
ID: 11767 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11769 - Posted: 9 Aug 2009, 21:08:40 UTC - in response to Message 11767.  

You're assuming that the errors you are seeing on your systems upon clock changes are caused by the clock changes.. which may very well be the case. but you're also assuming that all 190.38 problems are related to this similar cause - which I'm not so sure about. GDF said the errors (most?) with 190.38 happen in the FFT part.. which doesn't mix well with assuming downclocking as the reason.

MrS
Scanning for our furry friends since Jan 2002
ID: 11769 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
STE\/E

Send message
Joined: 18 Sep 08
Posts: 368
Credit: 4,174,624,885
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 11770 - Posted: 9 Aug 2009, 21:18:03 UTC - in response to Message 11769.  

You're assuming that the errors you are seeing on your systems upon clock changes are caused by the clock changes.. which may very well be the case. but you're also assuming that all 190.38 problems are related to this similar cause - which I'm not so sure about. GDF said the errors (most?) with 190.38 happen in the FFT part.. which doesn't mix well with assuming downclocking as the reason.

MrS


It's a case of which came first, the Chicken or the egg, in this case it's the Error or the Re-set. In other words did the Card Re-set itself & then the error occurs or did the error occur & then the card re-set it's self ???

I know several times I've seen a Grid Wu hung or not progressing, I'd check the clock settings with GPU-z & they would be where their supposed to be. But upon Stopping & Restarting BOINC to kick start the Wu again within minutes if not seconds the Computation Error would occur & I'd check the clock settings again and they would be at 300/Core 400/Memory ...
ID: 11770 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
frankhagen

Send message
Joined: 18 Sep 08
Posts: 65
Credit: 3,037,414
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 11776 - Posted: 10 Aug 2009, 15:06:22 UTC - in response to Message 11770.  

It's a case of which came first, the Chicken or the egg, in this case it's the Error or the Re-set. In other words did the Card Re-set itself & then the error occurs or did the error occur & then the card re-set it's self ???


now my old GTX260 got the flu too - over a year of crunching with very few errors, never seen it throttle down before. :((
ID: 11776 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11784 - Posted: 10 Aug 2009, 19:58:24 UTC - in response to Message 11770.  

It's a case of which came first, the Chicken or the egg, in this case it's the Error or the Re-set. In other words did the Card Re-set itself & then the error occurs or did the error occur & then the card re-set it's self ???


That's part of what I was thinking.

Could we say that, since after an error the card stays clocked down and subsequent WUs fail (if I remember correctly), the downclocking causes the error? I don't think so: it could also be that the driver detects no GPU activity (since the WUs fail) and therefore keeps it clocked down. Due to some reason this downclocking could be forced in newer drivers.

What if you change clocks manually during computation? Does that work? I know it did on my 9800GTX+ when I tried last time.

Can you set 2D clocks manually, maybe in the power profile? And see if you get an error. If not I'd say the downclocking is really "just" a symptom and not the cause.

MrS
Scanning for our furry friends since Jan 2002
ID: 11784 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
STE\/E

Send message
Joined: 18 Sep 08
Posts: 368
Credit: 4,174,624,885
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 11789 - Posted: 10 Aug 2009, 20:45:38 UTC - in response to Message 11784.  
Last modified: 10 Aug 2009, 20:57:04 UTC

Could we say that, since after an error the card stays clocked down and subsequent WUs fail (if I remember correctly), the downclocking causes the error? I don't think so: it could also be that the driver detects no GPU activity (since the WUs fail) and therefore keeps it clocked down. Due to some reason this downclocking could be forced in newer drivers.


Yes, once the Card/Cards clock down all subsequent Wu's will fail if not corrected, I have caught them clocked down though & rebooted & restarted BOINC & have had the Wu's finish successfully ...

Hmmmmmmmm, think I just answered my own question of which came first, now that I think if it I have found Cards clocked down still running the same Wu's they had been for up to 20 Hours already when I found them. So in those cases anyway the clock down came first but didn't make the Wu error but just run slower than Krap ...

But if I remember correctly as soon as I rebooted those computers where I found the card clocked down but the Wu still running within a few minutes of restarting BOINC after the reboot the Wu did error then even though the Card was once again running at normal speed again.

What if you change clocks manually during computation? Does that work? I know it did on my 9800GTX+ when I tried last time.

Can you set 2D clocks manually, maybe in the power profile? And see if you get an error. If not I'd say the downclocking is really "just" a symptom and not the cause.

MrS


Yes I can reset the Clock to a Higher or lower speed while running the Wu's but so far that hasn't produced an error on a running Wu.


PS: LOL, Tech Support, you got to Love Um, Sapphires response to my RMA Request after 24 hour's: Of course my response will be addressed in the order it was recieved so I assume there will be another 24 hour wait before I get another e-mail asking if I had the Big Black Cord from the Wall plugged into the back of the Computer in order for the Card to work ... ;)

Did you connect the card's 6 and 8pin pwoer connectors? the card requires those connectors to be connected in order for the card to function properly.


Now how would I have been able to use the card for the last month or so which I explained to them in my RMA Request if I hadn't hooked the 6 & 8 Pin Connectors to the Card ...

I had a EVGA RMA# in less than 30 Minutes this morning for 1 of the GTX 260 that clocks down, no questions asked either. I just told them the problem and 5 minutes later I had the RMA# ...
ID: 11789 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : Graphics cards (GPUs) : Failures since upgrading to 190.38

©2026 Universitat Pompeu Fabra