Recent problems for WUs on older GPUs

Message boards : Graphics cards (GPUs) : Recent problems for WUs on older GPUs
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5

AuthorMessage
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 9991 - Posted: 20 May 2009, 8:16:08 UTC - in response to Message 9981.  

So, it seems that there is a bug in the compiler/hardware which appears only on pre G200 cards.
We found a way to avoid it for now, but it limits what we can do, so it is not a solution.

gdf
ID: 9991 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
jrobbio

Send message
Joined: 13 Mar 09
Posts: 59
Credit: 324,366
RAC: 0
Level

Scientific publications
watwatwatwat
Message 9992 - Posted: 20 May 2009, 8:46:27 UTC - in response to Message 9991.  

So, it seems that there is a bug in the compiler/hardware which appears only on pre G200 cards.
We found a way to avoid it for now, but it limits what we can do, so it is not a solution.

gdf


Well at least it isn't a mystery any more. When you are on the bleeding edge, one should expect some cuts.

Hope it gets resolved in the not too distant future.

Rob
ID: 9992 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
MarkJ
Volunteer moderator
Volunteer tester

Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 9995 - Posted: 20 May 2009, 11:27:05 UTC - in response to Message 9991.  

So, it seems that there is a bug in the compiler/hardware which appears only on pre G200 cards.
We found a way to avoid it for now, but it limits what we can do, so it is not a solution.

gdf


How come the G200 based cards also get failures?

Will there be an updated app for the non-G200 machines, or perhaps all machines? Will this be a cuda 2.2 app or stick with the old version for the time being?

Can we use the 185.85 drivers now or with the new app (assuming there will be one)?
BOINC blog
ID: 9995 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile KWSN-Sir Papa Smurph

Send message
Joined: 17 Mar 09
Posts: 5
Credit: 7,136,253
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwat
Message 9998 - Posted: 20 May 2009, 12:00:52 UTC

I am running an 8800GT and 3 9800Gtx+ cards. I have had zero complete Kashif Wu's .
Could you make a way for me to "opt out" of those type of units?

I still get occasional errors with other units but the Kashif are 100% failure rates for me. (3 different machines)

I am really unable to babysit my machines as I am away from home for days at at time.....
ID: 9998 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Scott Brown

Send message
Joined: 21 Oct 08
Posts: 144
Credit: 2,973,555
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwat
Message 10000 - Posted: 20 May 2009, 12:48:35 UTC - in response to Message 9958.  

we are running this set of workunits called

x-GIANNI_newFB-...

If they go on ok, then we have isolated the problem with G90 chips. It is not solved yet but still at least we would know where to look.

gdf



These hang on my Pent D 830 with a 9600GSO. See here for a hung result that was aborted after more than 24 hours of no progress (hung at about 21%).

ID: 10000 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rbpeake

Send message
Joined: 30 Jul 08
Posts: 17
Credit: 80,343,188
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwat
Message 10002 - Posted: 20 May 2009, 14:40:55 UTC - in response to Message 9998.  

Just as a data point of reference, I have had 100% success on all work units using a GTX 260 Core 216 card, running CUDA 2.2 and 185.85 driver, even on work units that have had failures previously.
ID: 10002 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Zydor

Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 10004 - Posted: 20 May 2009, 15:16:35 UTC - in response to Message 9991.  

There is another way .....

Tap your well heeled benefactor you have tucked away for a mere $400,000 worth of vouchers to upgrade Crunchers pre-200's to 300GTXs - a snip at the price....

And ..... added Value!! ..... You'd also solve chruncher recruitment for while, they'd queue round to the next street, let alone next Block, for that one .....

*sigh* ........ aways nice to dream once in a while :)

Regards
Zy
ID: 10004 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rbpeake

Send message
Joined: 30 Jul 08
Posts: 17
Credit: 80,343,188
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwat
Message 10005 - Posted: 20 May 2009, 15:24:44 UTC - in response to Message 10004.  

There is another way .....

Tap your well heeled benefactor....

Regards
Zy

Believe me, I am tapping my heels that my card continues to function as well as it has! I just bought it, so it would be a big disappointment if there were issues so soon....but the issue fix would appear to be possible without a card upgrade, hopefully....(although NVIDIA I would guess is tapping its heels that many will upgrade...ouch! ;)
ID: 10005 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10020 - Posted: 21 May 2009, 10:09:42 UTC - in response to Message 9995.  

How come the G200 based cards also get failures?


Let's see what the fix can do and who still gets failures afterwards. Mind you, there's also the "regular failure rate", some kind of "noise floor" which affects all cards.

Will there be an updated app for the non-G200 machines, or perhaps all machines? Will this be a cuda 2.2 app or stick with the old version for the time being?
Can we use the 185.85 drivers now or with the new app (assuming there will be one)?


Not speaking officially, but I wouldn't rush to introduce another variable in the current situation. Wait until the dust settles and we're confident that the problems have been solved. 185.66 has been running fine for me with non-troublemaker WUs, so I'll keep using it until I see problems.

I do have a WU issued today and it appears to use client 6.64, so it may look like no new app for now. But this could be tied to an old type of WU as well.

MrS
Scanning for our furry friends since Jan 2002
ID: 10020 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Zydor

Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 10071 - Posted: 22 May 2009, 21:31:39 UTC - in response to Message 10020.  

Just had a Kashif go bang

ERROR: c:\cygwin\home\speechserver\gpumd2\src\pme\CPME_cufft.cu, line 104: cufftExecC2R (gridcalc3)

http://www.gpugrid.net/result.php?resultid=699822

I have had a problem on that PC re Office, and got it back online 5 hours ago. However I dont think it was that issue, I think it looks like the old WU problem surfacing - maybe in one of the older WUs still in the system??

Regards
Zy
ID: 10071 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10088 - Posted: 23 May 2009, 14:38:09 UTC - in response to Message 10071.  

Looks like the old problem and the WU was created past 20 May 16:44 CEST, when the fix was applied. I think it would be better to post such observations in the new thread, so they don't get lost.

MrS
Scanning for our furry friends since Jan 2002
ID: 10088 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5

Message boards : Graphics cards (GPUs) : Recent problems for WUs on older GPUs

©2025 Universitat Pompeu Fabra