Message boards :
Number crunching :
Three Computation Errors in 40 mins
Message board moderation
| Author | Message |
|---|---|
ZydorSend message Joined: 8 Feb 09 Posts: 252 Credit: 1,309,451 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]()
|
Just had three WUs go bang in 40 mins, that makes four in the last 24 hrs. Something strange is going on. I am alive to the fact it could be hardware, however its a new(ish) card only 2 months old, and has produced flawless WUs for three weeks or more since the last failure. I was not at the PC this time, I was having dinner, came back and bang three WUs totalled. There is a CUDA error shown in the results of tonights 3 failures, the same one for two of them, a different CUDA error for the third. Interestingly one of the wingmen who also had hassles with these also had a 9800GTX. These are not single failures, others have totalled them as well, lending weight to a possibility of something within the WU causing it. All a bit strange ..... Grateful someone take a look and see if there is anything obvious from the results file, before I download another and try again. I am wary now of downloading more and crashing them, until I can nail this. My task page http://www.gpugrid.net/results.php?userid=15789 I'm going to nip off for ten minutes reboot, have look etc etc - back soon [Edit] Link was wrong - sorry - its the correct link now. I have tested out the card all seems ok, hardware tests elsewhere seem ok, temps normal etc etc. I'll try and download another one, and see what happens.[/edit] Regards Zy |
ZydorSend message Joined: 8 Feb 09 Posts: 252 Credit: 1,309,451 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]()
|
|
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
EDIT: forget what I said.. saw the separate thread on this afterwards. Overall you have quite a few errors. A strange coincidence: recently my 9800GTX+ also errored out 3 WUs, which it otherwise didn't do. I think you and me should lower our clocks a bit or increase the fan speed and see what we get. I know that here the temperatures increased quite a bit compared to a few weeks ago. Regarding the possible G92 issue: - almost all of the tasks have been finished by G200-class cards - all of them have errored out on G92 - BUT: there were also 2 errors on G200 cards And it's remarkable that not many G92 or older chips were involved at all, so I suppose the numbers are too small to allow statistically relevant statements (i.e. "all of these tasks fail on G92"). MrS Scanning for our furry friends since Jan 2002 |
©2025 Universitat Pompeu Fabra