Message boards :
Graphics cards (GPUs) :
Strange experiences...
Message board moderation
| Author | Message |
|---|---|
UL1Send message Joined: 16 Sep 07 Posts: 56 Credit: 35,013,195 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
...in the last few days: at first most of my rigs suddenly started to error out almost all WUs (code 1 (0x1, -255)). I wouldn't have been surprised about this if I had changed anything on them...but I didn't. They were running without any modifications (e.g. oc'ing) for some days/WUs. What I did was setting network activity from 'suspended' to 'always available'...but I can't believe that this could cause such errors. Now I switched the video cards between two of them and yet it seems that they are back to normal again (but I suspended network activity)... Then I had a look at my 'oldest' rig lately: the last WU was successfully submitted at 02:03...and 6 hours later, when I wanted to send another two finished WUs I got the message: 'Client detached'... What ? I haven't done anything to that rig in the meantime... ??? |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
That's really strange. You're runnig Linux, so the usual "well, guess it was time for a reboot" doesn't fit. And you're running the well-tested 6.3.10, which should eleminate another possible cause of errors. However, changeing GPUs does require you to reboot.. even under Linux, doesn't it? MrS Scanning for our furry friends since Jan 2002 |
UL1Send message Joined: 16 Sep 07 Posts: 56 Credit: 35,013,195 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Yep, for sure... ;) But I changed cards after the rigs started acting stupid... And my 'old' rig kept its cards... |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
But I changed cards after the rigs started acting stupid... I wanted to imply that maybe the reboot necessary for swapping the cards solved the problem. But this wouldn't explain why all of your machines are / were affected. MrS Scanning for our furry friends since Jan 2002 |
UL1Send message Joined: 16 Sep 07 Posts: 56 Credit: 35,013,195 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Oh, I see... Had a reboot before changing cards...and the rigs still acted crazy... Anyway, I hope that's history now... |
koschiSend message Joined: 14 Aug 08 Posts: 127 Credit: 913,858,161 RAC: 18 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I got those type of error on earlier versions of the BOINC client, that when I set it to "always allow network access" but the network failed, it chrashed almost all units. Some projects were more resistant to this than others, so they kept running while the others errored out. Now it is better somehow, even when my wireless fails, everything is running fine, no errors. |
©2025 Universitat Pompeu Fabra