Message boards :
Graphics cards (GPUs) :
*_pYEEI_* information and issues
Message board moderation
Previous · 1 · 2 · 3 · 4 · Next
| Author | Message |
|---|---|
Michael GoetzSend message Joined: 2 Mar 09 Posts: 124 Credit: 124,873,744 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I *just* managed to squeak by. I had six of these error out, dropping my daily quota to 9. The next WU was the 9th of the day; fortunately, it's a different series and is crunching normally. If it had been another error I think this GPU would have been done for the day. (Unless it's still counting this as WUs per CPU core, in which case I had a lot of headway.) Want to find one of the largest known primes? Try PrimeGrid. Or help cure disease at WCG.
|
StoneagemanSend message Joined: 25 May 09 Posts: 224 Credit: 34,057,374,498 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
UPDATE: Four more have trashed another gpu Aborted a boat load of these critters, yet still they come. It's like they are breeding! |
BeyondSend message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Can you PLEASE PLEASE PLEASE make sure WU batches are OK before sending them out. |
|
Send message Joined: 23 Feb 09 Posts: 39 Credit: 144,654,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
GTX295 - Nine *_pYEEI_* WUs crashed in a row. http://www.gpugrid.net/results.php?hostid=53295 "MDIO ERROR: syntax error in file "structure.psf", line number 1: failed to find PSF keyword ERROR: mdioload.cu, line 172: Unable to read topology file" No new work sent for 7,5 hours. (recently got new) Should I abort *_pYEEI_* on other GPUs (cache)? |
|
Send message Joined: 4 Sep 08 Posts: 7 Credit: 52,864,406 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
last WUs all going down the drain: <stderr_txt> # Using CUDA device 0 # There are 2 devices supporting CUDA # Device 0: "GeForce GTX 280" # Clock rate: 1.55 GHz # Total amount of global memory: 1073741824 bytes # Number of multiprocessors: 30 # Number of cores: 240 # Device 1: "GeForce GTX 260" # Clock rate: 1.51 GHz # Total amount of global memory: 939524096 bytes # Number of multiprocessors: 27 # Number of cores: 216 MDIO ERROR: syntax error in file "structure.psf", line number 1: failed to find PSF keyword ERROR: mdioload.cu, line 172: Unable to read topology file called boinc_finish </stderr_txt> I'm over to Collatz for some days. |
Michael GoetzSend message Joined: 2 Mar 09 Posts: 124 Credit: 124,873,744 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
|
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Had 2 fail in a few seconds on one system, 3 on another. 184-IBUCH_reverse_pYEEI_2912-0-40-RND6748 http://www.gpugrid.net/workunit.php?wuid=1056751 128-IBUCH_reverse_pYEEI_2912-0-40-RND3643 http://www.gpugrid.net/workunit.php?wuid=1056695 Also, could not get any tasks this morning between about 1am and noon, on the same system, but running a task now. http://www.gpugrid.net/workunit.php?wuid=1056826 http://www.gpugrid.net/workunit.php?wuid=1056758 http://www.gpugrid.net/workunit.php?wuid=1056826 |
BeyondSend message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Please use this thread to post any problem regarding all workunits tagged as *_pYEEI_*. As you can see (I hope) massive problems have been reported and many systems have been locked out (and are sitting idle) of receiving new WUs due to these faulty units. Don't you think it's about time to pull the rest? It looks like they're just being allowed to run until they fail so many times that the server cancels them. That's not showing any concern at all for the people who are doing your work. I know they're not being canceled because I've received 22 of them so far today. Every one of those 22 has failed on several machines before being sent to me. That's just wrong. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
In a way the _pYEEI_ tasks are SPAM! I had to take extreme action yesterday - shut down my system for a couple of hours ;) |
|
Send message Joined: 10 Apr 08 Posts: 254 Credit: 16,836,000 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
My most sincere apologies to everybody for all this. I wanted to fill up the queue before going offline for some days but obviously it didn't work as expected. The balance between keeping crunchers support, not having an empty queue and having private life is always very sensitive to human errors. Sincerely, ignasi |
Michael GoetzSend message Joined: 2 Mar 09 Posts: 124 Credit: 124,873,744 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
|
BeyondSend message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
My most sincere apologies to everybody for all this. Thanks for letting us know what happened. Communication is appreciated. Happy new year everyone! |
StoneagemanSend message Joined: 25 May 09 Posts: 224 Credit: 34,057,374,498 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
"A PRIVATE life".......... well ok. However, we expect you to sleep with the server :) |
|
Send message Joined: 10 Apr 08 Posts: 254 Credit: 16,836,000 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
"A PRIVATE life".......... well ok. However, we expect you to sleep with the server :) [/quote] I am afraid girlfriends are too jealous... |
|
Send message Joined: 4 Apr 09 Posts: 450 Credit: 539,316,349 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
"A PRIVATE life".......... well ok. However, we expect you to sleep with the server :) You have more than ONE !!! No wonder he can't get the WUs straight , he is sleep deprived :-) Keep up the good work, we'll crunch the best we can! Thanks - Steve |
[AF>Libristes>Jip] Elgrande71Send message Joined: 16 Jul 08 Posts: 45 Credit: 78,618,001 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
|
|
Send message Joined: 27 Jan 09 Posts: 4 Credit: 582,988,184 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Sorry but gong to say good bye. Last 3 days non stop computation errors made even worse by the fact the cards just sat there doing nothing. Switching all my GPUs to F@H. I do not accept having my money wasted with units processing for 17 hours then showing a computing error. |
|
Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi, It's because you have been accepting beta work from us. If reliability of work is of paramount importance to you, don't track the beta application. Matt |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Switching all my GPUs to F@H. Your cards do a lot more work here than they can at F@H. If the problem is Beta related you just need to turn the Betas off, as MJH said. It might also be that you need to restart the system. Sometimes one failure can cause contunuous failures (a runaway) and you need to restart the system. I say this because the problem was only limited to your GTX 295, and not your GTX 275. Many of your tasks seem to have been aborted by user. Some immediately and one after running for a long time, 286-IBUCH_esrever_pYEEI_0301-10-40-RND7408 - Aborted by user after 43,189.28 seconds. Turn off Betas, restart and see how you get on. |
|
Send message Joined: 27 Jan 09 Posts: 4 Credit: 582,988,184 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thanks for the comments. I looked in my GPUGrid preferences and did not notice anything saying Beta I did see "Run test applications? This helps us develop applications, but may cause jobs to fail on your computer" Which was already set to no. Please advise, how do a turn off receiving Beta work units Thanks Andy |
©2026 Universitat Pompeu Fabra