Message boards :
Number crunching :
WU failures discussion
Message board moderation
Previous · 1 · 2 · 3 · 4 · Next
| Author | Message |
|---|---|
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks Stefan, sounds good so far! About that new project: new, as in "GPU-Grid beta" or something? Or a new subproject / WUs type like the short and long queues, maybe "risk production"? Credit-wise it might be nicer to have them all combined under the same banner. But we and you might not be able to set things up as specifically as we want it, if they're not separate projects. MrS Scanning for our furry friends since Jan 2002 |
|
Send message Joined: 5 Mar 13 Posts: 348 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
Ok so now the Noelia WU's got cancelled totally. Have a happy crash-free crunching month :D |
|
Send message Joined: 28 Mar 09 Posts: 490 Credit: 11,731,645,728 RAC: 42 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
They were fun while they lasted. Though, I managed to finish the last few successfully, including two overnight, no errors since last Friday. It's nice to end on a high note. I hope the results are useful, and I hope these simulations resume once the bugs are fixed. |
|
Send message Joined: 5 Mar 13 Posts: 348 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
Yes, they will probably come back in two three weeks with some different parameters which should decrease the error rate. |
|
Send message Joined: 17 Mar 10 Posts: 23 Credit: 1,173,824,416 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Ok so now the Noelia WU's got cancelled totally. Excellent news! I will start up GPUGRID again when I get home... |
|
Send message Joined: 19 Nov 12 Posts: 31 Credit: 1,549,545,867 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thanks very much for the update and follow-thru, Stefan! |
|
Send message Joined: 17 Feb 13 Posts: 181 Credit: 144,871,276 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thanks for the update, Stefan. John |
|
Send message Joined: 17 Feb 13 Posts: 181 Credit: 144,871,276 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi, Jim: I will not be processing GPUGrid WUs for a while as I am concentrating on other areas of interest. I will keep an eye on the Forum and decide at a future date if I should contribute more. Happy crunching! John |
|
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Ok so now the Noelia WU's got cancelled totally. I am not happy with this as on my rigs the Noelia's did better than Santi's and that still is, I had again Santi errors, LR and SR even with the latest beta drivers. Greetings from TJ |
|
Send message Joined: 5 Mar 13 Posts: 348 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
Well, we cannot please everyone unfortunately :D It's great to hear that they worked fine on your machine. But on the last days the Noelia WU's had an incredibly high failure rate, so even if they worked for you they were crashing for nearly 30-40% of the users. So the general good had to prevail here :) |
|
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Well, we cannot please everyone unfortunately :D It's grat to hear that they worked fine on your machine. But on the last days the Noelia WU's had an incredibly high failure rate, so even if they worked for you they were crashing for nearly 30-40% of the users. So the general good had to prevail here :) Aha, about a weak ago we could read that the failure rate was acceptable according to the project. This however is another conclusion ;-) Well never mind, the Santi's keep failing on my rigs and with a lot of wingman too who got them afterwards. So I guess the complains about them will now increase. Not longer from me, I set to LR and will not longer complain about them. And I have not to hurry to build new rigs, or update old ones with 690, 780 and titans. Perhaps you could crunch a few of those Santi's, Stefan than you can see it yourself. Greetings from TJ |
|
Send message Joined: 5 Mar 13 Posts: 348 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
Well a week ago it was acceptable. For some reason it started increasing and became quite unacceptable (hence I said "on the last days"). Santi's are at 2-7% error rate which might be an all-time historical low or something like that :P As for crunching them, I think my (single) GTX 280 might cry. |
|
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thanks for the clarification Stefan. Yes indeed the 280 will get it very warm :) Greetings from TJ |
BeyondSend message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Ok so now the Noelia WU's got cancelled totally. Thanks for this. Since the Noelia WUs disappeared I've had no crashes or failures at all on my 8 GPUGrid machines. 2. Now for every batch we send out we decided we will make a thread in the News section with the exact batch name. If someone forgets to do that send a message quick and I will remind them :D These threads will also contain information about the specific batch. Great news on both counts although I don't see why a new project would be necessary. |
|
Send message Joined: 5 Mar 13 Posts: 348 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
Supposedly from what I understood some options (like WU deadlines? or hardware requirements?) are defined project wise. So we could not test them publicly on our main project as it could ruin everything. That's at least what I understood. |
|
Send message Joined: 17 Feb 13 Posts: 181 Credit: 144,871,276 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Ok so now the Noelia WU's got cancelled totally. Hi, Stefan: Thank you for this most welcome news! I will now run a couple of short run WUs and see what happens. I cannot turn my back on this important research: that's why I invested in two GTX 650Ti GPUs a few months ago. Until that time I had processed other WUs with ATI GPUs only. Thanks, again. John |
BeyondSend message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Santi's are at 2-7% error rate which might be an all-time historical low or something like that :P Haven't had a single WU error since the Noelia WUs left over a week ago. Looking back, the very few errors I had with the Nathan and Santi WUs seem to have occurred after defective Noelia WUs put the GPUs into a bad state. This is with 8 machines running a range of GPUs from the lowly GTX 460/768, the 560 1GB, the 650 Ti 1GB and the 670 2GB. If my hypothesis is correct, without the Noelia WUs around to mess up the GPUs, the error rate for other WU types should be falling. |
|
Send message Joined: 28 Mar 09 Posts: 490 Credit: 11,731,645,728 RAC: 42 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Santi's are at 2-7% error rate which might be an all-time historical low or something like that :P This reminds me, on my windows xp computer, I had observed on 2 occasions, when the Noelia unit crashed, the subsequent non Noelia would not load into the GPU (the clock would run, but progress would stay at 0.00%). To fix this, I had to suspend the unit, reboot the computer, and then resume the unit. This non Noelia unit would then run normally. I almost forgot about this. Thanks for jogging my memory with your post. |
|
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
On my 660 the Santi's keep erroring so Noelia's WU have nothing to do with this! So I have withdrawn the 660 and give it to Einstein and Albert. Greetings from TJ |
BeyondSend message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
On my 660 the Santi's keep erroring so Noelia's WU have nothing to do with this! Did you ever try to RMA it as suggested? |
©2025 Universitat Pompeu Fabra