BAD PABLO_p53 WUs

Message boards : Number crunching : BAD PABLO_p53 WUs
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 261
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46721 - Posted: 21 Mar 2017, 14:48:12 UTC

People who are having task requests rejected because their quota is exhausted may wish to set 'No New Tasks' until they read that the faulty tasks have been flushed, and these new tasks are running successfully.

BOINC rebuilds the quota quickly when tasks are returned successfully, but if you're restricted to one task per day, and that one turns out to be a faulty one, you're stuck for another 24 hours.
ID: 46721 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Betting Slip

Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46722 - Posted: 21 Mar 2017, 15:10:19 UTC - in response to Message 46721.  

People who are having task requests rejected because their quota is exhausted may wish to set 'No New Tasks' until they read that the faulty tasks have been flushed, and these new tasks are running successfully.

BOINC rebuilds the quota quickly when tasks are returned successfully, but if you're restricted to one task per day, and that one turns out to be a faulty one, you're stuck for another 24 hours.


Nobody on this project is restricted to one task a day but they are restricted to 2 a day because of the way computers count. 0 = 1, 1 = 2, etc
ID: 46722 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46723 - Posted: 21 Mar 2017, 15:19:15 UTC - in response to Message 46701.  
Last modified: 21 Mar 2017, 15:19:43 UTC

My main host got its daily quota of long workunits reduced to 1 because it had too many failures (caused by this bad batch).
Luckily there are short runs (and one other long run), so my main host is not completely shut off of this project.
This is really annoying.

It's beyond annoying. I now have 6 hosts that won't get tasks because of these bad WUs. Two of the hosts not getting tasks are the fastest ones with 1060 GPUs. Irritating.
ID: 46723 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46725 - Posted: 21 Mar 2017, 17:38:16 UTC - in response to Message 46721.  

People who are having task requests rejected because their quota is exhausted may wish to set 'No New Tasks' until they read that the faulty tasks have been flushed, and these new tasks are running successfully.

BOINC rebuilds the quota quickly when tasks are returned successfully, but if you're restricted to one task per day, and that one turns out to be a faulty one, you're stuck for another 24 hours.

I have tried this, however, without success.
The only differnce to what it was before is that now the BOINC notice does no longer refer to the task limit per day, but simply says

21/03/2017 18:36:42 | GPUGRID | No tasks are available for Long runs (8-12 hours on fastest card)

Why so?
ID: 46725 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Greger

Send message
Joined: 6 Jan 15
Posts: 76
Credit: 25,499,534,331
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 46726 - Posted: 21 Mar 2017, 17:43:52 UTC - in response to Message 46723.  

Even worse is that all my linux host got coproc error which means bad batch crash drivers.So other project did fail to.

A restart is now done and might crash again if there still is task out.
ID: 46726 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46727 - Posted: 21 Mar 2017, 17:48:55 UTC
Last modified: 21 Mar 2017, 17:57:19 UTC

for some reason, 2 of my PCs now received new tasks, one of them was a
PABLO_contact_goal_KIX_CMYB

and even this one failed after a few seconds.
Till now I thought that only PABLO_p53 tasks are affected.

Edit: Only now I realize that on others of my PCs, during the day, had same probleme with all kinds of different WUs, not only PABLO_93.

Can it be that all recent WUs, regardless of the type, were faulty?
ID: 46727 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46729 - Posted: 21 Mar 2017, 21:24:25 UTC

It's ridiculous that these bad tasks weren't canceled. How many machines have been denied work because of this laziness on the admins part? I've personally received 137 of these bad WUs so far and now have 7 machines not accepting long WUs. Multiply this by how many users? This kind of thing can also happen at other projects but they cancel the the bad WUs when informed of the problem. Why not here?
ID: 46729 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Tom Miller

Send message
Joined: 21 Nov 14
Posts: 5
Credit: 1,081,640,766
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwat
Message 46730 - Posted: 21 Mar 2017, 23:00:49 UTC - in response to Message 46720.  

And still, for hours, the junk keeps rolling out.

If we the volunteers who donate our GPUs and electrons to help in what we're to believe is real science, I would hope the people using our resources would maybe have a little better way of administering them.
ID: 46730 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bedrich Hajek

Send message
Joined: 28 Mar 09
Posts: 490
Credit: 11,731,645,728
RAC: 42
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46731 - Posted: 22 Mar 2017, 0:51:49 UTC

They should eliminate the "daily quota" for this particular situation. and let us crunch!



ID: 46731 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46734 - Posted: 22 Mar 2017, 4:00:31 UTC - in response to Message 46720.  
Last modified: 22 Mar 2017, 4:10:57 UTC

Well these broken tasks will have to run their course.
That will be a long and frustrating process, as every host can have only one workunit per day, but right now 9 out of 10 workunits is a broken one (so the daily quota of the hosts won't rise for a while), and every workunit has to fail 7 times before it's cleared from the queue.
To speed this up, I've created dummy hosts with my inactive host, and I've "killed" about 100 of these broken workunits. I had to abort some working units, but these are the minority right now.
ID: 46734 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46735 - Posted: 22 Mar 2017, 4:18:50 UTC

The situation here still unchanged.
One of my 4 hosts luckily got a "good" WU some time last night and is crunching it.
On all other hosts BOINC still tells me

22/03/2017 05:14:41 | GPUGRID | This computer has finished a daily quota of 1 tasks

What I don't understand is why all these broken WUs cannot be removed from the queue, and why GPUGRID cannot somehow reset this daily quota junk.

By now, my frustration has reached quite a level :-(
ID: 46735 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Betting Slip

Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46736 - Posted: 22 Mar 2017, 10:16:56 UTC

Relax everone, we are where we are,I'm sure the admins are as frustrated as ourselves and are working to correct the situation.

On the bright side short WU's jus got a boost in computation.
ID: 46736 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46737 - Posted: 22 Mar 2017, 10:27:01 UTC - in response to Message 46736.  

short WU's jus got a boost in computation.

what does it help if they cannot be downloaded?
ID: 46737 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Loohi

Send message
Joined: 27 Aug 16
Posts: 16
Credit: 43,745,875
RAC: 0
Level
Val
Scientific publications
wat
Message 46738 - Posted: 22 Mar 2017, 10:50:47 UTC - in response to Message 46737.  


what does it help if they cannot be downloaded?


They can be downloaded, actually.
ID: 46738 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46739 - Posted: 22 Mar 2017, 11:21:11 UTC - in response to Message 46738.  
Last modified: 22 Mar 2017, 11:22:26 UTC

They can be downloaded, actually.

NOT on my machines. There comes the same notice re "daily quota of 1 task" ... :-(
ID: 46739 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 261
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46740 - Posted: 22 Mar 2017, 11:28:33 UTC - in response to Message 46739.  
Last modified: 22 Mar 2017, 11:30:15 UTC

They can be downloaded, actually.

NOT on my machine. There comes the same notice re "daily quota of 1 task" ... :-(

The quota is applied per task type. You are likely to be suffering from a quota of one long task per day: if you allow short tasks in your preferences, it is possible (but rare) to get short tasks allocated - I have two machines running them at the moment, because of that.

Here are the log entries from one of the affected machines:

22/03/2017 09:51:04 | GPUGRID | This computer has finished a daily quota of 1 tasks
22/03/2017 10:13:27 | GPUGRID | Scheduler request completed: got 2 new tasks
22/03/2017 10:13:27 | GPUGRID | No tasks are available for the applications you have selected
22/03/2017 10:13:27 | GPUGRID | No tasks are available for Long runs (8-12 hours on fastest card)
22/03/2017 10:13:27 | GPUGRID | Your preferences allow tasks from applications other than those selected
ID: 46740 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Betting Slip

Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46741 - Posted: 22 Mar 2017, 11:35:05 UTC - in response to Message 46739.  
Last modified: 22 Mar 2017, 11:37:36 UTC

They can be downloaded, actually.

NOT on my machines. There comes the same notice re "daily quota of 1 task" ... :-(


In addition to Richards response you have Long WU's running on three out of four of your machines. What more exactly do you want?
ID: 46741 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46742 - Posted: 22 Mar 2017, 11:44:07 UTC - in response to Message 46741.  

What is shown in my log is unfortunately wrong.

There is a total of 2 tasks running now. One on the slow GTX750Ti which was obviously not affacted the same way as the faster machines.
And one, to my surprise, on the GTX970.

The log, erronously shows 2 tasks on the PC with the two GTX980Ti, however, no tasks are being crunched there.

Then there is another PC with a GTX750Ti, which still shows the "quota of 1 task per day" notice.

It would of course be great if I could finally run tasks on the two GTX980ti's
ID: 46742 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46743 - Posted: 22 Mar 2017, 11:47:29 UTC

Further, the log shows that a

PABLO_contact_goal_KIX_CMYB-0-4-RND2705_5

was downloaded at 10:11 hrs UTC this morning, and also errored out after a few seconds.
Can these faulty WUs indeed not be eleminated from the queue?
ID: 46743 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46744 - Posted: 22 Mar 2017, 12:15:53 UTC - in response to Message 46740.  

You are likely to be suffering from a quota of one long task per day: if you allow short tasks in your preferences, it is possible (but rare) to get short tasks allocated

that's what BOINC is showing me:

22/03/2017 13:12:42 | GPUGRID | No tasks are available for Short runs (2-3 hours on fastest card)
22/03/2017 13:12:42 | GPUGRID | No tasks are available for Long runs (8-12 hours on fastest card)
22/03/2017 13:12:42 | GPUGRID | This computer has finished a daily quota of 1 tasks

So I doubt that could get short runs.
(your assumption is correct: I should be suffering on a long runs quota only, since no short runs were selected when the "accident" happened).
ID: 46744 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : BAD PABLO_p53 WUs

©2025 Universitat Pompeu Fabra