New D3RBanditTest workunits

Message boards : News : New D3RBanditTest workunits
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 14 · Next

AuthorMessage
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 5,269
Level
Trp
Scientific publications
wat
Message 56736 - Posted: 2 Mar 2021, 13:55:06 UTC - in response to Message 56734.  
Last modified: 2 Mar 2021, 13:56:02 UTC

Because it was already over 5 days since that host downloaded the unit, so beyond the deadline for sending a new instance of the wu.

oh, okay, I was not aware of that.

The interesting thing is: I received it 2 days ago. So if the original host finished it this morning, the task must have been 7 days "old" then (and obviously got credit).

Recently, one of my slower hosts finshed a task after 5 days plus a few hours, and it was not accepted any more. No credits: "too late".

How does this fit together?



I told you this before what happened in that case. There’s some grace period where if you return a result that has already been received by another host, you’ll still get credit. I’m guessing it’s about 1 day. Maybe less. In that case you returned it 4 days after the first result. So you missed the validate period.

If the person returned it in 7 days, but they were the first to return it, they get credit. Doesn’t matter if it’s late, if you’re first you will get credit.

It’s a good thing that the project cancelled that WU from your host to prevent unnecessary and wasted computation. You would have spent another 5 days crunching something that they already have a result for, then you would have not received credit, and been upset about that.
ID: 56736 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 56741 - Posted: 2 Mar 2021, 20:18:07 UTC - in response to Message 56735.  

I can't find that wu in your hosts, if you can point to it I will have a look.

here it is:
https://www.gpugrid.net/result.php?resultid=32550373
ID: 56741 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 56742 - Posted: 2 Mar 2021, 20:28:06 UTC - in response to Message 56736.  

... There’s some grace period where if you return a result that has already been received by another host, you’ll still get credit. I’m guessing it’s about 1 day. Maybe less.
...
If the person returned it in 7 days, but they were the first to return it, they get credit. Doesn’t matter if it’s late, if you’re first you will get credit.

so how long is the grace period?
about 1 day? or less? Or any time longer?
Or are there different types of grace periods?
This system is somewhat obscure, anyway.
ID: 56742 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 351
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 56743 - Posted: 2 Mar 2021, 20:43:42 UTC - in response to Message 56742.  

A true, formal, grace period would result in the deadline shown on the website being a day or few later than the deadline shown on your computer at home. The BOINC client will try to finish the job by the deadline shown locally, but provided its returned by the website deadline, nothing is lost. But we don't use that here.

More colloquially, an informal grace period occurs because you've got "until your replacement wingmate, after you've failed to return it in time, returns their copy". So,

However long it takes them to download the data, plus
However long the task hangs about before their computer starts working on it, plus
However long it tales them to compute it.

Don't rely on the first or second lasting longer then a few seconds. I think the shortest time reported so far for the third stage is about 10 hours with the current work.
ID: 56743 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 5,269
Level
Trp
Scientific publications
wat
Message 56745 - Posted: 2 Mar 2021, 21:37:05 UTC - in response to Message 56743.  
Last modified: 2 Mar 2021, 21:48:01 UTC

it's certainly informal. I don't know how long the grace period is, I'm just using my own experience to make an educated guess about the ~1 day length. but it's certainly shorter than the 4 days from Erich's previous situation since he got a validate error when he returned it.

i know i've returned a result that was 12+hrs past the return of the previous person (who blew their 5-day deadline, but returned it a few hours after it was sent to me).

I still got credit for it, but only the base credit based on the original host's 5+ day crunch, no bonus for me even though i was well within the 1 day. crunch time from when it hit my system.

the instance that we are referencing has already been purged though, so I can't link it unfortunately

edit: i found one in my list.

https://www.gpugrid.net/workunit.php?wuid=27035408

32544714 	483418 	21 Feb 2021 | 21:59:03 UTC 	22 Feb 2021 | 23:19:24 UTC 	Error while computing 	64,567.96 	64,163.00 	--- 	New version of ACEMD v2.11 (cuda101)
32547573 	564623 	23 Feb 2021 | 1:58:35 UTC 	28 Feb 2021 | 3:37:59 UTC 	Completed and validated 	170,762.44 	108,992.30 	348,750.00 	New version of ACEMD v2.11 (cuda101)
32550287 	543446 	28 Feb 2021 | 1:58:40 UTC 	28 Feb 2021 | 18:52:42 UTC 	Completed and validated 	60,467.52 	60,460.20 	348,750.00 	New version of ACEMD v2.11 (cuda100)


host before me blew their deadline
it was sent to me for crunching (i started it nearly right away due to small cache on this host)
host before me returned their result 2hrs after deadline, got base credit
i crunched it for 17hrs, returned it 15hrs after previous host, also got base credit.
ID: 56745 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Eos Yu

Send message
Joined: 27 Jan 21
Posts: 1
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 56747 - Posted: 2 Mar 2021, 21:47:48 UTC

i can`t get any WU! why@@?
ID: 56747 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 5,269
Level
Trp
Scientific publications
wat
Message 56748 - Posted: 2 Mar 2021, 21:54:12 UTC - in response to Message 56747.  

i can`t get any WU! why@@?

none available right now.
ID: 56748 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ServicEnginIC
Avatar

Send message
Joined: 24 Sep 10
Posts: 592
Credit: 11,972,186,510
RAC: 1,187
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 56751 - Posted: 3 Mar 2021, 6:29:24 UTC - in response to Message 56745.  

i know i've returned a result that was 12+hrs past the return of the previous person (who blew their 5-day deadline, but returned it a few hours after it was sent to me).

I still got credit for it, but only the base credit based on the original host's 5+ day crunch, no bonus for me even though i was well within the 1 day. crunch time from when it hit my system.

the instance that we are referencing has already been purged though, so I can't link it unfortunately

edit: i found one in my list.

https://www.gpugrid.net/workunit.php?wuid=27035408

32544714 483418 21 Feb 2021 | 21:59:03 UTC 22 Feb 2021 | 23:19:24 UTC Error while computing 64,567.96 64,163.00 --- New version of ACEMD v2.11 (cuda101)
32547573 564623 23 Feb 2021 | 1:58:35 UTC 28 Feb 2021 | 3:37:59 UTC Completed and validated 170,762.44 108,992.30 348,750.00 New version of ACEMD v2.11 (cuda101)
32550287 543446 28 Feb 2021 | 1:58:40 UTC 28 Feb 2021 | 18:52:42 UTC Completed and validated 60,467.52 60,460.20 348,750.00 New version of ACEMD v2.11 (cuda100)


host before me blew their deadline
it was sent to me for crunching (i started it nearly right away due to small cache on this host)
host before me returned their result 2hrs after deadline, got base credit
i crunched it for 17hrs, returned it 15hrs after previous host, also got base credit.

This agrees my own experience.
Your case hits scene number 2 at this previous post.

it's certainly informal. I don't know how long the grace period is, I'm just using my own experience to make an educated guess about the ~1 day length. but it's certainly shorter than the 4 days from Erich's previous situation since he got a validate error when he returned it.

I think that there isn't a fixed grace period. The only criterion is getting a valid result for each workunit.
And chance for these credit inconsistencies increases when (like now) the work units available are only rensends.
ID: 56751 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 56752 - Posted: 3 Mar 2021, 8:09:17 UTC - in response to Message 56751.  

And chance for these credit inconsistencies increases when (like now) the work units available are only rensends.

this statement seems perfectly correct :-)
ID: 56752 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Trotador

Send message
Joined: 25 Mar 12
Posts: 103
Credit: 14,948,929,771
RAC: 14
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 56753 - Posted: 3 Mar 2021, 13:09:23 UTC

New Gerard tasks?

https://www.gpugrid.net/workunit.php?wuid=27038831
ID: 56753 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 5,269
Level
Trp
Scientific publications
wat
Message 56754 - Posted: 3 Mar 2021, 14:08:19 UTC - in response to Message 56753.  

New Gerard tasks?

https://www.gpugrid.net/workunit.php?wuid=27038831

these pop up from time to time. always only a handful of them. they're a rare gem. not enough to feed the masses though.
ID: 56754 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 56755 - Posted: 3 Mar 2021, 17:23:06 UTC - in response to Message 56754.  

New Gerard tasks?

https://www.gpugrid.net/workunit.php?wuid=27038831

these pop up from time to time. always only a handful of them. they're a rare gem. not enough to feed the masses though.


I got 3 of them this morning (1_3-GERARD_pocket_discovery_...), and after they have waited a few hours in the queue, the server abortet them - "202 (0xca) EXIT_ABORTED_BY_PROJECT"
:-)
ID: 56755 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 5,269
Level
Trp
Scientific publications
wat
Message 56756 - Posted: 3 Mar 2021, 17:25:12 UTC - in response to Message 56755.  
Last modified: 3 Mar 2021, 17:35:14 UTC

I got two of them. They started processing right away. I have GPUGRID set to resource share of 100 and my other GPU project (Einstein) set to 0. So when I get GPUGRID tasks, they take priority over any backup project work already in the queue and begin right away.

One finished in about 2.5hrs (2080ti) and the other is in progress and will take probably 6hrs (1660Super)

Looks like it’s following the same rules outlined above.

If you haven’t even started processsing yet by the time someone else completes and returns a result, then it cancels the unstarted task. This is a good idea in my opinion and reduces wasteful computation. There’s no need to have you even start the task if they already have the result. If you had started the tasks, they would have been allowed to complete. But since they were not started, they get cancelled.

The difference here is that it looks like these Gerard tasks were send out in pairs from the beginning. Maybe they are trying to weed out the hosts that hit and run (download tasks and never return them). So it goes to two hosts at once to increase the chances that they get a valid result in the first 5-day window.
ID: 56756 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 56757 - Posted: 4 Mar 2021, 6:08:44 UTC - in response to Message 56756.  

I got two of them. They started processing right away.

I had a GPUGRID task running, so the downloaded tasks were in waiting position.
Had I known that they will disappear that soon, I would have interrupted the running task for short time, in order to get at least one of the three newly downloaded tasks started (thus preventing it from being aborted by the server).
Well, next time I know :-)
ID: 56757 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
RJ The Bike Guy

Send message
Joined: 2 Apr 20
Posts: 20
Credit: 35,363,533
RAC: 0
Level
Val
Scientific publications
wat
Message 56758 - Posted: 6 Mar 2021, 15:18:55 UTC - in response to Message 56756.  
Last modified: 6 Mar 2021, 15:19:13 UTC

I got two of them. They started processing right away. I have GPUGRID set to resource share of 100 and my other GPU project (Einstein) set to 0. So when I get GPUGRID tasks, they take priority over any backup project work already in the queue and begin right away.


Thanks! Didn't know I could do that. I had suspended Einstein so it would pick up the GPUGRID work. Now I have Einstein set to 0% and GPUGRID to 100%.
ID: 56758 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 351
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 56759 - Posted: 6 Mar 2021, 16:55:21 UTC - in response to Message 56758.  

... and begin right away.

Don't expect them to run instantly. But 'next in queue' when an Einstein task completes is usually good enough.
ID: 56759 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
RJ The Bike Guy

Send message
Joined: 2 Apr 20
Posts: 20
Credit: 35,363,533
RAC: 0
Level
Val
Scientific publications
wat
Message 56760 - Posted: 6 Mar 2021, 20:43:06 UTC - in response to Message 56759.  

... and begin right away.

Don't expect them to run instantly. But 'next in queue' when an Einstein task completes is usually good enough.


No problem. The Einstein jobs are taking less than 20 minutes currently.
ID: 56760 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Philip C Swift [Gridcoin]

Send message
Joined: 23 Dec 18
Posts: 12
Credit: 50,868,500
RAC: 0
Level
Thr
Scientific publications
wat
Message 56764 - Posted: 9 Mar 2021, 22:04:17 UTC

Correct me if I am wrong but the [deadline] is the date and time the task has to be started by NOT completed. It is a deadline for the task start to be completed not the start finish.
ID: 56764 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 5,269
Level
Trp
Scientific publications
wat
Message 56765 - Posted: 9 Mar 2021, 22:45:50 UTC - in response to Message 56764.  

Correct me if I am wrong but the [deadline] is the date and time the task has to be started by NOT completed. It is a deadline for the task start to be completed not the start finish.


If the task isn’t completed and returned by the deadline, it gets sent to another host. You can still submit it late, but the project really wants the result before the deadline.
ID: 56765 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zharkov70

Send message
Joined: 10 Mar 21
Posts: 1
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 56766 - Posted: 10 Mar 2021, 19:17:35 UTC - in response to Message 56504.  

хорошо
ID: 56766 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 14 · Next

Message boards : News : New D3RBanditTest workunits

©2025 Universitat Pompeu Fabra