Experimental Python tasks (beta)

Message boards : News : Experimental Python tasks (beta)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

AuthorMessage
zombie67 [MM]

Send message
Joined: 16 Jul 07
Posts: 209
Credit: 5,496,860,456
RAC: 8,582,660
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 56016 - Posted: 15 Dec 2020, 18:57:24 UTC - in response to Message 56015.  

you also appear to have your hosts setup to ONLY crunch these beta tasks. is there a reason for that?

I have reached my wuprop goals for the other apps. So I am interested in only this particular app (for now).

does your system process the normal tasks fine? maybe it's something going on with your system as a whole.

Yep, all the other apps run fine, both here and on other projects.
Reno, NV
Team: SETI.USA
ID: 56016 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1114
Credit: 40,838,535,595
RAC: 4,302,611
Level
Trp
Scientific publications
wat
Message 56017 - Posted: 15 Dec 2020, 20:40:18 UTC - in response to Message 56016.  
Last modified: 15 Dec 2020, 21:09:19 UTC

you also appear to have your hosts setup to ONLY crunch these beta tasks. is there a reason for that?

I have reached my wuprop goals for the other apps. So I am interested in only this particular app (for now).

does your system process the normal tasks fine? maybe it's something going on with your system as a whole.

Yep, all the other apps run fine, both here and on other projects.


I have a theory, but not sure if it's correct or not.

can you tell me the peak_flops value reported in your coproc_info.xml file for the 2080ti?

basically, since you are using such an old version of BOINC (7.9.3) which pre-dates the fixes implemented in 7.14.2 to properly calculate the peak flops of Turing cards. So I'm willing to bet that your version of BOINC is over-estimating your peak flops by a factor of 2. a 2080ti should read somewhere between 13.5 and 15 TFlops, and I'm guessing your old version of BOINC is thinking it's closer to double that (25-30 TFlops)

the second half of the theory is that there is some kind of hard limit (maybe an anti-cheat mechanism?) that prevents a credit reward somewhere around >2,000,000. maybe 1.8million, maybe 1.9million? but I haven't observed ANYONE getting a task earning that much, and all tasks that would reach that level based on runtime seem to get this 20-credit value.

thats my theory, i could be wrong. if you try a newer version of boinc that properly measures the flops on a turing card, and you start getting real credit, then it might hold water.
ID: 56017 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
sph

Send message
Joined: 22 Oct 20
Posts: 4
Credit: 34,434,982
RAC: 0
Level
Val
Scientific publications
wat
Message 56018 - Posted: 15 Dec 2020, 23:13:08 UTC - in response to Message 56007.  
Last modified: 15 Dec 2020, 23:15:51 UTC

Two outstanding issues are over-crediting (I am using some default BOINC formula) and, as far as i understand, the flops estimate (?).


Toni, One more issue to add to the list.

The download from Ananconda website does not allow for hosts behind a proxy. Can you please add a check for Proxy settings in the BOINC client so external software can be downloaded?
I have other hosts that are not behind a proxy and they download and run the Experimental tasks fine.

Issue here:
CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2>
Elapsed: -

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.

This error repeats itself until it eventually gives up after 5 minutes and fails the task.

Happens on 2 hosts sitting behind a Web Proxy (Squid)
ID: 56018 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]

Send message
Joined: 16 Jul 07
Posts: 209
Credit: 5,496,860,456
RAC: 8,582,660
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 56019 - Posted: 16 Dec 2020, 1:19:31 UTC - in response to Message 56017.  

A second, identical machine, except it has dual RTX 1660 Ti cards, finally got some work. The tasks reported and were awarded the large credits. So that rules out the question WRT BOINC version. FWIW, that version of BOINC is the latest available from the repository.

So maybe it is due to interruptions after all, and I am just unaware? I am running some more tasks now, and will check again in the morning.
Reno, NV
Team: SETI.USA
ID: 56019 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1114
Credit: 40,838,535,595
RAC: 4,302,611
Level
Trp
Scientific publications
wat
Message 56020 - Posted: 16 Dec 2020, 2:57:24 UTC - in response to Message 56019.  
Last modified: 16 Dec 2020, 3:01:29 UTC

A second, identical machine, except it has dual RTX 1660 Ti cards, finally got some work. The tasks reported and were awarded the large credits. So that rules out the question WRT BOINC version. FWIW, that version of BOINC is the latest available from the repository.

So maybe it is due to interruptions after all, and I am just unaware? I am running some more tasks now, and will check again in the morning.


it doesnt rule it out because a 1660ti has a much lower flops value. like 5.5 TFlop. so with the old boinc version, it's estimating ~11TFlop and that's not high enough to trigger the issue. you're only seeing it on the 2080ti because it's a much higher performing card. ~14TFlop by default, and the old boinc version is scaling it all the way up to 28+ TFlop. this causes the calculated credit to be MUCH higher than that of the 1660ti, and hence triggering the 20-cred issue, according to my theory of course. but your 1660ti tasks are well below the 2,000,000 credit threshold that i'm estimating. highest i've seen is ~1.7million, so the line cant be much higher. I'm willing to bet that if one of your tasks on that 1660ti system runs for ~30,000-40,000 seconds, it gets hit with 20 credits. ¯\_(ツ)_/¯

you really should try to get your hands on a newer version of BOINC. I use a version of BOINC that was compiled custom, and have usually used custom compiled versions from newer versions of the source code. maybe one of the other guys here can point you to a different repository that has a newer version of BOINC that can properly manage the Turing cards.
ID: 56020 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1114
Credit: 40,838,535,595
RAC: 4,302,611
Level
Trp
Scientific publications
wat
Message 56021 - Posted: 16 Dec 2020, 3:13:29 UTC - in response to Message 56020.  

i also verified that restarting ALONE, wont necessarily trigger the 20-credit reward.

it depends WHEN you restart it. if you restart the task early, early enough that the combined runtime wont reach a point where you wont come close to the 2mil credit mark, you'll get the normal points

this task here: https://www.gpugrid.net/result.php?resultid=31934720

I restarted this task about 10-15mins into it. and it started over from the 10% mark, ran to completion, and still got normal crediting. and well below the threshold.
ID: 56021 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1114
Credit: 40,838,535,595
RAC: 4,302,611
Level
Trp
Scientific publications
wat
Message 56023 - Posted: 16 Dec 2020, 14:36:25 UTC - in response to Message 56019.  

A second, identical machine, except it has dual RTX 1660 Ti cards, finally got some work. The tasks reported and were awarded the large credits. So that rules out the question WRT BOINC version. FWIW, that version of BOINC is the latest available from the repository.

So maybe it is due to interruptions after all, and I am just unaware? I am running some more tasks now, and will check again in the morning.


i see you changed BOINC to 7.17.0.

another thing I noticed was that the change in tasks didnt take effect until new tasks were downloaded after the change, so tasks that were already there and tagged with the overinflated flops value will probably still get 20-cred. only the newly downloaded tasks after the change should work better.

ID: 56023 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1114
Credit: 40,838,535,595
RAC: 4,302,611
Level
Trp
Scientific publications
wat
Message 56027 - Posted: 16 Dec 2020, 18:10:19 UTC - in response to Message 56023.  

aaaand your 2080ti just completed a task and got credit with the new BOINC version. called it.

http://www.gpugrid.net/result.php?resultid=31951281
ID: 56027 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1114
Credit: 40,838,535,595
RAC: 4,302,611
Level
Trp
Scientific publications
wat
Message 56028 - Posted: 16 Dec 2020, 18:13:53 UTC - in response to Message 56020.  

I'm willing to bet that if one of your tasks on that 1660ti system runs for ~30,000-40,000 seconds, it gets hit with 20 credits. ¯\_(ツ)_/¯


looks like just 25,000s was enough to trigger it.

http://www.gpugrid.net/result.php?resultid=31946707

it'll even out over time, since your other credits are earning 2x as much credit as you should be since the old version of BOINC is doubling your peak_flops value.

ID: 56028 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]

Send message
Joined: 16 Jul 07
Posts: 209
Credit: 5,496,860,456
RAC: 8,582,660
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 56030 - Posted: 17 Dec 2020, 0:43:46 UTC

After upgrading all the BOINC clients, the tasks are erroring out. Ugh.
Reno, NV
Team: SETI.USA
ID: 56030 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1114
Credit: 40,838,535,595
RAC: 4,302,611
Level
Trp
Scientific publications
wat
Message 56031 - Posted: 17 Dec 2020, 0:54:19 UTC - in response to Message 56030.  

they were working fine on your 2080ti system when you had 7.17.0. why change it?

but the issue you're having now looks like the same issue that richard was dealing with here: https://www.gpugrid.net/forum_thread.php?id=5204

that thread has the steps they took to fix it. it's a permissions issue.
ID: 56031 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]

Send message
Joined: 16 Jul 07
Posts: 209
Credit: 5,496,860,456
RAC: 8,582,660
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 56033 - Posted: 17 Dec 2020, 4:47:44 UTC - in response to Message 56031.  

they were working fine on your 2080ti system when you had 7.17.0. why change it?

but the issue you're having now looks like the same issue that richard was dealing with here: https://www.gpugrid.net/forum_thread.php?id=5204

that thread has the steps they took to fix it. it's a permissions issue.


That was a kludge. There is no such thing as 7.17.0. =;^) Once I verified that the newer version worked, I updated all my machines with the latest repository version, so it would be clean and updated going forward.
Reno, NV
Team: SETI.USA
ID: 56033 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1114
Credit: 40,838,535,595
RAC: 4,302,611
Level
Trp
Scientific publications
wat
Message 56036 - Posted: 17 Dec 2020, 5:05:48 UTC - in response to Message 56033.  

There is such a thing. It’s the development branch. All of my systems use a version of BOINC based on 7.17.0 :)
ID: 56036 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]

Send message
Joined: 16 Jul 07
Posts: 209
Credit: 5,496,860,456
RAC: 8,582,660
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 56037 - Posted: 17 Dec 2020, 5:23:58 UTC

Well sure. I meant a released version.
Reno, NV
Team: SETI.USA
ID: 56037 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Jul 16
Posts: 338
Credit: 7,987,341,558
RAC: 178,897
Level
Tyr
Scientific publications
watwatwatwatwat
Message 56046 - Posted: 18 Dec 2020, 11:24:17 UTC
Last modified: 18 Dec 2020, 11:24:46 UTC

So long start to end run times cause the 20 credit issue, not that they were restarted. But tasks that are interrupted cause them to restart at 0, thus having a longer start to end run time.

1070 or 1070Ti
27,656.18s received 1,316,998.40
42,652.74 received 20.83

1080Ti
21,508.23 received 1,694,500.25
25,133.86, 29,742.04, 38,297.41 tasks received 20.83

I doubt they were interrupted with the tasks being High Priority and nothing else but GPUGrid in the BOINC queue.
ID: 56046 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1114
Credit: 40,838,535,595
RAC: 4,302,611
Level
Trp
Scientific publications
wat
Message 56049 - Posted: 18 Dec 2020, 14:57:21 UTC - in response to Message 56046.  

yup I confirmed this. I manually restarted a task that didnt run very long and it didnt have the issue.

the issue only happens if your credit reward will be greater than about 1.9 million.

take some of your completed tasks, divide the total credit by the runtime seconds to figure how much credit you earn per second. then figure how many seconds you need to hit 1.9 million, and that's the runtime limit for your system, anything over that and you get the 20-credit bug
ID: 56049 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]

Send message
Joined: 16 Jul 07
Posts: 209
Credit: 5,496,860,456
RAC: 8,582,660
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 56148 - Posted: 24 Dec 2020, 15:33:20 UTC

Why is the number of tasks in progress dwindling? Are no new tasks being issued?
Reno, NV
Team: SETI.USA
ID: 56148 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1114
Credit: 40,838,535,595
RAC: 4,302,611
Level
Trp
Scientific publications
wat
Message 56149 - Posted: 24 Dec 2020, 15:48:21 UTC - in response to Message 56148.  
Last modified: 24 Dec 2020, 15:49:07 UTC

most of the Python tasks I've received in the last 3 days have been "_0", so that indicates brand new. and a few resends here and there.

the rate in which they are creating them is likely slowed, and the demand is high since points chasers have come to try to snatch them up. also possible that the recent new (_0) ones are only recreations of earlier failed tasks that had some bug that needed fixing. it does seem that this run is concluding.
ID: 56149 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile trigggl

Send message
Joined: 6 Mar 09
Posts: 25
Credit: 102,324,681
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 56151 - Posted: 25 Dec 2020, 16:41:49 UTC - in response to Message 55590.  

...
Also Warnings about path not found:
WARNING conda.core.envs_manager:register_env(50): Unable to register environment. Path not writable or missing.
environment location: /var/lib/boinc-client/projects/www.gpugrid.net/miniconda
  registry file: /root/.conda/environments.txt

Registry file location ( /root/ ) will not be accessible to boinc user unless conda is already installed on the host (by root user) and conda file is world readable
...

I had the same error message except that mine was trying to go to
/opt/boinc/.conda/environments.txt

ID: 56151 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile trigggl

Send message
Joined: 6 Mar 09
Posts: 25
Credit: 102,324,681
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 56152 - Posted: 25 Dec 2020, 16:43:36 UTC - in response to Message 55590.  
Last modified: 25 Dec 2020, 16:59:59 UTC

...
Also Warnings about path not found:
WARNING conda.core.envs_manager:register_env(50): Unable to register environment. Path not writable or missing.
environment location: /var/lib/boinc-client/projects/www.gpugrid.net/miniconda
  registry file: /root/.conda/environments.txt

Registry file location ( /root/ ) will not be accessible to boinc user unless conda is already installed on the host (by root user) and conda file is world readable
...

I had the same error message except that mine was trying to go to...
/opt/boinc/.conda/environments.txt
Looks harmless, thanks for reporting. It's because the "boinc" user doesn't have a HOME directory I think.

Gentoo put the home for boinc at /opt/boinc.
I updated the user file to change it to /var/lib/boinc.
ID: 56152 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

Message boards : News : Experimental Python tasks (beta)

©2025 Universitat Pompeu Fabra