Message boards :
News :
Experimental Python tasks (beta)
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author | Message |
---|---|
Send message Joined: 16 Jul 07 Posts: 209 Credit: 5,496,860,456 RAC: 8,582,660 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
you also appear to have your hosts setup to ONLY crunch these beta tasks. is there a reason for that? I have reached my wuprop goals for the other apps. So I am interested in only this particular app (for now). does your system process the normal tasks fine? maybe it's something going on with your system as a whole. Yep, all the other apps run fine, both here and on other projects. Reno, NV Team: SETI.USA |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
you also appear to have your hosts setup to ONLY crunch these beta tasks. is there a reason for that? I have a theory, but not sure if it's correct or not. can you tell me the peak_flops value reported in your coproc_info.xml file for the 2080ti? basically, since you are using such an old version of BOINC (7.9.3) which pre-dates the fixes implemented in 7.14.2 to properly calculate the peak flops of Turing cards. So I'm willing to bet that your version of BOINC is over-estimating your peak flops by a factor of 2. a 2080ti should read somewhere between 13.5 and 15 TFlops, and I'm guessing your old version of BOINC is thinking it's closer to double that (25-30 TFlops) the second half of the theory is that there is some kind of hard limit (maybe an anti-cheat mechanism?) that prevents a credit reward somewhere around >2,000,000. maybe 1.8million, maybe 1.9million? but I haven't observed ANYONE getting a task earning that much, and all tasks that would reach that level based on runtime seem to get this 20-credit value. thats my theory, i could be wrong. if you try a newer version of boinc that properly measures the flops on a turing card, and you start getting real credit, then it might hold water. ![]() |
Send message Joined: 22 Oct 20 Posts: 4 Credit: 34,434,982 RAC: 0 Level ![]() Scientific publications ![]() |
Two outstanding issues are over-crediting (I am using some default BOINC formula) and, as far as i understand, the flops estimate (?). Toni, One more issue to add to the list. The download from Ananconda website does not allow for hosts behind a proxy. Can you please add a check for Proxy settings in the BOINC client so external software can be downloaded? I have other hosts that are not behind a proxy and they download and run the Experimental tasks fine. Issue here: CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2> This error repeats itself until it eventually gives up after 5 minutes and fails the task. Happens on 2 hosts sitting behind a Web Proxy (Squid) |
Send message Joined: 16 Jul 07 Posts: 209 Credit: 5,496,860,456 RAC: 8,582,660 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
A second, identical machine, except it has dual RTX 1660 Ti cards, finally got some work. The tasks reported and were awarded the large credits. So that rules out the question WRT BOINC version. FWIW, that version of BOINC is the latest available from the repository. So maybe it is due to interruptions after all, and I am just unaware? I am running some more tasks now, and will check again in the morning. Reno, NV Team: SETI.USA |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
A second, identical machine, except it has dual RTX 1660 Ti cards, finally got some work. The tasks reported and were awarded the large credits. So that rules out the question WRT BOINC version. FWIW, that version of BOINC is the latest available from the repository. it doesnt rule it out because a 1660ti has a much lower flops value. like 5.5 TFlop. so with the old boinc version, it's estimating ~11TFlop and that's not high enough to trigger the issue. you're only seeing it on the 2080ti because it's a much higher performing card. ~14TFlop by default, and the old boinc version is scaling it all the way up to 28+ TFlop. this causes the calculated credit to be MUCH higher than that of the 1660ti, and hence triggering the 20-cred issue, according to my theory of course. but your 1660ti tasks are well below the 2,000,000 credit threshold that i'm estimating. highest i've seen is ~1.7million, so the line cant be much higher. I'm willing to bet that if one of your tasks on that 1660ti system runs for ~30,000-40,000 seconds, it gets hit with 20 credits. ¯\_(ツ)_/¯ you really should try to get your hands on a newer version of BOINC. I use a version of BOINC that was compiled custom, and have usually used custom compiled versions from newer versions of the source code. maybe one of the other guys here can point you to a different repository that has a newer version of BOINC that can properly manage the Turing cards. ![]() |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
i also verified that restarting ALONE, wont necessarily trigger the 20-credit reward. it depends WHEN you restart it. if you restart the task early, early enough that the combined runtime wont reach a point where you wont come close to the 2mil credit mark, you'll get the normal points this task here: https://www.gpugrid.net/result.php?resultid=31934720 I restarted this task about 10-15mins into it. and it started over from the 10% mark, ran to completion, and still got normal crediting. and well below the threshold. ![]() |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
A second, identical machine, except it has dual RTX 1660 Ti cards, finally got some work. The tasks reported and were awarded the large credits. So that rules out the question WRT BOINC version. FWIW, that version of BOINC is the latest available from the repository. i see you changed BOINC to 7.17.0. another thing I noticed was that the change in tasks didnt take effect until new tasks were downloaded after the change, so tasks that were already there and tagged with the overinflated flops value will probably still get 20-cred. only the newly downloaded tasks after the change should work better. ![]() |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
aaaand your 2080ti just completed a task and got credit with the new BOINC version. called it. http://www.gpugrid.net/result.php?resultid=31951281 ![]() |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
I'm willing to bet that if one of your tasks on that 1660ti system runs for ~30,000-40,000 seconds, it gets hit with 20 credits. ¯\_(ツ)_/¯ looks like just 25,000s was enough to trigger it. http://www.gpugrid.net/result.php?resultid=31946707 it'll even out over time, since your other credits are earning 2x as much credit as you should be since the old version of BOINC is doubling your peak_flops value. ![]() |
Send message Joined: 16 Jul 07 Posts: 209 Credit: 5,496,860,456 RAC: 8,582,660 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
|
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
they were working fine on your 2080ti system when you had 7.17.0. why change it? but the issue you're having now looks like the same issue that richard was dealing with here: https://www.gpugrid.net/forum_thread.php?id=5204 that thread has the steps they took to fix it. it's a permissions issue. ![]() |
Send message Joined: 16 Jul 07 Posts: 209 Credit: 5,496,860,456 RAC: 8,582,660 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
they were working fine on your 2080ti system when you had 7.17.0. why change it? That was a kludge. There is no such thing as 7.17.0. =;^) Once I verified that the newer version worked, I updated all my machines with the latest repository version, so it would be clean and updated going forward. Reno, NV Team: SETI.USA |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
There is such a thing. It’s the development branch. All of my systems use a version of BOINC based on 7.17.0 :) ![]() |
Send message Joined: 16 Jul 07 Posts: 209 Credit: 5,496,860,456 RAC: 8,582,660 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
|
Send message Joined: 2 Jul 16 Posts: 338 Credit: 7,987,341,558 RAC: 178,897 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
So long start to end run times cause the 20 credit issue, not that they were restarted. But tasks that are interrupted cause them to restart at 0, thus having a longer start to end run time. 1070 or 1070Ti 27,656.18s received 1,316,998.40 42,652.74 received 20.83 1080Ti 21,508.23 received 1,694,500.25 25,133.86, 29,742.04, 38,297.41 tasks received 20.83 I doubt they were interrupted with the tasks being High Priority and nothing else but GPUGrid in the BOINC queue. |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
yup I confirmed this. I manually restarted a task that didnt run very long and it didnt have the issue. the issue only happens if your credit reward will be greater than about 1.9 million. take some of your completed tasks, divide the total credit by the runtime seconds to figure how much credit you earn per second. then figure how many seconds you need to hit 1.9 million, and that's the runtime limit for your system, anything over that and you get the 20-credit bug ![]() |
Send message Joined: 16 Jul 07 Posts: 209 Credit: 5,496,860,456 RAC: 8,582,660 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Why is the number of tasks in progress dwindling? Are no new tasks being issued? Reno, NV Team: SETI.USA |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
most of the Python tasks I've received in the last 3 days have been "_0", so that indicates brand new. and a few resends here and there. the rate in which they are creating them is likely slowed, and the demand is high since points chasers have come to try to snatch them up. also possible that the recent new (_0) ones are only recreations of earlier failed tasks that had some bug that needed fixing. it does seem that this run is concluding. ![]() |
![]() Send message Joined: 6 Mar 09 Posts: 25 Credit: 102,324,681 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
... I had the same error message except that mine was trying to go to /opt/boinc/.conda/environments.txt |
![]() Send message Joined: 6 Mar 09 Posts: 25 Credit: 102,324,681 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
... I had the same error message except that mine was trying to go to... /opt/boinc/.conda/environments.txt Looks harmless, thanks for reporting. It's because the "boinc" user doesn't have a HOME directory I think. Gentoo put the home for boinc at /opt/boinc. I updated the user file to change it to /var/lib/boinc. |
©2025 Universitat Pompeu Fabra