Experimental Python tasks (beta)

Message boards : News : Experimental Python tasks (beta)
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 6 · Next

AuthorMessage
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 55588 - Posted: 13 Oct 2020, 6:07:19 UTC

I'm creating some experimental tasks for the Python app (made Beta). They are Linux and CUDA specific and serve in preparation for future batches.

They may use a relatively large amount of disk space (order of 1-10 GB) which persists between runs, and is cleared if you reset the project.

ID: 55588 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rod4x4

Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,219,935,054
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 55590 - Posted: 13 Oct 2020, 7:44:18 UTC - in response to Message 55588.  
Last modified: 13 Oct 2020, 8:24:54 UTC

I'm creating some experimental tasks for the Python app (made Beta). They are Linux and CUDA specific and serve in preparation for future batches.

They may use a relatively large amount of disk space (order of 1-10 GB) which persists between runs, and is cleared if you reset the project.



Preference Ticked, ready and waiting...

EDIT: Received some already
https://www.gpugrid.net/result.php?resultid=29466771
https://www.gpugrid.net/result.php?resultid=29466770

Conda Warnings reported. Will you push out update with app (or safe to ignore)?

Also Warnings about path not found:
WARNING conda.core.envs_manager:register_env(50): Unable to register environment. Path not writable or missing.
environment location: /var/lib/boinc-client/projects/www.gpugrid.net/miniconda
  registry file: /root/.conda/environments.txt

Registry file location ( /root/ ) will not be accessible to boinc user unless conda is already installed on the host (by root user) and conda file is world readable

Otherwise the task status is Completed and Validated
ID: 55590 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 55591 - Posted: 13 Oct 2020, 9:25:38 UTC - in response to Message 55590.  

Looks harmless, thanks for reporting. It's because the "boinc" user doesn't have a HOME directory I think.
ID: 55591 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rod4x4

Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,219,935,054
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 55592 - Posted: 13 Oct 2020, 11:14:14 UTC - in response to Message 55591.  
Last modified: 13 Oct 2020, 11:17:49 UTC

Looks harmless, thanks for reporting. It's because the "boinc" user doesn't have a HOME directory I think.


Agreed

Perhaps adding "./envs" switch to the end of the command:

/var/lib/boinc-client/projects/www.gpugrid.net/miniconda/bin/conda install


May help with setting up the environment.

This switch should add environment file to current directory from which command is executed.
ID: 55592 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers

Send message
Joined: 13 Dec 17
Posts: 1387
Credit: 8,177,442,190
RAC: 6,674,176
Level
Tyr
Scientific publications
watwatwatwatwat
Message 55724 - Posted: 12 Nov 2020, 1:59:01 UTC

I got one of these tasks which confused me as I have not set "accept beta applications" in my project preferences.

Failed after 1200 seconds.

Any idea why I got this task even when I have not accepted the app through beta settings?

https://www.gpugrid.net/result.php?resultid=30508976
ID: 55724 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1099
Credit: 40,331,687,595
RAC: 101,874
Level
Trp
Scientific publications
wat
Message 55920 - Posted: 9 Dec 2020, 19:42:43 UTC

What is the difference between these test Python apps and the standard one? Is it just that this application is coded in Python? what language are the default apps coded in?
ID: 55920 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers

Send message
Joined: 13 Dec 17
Posts: 1387
Credit: 8,177,442,190
RAC: 6,674,176
Level
Tyr
Scientific publications
watwatwatwatwat
Message 55926 - Posted: 9 Dec 2020, 23:40:13 UTC - in response to Message 55920.  

Both apps are wrappered. One is the stock acemd3 and I assume is written in some form of C.

The new Anaconda Python task is a conda application. And Python.

I think Toni is going to have to explain what and how these new tasks and application work.

Very strange behavior. I think the conda and python parts run first and communicate with the project doing some intermediary calculation/configuration/formatting or something. Lots of upstream network activity and nothing going on in the client transfers screen.

I saw the tasks get to 100% progress and no time remaining and then stall out. No upload of the finished task.

Looked away from the machine and looked again and now both tasks have reset their progress and now have 3 hours to run.

I first saw conda show up in the process list and now that has disappeared to be replaced with a acemd3 and python process for each task.

Must be doing something other than insta-failing that the previous tries.
ID: 55926 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
sph

Send message
Joined: 22 Oct 20
Posts: 4
Credit: 34,434,982
RAC: 0
Level
Val
Scientific publications
wat
Message 55933 - Posted: 10 Dec 2020, 5:22:30 UTC

CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2>
Elapsed: -

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.


I am receiving this error in STDerr Output for Experimental Python tasks on all my hosts.

This is probably due to the fact all my PCs are behind a proxy. Can you please set the Python tasks to use the Proxy defined in the Boinc Client?

Work Units here:
https://www.gpugrid.net/result.php?resultid=31672354
https://www.gpugrid.net/result.php?resultid=31668427
https://www.gpugrid.net/result.php?resultid=31665961
ID: 55933 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers

Send message
Joined: 13 Dec 17
Posts: 1387
Credit: 8,177,442,190
RAC: 6,674,176
Level
Tyr
Scientific publications
watwatwatwatwat
Message 55936 - Posted: 10 Dec 2020, 8:30:18 UTC

Boy, mixing both regular acemd3 and the python anaconda tasks sure F*s up the APR for both tasks. The insanely low APR for the Python tasks is forcing all GPUGrid tasks into High Priority.

The regular acemd3 tasks are getting 3-6 day estimated completions.
ID: 55936 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1099
Credit: 40,331,687,595
RAC: 101,874
Level
Trp
Scientific publications
wat
Message 55945 - Posted: 10 Dec 2020, 15:25:26 UTC - in response to Message 55936.  
Last modified: 10 Dec 2020, 15:41:38 UTC

Boy, mixing both regular acemd3 and the python anaconda tasks sure F*s up the APR for both tasks. The insanely low APR for the Python tasks is forcing all GPUGrid tasks into High Priority.

The regular acemd3 tasks are getting 3-6 day estimated completions.


I'm seeing that too lol. but it doesnt seem to be causing too much trouble for me since I don't run more than one GPU project concurrently. Only have Prime and backup.

copying my message from another thread with my observations about these tasks for Toni to see if he doesnt check the other threads:

Looks like I have 11 successful tasks, and 2 failures.

the two failures both failed with "196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED" after a few mins and on different hosts.
https://www.gpugrid.net/result.php?resultid=31680145
https://www.gpugrid.net/result.php?resultid=31678136

curious, since both systems have plenty of free space, and I've allowed BOINC to use 90% of it.

these tasks also have much different behavior compared to the default new version acemd tasks. and they don't seem well optimized yet.
-less reliance on PCIe bandwidth, seeing 2-8% PCIe 3.0 bus utilization
-more reliance on GPU VRAM, seeing 2-3GB memory used
-less GPU utilization, seeing 65-85% GPU utilization. (maybe more dependent on a fast CPU/mem subsystem. my 3900X system gets better GPU% than my slower EPYC systems)

contrast that with the default acemd3 tasks:
-25-50% PCIe 3.0 bus utilization
-about 500MB GPU VRAM used
-95+% GPU utilization

thinking about the GPU utilization being dependent on CPU speed. It could also have to do with the relative speed between the GPU:CPU. just something I observed on my systems. slower GPUs seem to tolerate slower CPUs better, which makes sense if the CPU speed is a limiting factor.

Ryzen 3900X @4.20GHz w/ 2080ti = 85% GPU Utilization
EPYC 7402P @3.30GHz w/ 2080ti = 65% GPU Utilization
EPYC 7402P @3.30GHz w/ 2070 = 76% GPU Utilization
EPYC 7642 @2.80GHz w/ 1660Super = 71% GPU Utilization

needs more optimization IMO. the default app sees much better performance keeping the GPU fully loaded.

ID: 55945 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 9,948,917,649
RAC: 8,720,931
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55946 - Posted: 10 Dec 2020, 16:03:34 UTC - in response to Message 55936.  

Boy, mixing both regular acemd3 and the python anaconda tasks sure F*s up the APR for both tasks. The insanely low APR for the Python tasks is forcing all GPUGrid tasks into High Priority.

The regular acemd3 tasks are getting 3-6 day estimated completions.

Actually, that won't be the cause. The APRs are kept separately for each application, and once you have an 'active' APR (11 or more 'completions' - validated tasks for that app), they should keep out of each others way.

What will F* things up is that this project still allows DCF to run free - and that's a single value which is applied to both task types.
ID: 55946 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers

Send message
Joined: 13 Dec 17
Posts: 1387
Credit: 8,177,442,190
RAC: 6,674,176
Level
Tyr
Scientific publications
watwatwatwatwat
Message 55947 - Posted: 10 Dec 2020, 16:07:55 UTC - in response to Message 55946.  

Yeah, after I wrote that I realized I meant the DCF is what is messing up the runtime estimations.

I wonder if the regular acemd3 tasks will ever get their normal DCF's back to normal.

I haven't run ANY of my other gpu project tasks since these anaconda python tasks have shown up. I will eventually when the other projects deadlines approach of course.
ID: 55947 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1099
Credit: 40,331,687,595
RAC: 101,874
Level
Trp
Scientific publications
wat
Message 55948 - Posted: 10 Dec 2020, 16:09:51 UTC - in response to Message 55946.  

what's DCF?
ID: 55948 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers

Send message
Joined: 13 Dec 17
Posts: 1387
Credit: 8,177,442,190
RAC: 6,674,176
Level
Tyr
Scientific publications
watwatwatwatwat
Message 55949 - Posted: 10 Dec 2020, 16:29:16 UTC - in response to Message 55948.  

what's DCF?

Task Duration Correction Factor.
The older BOINC server versions use it like Einstein.
It messes up gpu tasks of different apps there too.
ID: 55949 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 9,948,917,649
RAC: 8,720,931
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55951 - Posted: 10 Dec 2020, 17:11:20 UTC - in response to Message 55947.  

You can't talk about 'their DCFs' - there is only one (there could have been more than one, but that's the way David chose to play it)

You can see it in BOINC Manager, on the Projects|properties dialog. If it gets really, really high (above 90), it'll inch downwards at 1% per task. Below 90, it'll speed up to 10% par task. The standard advice used to be "two weeks to stabilise", but with modern machines (multi-core, multi-GPU, and faster), the tasks fly by, and it should be quicker.
ID: 55951 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers

Send message
Joined: 13 Dec 17
Posts: 1387
Credit: 8,177,442,190
RAC: 6,674,176
Level
Tyr
Scientific publications
watwatwatwatwat
Message 55953 - Posted: 10 Dec 2020, 17:28:15 UTC - in response to Message 55951.  

What is also messed up is the size of the Anaconda Python task estimated computation size shown in the task properties.

The ones I crunched were only set for 3,000 GFLOPS.

The regular acemd3 tasks are set for 5,000,000 GFLOPS.

This also probably influenced the wildly inaccurate DCF's for the new python tasks.
ID: 55953 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers

Send message
Joined: 13 Dec 17
Posts: 1387
Credit: 8,177,442,190
RAC: 6,674,176
Level
Tyr
Scientific publications
watwatwatwatwat
Message 55954 - Posted: 10 Dec 2020, 17:33:17 UTC - in response to Message 55951.  

You can't talk about 'their DCFs' - there is only one (there could have been more than one, but that's the way David chose to play it)

You can see it in BOINC Manager, on the Projects|properties dialog. If it gets really, really high (above 90), it'll inch downwards at 1% per task. Below 90, it'll speed up to 10% par task. The standard advice used to be "two weeks to stabilise", but with modern machines (multi-core, multi-GPU, and faster), the tasks fly by, and it should be quicker.

This daily driver has GPUGrid DCF Project properties currently at 85 and change.
ID: 55954 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1099
Credit: 40,331,687,595
RAC: 101,874
Level
Trp
Scientific publications
wat
Message 55955 - Posted: 10 Dec 2020, 17:33:49 UTC - in response to Message 55953.  
Last modified: 10 Dec 2020, 17:37:11 UTC

What is also messed up is the size of the Anaconda Python task estimated computation size shown in the task properties.

The ones I crunched were only set for 3,000 GFLOPS.

The regular acemd3 tasks are set for 5,000,000 GFLOPS.

This also probably influenced the wildly inaccurate DCF's for the new python tasks.

can confirm.

could this be why the credit reward is so high too?

I wonder what the flop estimate was on this one from Kevvy:
https://www.gpugrid.net/result.php?resultid=31679003
he got wrecked on this one, over 5hrs on a 2080ti, and got a mere 20 credits lol.
ID: 55955 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
biodoc

Send message
Joined: 26 Aug 08
Posts: 183
Credit: 10,085,929,375
RAC: 4
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55956 - Posted: 10 Dec 2020, 18:20:14 UTC

I've got one running now on an RTX 2070S and the only real issue is low GPU utilization (60-70%). The current task is using ~2 GB of VRAM and ~3 GB of system RAM. I have one thread free on a ryzen 3900X to support the GPU and that thread is running at 100%. This computer has complete 3 of the new python tasks successfully.

Linux Mint 20; Driver Version: 440.95.01; CUDA Version: 10.2
ID: 55956 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1099
Credit: 40,331,687,595
RAC: 101,874
Level
Trp
Scientific publications
wat
Message 55957 - Posted: 10 Dec 2020, 18:25:08 UTC - in response to Message 55956.  

I've got one running now on an RTX 2070S and the only real issue is low GPU utilization (60-70%). The current task is using ~2 GB of VRAM and ~3 GB of system RAM. I have one thread free on a ryzen 3900X to support the GPU and that thread is running at 100%. This computer has complete 3 of the new python tasks successfully.

Linux Mint 20; Driver Version: 440.95.01; CUDA Version: 10.2


what kind of BOINC install do you have? does it run as a service? or a standalone install that runs from an executable?

what is the clock speed of your 3900X and memory speed as well?

try letting there be 2 spare free threads (so you have one doing nothing) to avoid maxing out the CPU to 100% utilization on all threads. this is known to slow down GPU work. this might increase your GPU utilization a bit.

ID: 55957 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · 4 . . . 6 · Next

Message boards : News : Experimental Python tasks (beta)

©2025 Universitat Pompeu Fabra