Message boards :
News :
Experimental Python tasks (beta) - task description
Message board moderation
Previous · 1 . . . 40 · 41 · 42 · 43 · 44 · 45 · 46 . . . 50 · Next
| Author | Message |
|---|---|
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 351 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I don't see any parameter in the jobin page that allocates the number of cpus the task will tie up. You're right - it doesn't belong there. It will be set in the <app_version>, through the plan class - see https://boinc.berkeley.edu/trac/wiki/AppPlanSpec. And to amplify Ian's point - not only does BOINC not control the application, it merely allows the specified amount of free resource in which to run. |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 731 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Since these apps aren't proper BOINC MT or multi-threaded apps using a MT plan class, you wouldn't be using the <max_threads>N [M]</max_threads> parameter. Seems like the proper parameter to use would be the <cpu_frac>x</cpu_frac> one. Do you concur? |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 731 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Bunch of the standard Python 4.03 versioned tasks have been going out and erroring out. I've had five so far today. Problems in the main learner step with the pytorchl packages. https://www.gpugrid.net/result.php?resultid=33268830 |
|
Send message Joined: 8 Aug 19 Posts: 252 Credit: 458,054,251 RAC: 0 Level ![]() Scientific publications ![]()
|
Maybe this might help Abou with the scripting, I'm too green at Python to know. https://stackoverflow.com/questions/58666537/error-while-feeding-tf-dataset-to-fit-keyerror-embedding-input |
|
Send message Joined: 27 Jul 11 Posts: 138 Credit: 539,953,398 RAC: 0 Level ![]() Scientific publications ![]()
|
I was allocated two tasks of "ATM: Free energy calculations of protein-ligand binding v1.11 (cuda1121)" and both of them were cancelled by the server in transmission. What are these tasks about and why were they cancelled? |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 731 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
The researcher cancelled them because they recognized a problem with how the package was put together and the tasks would fail. So better to cancel them in the pipeline rather than waste download bandwidth and the cruncher's resources. You can thank them for being responsible and diligent. |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 731 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
I successfully ran one of the beta Python tasks after the first cruncher errored out the task. https://www.gpugrid.net/result.php?resultid=33268305 |
|
Send message Joined: 31 May 21 Posts: 200 Credit: 0 RAC: 0 Level ![]() Scientific publications
|
The beta tasks were of the same size as the normal ones. So if they run faster hopefully the future PythonGPU tasks will too. |
|
Send message Joined: 31 May 21 Posts: 200 Credit: 0 RAC: 0 Level ![]() Scientific publications
|
Thank you very much for pointing it out. Will look at the error this morning! |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 5,269 Level ![]() Scientific publications
|
finally got some more beta tasks and they seem to be running fine. and now limited to only 4 threads on the main run.py process. but i did notice that VRAM use has increased by about 30%. tasks are now using more than 4GB on the GPU, before it was about 3.2GB. was that intentional? are these beta tasks going to be the same as the new batch? beta is running fine but the small amount of new ones going out non-beta seem to be failing.
|
|
Send message Joined: 27 Jul 11 Posts: 138 Credit: 539,953,398 RAC: 0 Level ![]() Scientific publications ![]()
|
I never blamed anyone. Just asked a question for my own knowledge. Anyway, Thank you. Now I wish I could get a task. |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
... this is definitely bad news for GPUs with 8GB VRAM, like the two RTX3070 in my case. Before, I could run 2 tasks each GPU. It became quite tight, but it worked (with some 70-100MB left on the GPU the monitor is connected to). |
|
Send message Joined: 31 May 21 Posts: 200 Credit: 0 RAC: 0 Level ![]() Scientific publications
|
Yes, these latest beta tasks use a little bit more GPU memory. The AI agent has a bigger neural network. Hope it is not too big and most machines can still handle it. What about number of threads? Is it any better? I also fixed the problems with the non-beta (queue was empty but I guess some failed jobs were added again to it after the new software version was released). Let me know if more errors occur please. |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 5,269 Level ![]() Scientific publications
|
i have 4 of the beta tasks running. the number of threads looks good. using 4 threads per task as specified in the run.py script. i just got an older non-beta task resend, and it's working fine so far (after I manually edited the threads again. but the setup with beta tasks seems viable to push out to non-beta now. about VRAM use. so far, it seems when they first get going, they are using about 3800MB, but rises over time. at about 50% run completion the tasks are up to ~4600MB each. not sure how high they will go. the old tasks did not have this increasing VRAM as the task progressed behavior. is it necessary? or is it leaking and not cleaning up?
|
|
Send message Joined: 31 May 21 Posts: 200 Credit: 0 RAC: 0 Level ![]() Scientific publications
|
Great! very helpful feedback Ian thanks. Since the scripts seem to run correctly I will start sending tasks to PythonGPU app with the current python script version. In parallel, I will look into the VRAM increase running a few more tests in PythonGPUbeta. I don't think it is a memory leak but maybe there is way for a more efficient memory use in the code. I will dig a bit into it. Will post updates on that. |
|
Send message Joined: 6 Mar 18 Posts: 38 Credit: 1,340,042,080 RAC: 27 Level ![]() Scientific publications
|
Got one of the new Betas, it's using about 28% average of my 16core 5950x in Windows 11 so roughly 9 threads? |
|
Send message Joined: 31 May 21 Posts: 200 Credit: 0 RAC: 0 Level ![]() Scientific publications
|
The scripts still spawn 32 python threads. But I think before with wandb and maybe without fixing some environ variables even more were spawned. However, note that not 32 cores are necessary to run the scripts. Not sure what is the optimal number but much lower than 32. |
|
Send message Joined: 6 Mar 18 Posts: 38 Credit: 1,340,042,080 RAC: 27 Level ![]() Scientific publications
|
Yea it definitely uses less overall CPU time that before, capped the apps at 10 cores now which seems like the sweet spot to allow me to also run other apps. |
|
Send message Joined: 27 Jul 11 Posts: 138 Credit: 539,953,398 RAC: 0 Level ![]() Scientific publications ![]()
|
|
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 5,269 Level ![]() Scientific publications
|
Great! very helpful feedback Ian thanks. seeing up to 5.6GB VRAM use per task. but it doesnt seem consistent. some tasks will go up to ~4.8, others 4.5, etc. there doesnt seem to be a clear pattern to it. the previous tasks were very consistent and always used exactly the same amount of VRAM.
|
©2025 Universitat Pompeu Fabra