Python apps for GPU hosts 4.03 (cuda1131) using a LOT of CPU

Message boards : Number crunching : Python apps for GPU hosts 4.03 (cuda1131) using a LOT of CPU
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3

AuthorMessage
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 59346 - Posted: 27 Sep 2022, 13:33:46 UTC - in response to Message 59304.  
Last modified: 27 Sep 2022, 14:13:29 UTC

Keith Myers wrote:

Use an app_config.xml file.

<app_config>
...
<app>
<name>PythonGPU</name>
<gpu_versions>
<gpu_usage>0.33</gpu_usage>
<cpu_usage>3.0</cpu_usage>
</gpu_versions>
</app>
</app_config>


I put in the above in the app_config.xml, with the intention to run 3 Pythons on one GPU.
However, after downloading 2 tasks (which had started right away), and trying to download a third one, BOINC tells me "the computer has reached a limit on tasks in progress", and it was not possible to download a third one :-(

At this point I remember that it has always been said that only 2 tasks per GPU can be downloaded from GPUGRID

So, how did you manage to download 3 tasks per GPU ?
ID: 59346 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 891
Level
Tyr
Scientific publications
watwatwatwatwat
Message 59349 - Posted: 27 Sep 2022, 16:59:49 UTC - in response to Message 59346.  
Last modified: 27 Sep 2022, 17:01:14 UTC

Keith Myers wrote:

Use an app_config.xml file.

<app_config>
...
<app>
<name>PythonGPU</name>
<gpu_versions>
<gpu_usage>0.33</gpu_usage>
<cpu_usage>3.0</cpu_usage>
</gpu_versions>
</app>
</app_config>


I put in the above in the app_config.xml, with the intention to run 3 Pythons on one GPU.
However, after downloading 2 tasks (which had started right away), and trying to download a third one, BOINC tells me "the computer has reached a limit on tasks in progress", and it was not possible to download a third one :-(

At this point I remember that it has always been said that only 2 tasks per GPU can be downloaded from GPUGRID

So, how did you manage to download 3 tasks per GPU ?


By spoofing the card count using a custom client, or editing coproc_info.xml and locking it down or having more than one card in a host.
ID: 59349 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 59350 - Posted: 27 Sep 2022, 18:08:12 UTC - in response to Message 59349.  

...
So, how did you manage to download 3 tasks per GPU ?

By spoofing the card count using a custom client, or editing coproc_info.xml and locking it down or having more than one card in a host.

when you say editing coproc_info.xml, you are talking about the entry

<warning>NVIDIA library reports 1 GPU</warning>

near the bottom? After changing this to "2", how would I lock it down?
ID: 59350 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Bill F
Avatar

Send message
Joined: 21 Nov 16
Posts: 36
Credit: 164,429,114
RAC: 18
Level
Ile
Scientific publications
wat
Message 60054 - Posted: 10 Mar 2023, 4:27:28 UTC

Here is a URL to a BOINC message board tread on Hardware Accelerated GPU scheduling. Is this something that might benefit Windows based systems with a minor improvement ?

https://boinc.berkeley.edu/dev/forum_thread.php?id=14235#104003

Thanks
Bill F
ID: 60054 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 891
Level
Tyr
Scientific publications
watwatwatwatwat
Message 60055 - Posted: 10 Mar 2023, 8:14:57 UTC - in response to Message 59350.  

...
So, how did you manage to download 3 tasks per GPU ?

By spoofing the card count using a custom client, or editing coproc_info.xml and locking it down or having more than one card in a host.

when you say editing coproc_info.xml, you are talking about the entry

<warning>NVIDIA library reports 1 GPU</warning>

near the bottom? After changing this to "2", how would I lock it down?

No, not entirely. You would just duplicate the card detection section and the increment the card count in that line.

But you also have to prevent BOINC from changing the file afterwards which will reset the count to the true detection.

You make the edit in the file and then mark the file immutable. In Linux, you would execute:

sudo chattr +i coproc_info.xml
ID: 60055 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 60056 - Posted: 10 Mar 2023, 15:18:53 UTC - in response to Message 60055.  

...
So, how did you manage to download 3 tasks per GPU ?

By spoofing the card count using a custom client, or editing coproc_info.xml and locking it down or having more than one card in a host.

when you say editing coproc_info.xml, you are talking about the entry

<warning>NVIDIA library reports 1 GPU</warning>

near the bottom? After changing this to "2", how would I lock it down?

No, not entirely. You would just duplicate the card detection section and the increment the card count in that line.

But you also have to prevent BOINC from changing the file afterwards which will reset the count to the true detection.

You make the edit in the file and then mark the file immutable. In Linux, you would execute:

sudo chattr +i coproc_info.xml

hi Keith, thanks for your reply.
Meanwhile, I had found out via a video on Youtube how to spoof a GPU; in fact, this was months ago. And it's working well since then :-)
ID: 60056 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Aurum
Avatar

Send message
Joined: 12 Jul 17
Posts: 404
Credit: 17,408,899,587
RAC: 0
Level
Trp
Scientific publications
watwatwat
Message 60057 - Posted: 10 Mar 2023, 15:56:15 UTC - in response to Message 59328.  
Last modified: 10 Mar 2023, 16:22:58 UTC

That way it will mix the projects, one from each on one GPU. 0.6+0.4=1. But 0.6+0.6>1 so it won’t start two from GPUGRID on the same GPU. It will go to the next GPU with open resources.

That will not fix anything.

I'm not certain what the exact problem is you are referring to, so I can only give as many pointers as I can imagine. This might or might not help:
You might just try to set the GPUGrid tasks to <gpu_usage>0.6</gpu_usage> and don't run a second project.
I actually don't know what happens then, but I'm interested in it.
So, if you try, please tell us. :-)
Else experiment with the tags <fetch_minimal_work>, <ngpus>, <max_concurrent>, try to set <gpu_usage> to 1.1 or 2.0, or whatever else comes to mind regarding client configuration.
You might also setup two Boinc instances and give one GPU to each of them.

*edit*
With Boinc client 7.14.x you could also set resource share and the Boinc caches to 0. That way you should not get any work as long as a device is occupied.
*end edit*
Thanks. My problem with GG is that I want 2 ACEMD, 1 Python, and 4 or more ATM. I know this is just a pipe dream on my part since GG hasn't lifted a finger to improve or repair their UI in years.
I see complaints that I'm not being clear. Note I only install one GPU per computer. Let me try it this way for a single computer:
IF ACEMD Tasks ready to send => 1 THEN DL+RUN 2 ELSE
IF ATM Tasks ready to send => 1 THEN DL+RUN 4 ELSE
IF Python Tasks ready to send = 1 THEN DL+RUN 1
So imagine a miracle occurs and there's a plethora of all manner of GG WUs. Then each of my computers would be running 2 ACEMD WUs. If ACEMD runs low then each computer would run from 1 to 4 ATM WUs. And if there's a dearth of both ACEMD and ATM WUs each computer might run some combination like: 1 ACEMD + 1 ATM, or 1 ACEDMD + 1 Python, or 1 ATM + 1 Python, or 4 ATM.
If GG used the same minimal Project Preferences that LHC does then I could make a very productive compromise to get the most of each generation of my computers.
ID: 60057 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 60058 - Posted: 10 Mar 2023, 16:06:21 UTC - in response to Message 60057.  
Last modified: 10 Mar 2023, 16:06:53 UTC

Thanks. My problem with GG is that I want 2 ACEMD, 1 Python, and 4 or more ATM. I know this is just a pipe dream on my part since GG hasn't lifted a finger to improve or repair their UI in years.


only way to do this is run multiple boinc clients and manage caches separately. will require a good bit of manual intervention on your part.
ID: 60058 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Aurum
Avatar

Send message
Joined: 12 Jul 17
Posts: 404
Credit: 17,408,899,587
RAC: 0
Level
Trp
Scientific publications
watwatwat
Message 60059 - Posted: 10 Mar 2023, 16:18:45 UTC - in response to Message 60058.  

Thanks. My problem with GG is that I want 2 ACEMD, 1 Python, and 4 or more ATM. I know this is just a pipe dream on my part since GG hasn't lifted a finger to improve or repair their UI in years.


only way to do this is run multiple boinc clients and manage caches separately. will require a good bit of manual intervention on your part.
No, it's not the only way it can be done. I can do that on LHC today and have been able to do it for a long time.
ID: 60059 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 60060 - Posted: 10 Mar 2023, 16:37:18 UTC - in response to Message 60059.  
Last modified: 10 Mar 2023, 16:38:24 UTC

Thanks. My problem with GG is that I want 2 ACEMD, 1 Python, and 4 or more ATM. I know this is just a pipe dream on my part since GG hasn't lifted a finger to improve or repair their UI in years.


only way to do this is run multiple boinc clients and manage caches separately. will require a good bit of manual intervention on your part.
No, it's not the only way it can be done. I can do that on LHC today and have been able to do it for a long time.

custom/modified BOINC server software.

but i was saying the only way to do that on GPUGRID. which is true.
ID: 60060 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Greger

Send message
Joined: 6 Jan 15
Posts: 76
Credit: 25,499,534,331
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 60061 - Posted: 10 Mar 2023, 16:48:58 UTC - in response to Message 60059.  

If this works in LHC you would need to point out HOW these preferences are set to that project.

I have run LHC for many years and use same logic of user preferences setting just like "GG".
What use is a separated application layer on top of work distribution of units. So you can set vbox or native to separate on each sub projects.

IF THEN ELSE statement would require additional coding and to would not work in distribution of unit to without reject abort script,

To accomplish such thing on specific amount of unit to each host or combined would require app_config but also 3 instances of boinc-clients.

Project can not handle these request by default from server and would not.
ID: 60061 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 60062 - Posted: 10 Mar 2023, 17:08:16 UTC - in response to Message 60061.  

projects like LHC and WCG (and others) have custom or modified BOINC server software. they give additional functionality not in the base code for normal BOINC projects. usually its in the project preferences somewhere.

but it's really disingenuous to try to compare projects with custom software and say something like "they can do it, why can't you!?". every project has their own priorities and idiosyncrasies (and budgets). each user should just find what works for them to work around project-specific oddities.
ID: 60062 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile JStateson
Avatar

Send message
Joined: 31 Oct 08
Posts: 186
Credit: 3,578,903,157
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 60063 - Posted: 11 Mar 2023, 20:49:13 UTC

i just downloaded my first Python app. Been running 1/2 hour now on a 2080Ti

Tech power up shows GPU load average of 14% and power average is 74 watts.
that seems really low compared to Einstein's 88% and 190 watts

Is this normal?

CUDA: NVIDIA GPU 0: NVIDIA GeForce RTX 2080 Ti (driver version 528.24, CUDA version 12.0, compute capability 7.5, 11264MB, 11264MB available, 13448 GFLOPS peak)


try my performance program, the BoincTasks History Reader.
Find and read about it here: https://forum.efmer.com/index.php?topic=1355.0
ID: 60063 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 60064 - Posted: 11 Mar 2023, 21:24:10 UTC - in response to Message 60063.  

yes it's normal. these tasks are mostly a CPU/memory app and only uses the GPU intermittently for a small part of the overall computation.

running two concurrently can help overall production.
ID: 60064 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Aurum
Avatar

Send message
Joined: 12 Jul 17
Posts: 404
Credit: 17,408,899,587
RAC: 0
Level
Trp
Scientific publications
watwatwat
Message 60077 - Posted: 14 Mar 2023, 14:58:07 UTC - in response to Message 60061.  

IF THEN ELSE statement
Was only used in a futile attempt to explain what I'm asking GDF to do.
ID: 60077 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Aurum
Avatar

Send message
Joined: 12 Jul 17
Posts: 404
Credit: 17,408,899,587
RAC: 0
Level
Trp
Scientific publications
watwatwat
Message 60078 - Posted: 14 Mar 2023, 14:59:03 UTC - in response to Message 60062.  

projects like LHC and WCG (and others) have custom or modified BOINC server software. they give additional functionality not in the base code for normal BOINC projects. usually its in the project preferences somewhere.

but it's really disingenuous to try to compare projects with custom software and say something like "they can do it, why can't you!?". every project has their own priorities and idiosyncrasies (and budgets). each user should just find what works for them to work around project-specific oddities.
I wasn't even talking to you, I was talking to GDF.
ID: 60078 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 60079 - Posted: 14 Mar 2023, 15:34:31 UTC - in response to Message 60078.  
Last modified: 14 Mar 2023, 15:35:45 UTC

projects like LHC and WCG (and others) have custom or modified BOINC server software. they give additional functionality not in the base code for normal BOINC projects. usually its in the project preferences somewhere.

but it's really disingenuous to try to compare projects with custom software and say something like "they can do it, why can't you!?". every project has their own priorities and idiosyncrasies (and budgets). each user should just find what works for them to work around project-specific oddities.
I wasn't even talking to you, I was talking to GDF.

ok. and? my post wasn't even in reply to yours lol.

but the point still stands.
ID: 60079 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3

Message boards : Number crunching : Python apps for GPU hosts 4.03 (cuda1131) using a LOT of CPU

©2025 Universitat Pompeu Fabra