Python apps for GPU hosts 4.03 (cuda1131) using a LOT of CPU

Author	Message
Erich56 Send message Joined: 1 Jan 15 Posts: 1168 Credit: 12,311,898,501 RAC: 271,810 Level Scientific publications	Message 59346 - Posted: 27 Sep 2022, 13:33:46 UTC - in response to Message 59304. Last modified: 27 Sep 2022, 14:13:29 UTC Keith Myers wrote: Use an app_config.xml file. <app_config> ... <app> <name>PythonGPU</name> <gpu_versions> <gpu_usage>0.33</gpu_usage> <cpu_usage>3.0</cpu_usage> </gpu_versions> </app> </app_config> I put in the above in the app_config.xml, with the intention to run 3 Pythons on one GPU. However, after downloading 2 tasks (which had started right away), and trying to download a third one, BOINC tells me "the computer has reached a limit on tasks in progress", and it was not possible to download a third one :-( At this point I remember that it has always been said that only 2 tasks per GPU can be downloaded from GPUGRID So, how did you manage to download 3 tasks per GPU ? ID: 59346 · Rating: 0 · rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1423 Credit: 9,188,446,190 RAC: 1,336,521 Level Scientific publications	Message 59349 - Posted: 27 Sep 2022, 16:59:49 UTC - in response to Message 59346. Last modified: 27 Sep 2022, 17:01:14 UTC Keith Myers wrote: Use an app_config.xml file. <app_config> ... <app> <name>PythonGPU</name> <gpu_versions> <gpu_usage>0.33</gpu_usage> <cpu_usage>3.0</cpu_usage> </gpu_versions> </app> </app_config> I put in the above in the app_config.xml, with the intention to run 3 Pythons on one GPU. However, after downloading 2 tasks (which had started right away), and trying to download a third one, BOINC tells me "the computer has reached a limit on tasks in progress", and it was not possible to download a third one :-( At this point I remember that it has always been said that only 2 tasks per GPU can be downloaded from GPUGRID So, how did you manage to download 3 tasks per GPU ? By spoofing the card count using a custom client, or editing coproc_info.xml and locking it down or having more than one card in a host. ID: 59349 · Rating: 0 · rate: / Reply Quote

Erich56 Send message Joined: 1 Jan 15 Posts: 1168 Credit: 12,311,898,501 RAC: 271,810 Level Scientific publications	Message 59350 - Posted: 27 Sep 2022, 18:08:12 UTC - in response to Message 59349. ... So, how did you manage to download 3 tasks per GPU ? By spoofing the card count using a custom client, or editing coproc_info.xml and locking it down or having more than one card in a host. when you say editing coproc_info.xml, you are talking about the entry <warning>NVIDIA library reports 1 GPU</warning> near the bottom? After changing this to "2", how would I lock it down? ID: 59350 · Rating: 0 · rate: / Reply Quote

Bill F Send message Joined: 21 Nov 16 Posts: 36 Credit: 164,429,114 RAC: 0 Level Scientific publications	Message 60054 - Posted: 10 Mar 2023, 4:27:28 UTC Here is a URL to a BOINC message board tread on Hardware Accelerated GPU scheduling. Is this something that might benefit Windows based systems with a minor improvement ? https://boinc.berkeley.edu/dev/forum_thread.php?id=14235#104003 Thanks Bill F ID: 60054 · Rating: 0 · rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1423 Credit: 9,188,446,190 RAC: 1,336,521 Level Scientific publications	Message 60055 - Posted: 10 Mar 2023, 8:14:57 UTC - in response to Message 59350. ... So, how did you manage to download 3 tasks per GPU ? By spoofing the card count using a custom client, or editing coproc_info.xml and locking it down or having more than one card in a host. when you say editing coproc_info.xml, you are talking about the entry <warning>NVIDIA library reports 1 GPU</warning> near the bottom? After changing this to "2", how would I lock it down? No, not entirely. You would just duplicate the card detection section and the increment the card count in that line. But you also have to prevent BOINC from changing the file afterwards which will reset the count to the true detection. You make the edit in the file and then mark the file immutable. In Linux, you would execute: sudo chattr +i coproc_info.xml ID: 60055 · Rating: 0 · rate: / Reply Quote

Erich56 Send message Joined: 1 Jan 15 Posts: 1168 Credit: 12,311,898,501 RAC: 271,810 Level Scientific publications	Message 60056 - Posted: 10 Mar 2023, 15:18:53 UTC - in response to Message 60055. ... So, how did you manage to download 3 tasks per GPU ? By spoofing the card count using a custom client, or editing coproc_info.xml and locking it down or having more than one card in a host. when you say editing coproc_info.xml, you are talking about the entry <warning>NVIDIA library reports 1 GPU</warning> near the bottom? After changing this to "2", how would I lock it down? No, not entirely. You would just duplicate the card detection section and the increment the card count in that line. But you also have to prevent BOINC from changing the file afterwards which will reset the count to the true detection. You make the edit in the file and then mark the file immutable. In Linux, you would execute: sudo chattr +i coproc_info.xml hi Keith, thanks for your reply. Meanwhile, I had found out via a video on Youtube how to spoof a GPU; in fact, this was months ago. And it's working well since then :-) ID: 60056 · Rating: 0 · rate: / Reply Quote

Aurum Send message Joined: 12 Jul 17 Posts: 404 Credit: 17,408,899,587 RAC: 0 Level Scientific publications	Message 60057 - Posted: 10 Mar 2023, 15:56:15 UTC - in response to Message 59328. Last modified: 10 Mar 2023, 16:22:58 UTC That way it will mix the projects, one from each on one GPU. 0.6+0.4=1. But 0.6+0.6>1 so it won’t start two from GPUGRID on the same GPU. It will go to the next GPU with open resources. That will not fix anything. I'm not certain what the exact problem is you are referring to, so I can only give as many pointers as I can imagine. This might or might not help: You might just try to set the GPUGrid tasks to <gpu_usage>0.6</gpu_usage> and don't run a second project. I actually don't know what happens then, but I'm interested in it. So, if you try, please tell us. :-) Else experiment with the tags <fetch_minimal_work>, <ngpus>, <max_concurrent>, try to set <gpu_usage> to 1.1 or 2.0, or whatever else comes to mind regarding client configuration. You might also setup two Boinc instances and give one GPU to each of them. edit With Boinc client 7.14.x you could also set resource share and the Boinc caches to 0. That way you should not get any work as long as a device is occupied. end edit Thanks. My problem with GG is that I want 2 ACEMD, 1 Python, and 4 or more ATM. I know this is just a pipe dream on my part since GG hasn't lifted a finger to improve or repair their UI in years. I see complaints that I'm not being clear. Note I only install one GPU per computer. Let me try it this way for a single computer: IF ACEMD Tasks ready to send => 1 THEN DL+RUN 2 ELSE IF ATM Tasks ready to send => 1 THEN DL+RUN 4 ELSE IF Python Tasks ready to send = 1 THEN DL+RUN 1 So imagine a miracle occurs and there's a plethora of all manner of GG WUs. Then each of my computers would be running 2 ACEMD WUs. If ACEMD runs low then each computer would run from 1 to 4 ATM WUs. And if there's a dearth of both ACEMD and ATM WUs each computer might run some combination like: 1 ACEMD + 1 ATM, or 1 ACEDMD + 1 Python, or 1 ATM + 1 Python, or 4 ATM. If GG used the same minimal Project Preferences that LHC does then I could make a very productive compromise to get the most of each generation of my computers. ID: 60057 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 347,555 Level Scientific publications	Message 60058 - Posted: 10 Mar 2023, 16:06:21 UTC - in response to Message 60057. Last modified: 10 Mar 2023, 16:06:53 UTC Thanks. My problem with GG is that I want 2 ACEMD, 1 Python, and 4 or more ATM. I know this is just a pipe dream on my part since GG hasn't lifted a finger to improve or repair their UI in years. only way to do this is run multiple boinc clients and manage caches separately. will require a good bit of manual intervention on your part. ID: 60058 · Rating: 0 · rate: / Reply Quote

Aurum Send message Joined: 12 Jul 17 Posts: 404 Credit: 17,408,899,587 RAC: 0 Level Scientific publications	Message 60059 - Posted: 10 Mar 2023, 16:18:45 UTC - in response to Message 60058. Thanks. My problem with GG is that I want 2 ACEMD, 1 Python, and 4 or more ATM. I know this is just a pipe dream on my part since GG hasn't lifted a finger to improve or repair their UI in years. only way to do this is run multiple boinc clients and manage caches separately. will require a good bit of manual intervention on your part. No, it's not the only way it can be done. I can do that on LHC today and have been able to do it for a long time. ID: 60059 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 347,555 Level Scientific publications	Message 60060 - Posted: 10 Mar 2023, 16:37:18 UTC - in response to Message 60059. Last modified: 10 Mar 2023, 16:38:24 UTC Thanks. My problem with GG is that I want 2 ACEMD, 1 Python, and 4 or more ATM. I know this is just a pipe dream on my part since GG hasn't lifted a finger to improve or repair their UI in years. only way to do this is run multiple boinc clients and manage caches separately. will require a good bit of manual intervention on your part. No, it's not the only way it can be done. I can do that on LHC today and have been able to do it for a long time. custom/modified BOINC server software. but i was saying the only way to do that on GPUGRID. which is true. ID: 60060 · Rating: 0 · rate: / Reply Quote

Greger Send message Joined: 6 Jan 15 Posts: 76 Credit: 25,499,534,331 RAC: 0 Level Scientific publications	Message 60061 - Posted: 10 Mar 2023, 16:48:58 UTC - in response to Message 60059. If this works in LHC you would need to point out HOW these preferences are set to that project. I have run LHC for many years and use same logic of user preferences setting just like "GG". What use is a separated application layer on top of work distribution of units. So you can set vbox or native to separate on each sub projects. IF THEN ELSE statement would require additional coding and to would not work in distribution of unit to without reject abort script, To accomplish such thing on specific amount of unit to each host or combined would require app_config but also 3 instances of boinc-clients. Project can not handle these request by default from server and would not. ID: 60061 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 347,555 Level Scientific publications	Message 60062 - Posted: 10 Mar 2023, 17:08:16 UTC - in response to Message 60061. projects like LHC and WCG (and others) have custom or modified BOINC server software. they give additional functionality not in the base code for normal BOINC projects. usually its in the project preferences somewhere. but it's really disingenuous to try to compare projects with custom software and say something like "they can do it, why can't you!?". every project has their own priorities and idiosyncrasies (and budgets). each user should just find what works for them to work around project-specific oddities. ID: 60062 · Rating: 0 · rate: / Reply Quote

JStateson Send message Joined: 31 Oct 08 Posts: 186 Credit: 3,578,903,157 RAC: 0 Level Scientific publications	Message 60063 - Posted: 11 Mar 2023, 20:49:13 UTC i just downloaded my first Python app. Been running 1/2 hour now on a 2080Ti Tech power up shows GPU load average of 14% and power average is 74 watts. that seems really low compared to Einstein's 88% and 190 watts Is this normal? CUDA: NVIDIA GPU 0: NVIDIA GeForce RTX 2080 Ti (driver version 528.24, CUDA version 12.0, compute capability 7.5, 11264MB, 11264MB available, 13448 GFLOPS peak) try my performance program, the BoincTasks History Reader. Find and read about it here: https://forum.efmer.com/index.php?topic=1355.0 ID: 60063 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 347,555 Level Scientific publications	Message 60064 - Posted: 11 Mar 2023, 21:24:10 UTC - in response to Message 60063. yes it's normal. these tasks are mostly a CPU/memory app and only uses the GPU intermittently for a small part of the overall computation. running two concurrently can help overall production. ID: 60064 · Rating: 0 · rate: / Reply Quote

Aurum Send message Joined: 12 Jul 17 Posts: 404 Credit: 17,408,899,587 RAC: 0 Level Scientific publications	Message 60077 - Posted: 14 Mar 2023, 14:58:07 UTC - in response to Message 60061. IF THEN ELSE statement Was only used in a futile attempt to explain what I'm asking GDF to do. ID: 60077 · Rating: 0 · rate: / Reply Quote

Aurum Send message Joined: 12 Jul 17 Posts: 404 Credit: 17,408,899,587 RAC: 0 Level Scientific publications	Message 60078 - Posted: 14 Mar 2023, 14:59:03 UTC - in response to Message 60062. projects like LHC and WCG (and others) have custom or modified BOINC server software. they give additional functionality not in the base code for normal BOINC projects. usually its in the project preferences somewhere. but it's really disingenuous to try to compare projects with custom software and say something like "they can do it, why can't you!?". every project has their own priorities and idiosyncrasies (and budgets). each user should just find what works for them to work around project-specific oddities. I wasn't even talking to you, I was talking to GDF. ID: 60078 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 347,555 Level Scientific publications	Message 60079 - Posted: 14 Mar 2023, 15:34:31 UTC - in response to Message 60078. Last modified: 14 Mar 2023, 15:35:45 UTC projects like LHC and WCG (and others) have custom or modified BOINC server software. they give additional functionality not in the base code for normal BOINC projects. usually its in the project preferences somewhere. but it's really disingenuous to try to compare projects with custom software and say something like "they can do it, why can't you!?". every project has their own priorities and idiosyncrasies (and budgets). each user should just find what works for them to work around project-specific oddities. I wasn't even talking to you, I was talking to GDF. ok. and? my post wasn't even in reply to yours lol. but the point still stands. ID: 60079 · Rating: 0 · rate: / Reply Quote