Python apps for GPU hosts 4.03 (cuda1131) using a LOT of CPU

Author	Message
lohphat Send message Joined: 21 Jan 10 Posts: 46 Credit: 1,388,234,528 RAC: 0 Level Scientific publications	Message 59124 - Posted: 18 Aug 2022, 9:35:11 UTC My recent WUs are using 60-100% CPU on a Ryzen 9 3900X (12 core). Running BOINC 7.20.2 on win11 (64GB DDR4) using 7GB. The GPU (GTX 980Ti) is using 4GB most of its local VRAM (6GB total), but the CPU_0 and Copy graphs are "spikey" running low (20%) then spiking to 80% every 7 seconds. ID: 59124 · Rating: 0 · rate: / Reply Quote

KAMasud Send message Joined: 27 Jul 11 Posts: 138 Credit: 539,953,398 RAC: 0 Level Scientific publications	Message 59128 - Posted: 18 Aug 2022, 14:35:09 UTC Same on my Intel six core. The WU is consuming five cores. ID: 59128 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 423,674 Level Scientific publications	Message 59129 - Posted: 18 Aug 2022, 17:12:41 UTC - in response to Message 59128. this is normal for the Python tasks. they use a lot of CPU resources. ID: 59129 · Rating: 0 · rate: / Reply Quote

lohphat Send message Joined: 21 Jan 10 Posts: 46 Credit: 1,388,234,528 RAC: 0 Level Scientific publications	Message 59131 - Posted: 18 Aug 2022, 23:24:32 UTC - in response to Message 59129. Well then the status text should be updated to set expectations. Resources 0.983 CPUs + 1 NVIDIA GPU Traditionally this SEEMED to mean CPU core not 98% of the CPU. ID: 59131 · Rating: 0 · rate: / Reply Quote

csbyseti Send message Joined: 4 Oct 09 Posts: 6 Credit: 1,109,686,172 RAC: 0 Level Scientific publications	Message 59274 - Posted: 19 Sep 2022, 16:05:52 UTC Started GPUGrid after a long period of pausing and get 4.03 Python apps for GPU hosts (cuda1131) on my RTx 3080 But this WU's uses 60% of the CPU Cores and 7% GPU Load. This is not a GPU-WU, it's a CPU WU wich is wasting power on a GPU. ID: 59274 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 423,674 Level Scientific publications	Message 59275 - Posted: 19 Sep 2022, 16:50:41 UTC - in response to Message 59274. Started GPUGrid after a long period of pausing and get 4.03 Python apps for GPU hosts (cuda1131) on my RTx 3080 But this WU's uses 60% of the CPU Cores and 7% GPU Load. This is not a GPU-WU, it's a CPU WU wich is wasting power on a GPU. it uses both. but you need a powerful CPU to properly feed a GPU. my system has 2x RTX 3060 and each GPU sees 25-100% load. ID: 59275 · Rating: 0 · rate: / Reply Quote

csbyseti Send message Joined: 4 Oct 09 Posts: 6 Credit: 1,109,686,172 RAC: 0 Level Scientific publications	Message 59279 - Posted: 20 Sep 2022, 8:16:04 UTC Ok, you'll mean a 3900X is to slow to feed a RTX3080. Perhaps your RTX3060 is slow enough for seeing some GPU Load. ID: 59279 · Rating: 0 · rate: / Reply Quote

kksplace Send message Joined: 4 Mar 18 Posts: 53 Credit: 2,816,226,011 RAC: 142 Level Scientific publications	Message 59282 - Posted: 20 Sep 2022, 12:11:32 UTC As another reference, I have an i7-7820x (8 core, 16 thread) overclocked to 4.4 with an EVGA 3080 (memory clock set to +500, no other overclock) running 2x of these WUs. The GPU is loaded between 50 and 85% (rare dips down to 35%). (Linux Mint OS). ID: 59282 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 423,674 Level Scientific publications	Message 59283 - Posted: 20 Sep 2022, 14:34:38 UTC - in response to Message 59279. Ok, you'll mean a 3900X is to slow to feed a RTX3080. Perhaps your RTX3060 is slow enough for seeing some GPU Load. well my 3060 is vastly out producing your 3080 (and everyone else), so maybe I'm on to something? switch to Linux, and use a CPU that's strong enough. you'll get the best production running multiples so that you can increase GPU utilization, but you'll need enough CPU and memory and GPU memory to handle more tasks. my 24-core CPU feeds 3x tasks to each of my 3060's. effective task time are 4-4.3 hrs each. you'll be able to run 2x tasks on your 3080, but probably wont be able to run 3. with the higher number of cores, it will want to reserve more memory than my 3060, and even the 12GB model wouldn't be enough for 3 tasks. IMO the power of a 3080 goes to waste. "slower" lower end cards are just as productive because you're ultimately CPU bound. ID: 59283 · Rating: 0 · rate: / Reply Quote

gemini8 Send message Joined: 3 Jul 16 Posts: 31 Credit: 2,250,309,169 RAC: 2,157 Level Scientific publications	Message 59300 - Posted: 23 Sep 2022, 8:06:53 UTC - in response to Message 59283. Good morning folks. my 24-core CPU feeds 3x tasks to each of my 3060's. effective task time are 4-4.3 hrs each. You're talking about six Pythons running on your system, right? Do you use four threads per workunit, or is it actually eight threads? When I run a Python with eight threads on my Ryzen 7 2700 with GTX 1080 the cpu isn't at its full potential. I just switched to only four threads to see what's happening. I also run other GPU work aside GPUGrid to see more stable GPU temperatures, as the GPU isn't fully used either. - - - - - - - - - - Greetings, Jens ID: 59300 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 423,674 Level Scientific publications	Message 59301 - Posted: 23 Sep 2022, 13:07:28 UTC - in response to Message 59300. Last modified: 23 Sep 2022, 13:15:10 UTC Good morning folks. my 24-core CPU feeds 3x tasks to each of my 3060's. effective task time are 4-4.3 hrs each. You're talking about six Pythons running on your system, right? Do you use four threads per workunit, or is it actually eight threads? When I run a Python with eight threads on my Ryzen 7 2700 with GTX 1080 the cpu isn't at its full potential. I just switched to only four threads to see what's happening. I also run other GPU work aside GPUGrid to see more stable GPU temperatures, as the GPU isn't fully used either. Yes, 6 total pythons running on the system. 3x on each GPU. I’m not running any other projects on this system, only GPUGRID Python, so I haven’t changed the CPU used value from the default. Since no other work is running changing that value will have no effect. This is a common misconception about this setting. You cannot control how much CPU is used by an application through this setting. The application will use what it needs regardless. All this setting does is tell BOINC how many threads to set aside or reserve for this task. Your system isn’t using less CPU for the Python tasks by lowering this value. You’re just allowing more work from other projects to run. Each Python task will spawn 32x multiprocess spawn processes plus n*times processes from the main run.py program, where n is the number of cores. If you have enough cores, each process will run on a separate thread, if not then they will get timesliced by the OS scheduler and tetris’d in with the other processes. No setting in BOINC can change this. ID: 59301 · Rating: 0 · rate: / Reply Quote

Aurum Send message Joined: 12 Jul 17 Posts: 404 Credit: 17,408,899,587 RAC: 0 Level Scientific publications	Message 59302 - Posted: 23 Sep 2022, 15:46:48 UTC - in response to Message 59283. my 3060 is vastly out producing your 3080 (and everyone else) I'm up for the challenge :-) What figure of merit are you using to decide the winner? ID: 59302 · Rating: 0 · rate: / Reply Quote

Aurum Send message Joined: 12 Jul 17 Posts: 404 Credit: 17,408,899,587 RAC: 0 Level Scientific publications	Message 59303 - Posted: 23 Sep 2022, 15:52:40 UTC - in response to Message 59283. my 24-core CPU feeds 3x tasks to each of my 3060's. How do you get GG to send you 3 WUs per GPU??? I have no control over how many WUs GG sends me, it's 2 per GPU even when I only want one. What's the trick? ID: 59303 · Rating: 0 · rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1423 Credit: 9,186,946,190 RAC: 1,288,374 Level Scientific publications	Message 59304 - Posted: 23 Sep 2022, 17:03:14 UTC - in response to Message 59303. Use an app_config.xml file. <app_config> <app> <name>acemd3</name> <gpu_versions> <gpu_usage>1.0</gpu_usage> <cpu_usage>1.0</cpu_usage> </gpu_versions> </app> <app> <name>acemd4</name> <gpu_versions> <gpu_usage>1.0</gpu_usage> <cpu_usage>1.0</cpu_usage> </gpu_versions> </app> <app> <name>PythonGPU</name> <gpu_versions> <gpu_usage>0.33</gpu_usage> <cpu_usage>3.0</cpu_usage> </gpu_versions> </app> </app_config> ID: 59304 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 423,674 Level Scientific publications	Message 59305 - Posted: 23 Sep 2022, 17:12:08 UTC - in response to Message 59303. I'm up for the challenge :-) What figure of merit are you using to decide the winner? I guess total RAC per GPU? or shortest effective task completion time? my longest task times are ~13hrs, but I'm running 3x so it completes a task about every 4.3hrs, excluding early completions. I haven't seen anyone able to beat that. bare minimum if i were to only get full-run tasks, each of my GPUs can do about 580,000/day. and based on current trends, maybe up to ~800,000/GPU including the early completion bonus tasks. these are not normal tasks. there are a lot of factors that will determine overall production. not all run for the same length of time, but all are awarded the same. so daily reward can vary a lot depending on how many short running tasks you're lucky enough to get. so I've really only been paying attention to how long the longest full-run tasks take, excluding early completions. if you just look for the longest running tasks in your list it will give you an idea of how long the longest ones are. the other factor is CPU power. these tasks really are primarily CPU tasks with a small GPU component. that's why a lower end GPU pairs nicely with a beefy CPU. CPU power is the ultimate bottleneck. even when the CPU isnt maxed out, you can start to hit a limit in how many processes the CPU can handle when it doesn't have enough threads to service them all. if you were to run 3x on your 18-core intel system on a single GPU. the CPU would have to juggle about 150 running processes and time slicing them into your 36 available threads, not even accounting for other tasks or processes running. another factor is what, if any, other projects you are running and how much of the CPU is available. these tasks will spin up more processes than you have threads available. so if you running any other work, it will fight for priority over other projects and likely hurt your production. that's why on my system I dedicated it to these python task and doing nothing else. you're free to try, but I don't see your systems being able to overtake mine with the less powerful Intel CPUs you have. look at the last few days production numbers, my single system with 2x 3060's have outproduced all of yours combined. if you stop running other projects you can probably overtake me with all of them, but not with any single one. unless you put together a different system. How do you get GG to send you 3 WUs per GPU??? I have no control over how many WUs GG sends me, it's 2 per GPU even when I only want one. What's the trick? probably my custom BOINC client, which gives me better control over how many tasks a project sends. you can't run more than 2x on any of your GPUs anyway. not enough VRAM. yes your 3080Ti has the same 12GB as my 3060, but a 3080Ti requests more VRAM than a 3060 does because of the larger number of cores, so 3x on a 3080Ti will exceed the maximum VRAM where it doesnt on a 3060. you could "trick" GG into sending you more tasks with a stock BOINC client by making it look like you have more than one GPU. but that is another can of worms that will require additional configuration to prevent tasks from running on the "fake" GPU. It's possible, but it wont be an elegant solution and would likely have impacts to other projects. ID: 59305 · Rating: 0 · rate: / Reply Quote

gemini8 Send message Joined: 3 Jul 16 Posts: 31 Credit: 2,250,309,169 RAC: 2,157 Level Scientific publications	Message 59306 - Posted: 23 Sep 2022, 20:07:44 UTC - in response to Message 59301. Yes, 6 total pythons running on the system. 3x on each GPU. I’m not running any other projects on this system, only GPUGRID Python, so I haven’t changed the CPU used value from the default. Since no other work is running changing that value will have no effect. This is a common misconception about this setting. You cannot control how much CPU is used by an application through this setting. The application will use what it needs regardless. All this setting does is tell BOINC how many threads to set aside or reserve for this task. Your system isn’t using less CPU for the Python tasks by lowering this value. You’re just allowing more work from other projects to run. Thanks for your answer. Even if your explanation technically is much more precise than the idea of using some amount of threads per workunit, setting aside some threads to keep CPU capacities reserved for Python work is what I'm doing to be able to do other things on my system which utterly lacks the kind of GPUs yours is featuring. Running two Python beside two Milkyway workunits was pushing the VRAM limits of the GPU, and at one point a lot of the Milkyway tasks errored out, probably because of too few GPU memory beside the Pythons. Since then I've been running one Python and one other GPU task side by side without encountering further problems. Maybe I'll have to adjust the number of CPU threads again until I get some sort of 'best' performance out of my system. And maybe I'll just use another system for GPUGrid Python. I ran some tasks on an i7-6700k combined with a 1060 3GB some time ago. Don't know if this will fit anymore (possibly not), so I'm prepared to swap a 1050Ti into the system. Thanks again for giving me additional ideas for this kind of Boinc work! - - - - - - - - - - Greetings, Jens ID: 59306 · Rating: 0 · rate: / Reply Quote

Aurum Send message Joined: 12 Jul 17 Posts: 404 Credit: 17,408,899,587 RAC: 0 Level Scientific publications	Message 59313 - Posted: 25 Sep 2022, 15:00:46 UTC Bug: PythonGPU does not know how to share two GPUs. If the CPU is 18c/36t it shares all 36 threads between two WUs but will not start a third because of a lack of CPU threads. Fine. But, the problem is that it assigns both WUs to GPU d0 and ignores the second GPU d1. Maybe this is not actually a problem if it's impossible that both WUs may need to process work on a GPU at the same time. It seems that at some points over the long course of completing a PythonGPU WU they will want a GPU at the same time. It should use the next available GPU or at least assign one WU to d0 and the second WU to d1. This is another example of how denying us any control over how many WUs get downloaded is inefficient. With two GPUs GG insists on DLing four WUs when only two can run. Those WUs could be running on another computer rather than wasting half a day idle. This may be harder to remedy than the other case where only a single PythonGPU WU is allowed to run on a computer. It would really be donor-friendly to give us the ability to specify the number of WUs we want DLed for a given Preferences group as many other BOINC projects do. It just needs to be turned on. ID: 59313 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 423,674 Level Scientific publications	Message 59314 - Posted: 25 Sep 2022, 15:38:04 UTC - in response to Message 59313. Last modified: 25 Sep 2022, 15:48:54 UTC You CAN do that. You just don’t know how. It has nothing to do with GPUGRID, it’s in how you configure the BOINC settings. If you want to run two tasks on the same GPU, but different projects? And never having two GPUGRID on the same GPU? Easy. Set the App config for GPUGRID for 0.6 gpu_usage Set the app config for the other project to 0.4 gpu usage. That way it will mix the projects, one from each on one GPU. 0.6+0.4=1. But 0.6+0.6>1 so it won’t start two from GPUGRID on the same GPU. It will go to the next GPU with open resources. ID: 59314 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 423,674 Level Scientific publications	Message 59315 - Posted: 25 Sep 2022, 15:45:51 UTC - in response to Message 59313. It would really be donor-friendly to give us the ability to specify the number of WUs we want DLed for a given Preferences group as many other BOINC projects do. It just needs to be turned on. I’ve not seen this functionality on any other BOINC project. They all go by your local cache (number of days) setting in BOINC itself. What project let’s you explicitly specify number of tasks to download? ID: 59315 · Rating: 0 · rate: / Reply Quote

Aurum Send message Joined: 12 Jul 17 Posts: 404 Credit: 17,408,899,587 RAC: 0 Level Scientific publications	Message 59319 - Posted: 25 Sep 2022, 17:10:55 UTC - in response to Message 59314. You CAN do that. You just don’t know how. It has nothing to do with GPUGRID, it’s in how you configure the BOINC settings. If you want to run two tasks on the same GPU, but different projects? And never having two GPUGRID on the same GPU? Easy. Set the App config for GPUGRID for 0.6 gpu_usage Set the app config for the other project to 0.4 gpu usage. That way it will mix the projects, one from each on one GPU. 0.6+0.4=1. But 0.6+0.6>1 so it won’t start two from GPUGRID on the same GPU. It will go to the next GPU with open resources. You did not read what I actually wrote before making your snide remark. That will not fix anything. ID: 59319 · Rating: 0 · rate: / Reply Quote