Message boards :
News :
Experimental Python tasks (beta) - task description
Message board moderation
Previous · 1 . . . 35 · 36 · 37 · 38 · 39 · 40 · 41 . . . 50 · Next
| Author | Message |
|---|---|
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
These tasks report CPU time as elapsed time. Actually, that's not quite right. The report (made in sched_request_www.gpugrid.net.xml) is accurate - it's after it lands in the server that it's filed in the wrong pocket. I've got a couple of tasks finishing in the next hour / 90 minutes - I'll try to catch the report for one of them. |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 9,834 Level ![]() Scientific publications
|
It’s correct. You just misinterpreted my perspective. I was talking about what the website reports to us. Not what we report to the server.
|
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Anyway, I caught one just to clarify my perspective. <result> That matches what it says in the job log: ct 151352.900000 et 54305.405065 But not what is says on the website: task 33116901 I'm going on about it, because if it was a problem in the client, we could patch the code and fix it. But because it happens on the server, it's not even worth trying. Precision in language matters. |
|
Send message Joined: 16 Oct 22 Posts: 12 Credit: 1,382,500 RAC: 0 Level ![]() Scientific publications
|
I have a question: Currently, I'm running a Python task with 1 core and one GPU. Would the crunching time decrease, if I allocate more cores to this tasks? 2 cores equals 50%, 4 cores equals 25% ? I know how to tweak the app_config.xml, but I want to ask before I waist time with tinkering. |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 9,834 Level ![]() Scientific publications
|
I assume you're talking about the app_config settings when you say "allocate". as a reminder, these settings do not change how much CPU is used by the app. the app uses whatever it needs no matter what settings you choose (up to physical constraints). the only way you can constrain CPU use is to do something like run a virtual machine with less cores allocated to it than the host has. otherwise the app still has full access to all your cores, and if you monitor cpu use by the various processes you'll observe this. if you're not running any other tasks (other CPU projects) at the same time, then changing the CPU allocation likely wont have any impact to your completion times since it's already using all of your cores.
|
|
Send message Joined: 16 Oct 22 Posts: 12 Credit: 1,382,500 RAC: 0 Level ![]() Scientific publications
|
Thanks for the fast reply. I'm running MCM from WCG on my machine in parallel. I will do a short test and suspend all other tasks. The question is: Will Python add more cores to this task if the other cores become available? My system: Ryzen 9 5950X, NVidia RTX 3060 Ti, 64 GB RAM, WIN 10 |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 9,834 Level ![]() Scientific publications
|
don't think of it in that sense. these tasks will spawn 32+ processes no matter how many cores you have or how much you allocate in BOINC. these processes need to be serviced by the CPU. if you have many processes and not enough threads to service them all, they will need to wait in the priority queue against all other processes. increasing the BOINC CPU allocation for the Python tasks, will stop processing by other competing BOINC CPU tasks, leaving more free available resources to the Python processes. so they will get the opportunity use more CPU in a shorter amount of time, but probably not much different total CPU time. meaning the tasks should run faster since they aren't competing with the other CPU work.
|
|
Send message Joined: 1 Jan 15 Posts: 1168 Credit: 12,317,898,501 RAC: 91,654 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
...the only way you can constrain CPU use is to do something like run a virtual machine with less cores allocated to it than the host has. otherwise the app still has full access to all your cores, and if you monitor cpu use by the various processes you'll observe this. however, you guys recently stated that best way is not to run any other projects while processing Python tasks. I can confirm. A week ago, I ran one LHC-ATLAS task, 2-core (virtual machine) together with 2 Pythons (1 each per GPU), and after a while the system crashed. Since then, only Pythons are being processed - no crashes so far. |
|
Send message Joined: 16 Oct 22 Posts: 12 Credit: 1,382,500 RAC: 0 Level ![]() Scientific publications
|
Well, CPU load was 100 % before with 30 MCM tasks running in parallel. Now, only the Python task is running and the CPU load is between 40 and 75 %. GPU load has not changed and is between 18 and 22 % like before. Looks like it is progressing faster than before ;-) |
|
Send message Joined: 16 Oct 22 Posts: 12 Credit: 1,382,500 RAC: 0 Level ![]() Scientific publications
|
Found a nice balance between MCM and Python tasks. Now I run 7 MCM and 1 Python tasks and the CPU load is about 99 %. |
|
Send message Joined: 1 Jan 15 Posts: 1168 Credit: 12,317,898,501 RAC: 91,654 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
there was a task which ran for about 20 hours and yielded a credit of 45.000 https://www.gpugrid.net/result.php?resultid=33117861 how come ? |
|
Send message Joined: 31 May 21 Posts: 200 Credit: 0 RAC: 0 Level ![]() Scientific publications
|
Currently, credits are not defined by execution time, but by the maximum possible compute effort. In particular for these AI experiments which consist on training AI agents, a maximum number of learning steps for the AI agents is defined as a target. That means that the agent interacts with its simulated environment and then learns from these interactions a certain amount of time. However, if some condition is met earlier, the task ends. There is a certain amount of randomness in the learning process, but the amount of credits is defined by the upper bound of training steps, independently of whether the task finished earlier or not. That is the amount of learning steps that the agent would do if the early stopping condition is never met. In general the condition is met more often by earlier RL agents in the populations that by later ones. Also can vary from experiment to experiment. Locally the task last on average 10-14h. |
|
Send message Joined: 27 Jul 11 Posts: 138 Credit: 539,953,398 RAC: 0 Level ![]() Scientific publications ![]()
|
don't think of it in that sense. I have a question also. Maybe Richard might understand better. I run CPDN tasks also which are very few and far between. So I gave zero resources to Moo Wrapper and ran it in parallel. No CPDN task then Moo would send me WUs. Now with GPUgrid tasks, this is not the case. These tasks do not register in Boinc as a task for some reason. If I am crunching a GPUgrid task then I SHOULD not get a Moo task. That is the correct procedure but what happened when I shifted from CPDN to here, I was running one GPUgrid(on all cores) task as well as twelve Moo tasks. That is thirteen tasks. I am not worried about if it can be done but why is this happening? |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Without having full details of how your copy of BOINC is configured, and how the tasks from each project are configured to run (in particular, the resource assignment for each task type) it's impossible to say. This may help: That machine has six CPU cores, but it's only running five tasks. That's because BOINC has committed 3+1+0.5+0.5+1 = 6 cores, and there are none left. If one of the GPU applications had been configured to require 2.99 CPUs, or 0.49 CPUs, the total core allocation would have fallen "below six", and BOINC's rules say that another task can be started. |
[AF] fansylSend message Joined: 26 Sep 13 Posts: 20 Credit: 1,714,356,441 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Example: https://www.gpugrid.net/result.php?resultid=33109419 I need to push swap size file up to 32GB but now it's OK. Even if the GPU activity rate is low and the Python task does not respect the number of threads allocated to it... no problem, go ahead science ! |
|
Send message Joined: 27 Jul 11 Posts: 138 Credit: 539,953,398 RAC: 0 Level ![]() Scientific publications ![]()
|
Without having full details of how your copy of BOINC is configured, and how the tasks from each project are configured to run (in particular, the resource assignment for each task type) it's impossible to say. Boinc version 7.20.2. Stock, out of the box. If there is a thread where I can learn mischief let me know. It is stock Boinc and I have allocated 100% of resources to GPUGrid plus 0% resources to Moo Wrapper. In case of no task from GPUGrid, I can get Moo tasks. I am in a hot, arid part of South Asia so I have to keep an eye on Temperatures. I don't want a puddle of plastic. Having too many cores is not an advantage in my case. |
|
Send message Joined: 17 Feb 09 Posts: 91 Credit: 1,603,303,394 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
According to my work in progress listings, I received this WU listed in progress: https://www.gpugrid.net/result.php?resultid=33134063 but it is non existent on the computer. Since it doesn't exist, I can't abort it or anything so the project will have to remove it from my queue and reassign it. |
|
Send message Joined: 1 Jan 15 Posts: 1168 Credit: 12,317,898,501 RAC: 91,654 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
on one of my hosts a Python has now been running for almost 3 times as long as all the "long" ones before. There is CPU activity, also GPU activity + VRAM usage in the usual range. Also RAM. The slot in the project folder is also filled with some 8,25GB. Still I am not sure whether this task maybe has hung up itself some way. Could this still be a valid task, or should I better terminate it? |
|
Send message Joined: 1 Jan 15 Posts: 1168 Credit: 12,317,898,501 RAC: 91,654 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
on one of my hosts a Python has now been running for almost 3 times as long as all the "long" ones before. I now looked up the task history - it failed on 7 other hosts. So I'd better cancel it :-) |
|
Send message Joined: 18 Jul 13 Posts: 79 Credit: 218,778,292 RAC: 12,880 Level ![]() Scientific publications
|
Can you check whether wrapper_run.out changes and number of samples collected? There should be a config file in slot directory that contains start sample number and end sample number. You can use subtraction to determine target number of samples. |
©2026 Universitat Pompeu Fabra