Message boards :
News :
New CPU Application for testing
Message board moderation
Author | Message |
---|---|
![]() Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level ![]() Scientific publications ![]() ![]() |
Hi, There's a new CPU app available for Linux clients. A few WUs are out now, with some more to come after I've received the first results back. The app is multithreaded, I think the default behaviour of the BOINC client is to allocate all cores to it. Please report any observations here. Matt |
Send message Joined: 9 May 13 Posts: 171 Credit: 4,594,296,466 RAC: 117,924 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hi Matt, So far, all tasks ending in error. Some stop immediately, some run for a while. Here's my main question. Over at Milkyway@home, they are able to control the threads assigned to a multi-thread task using an app_config.xml file. I put together a similar app_config.xml file for the cpumd tasks and tried it on a few tasks. My BOINC client has 10 threads assigned. I have an app_config.xml set up to limit the threads assigned to cpumd tasks to 6 threads. In BOINC Manager, the task shows that it is using 6 CPU's. However, the stderr.txt shows that it is using 10 openmp threads and system usage indicates that it is using all 10 threads. My app_config: <app_config> <app> <name>acemdlong</name> <max_concurrent>9999</max_concurrent> <gpu_versions> <gpu_usage>1</gpu_usage> <cpu_usage>1.5</cpu_usage> </gpu_versions> </app> <app> <name>acemdbeta</name> <max_concurrent>9999</max_concurrent> <gpu_versions> <gpu_usage>1</gpu_usage> <cpu_usage>1.5</cpu_usage> </gpu_versions> </app> <app> <name>acemdshort</name> <max_concurrent>9999</max_concurrent> <gpu_versions> <gpu_usage>1.0</gpu_usage> <cpu_usage>1.5</cpu_usage> </gpu_versions> </app> <app> <name>android</name> <max_concurrent>4</max_concurrent> </app> <app_version> <app_name>cpumd</app_name> <plan_class>mt</plan_class> <avg_ncpus>6</avg_ncpus> </app_version> </app_config> Can you see anything that needs to be adjusted in my app_config? BOINC Manager is not giving me any error messages when it reads the app_config file. Sure would be nice if this would work here so we could leave some threads open to support GPU tasks. Thanks for all the help. |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 295,172 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Refer to the Application configuration documentation. For most multi-threaded applications you need <cmdline>--nthreads 6</cmdline> to control the behaviour of the application, in addition to the <avg_ncpus>6</avg_ncpus> (which only controls BOINC's scheduling, as you've found). I've only ever used --nthreads under Windows: I'm not sure whether it's applicable under Linux. Perhaps you or Matt could find out for us. |
Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
That's right - it's controlled by the command line option "--nthreads". It should default to using a single core if that's not specified. You'll be able to see in the stderr of the task's tombstone web page what arguments it received. Matt |
Send message Joined: 9 May 13 Posts: 171 Credit: 4,594,296,466 RAC: 117,924 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Matt and Richard, Thanks for the advice. The cmdline option seemed to do the trick. It is now running on 6 threads. Thanks for the help, captainjack |
Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I've just updated the app to correct for crashes on clients with venerable Core2 CPUs. Matt |
Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Could someone please report on the success or otherwise of suspend/resume of WUs? Matt |
Send message Joined: 17 Dec 11 Posts: 11 Credit: 105,502,570 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Is there any real advantage in making the app multithreaded? It saves memory, but that's all that comes to mind. On the other hand I expect it to be less efficient than running several single-threaded tasks. Plus the BOINC scheduler seems to be bad at managing multithreaded apps. |
Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Yes. For the use we intend for it we need the results back in a timely manner. Running these WUs on a single core will work, but the results are likely to come back too late to be useful. This application scales linearly for small N - I'm estimating 4-8 cores on most machines. Plus the BOINC scheduler seems to be bad at managing multithreaded apps. Well that's another thing entirely, and our problem to solve. I'm more concerned that the client doesn't give the user the desired level of control. Matt |
Send message Joined: 17 Dec 11 Posts: 11 Credit: 105,502,570 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
For the use we intend for it we need the results back in a timely manner. Running these WUs on a single core will work, but the results are likely to come back too late to be useful. This can only work if you don't have to compete for resources. You'll lose your advantage at every task switch or when BOINC decides to delay the start of a cached task in favour of some other. A very short deadline could possibly avoid this but I'm not sure if I could tolerate such hijacking. Plus the BOINC scheduler seems to be bad at managing multithreaded apps. I was talking about the client's task scheduler actually. As you already mentioned, it will run multithreaded apps on all cores, effectively interrupting all other work. |
Send message Joined: 9 May 13 Posts: 171 Credit: 4,594,296,466 RAC: 117,924 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Matt said, Could someone please report on the success or otherwise of suspend/resume of WUs? Just successfully finished an 8.43 task and it shows validated. It was running with an app_config.xml which allocated 6 threads to the task. After it was running for a few minutes, I suspended then resumed the task. Looks like it worked fine. Link to task http://www.gpugrid.net/result.php?resultid=12864432 I have another task running now. After it had been running for ~10 minutes, I shut down BOINC then restarted BOINC. It restarted from the beginning. Looks like it is running fine now. Will report back in later. Let me know if you need more information. |
Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Super, thanks! Matt |
Send message Joined: 9 May 13 Posts: 171 Credit: 4,594,296,466 RAC: 117,924 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Matt, I had a second task that was running when I shut down BOINC and started it back up again. That task has finished and validated. Ubuntu 14.04 BOINC 7.2.42 installed using the Berkeley installer Let us know if you want us to run some other tests. |
Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hi Everyone, The new test application is now available for Windows as well as Linux. Please do help us test it! This application does the same type of simulations as ACEMD, our GPU application. We reason why we are testing it now is that, in pinciple, modern CPUs are now finally fast enough to do process some of our WUs within an acceptable amount of time. To get to that point though, it is essential to use multiple CPU simultaneously, so this is a multithreaded app. I'd encourage you to let the WUs run on all cores (which it will do by default). The performance scales linearly with core count. The main objective of this first phase is to test application stability and measure achieved simulation rates and total throughput. It's a Beta application - to get work for it, you'll need to have your profile set to allow CPU work, allow beta work and enable the application "Molecular Dynamics on CPU". The app is largely feature-complete. The only issue I know is outstanding is that % completion statistics are wrong. I'm sure you'll all find other issues - please post your experiences and observations here. Matt Matt |
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hi Matt, My Haswell 4771 on win7 did one but with error. Five of my wing(wo)man had error too. You can see it here: http://www.gpugrid.net/result.php?resultid=12877823 Off topic: my error page has also still one error on it from 31 Aug 2013 and one form 19 Nov 2013. Greetings from TJ |
Send message Joined: 9 May 13 Posts: 171 Credit: 4,594,296,466 RAC: 117,924 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hi Matt, Just ran one of the multithread CPU tasks on a Windows 7 machine with 16 threads. Matt said: I'd encourage you to let the WUs run on all cores (which it will do by default). The performance scales linearly with core count. Per your request, I let it run on all 16 threads. CPU Utilization was pegged at 100% throughout the run. Task has uploaded and validated. Just started another one. I will keep an eye on it and let you know if anything changes. Let me know if you want me to try a different kind of test. captainjack |
Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Those were with the previous version. 844 is current, and should already have fixed those problems. Matt |
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks, will wait for new ones. Greetings from TJ |
Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
There are plenty of unsent tasks - if you're not receiving them, best check your project settings as below. Matt |
Send message Joined: 17 Dec 11 Posts: 11 Credit: 105,502,570 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Matt, you already know about the incorrect progress display, together with the elapsed and remaining run times, but you haven't mentioned if you see a way to fix this. If not, I'd like to point out that all calculations based on those values are of course wrong too, like computing speed, estimated run times and credits, possibly affecting system operations and user acceptance. For completeness, all my 8.43 WUs so far have finished and validated without further issues, including one that restarted after a system shutdown. WU size seems reasonable. |
©2025 Universitat Pompeu Fabra