Message boards :
News :
New CPU work units
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author | Message |
---|---|
![]() Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level ![]() Scientific publications ![]() ![]() |
It's on the TODO list, yes. It'll be appearing in the Linux version first, as that's the easier to develop. Matt |
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I am thinking about hard-coding the CPU use to [number-of-CPU-cores] - [number-of-GPUs]. Opinions? For GPUGRID only crunchers this is great, but I think that those who do other project on the CPU are not that happy. For me you can do it. Edit: there are also projects that use the i-GPU, that is also a thread on the CPU. Greetings from TJ |
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I have set to use 100% of the CPU's and only the CPU app for GPUGRID is in hte queque, but strangely only 5CPU's are used. Should be 8. I noticed this in the progress file: Detecting CPU-specific acceleration. Present hardware specification: Vendor: GenuineIntel Brand: Intel(R) Core(TM) i7 CPU 960 @ 3.20GHz Family: 6 Model: 26 Stepping: 5 Features: apic clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc pdcm popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3 Acceleration most likely to fit this hardware: SSE4.1 Acceleration selected at GROMACS compile time: SSE2 Binary not matching hardware - you might be losing performance. Acceleration most likely to fit this hardware: SSE4.1 Acceleration selected at GROMACS compile time: SSE2 Also I think as the estimation of run time is not correct yet the first 99% goes rather quick and then the last 1% take between 20-28 hours to finish. But Matt knows this already. Greetings from TJ |
Send message Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
TJ- from a post earlier. Only SSE2 at the moment. Will probably make builds with higher levels of optimisation later but for now I'm concerned about correctness rather than performance. TJ or Jacob- for you're SLI system- have you noticed any task being kicked out while running an older CPUMD task with a new x10 compute cost in cache? For me- with an old CPUMD running in high priority mode and new CPUMD in cache- one of two GPU tasks running go's into "waiting to run" mode. |
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
TJ- from a post earlier. Aha, thanks now I understand. TJ or Jacob- for you're SLI system- have you noticed any task being kicked out while running an older CPUMD task with a new x10 compute cost in cache? For me- with an old CPUMD running in high priority mode and new CPUMD in cache- one of two GPU tasks running go's into "waiting to run" mode. I have only "old" ones running and in queue, will let them finish first. However none is yet running at high priority. Greetings from TJ |
Send message Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
TJ- from a post earlier. Currently- CPU task from first batch is NOT running in high priority- but when I download a "new" CPUMD 10x compute task- "old" task go's into high priority and kick's out one of two GPU task computing. |
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Ok, let's get a thing straight here. Client scheduling. The order, I believe, goes something like this: 1) "High Priority" coprocessor (GPU/ASIC) tasks 2) "High Priority" CPU tasks (up to ncpus + 1) (MT tasks allowed to overcommit) 3) "Regular" coprocessor (GPU/ASIC) tasks (up to ncpus + 1) 4) "Regular" CPU tasks (up to ncpus + 1) (MT tasks allowed to overcommit) So... When one of the new GPUGrid MT CPU tasks comes in, if it is set to use all of the CPUs, and it run's high priority... It gets scheduled in "order 2", which is above the GPU tasks which come in at "order 3". And then, it will additionally schedule as many "order 3" GPU tasks as it can, but only up to the point that it budgets 1 additional CPU. (So, if your GPU tasks are set to use 0.667 CPUs like I have scheduled mine via app_config, then it will run 1 GPU task, but not 2). This is NOT a problem of "oh wow, GPUGrid MT tasks are scheduling too many CPUs." This IS a problem of "oh wow, GPUGrid MT tasks go high-priority immediately. That throws off all of the scheduling on the client." Hopefully that helps clarify. PS: Here is some dated info that is a useful read: http://boinc.berkeley.edu/trac/wiki/ClientSched http://boinc.berkeley.edu/trac/wiki/ClientSchedOctTen |
Send message Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Jacob- thank you for the information about client scheduling. Matt- I see you released a CPUMD app for Linux with support for SSE4/AVX. Will windows also see an upgrade? Do you have idea what the speed up with SSE4/ AVX app will be compared to standard SSE2 app? |
![]() Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level ![]() Scientific publications ![]() ![]() |
Will windows also see an upgrade? Probably within the week.
10-30% for AVX on Intel, I think. |
Send message Joined: 10 Nov 07 Posts: 10 Credit: 12,777,491 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
ohmyohmy... http://www.gpugrid.net/result.php?resultid=13195959 running on 3 out of 4 cpu cores, nsteps=5000000 at 57 hours: Writing checkpoint, step 3770270 at Tue Oct 14 23:57:52 2014 seems it will finish... in 24 more hours. seems something went rly weird with est runtime. question: should I abort the 6 other workunits? |
![]() Send message Joined: 8 Oct 12 Posts: 98 Credit: 385,652,461 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
|
Send message Joined: 31 Aug 13 Posts: 11 Credit: 7,952,212 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I noticed this bit of info off of the task I ran, using 3 out of the 4 available cores on my computer: http://www.gpugrid.net/result.php?resultid=13201370 Using 1 MPI thread Using 3 OpenMP threads NOTE: The number of threads is not equal to the number of (logical) cores and the -pin option is set to auto: will not pin thread to cores. This can lead to significant performance degradation. Consider using -pin on (and -pinoffset in case you run multiple jobs). Can this become an issue for computers that aren't running the task with all cores? |
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
This is the last bit of the stderr file: starting mdrun 'Protein in water' 5000000 steps, 10000.0 ps (continuing from step 3283250, 6566.5 ps). Writing final coordinates. Core t (s) Wall t (s) (%) Time: 32457.503 32458.000 100.0 9h00:58 (ns/day) (hour/ns) Performance: 9.140 2.626 gcq#0: Thanx for Using GROMACS - Have a Nice Day 16:39:45 (4332): called boinc_finish(0) It ran on 5 CPU's (8 where allowed). Am I right seeing that it took 9 hours to finish? It took a bit more, see this: 1345-MJHARVEY_CPUDHFR-0-1-RND9787_0 10159887 153309 12 Oct 2014 | 17:58:49 UTC 15 Oct 2014 | 14:38:34 UTC Completed and validated 94,864.92 567,905.50 2,773.48 Test application for CPU MD v8.46 (mt) Greetings from TJ |
![]() Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level ![]() Scientific publications ![]() ![]() |
No - it took just over a day. The performance was ~9ns/day, the sim was 10ns in length. Matt |
![]() Send message Joined: 8 Oct 12 Posts: 98 Credit: 385,652,461 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Had 3 errors on one of my PCs: http://www.gpugrid.net/results.php?hostid=185425 All errored out with: "Program projects/www.gpugrid.net/mdrun.846, VERSION 4.6.3 Source code file: ..\..\..\gromacs-4.6.3\src\gmxlib\checkpoint.c, line: 1562 File input/output error: Cannot read/write checkpoint; corrupt file, or maybe you are out of disk space? For more information and tips for troubleshooting, please check the GROMACS website at http://www.gromacs.org/Documentation/Errors" This computer has no problems running other projects... including vLHC@Home, Rosetta, etc. ![]() |
Send message Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
A few observations about CPUMD tasks- a dual core CPU will need 4 days/ 96hr~ to compete task- a dual core with HT [4threads] requires 2days/48~hr - a quad core with no HT [4threads] takes 16-36~hr - a quad core with HT [8threads] competes tasks in ~24hr - a 6 core [12threads] finishes a task in ~8-16hr - while a 16 thread CPU manages CPUMD tasks in under ~12hr. There are CPU finishing faster from being overclocked and having 1833MHz or higher RAM clocks. Disk usage is low for CPUMD- notice when running GPU tasks disk usage can be higher for certain tasks. (unfold_Noelia) CPU temps are low with SSE2 app- when AVX CPUMD app is released temps will be higher. For people who are running Intel AVX CPU- there a possible 10-30% speed up coming when AVX app is released. Some info-- http://en.wikipedia.org/wiki/Dihydrofolate_reductase |
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I completed my first CPU task on my main rig on Windows 10 Technical Preview x64. http://www.gpugrid.net/result.php?resultid=13206356 Observations: - It used app: Test application for CPU MD v8.46 (mtsse2) - It had horrible estimates, along with an ability to report progress correctly, and had a 1-week deadline, and so it ran as high-priority the entire time, interfering with the BOINC client scheduling of my GPUGrid GPU tasks. I will not be running this type of task again unless the estimation is fixed. - It did not report progress correctly. - It ran using 6 (of my 8) logical CPUs, as I had BOINC set to use 75% CPUs, since I am running 2 RNA World VM tasks outside of BOINC - It took 162,768.17s (45.2 hours) of wall time - It consumed 721,583.90s (200.4 hours) of CPU time - It did checkpoint every so often, which I was happy to see. It appeared to resume from checkpoints just fine. - It completed successfully, with the output text below - It validated successfully, and granted credit. - It seems weird that the time values in the output do not match either the wall time or CPU time values that BOINC reported. Bug? Core t (s) Wall t (s) (%) Time: 18736.176 18736.000 100.0 5h12:16 (ns/day) (hour/ns) Performance: 5.491 4.371 Let us know when the estimation and progress problems are fixed, and then maybe I'll run another one for you! Thanks, Jacob |
Send message Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Deleted post |
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
eXaPower: Your questions about Windows 10 should have been a PM. I'll send you a PM response. |
Send message Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
For last couple days- I've had two GPU tasks and one CPUMD tasks running in high priority- up until now all ran with no issues. Just now and randomly BOINC has decided to kill one of GPU tasks- sending it to "waiting to run" mode. If I suspend CPUMD task both GPU tasks will run. Allowing CPUMD task to run will shut a GPU task. |
©2025 Universitat Pompeu Fabra