Message boards :
Number crunching :
Kepler - Not fully using CPU?
Message board moderation
Previous · 1 · 2
| Author | Message |
|---|---|
Carlesa25Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hello: Possibly it is the same issue that is sucediento in Linux for a month. The low use (12% + -) of the CPU but with a performance almost equal to when the load was 100% of one CPU per GPU. see:http://www.gpugrid.net/forum_thread.php?id=3601 |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Carlesa, did your change in GPUGrid task behavior... coincide with an updated driver version? Here in the Windows world, we only noticed the new behavior when we started using 334.67 BETA drivers. |
|
Send message Joined: 16 Mar 11 Posts: 509 Credit: 179,005,236 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Sorry for not providing input on this sooner but I've been swamped with other issues. I have 2 Linux rigs crunching GPUgrid. One has a 670 and a 660Ti in it (shows as two GTX 660Ti on the website) with driver version 331.20 and it is showing 99 - 100% CPU usage on each of 2 tasks. Those are real cores, no HT on that CPU. The other rig has one GTX 670 with driver 331.38 and it is showing 9 - 11% CPU usage. Again those are real cores, no HT on that CPU. I don't think 331.38 is a beta driver but I could be wrong. Take this report with a bit of caution because something weird that I don't understand is going on. Due to events I'm not going to bother explaining because they're long and complicated, the rig with the older driver should actually have the older driver so something is fishy here bit I'm not sure what it is, yet. I'm either confused or NVIDIA released a driver and then retracted it or something. I'll get back with more details when I know more but I hope what I have provided so far sheds some light. I would be very happy to learn that we're getting the same production with less CPU time. BOINC <<--- credit whores, pedants, alien hunters |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I changed my CPU usage setting to 100%. As my two present GPUGrid tasks state that they require 0.701 and 0.778 CPU's this means that I can run 7 CPU tasks, rather than 5 (as before the driver update) and the actual CPU usage is at ~97%. So in reality the GPU's are using around 0.5 CPU's to support each of my GTX670 and GTX770 cards. For lesser cards it would be less and for bigger cards it's going to be more. I have a 210x-SANTI_MAR420cap310-0-32-RND0577_0 WU (CUDA5.5) that was under 3% complete when I changed the CPU settings. It has been running while the system wide CPU usage is ~97% for a further 15% of its run. Unfortunately it appears that the run time will rise to around 38,000seconds which is significantly (~20%) longer than with the previous settings and driver: 313x-SANTI_MARwtcap310-2-32-RND3990_0 5134323 3 Feb 2014 | 13:53:54 UTC 3 Feb 2014 | 23:37:50 UTC Completed and validated 31,446.83 30,515.01 115,650.00 Long runs (8-12 hours on fastest card) v8.15 (cuda55) 848x-SANTI_MARwtcap310-1-32-RND6732_0 5133355 3 Feb 2014 | 3:52:21 UTC 3 Feb 2014 | 14:53:19 UTC Completed and validated 31,495.69 31,043.62 115,650.00 Long runs (8-12 hours on fastest card) v8.15 (cuda42) So, despite freeing a CPU thread or two, the GPUGrid WU's are every bit as dependent on CPU availability/responsiveness as before, and the more CPU apps are running the slower the GPUGrid app will run. Using MSI Afterburner I can see that GPU usage is more jagged; typical when there is resource contention. Dropping from 100% CPU usage to 95% (6 CPU tasks in my case instead of 7) the GPU usage rose from roughly 70% to 86% and 81% and became less jagged (but still a little). Dropping to 75% made it slightly more linear and utilization rose by a further 1%. Dropping to 50% did the same, another ~1% gain in GPU usage and an almost perfectly linear line. With CPU usage set to 40% GPU usage was linear at 89% and 86%. This is typical of the way it use to be. The only other thing you could fiddle at is the GPUGrid WU priorities... FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
Carlesa25Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Carlesa, did your change in GPUGrid task behavior... coincide with an updated driver version? Here in the Windows world, we only noticed the new behavior when we started using 334.67 BETA drivers. Hi, The difference, as commented on the information is in the version of Linux you use Ubuntu 13.10 or 14.04 of the Nvidia driver is the same in both cases see. 331.38. I also use Windows 8.1 (same hardware) and normal operation using almost 100% of the CPU driver 332.21. |
|
Send message Joined: 16 Mar 11 Posts: 509 Credit: 179,005,236 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Carlesa, did your change in GPUGrid task behavior... coincide with an updated driver version? Here in the Windows world, we only noticed the new behavior when we started using 334.67 BETA drivers. I am running Ubuntu 12.04 on both machines. I downloaded the drivers directly from NVIDIA, I did not install drivers from the "Additional drivers" utility. Anyway, the fishy thing I referred to in my previous post is probably irrelevant, the relevant point is that the newer driver seems to use a lot less CPU. On both rigs I am running other non-GPU projects. I'll experiment with turning those off while allowing GPUgrid tasks to run to see if that affects CPU usage and/or runtimes. I can experiment with process priority (niceness) too. BOINC <<--- credit whores, pedants, alien hunters |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
... and fiddle at the GPUGrid WU priorities I did; used Process Hacker to set them all to high, including the I/O priorities. Ran a 715x-SANTI_MARwtcap310-6-32-RND2030_0 WU with CPU usage at 95% in Boinc (88% going by task manager). In effect while using 1 more CPU core to crunch with, the run time was less than with a similar WU I ran with the old drivers, 917x-SANTI_MARwtcap310-1-32-RND4455_0. Beta drivers (GTX770), Run time 28,202.99 CPU time 11,046.48 Old drivers (GTX770), Run time 29,690.77 CPU time 29,412.32 So, that's a 5% improvement in the GPUGrid WU while using 1 more CPU core. 715x-SANTI_MARwtcap310-6-32-RND2030_0 5142448 5 Feb 2014 | 18:58:13 UTC 6 Feb 2014 | 4:11:30 UTC Completed and validated 28,202.99 11,046.48 115,650.00 Long runs (8-12 hours on fastest card) v8.15 (cuda55) 917x-SANTI_MARwtcap310-1-32-RND4455_0 5132820 2 Feb 2014 | 22:00:26 UTC 3 Feb 2014 | 7:07:43 UTC Completed and validated 29,690.77 29,412.32 115,650.00 Long runs (8-12 hours on fastest card) v8.15 (cuda55) Another comparison of the Beta against the old driver, this time for a GTX670. This time priorities were set some time into the run, and again overall 1 more CPU core was used to crunch CPU tasks: 313x-SANTI_MARwtcap310-2-32-RND3990_0 5134323 3 Feb 2014 | 13:53:54 UTC 3 Feb 2014 | 23:37:50 UTC Completed and validated 31,446.83 30,515.01 115,650.00 Long runs (8-12 hours on fastest card) v8.15 (cuda55) 533x-SANTI_MAR420cap310-0-32-RND9530_0 5139521 5 Feb 2014 | 23:34:59 UTC 6 Feb 2014 | 9:31:07 UTC Completed and validated 31,439.30 11,441.35 115,650.00 Long runs (8-12 hours on fastest card) v8.15 (cuda55) Although Process Hacker does not allow the saving of thread priority settings, it does allow you to save Task Priority. Fortunately this seems to be sufficient: old drivers, gluilex4x4-NOELIA_DIPEPT1-0-2-RND2366_1 5127910 4 Feb 2014 | 0:18:57 UTC 4 Feb 2014 | 8:45:15 UTC Completed and validated 26,130.57 25,997.61 93,000.00 Long runs (8-12 hours on fastest card) v8.15 (cuda55) Beta drivers, tyrglux6x44-NOELIA_DIPEPT1-1-2-RND3665_0 5143384 6 Feb 2014 | 3:16:34 UTC 6 Feb 2014 | 11:22:11 UTC Completed and validated 25,796.40 13,062.85 93,000.00 Long runs (8-12 hours on fastest card) v8.15 (cuda55) This NOELIA WU comparison suggests that with just the app priority set to high (not threads or I/O) the WU is slightly faster (1.3%) than the last one I ran on the same card with the old drivers. This is good news at it means this works on at least 2 WU types. Even if the tasks took the same length of time to complete, I'm getting a CPU thread out of it. A few things to note: Different GPU's use the CPU to different extents during the run, so expect some performance variation. Different WU types use the CPU to different extents. For my W7 system it's presently from 29% to 46%. The actual CPU usage obviously depends on the GPU (small will be less and the GTX780Ti the most), and the CPU which could be 2GHz or 4GHz. So, some tasks might be 5% faster, others only 1%. You need to save the priority for both with cuda4.2 and 5.5 versions of the app. What to do next is to test with app priority only more thoroughly and then without app priority to make sure these results are not just due to having more free CPU cycles, then test again at 100% CPU in Boinc and with priority on. That will take a few days though. PS. Setting priority is old hat but it's an opportunity to have another look now that things have changed with this beta driver. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
MJHSend message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level ![]() Scientific publications ![]()
|
The purpose of the "SWAN_SYNC" knob is to set the CUDA runtime to use a low-CPU mode. In practice, this hasn't worked very well for quite some time (in terms of driver releases). Looks rather like "correct" behaviour is restored with version 334. You should see lower CPU load, without diminished GPU performance. MJH |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The purpose of the "SWAN_SYNC" knob is to set the CUDA runtime to use a low-CPU mode. In practice, this hasn't worked very well for quite some time (in terms of driver releases). Looks rather like "correct" behaviour is restored with version 334. I've gave a try to the v334.67 driver on my WinXPx64 / Core i7-4770k / 2x GTX 780Ti host, and the GPU usage on both cards dropped by 5% (the temperatures were also lower than before), so I've reverted to the v332.21 driver. See task 7743473 & 7739626. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The purpose of the "SWAN_SYNC" knob is to set the CUDA runtime to use a low-CPU mode. In practice, this hasn't worked very well for quite some time (in terms of driver releases). Looks rather like "correct" behaviour is restored with version 334. I expect XP is different to W7 because the GPU utilization is higher on XP. On W7 my GTX770's GPU usage is around 79% running a long SANTI_MAR (cuda5.5). Did you try changing the priority, or perhaps you were doing that with the old drivers? Performance seems to depend on what your settings are/were, and likely your OS (XP vs W7); when I first tested the Beta driver, the runtimes were the same but there was more CPU available. I didn't use the extra CPU at first. When I used some of this freed up CPU the GPUGrid runtimes increased. The fact that runtime is the same but CPU usage is less is still a better situation from the users point of view; the system is going to be more responsive to the user as it has more CPU threads available, but if you can't use these free CPU cycles without the detrimental impact on the GPUGrid WU's runtime then its not great. The possible fix is to change the apps priority to high using Process Hacker (or similar). That way you might be able to use one more CPU cores/threads and get the same runtime performance for GPUGrid WU's (while utilizing more of your CPU at a CPU project). Of course this undoes the apps default settings, so it might mean that system responsiveness isn't what it was. Again, this is likely to be different for different hardware; a GTX660 is more likely to be less responsive than a high end card. To test the suggested priority fix, I have just started running 2 SANTI_MAR WU's with priority set to normal so I can compare these against WU's completed at high priority (again, 1 more CPU thread is being used than I was using with the non Beta drivers)... FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thank you for performing these tests, skgiven. I consider them very useful, and look forward to your additional results. It would be difficult for me to perform such tests, since I'd have to disable work on a heterogeneous GPU in order to ensure tasks stay on the Kepler. But I will surely use your results to modify my configuration appropriately. So far, my configuration has been changed a little. I still run GPUGrid-only on my GTX 660 Ti and my GTX 460. And I run Albert/Einstein/SETI/Beta (A/E/S/B) only, on my GTS 240. Old config: Set GPUGrid tasks to 0.5 CPU, since when running 2, I for sure was using a core on the Kepler. Set A/E/S/B to 0.5 CPU for most of their apps, since they don't use a full core. Use 100% CPUs. Result: 7 CPU tasks, 2 GPUGrid tasks, 1 A/E/S/B task; CPU slightly overloaded. Current config: Set GPUGrid tasks to 0.4 CPU, to better reflect values I'm seeing. Set A/E/S/B to 0.3 CPU for most of their apps. Note: 0.4+0.4+0.3=1.1. Use 100% CPUs. Result: 7 CPU tasks, 2 GPUGrid tasks, 1 A/E/S/B task; CPU very slightly underloaded. Note: There is 1 A/E app that I set to 1 CPU, "Gamma-ray pulsar search #2". And there is 1 S/B app that I set to 1.0 CPU, "AstroPulse v6". Process Monitor shows that those apps actually use a full core, which is why I have them setup to use 1.0 CPU. When one of those tasks run, the result is: 7 CPU tasks, 2 GPUGrid tasks, 1 1-core A/E/S/B task; CPU moderately overloaded. Regards, Jacob |
|
Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I have been running the 334.67beta on WinXP (32-bit) long enough to complete a Long on both a GTX 660 (SANTI_MARwtcap310) and a GTX 650 Ti (SANTI_MAR422cap310). It seems fine on GPU usage, being its usual 97% on the 660 and 98% on the 650 Ti. The run times also seem identical to the previous driver (332.21); that is, they are no faster. The only real difference is the CPU usage. It is now down to 17% on the 660, and 12% on the 650 Ti (as measured by BoincTasks). That is very nice, since that motherboard is an older P45 with an E8400 Core2 Duo at 3.0 GHz. The next step will be to try to run a single WCG project also on the CPU. We will see about that. |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 2 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I expect XP is different to W7 because the GPU utilization is higher on XP. I suspect GPU utilization is higher on XP because the Windows driver model is so very different from Vista/Win 7. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I expect XP is different to W7 because the GPU utilization is higher on XP. It's been long established (not least by you) that the introduction of the WDDM is the root of the performance difference between XP and more recent versions of Windows (Vista onwards). The WDDM introduced a large CPU overhead, hence the latency increase. For some 'non-GPUgrid' CUDA apps it's negligible, for other apps it's as high as 20%. Here it's around 12.5% now for a high-ish end GPU, but is obviously dependent on the CPU and supporting hardware. The latency also impacts upon some OpenCL apps for ATI cards, but not any Boinc apps that I'm aware of. In the case of the Beta vs older drivers there appears to be a difference between how the tasks perform under XP and W7. With XP it's apparently worse, though I think I know why (below). So far as I can tell for W7, an 8thread Intel CPU and two High-ish end GPU's the run times are the same under the Beta driver if you don't change any Boinc settings, and hadn't saturated the CPU to begin with. In my case I was able to configure Boinc to use an extra thread to crunch CPU tasks on and set the app Priority to High keeping the GPUGrid run times the same or slightly better. Zoltan, it occurred to me that the difference might actually be the CPU projects you were running; I'm just running WCG task that set the app priority to Idle. Many projects use higher settings, such as Normal. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Zoltan, it occurred to me that the difference might actually be the CPU projects you were running; I'm just running WCG task that set the app priority to Idle. Many projects use higher settings, such as Normal. I'm running 5 SIMAP on the CPU, they are running at low priority (I didn't change anything else than the NVidia driver during this test). |
|
Send message Joined: 4 Oct 12 Posts: 53 Credit: 333,467,496 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
For those who may be interested I deployed 334.67 BETA drivers to my small machine with two gtx 650ti's. Agree swan-sync appears to be enabled now as CPU usage dropped to 8% yet the run time for a long task appear to take only 2% longer - a good trade off for the reduced power usage, not to mention my GPU temps dropped a few degrees. http://www.gpugrid.net/results.php?hostid=166813 I set Boinc to run with an 'above normal' priority using: http://www.efmer.eu/boinc/download.html Incidently my machine is running headless; on this occasion I was able to install the drivers through an rdp session - though they do not take effect(as the cards cannot be seen). Hence once rebooted I just updated the cards directly through device manager. |
|
Send message Joined: 4 Oct 12 Posts: 53 Credit: 333,467,496 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Measured the power draw at the wall with the 334.67 beta drivers on XP x86 and its about 8% less (176w) than the previous whql drivers (192w) whilst performance only dropped by about 2% - a good result in my opinion. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Just wanted to chime in that, after repeated pressure by me, ManuelG is going to find more information on what has changed in the driver that has affected Kepler CPU usage. He hopes to have such information within 48 hours. https://forums.geforce.com/default/topic/679611/geforce-drivers/official-nvidia-334-67-beta-display-driver-feedback-thread-released-1-27-14-/post/4119864/#4119864 |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
ManuelG said he had to file a bug, to have the devs even look at it. https://forums.geforce.com/default/topic/690370/geforce-drivers/official-nvidia-334-89-whql-display-driver-feedback-thread-released-2-18-14-/post/4127820/#4127820 I tried to stress that it may not be a bug at all, and that we are simply looking for confirmation that something changed, and more information about that change: https://forums.geforce.com/default/topic/690370/geforce-drivers/official-nvidia-334-89-whql-display-driver-feedback-thread-released-2-18-14-/post/4128112/#4128112 I'll let you guys know when I know more. |
Carlesa25Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hello: In Windows 8.1 since I installed the latest Nvidia driver (a couple of days) 334.89 behaves like Linux, (for some time now) the CPU load is reduced to 10-20% (varies according to the task type) instead of 100% previously. The overall performance of the GPU hardly altered and the change improves load / performance / consumption ratio. |
©2026 Universitat Pompeu Fabra