1 Work Unit Per GPU

Author	Message
Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 47946 - Posted: 5 Oct 2017, 11:54:30 UTC - in response to Message 47945. Last modified: 5 Oct 2017, 12:07:34 UTC Aleksey Belkov wrote: Retvari Zoltan wrote: Using SWAN_SYNC (in combination with no CPU tasks)... ..or just increase priority of GPU tasks. I wrote earlier in this message how to automate the process of increasing priority of GPU tasks. I see no reason to abandon computing CPU tasks, if there is a quite simple way to minimize their effect(or other programs, if it's not a dedicated host for computing) on GPU tasks. As people stated in your link... GPU tasks already run, by default, at a higher process priority (Below Normal process priority = 6) than CPU tasks (Idle process priority = 4). You can inspect the "Base Prio" column in Task Manager, or the "Priority" column in Process Explorer, to confirm. The Windows Task scheduler does a good job of honoring these priorities. So... Are you seeing a speedup by hacking the priorities with Process Hacker? And, if so, isn't the real reason that you see it is because you're bumping priorities to be higher than the (Normal priority = 8) tasks that Windows uses for all your other non-BOINC processes? BOINC is meant to run tasks without interfering with the normal operations of a device. So, I think our defaults work well to do that, and I think your hacking may work well to achieve more GPU throughput at the expense of potentially slowing down normal operations of the device which now run at a priority lower than your hacked processes. Regards, Jacob ID: 47946 · Rating: 0 · rate: / Reply Quote

mmonnin Send message Joined: 2 Jul 16 Posts: 339 Credit: 7,990,341,558 RAC: 3,629 Level Scientific publications	Message 47947 - Posted: 5 Oct 2017, 14:28:16 UTC - in response to Message 47945. Retvari Zoltan wrote: Using SWAN_SYNC (in combination with no CPU tasks)... ..or just increase priority of GPU tasks. I wrote earlier in this message how to automate the process of increasing priority of GPU tasks. I see no reason to abandon computing CPU tasks, if there is a quite simple way to minimize their effect(or other programs, if it's not a dedicated host for computing) on GPU tasks. Or run #ofCPUThreads minus 2 to leave a full core for GPUGrid if you must. I sure wouldn't castrate CPU production from 7 to 3 tasks for a single GPU task. The 3 would run faster but production most likely would go down overall. There just needs to be more tasks. Plain and simple. There is a Formula BOINC competition right now for the next 3 days and it's not going to be a competition of who has the most processing power but who can get the most work. :( ID: 47947 · Rating: 0 · rate: / Reply Quote

[CSF] Aleksey Belkov Send message Joined: 26 Dec 13 Posts: 87 Credit: 1,292,358,731 RAC: 0 Level Scientific publications	Message 47948 - Posted: 5 Oct 2017, 16:54:41 UTC - in response to Message 47946. Jacob Klein wrote: Are you seeing a speedup by hacking the priorities with Process Hacker? And, if so, isn't the real reason that you see it is because you're bumping priorities to be higher than the (Normal priority = 8) tasks that Windows uses for all your other non-BOINC processes? BOINC is meant to run tasks without interfering with the normal operations of a device. So, I think our defaults work well to do that, and I think your hacking may work well to achieve more GPU throughput at the expense of potentially slowing down normal operations of the device which now run at a priority lower than your hacked processes. Probably I did not express myself correctly. Described method ensures that for GPU tasks will be allocated as much resources as they are requesting and CPU tasks will receive all remaining available resources(after execution of other processes having higher priority). In my opinion the method using SWAN_SYNC leads to loss of some CPU resources, which is rigidly assigned for GPU tasks. In previous tests I did not see a significant difference between using SWAN_SYNC and increasing priority, that has made me think that GPU tasks don't need such amount of CPU time. Therefore, it is sufficient to prioritise the execution of GPU tasks over other processes(not time-critical) to improve performance relative to the standard mode(without SWAN_SYNC and raising priority). I in my experience when priority of the GPU tasks increasing, I have not noticed any problems with responsiveness of system(in my case it isn't a dedicated host) or any other negative effects(unless you don't try to play demanding 3D games). Of course, on different systems and in different usage scenarios, the effect can vary greatly. I suggest you conduct your own tests on a dedicated host or home/work host. I believe that this method is particularly useful for those who run GPUGRID on home computers. ID: 47948 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 47949 - Posted: 5 Oct 2017, 17:02:54 UTC Thanks for the reply. I'm curious how much of a speedup your process hacking actually gets? ID: 47949 · Rating: 0 · rate: / Reply Quote

3de64piB5uZAS6SUNt1GFDU9dRhY Send message Joined: 20 Apr 15 Posts: 285 Credit: 1,102,216,607 RAC: 0 Level Scientific publications	Message 47950 - Posted: 5 Oct 2017, 22:17:21 UTC well ..... back to my question and the topic ... is the 1 Work Unit Per GPU rule now set? If so, may I protest. That measure has an enormous impact on my GPU utilization. I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday. ID: 47950 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 47951 - Posted: 5 Oct 2017, 22:41:16 UTC Last modified: 5 Oct 2017, 22:41:38 UTC No, I don't think it is set. What evidence do you have? I have a PC with 2 GPUs and 4 GPUGrid GPU tasks. ID: 47951 · Rating: 0 · rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 47953 - Posted: 6 Oct 2017, 8:54:41 UTC - in response to Message 47950. Last modified: 6 Oct 2017, 8:55:24 UTC is the 1 Work Unit Per GPU rule now set? It's not set. There was a debate about this earlier when there was a shortage, with minimal response from the staff. I think it would be much better if the low-end cards would be refused to get work from the long queue, as this is predictably futile considering the 5 days deadline. See this workunit. ID: 47953 · Rating: 0 · rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 47954 - Posted: 6 Oct 2017, 10:29:24 UTC - in response to Message 47949. Thanks for the reply. I'm curious how much of a speedup your process hacking actually gets? I was experimenting with process priority and CPU affinity also on my various CPUs (i7-870, i7-980x, i7-4770k, i3-4160, i3-4360, i7-4930k), but the GPU performance gain from these tricks was negligible compared to the lack of WDDM, and less (or none) CPU tasks. This led me to the conclusion that a single PC could not excel simultaneously in GPU and CPU performance, thus there's no need for a high-end CPU (though it should be state of the art) to maximize high-end GPU performance. It is very hard to resist the temptation of running 12 CPU tasks on a very expensive (6-core+HT) CPU which will reduce GPU performance; thus it's better to build (at least) two separate PCs: one for CPU tasks and one for GPU tasks. This is the reason for my PCs with i3 CPUs: I rather spend more money on better GPUs than on better CPUs. Regarding RAC (or PPD): the loss of RAC of a high-end GPU by running CPU tasks simultaneously on the same PC is much bigger than the RAC gain of CPU tasks, so the overall RAC of the given PC will be lower if it runs CPU and GPU tasks simultaneously. Of course if someone could have only one PC, their possibilities for performance optimization are limited, and my advice is not worth to be applied (because of the huge loss in the user's overall CPU performance). Sorry for writing the same thing different ways many times, but my experiences are confirmed by the fact that my GTX 980Ti's (driven by i3 CPUs) are competing with GTX 1080Ti's on the performance page. Though the days of my GTX 980Ti's (running under Windows XPx64) are numbered, as the next generation (Volta) GPUs will wash them away from the top list even with WDDM, we could still use Linux to avoid WDDM, so my advice will still stand. ID: 47954 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 47955 - Posted: 6 Oct 2017, 12:57:20 UTC Last modified: 6 Oct 2017, 12:57:44 UTC Thanks Retvari. You must understand that your goals of "maximized GPU performance" are radically different than my goal of "help out lots of projects with science, on my regular PC". I'm attached to all projects, do work for about 12 of them, I happily run lots of CPU tasks alongside the GPU tasks, and I use Windows 10 (Insider - Fast Ring - I love to beta test). I do not care at all about credit, but I do care about utility of helping the projects. I adjust my settings to allow a little bit of a performance boost for GPUGrid. My settings are: - Use app_config to budget 0.500 CPU per GPUGrid task - Main PC: Set BOINC to use (n-1) % CPUs - Unattended PCs: Set BOINC to use 100% CPUs One of the main reasons I use (n-1) there, also, is because there has always been a Windows 10 shell priority bug where right-clicking taskbar icons gets a delayed response - sometimes several seconds! That piece of shell code must be running at IDLE priority, and stalls if using 100% CPUs! Using (n-1) alleviates it. Kind regards, Jacob ID: 47955 · Rating: 0 · rate: / Reply Quote

mesman21 Send message Joined: 16 Apr 09 Posts: 4 Credit: 402,158,602 RAC: 0 Level Scientific publications	Message 47956 - Posted: 6 Oct 2017, 13:32:48 UTC - in response to Message 47955. I can understand the different goals and mindsets with all the projects out there. I'd say my goals are more in line with Retvari's; fastest GPU WU return times possible. With this in mind, I've detached from all CPU projects and have seen improvements. The faster/ more GPU's you have in your system the more this makes sense. An improvement of only 1% on the return times from my pair of GTX 1080's easily provides more RAC than my i7-7700k could running CPU tasks alone. Regardless of your goals, Aleksey's advice of running "Process Hacker" can be extremely beneficial for those of us running tasks on a non-dedicated machine. For example, I run Plex media server on the same machine, and my WUs would get decimated whenever Plex started transcoding video. The Plex transcoder ran at "Normal" priority an WUs at "Below Normal". With "Process Hacker" I was able to permanently adjust Plex transcoder to a lower priority and WUs to a higher priority. Now the WU's hardly slow at all during transcoding and I've seen no performance reduction in Plex. Thank you! ID: 47956 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 47957 - Posted: 6 Oct 2017, 13:50:23 UTC - in response to Message 47956. Last modified: 6 Oct 2017, 13:52:16 UTC I would encourage you to do some testing, using the following BOINC client options (which you'd use in cc_config.xml), especially <process_priority_special>, which might allow you to do what you want without hacking. https://boinc.berkeley.edu/wiki/Client_configuration <no_priority_change>0\|1</no_priority_change> If 1, don't change priority of applications (run them at same priority as client). NB: This option can, if activated, impact system responsiveness for the user. Default, all CPU science apps run at lowest (idle) priority Nice 15. <process_priority>N</process_priority> <process_priority_special>N</process_priority_special> The OS process priority at which tasks are run. Values are 0 (lowest priority, the default), 1 (below normal), 2 (normal), 3 (above normal), 4 (high) and 5 (real-time - not recommended). 'special' process priority is used for coprocessor (GPU) applications, wrapper applications, and non-compute-intensive applications, 'process priority' for all others. The two options can be used independently. New in 7.6.14 I bet they work for you!! Try them and let us know :) ID: 47957 · Rating: 0 · rate: / Reply Quote

mesman21 Send message Joined: 16 Apr 09 Posts: 4 Credit: 402,158,602 RAC: 0 Level Scientific publications	Message 47958 - Posted: 6 Oct 2017, 17:30:01 UTC - in response to Message 47957. Try them and let us know :) I tried these changes to cc_config, and they worked for all of the other projects I tried, but not for GPUgrid. I was running CPU tasks on WCG and I run GPU tasks on Einstein when work is low here. I was able to manipulate the priority of each, setting a higher priority for the Einstein GPU tasks. No such luck on GPUgrid tasks; that's why I was so happy to hear about "Process Hacker". Maybe it's just me, has anyone successfully changed the priority of GPUgrid tasks with cc_config? ID: 47958 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 47959 - Posted: 6 Oct 2017, 17:54:13 UTC - in response to Message 47958. I reproduced your problem - GPUGrid GPU tasks aren't honoring that <process_priority_special>. Other projects' GPU tasks do honor it. I guess that's another bug to add to GPUGrids list of brokenness. Dangit. ID: 47959 · Rating: 0 · rate: / Reply Quote

mmonnin Send message Joined: 2 Jul 16 Posts: 339 Credit: 7,990,341,558 RAC: 3,629 Level Scientific publications	Message 47960 - Posted: 6 Oct 2017, 21:06:55 UTC - in response to Message 47959. I reproduced your problem - GPUGrid GPU tasks aren't honoring that <process_priority_special>. Other projects' GPU tasks do honor it. I guess that's another bug to add to GPUGrids list of brokenness. Dangit. Would swan_sync play a role with this priority option? ID: 47960 · Rating: 0 · rate: / Reply Quote

mmonnin Send message Joined: 2 Jul 16 Posts: 339 Credit: 7,990,341,558 RAC: 3,629 Level Scientific publications	Message 47961 - Posted: 6 Oct 2017, 21:06:57 UTC - in response to Message 47959. Last modified: 6 Oct 2017, 21:07:39 UTC Edit: Double post :( ID: 47961 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 47962 - Posted: 6 Oct 2017, 21:15:45 UTC - in response to Message 47960. I reproduced your problem - GPUGrid GPU tasks aren't honoring that <process_priority_special>. Other projects' GPU tasks do honor it. I guess that's another bug to add to GPUGrids list of brokenness. Dangit. Would swan_sync play a role with this priority option? No, I don't think so. ID: 47962 · Rating: 0 · rate: / Reply Quote

[CSF] Aleksey Belkov Send message Joined: 26 Dec 13 Posts: 87 Credit: 1,292,358,731 RAC: 0 Level Scientific publications	Message 47963 - Posted: 6 Oct 2017, 22:14:46 UTC - in response to Message 47960. Last modified: 6 Oct 2017, 22:15:13 UTC mmonnin wrote: Would swan_sync play a role with this priority option? This combination can lead to significant problems with responsiveness of system. ID: 47963 · Rating: 0 · rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 47964 - Posted: 7 Oct 2017, 9:29:52 UTC - in response to Message 47959. Last modified: 7 Oct 2017, 10:12:41 UTC I reproduced your problem - GPUGrid GPU tasks aren't honoring that <process_priority_special>. Other projects' GPU tasks do honor it. I guess that's another bug to add to GPUGrids list of brokenness. Dangit. If I can recall it correctly, this behavior is intentional. Originally, the GPUGrid app run at the same process priority level ("Idle") as CPU tasks, but it turned out to hinder GPU performance. Back then this <process_priority_special> did not exist (or the staff thought that it wouldn't be used by many participants) so it's been hard coded into the app to run at "below normal" priority level. EDIT: it was the result of "iterating" the optimal process priority level, as when it was hard coded to "above normal" some systems (Core2 Duo and Core2 Quad using SWAN_SYNC) became sluggish back then. While it's not explicitly stated, see this post by GDF (and the whole thread). EDIT2: See this thread also. ID: 47964 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 47965 - Posted: 7 Oct 2017, 11:47:07 UTC Well, it should be fixed. If the user is going to use Swan_Sync via manual system variable, they can use manual cc_config to control the priority (if the app isn't rude like it is currently). Fixable. Some staff required. ID: 47965 · Rating: 0 · rate: / Reply Quote