Running multiple tasks per GPU

Message boards : Number crunching : Running multiple tasks per GPU - count=0.5

Author	Message
Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,932,421,670 RAC: 18,319,984 Level Scientific publications	Message 23822 - Posted: 7 Mar 2012 \| 19:29:25 UTC
	Has anyone ever tried this at GPUGrid? It's commonplace at other projects with smaller tasks, and Einstein@home have just announced an experiment in enabling the option via project preferences. I'm going to try it overnight on my GTX 470, first of all with small-task projects (SETI and Einstein). If that works well, and if nobody here comes up with any show-stoppers, I'll try adding GRUGrid to the mix tomorrow or the next day. I'm planning to set count=0.48 for the other two projects, and count=0.51 here. That would allow any combination of two tasks from {Einstein, GPUGrid, SETI} to run, except two GPUGrid simultaneously. I'm currently running short tasks only on that host, and they take no longer than 5 hours. Since I'm already alternating the tasks with SETI, I don't expect that my return-rate will be made much, if any, slower by running concurrently - but testing that expectation is part of the reason for my experiment. Comments?
	ID: 23822 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 23823 - Posted: 7 Mar 2012 \| 23:28:32 UTC - in response to Message 23822.
	This is the latest thread on the subject. You would at least need to change the app name. I made a suggestion in the past about running two tasks simultaneously, but it's not practical; here it's one app for many task types, and no task selection. Some tasks run at 99% and thus don't benefit, even 85% would be about the same. Running other projects might slow the tasks down massively. We would really need greater modulation control of the GPU, possibly through Boinc, to better manage mixed projects. Something that is more attainable for AMD, and possibly forthcoming NVidia's. At present it's not like you can say project A, use 128 cuda cores, and project B, use the remaining 320. ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help
	ID: 23823 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,932,421,670 RAC: 18,319,984 Level Scientific publications	Message 23825 - Posted: 7 Mar 2012 \| 23:51:04 UTC - in response to Message 23823.
	I've already run tasks for 'ACEMD2: GPU molecular dynamics v6.16', so I have the executables and the <app_version> segment, which will allow me to construct app_info.xml with no bother at all. Been there, done that. I'm not specifically asking about 'benefit', certainly not for the GPUGrid processing itself. I'm more thinking about the other projects. Both SETI (during the startup phase of each - short - task), and Einstein (throughout the task) have periods when the app doesn't make efficient use of the CUDA cores - either because data is being pre-processed and loaded into video memory by the CPU, or because some parts of the calculation are not amenable to parallelisation and have been deliberately left for CPU processing. Running two tasks at once allows one of them to make use of the CUDA cores while the other is attending to other matters. My question is: can GPUGrid join the party without detriment?
	ID: 23825 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 23826 - Posted: 8 Mar 2012 \| 1:54:05 UTC - in response to Message 23822.
	Think I tried this with MW+GPUGrid, and it was pointless; no benefit, but apps change and your not talking about MW. Also tried GPUGrid + the non-Boinc GPU project, and found that GPUGrid tasks ran extremely slowly. Priorities might give some control. How GPUGrid tasks will perform will be partially down to the other project. The issue with GPUGrid is that there are tasks that run from ~75% right up to 99% and you can't pick and choose. So expect variation in performances of the other projects. As Ignasi's tasks use more CPU, that could be a factor too. So it really is test and see. Good luck, ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help
	ID: 23826 \| Rating: 0 \| rate: / Reply Quote

nenym Send message Joined: 31 Mar 09 Posts: 137 Credit: 1,308,230,581 RAC: 0 Level Scientific publications	Message 23828 - Posted: 8 Mar 2012 \| 10:49:21 UTC Last modified: 8 Mar 2012 \| 10:51:28 UTC
	A bit OT: I did many tests about it (9600GT, GTX 260, GTX 560Ti CC2.1) one year ago. The most useful for Einstein or Seti seemed to be F@H. I have found it more efficient than running concurently Einstein&Einstein or Einstein&Seti or Seti&Seti dute to possibility to set GPU load by F@H client. GPUGRID I didn't test.
	ID: 23828 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,932,421,670 RAC: 18,319,984 Level Scientific publications	Message 23830 - Posted: 8 Mar 2012 \| 14:16:33 UTC - in response to Message 23828.
	A bit OT: I did many tests about it (9600GT, GTX 260, GTX 560Ti CC2.1) one year ago. The most useful for Einstein or Seti seemed to be F@H. I have found it more efficient than running concurently Einstein&Einstein or Einstein&Seti or Seti&Seti dute to possibility to set GPU load by F@H client. GPUGRID I didn't test. It'll always be a bit of a struggle on a 9600 or a 260, because they don't have the hardware support for context switching that came in with the Fermis. The first one is running now - well, walking, but at least it's not crawling. 17.5% in 1hr 20mn. I need to check my history on that type of task, but it feels better than half-speed, which is good enough for me so far. I can try other settings later.
	ID: 23830 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,932,421,670 RAC: 18,319,984 Level Scientific publications	Message 23836 - Posted: 8 Mar 2012 \| 22:10:11 UTC
	Well, the experimental host has completed A565-TONI_AGG1CMAP-5-50-RND4946_2. As you can see, it's a _2 resend, so two hosts have failed before me - so the accepted result has to be a bonus. It took just over 30 Ksec, compared with a usual ~13.2 Ksec when running these tasks on their own (unshared). So less than half speed, which won't impress the chasers after efficiency, but not a complete disaster either. I'l let the next task - a TONI_AGGMI - run in the same configuration, to get a second comparison point. Then, I'll try freeing up a CPU core, and see what difference it makes.
	ID: 23836 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,932,421,670 RAC: 18,319,984 Level Scientific publications	Message 23837 - Posted: 9 Mar 2012 \| 9:16:34 UTC
	Second task complete - A1823-TONI_AGGMI1-34-100-RND3451_3. Again, I succeeded with a task that two had crashed (and a third timed out on) before me. So, worthwhile for the science. This one took 41 Ksec, against ~17.5 Ksec, so fairly consistently taking about 2.3 times as long as a task running alone. Not worth doing normally, unless you're interested in these sorts of multi-project experiments. Next, to see what happens with a free CPU.
	ID: 23837 \| Rating: 0 \| rate: / Reply Quote

Damaraland Send message Joined: 7 Nov 09 Posts: 152 Credit: 16,181,924 RAC: 0 Level Scientific publications	Message 24060 - Posted: 20 Mar 2012 \| 23:32:17 UTC
	I did some testing. I found it easy on Einstein since their units are the same size. Here I see it more complicated. The results are very good since standard dev is very small. http://einstein.phys.uwm.edu/forum_thread.php?id=9361&nowrap=true#116431 ____________ HOW TO - Full installation Ubuntu 11.10
	ID: 24060 \| Rating: 0 \| rate: / Reply Quote

Post to thread

Message boards : Number crunching : Running multiple tasks per GPU - count=0.5

	About	Science	Volunteers	Performance	Forum	Join us	Donate