Custom app_info.xml for these pesky low-utilisation 3EKO_paola tasks

Message boards : Number crunching : Custom app_info.xml for these pesky low-utilisation 3EKO_paola tasks
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Luke Formosa

Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26697 - Posted: 25 Aug 2012, 12:46:36 UTC

In the previous thread (this one: http://www.gpugrid.net/forum_thread.php?id=3116#26688) we established that the 3EKO Paola tasks have low GPU utilisation. In the case of my GTX 670 on Windows 7, I'm only seeing 30% load, whereas the Nathan tasks for example return 92% load.

Can someone help me make a custom app_info.xml to run multiple instances of 3EKO Paola tasks on each GPU? Ideally we'd make it so that it only changes the <coproc> number of Paola tasks, and does not affect Nathan, Noelia, or normal Paola tasks (the non-3EKO ones that use the whole GPU properly on their own), and doesn't affect the short (acemd) tasks either.

Please ideally post a complete app_info.xml. I need a starting point... but once I get it running I'll be happy to run some experiments and post to this thread which parameters give the greatest throughput.
ID: 26697 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Luke Formosa

Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26705 - Posted: 25 Aug 2012, 23:19:33 UTC

Hmm. Silence. OK, I did all the thinking and testing myself. Here's a working app_info.xml that will run 2 tasks on one GPU. You're welcome :)

<app_info>
<app>
<name>acemdlong</name>
<user_friendly_name>Long runs (8-12 hours on fastest card)</user_friendly_name>
</app>
<file_info>
<name>acemd.2562.cuda42</name>
<executable/>
</file_info>
<file_info>
<name>cudart32_42_9.dll</name>
<executable/>
</file_info>
<file_info>
<name>cufft32_42_9.dll</name>
<executable/>
</file_info>
<file_info>
<name>tcl85.dll</name>
<executable/>
</file_info>
<app_version>
<app_name>acemdlong</app_name>
<version_num>615</version_num>
<platform>windows_intelx86</platform>
<avg_ncpus>0.36</avg_ncpus>
<max_ncpus>1.000000</max_ncpus>
<flops>6.0e11</flops>
<plan_class>cuda42</plan_class>
<api_version>6.7.0</api_version>
<coproc>
<type>CUDA</type>
<count>.5</count>
</coproc>
<gpu_ram>1073741824.00000</gpu_ram>
<file_ref>
<file_name>acemd.2562.cuda42</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart32_42_9.dll</file_name>
<open_name>cudart32_42_9.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>cufft32_42_9.dll</file_name>
<open_name>cufft32_42_9.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>tcl85.dll</file_name>
<open_name>tcl85.dll</open_name>
<copy_file/>
</file_ref>
</app_version>
</app_info>

Now, I tried to test it, but sadly I only kept being given Nathan tasks. Nathan tasks DO NOT WORK WELL when running two at a time with this app_info. They make the entire system very very slow and unresponsive, and the GPU load is actually 40%!! Running one nathan uses 90% GPU. So if you receive Nathan tasks, you'll have to stop two from running at the same time. Either suspend one of them in BOINC, or exit boinc, open app_info.xml, change the line <count>.5</count> to <count>1</count>, save and close app_info.xml and start BOINC again. You won't lose data - BOINC only discards tasks the first time it finds an app_info.xml. Changing values within the file (whilst boinc is not running) doesn't make it discard tasks.

I have yet to receive any more 3EKO Paola tasks, so I can't test this app_info.xml with them. But if anyone reading this could try, please report back to us if things improve.
ID: 26705 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
flashawk

Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 26708 - Posted: 26 Aug 2012, 18:15:48 UTC - in response to Message 26705.  

How did you make out? I noticed that they started sending out another batch of 3EKO Paola tasks.

I had one take over 27 hours on my GTX550Ti, this must be close to what it was like years ago on this project (I'm assuming) with the older cards. It could have taken days to finnish a task for not many points, now with the new cards they rip through them in some cases in just a few hours.
ID: 26708 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Luke Formosa

Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26709 - Posted: 27 Aug 2012, 0:40:58 UTC - in response to Message 26708.  

Heh. Well, using the posted app_info does indeed run two tasks at the same time, but it's a disaster with Nathan tasks. I never actually received any paola tasks to test it with. And yesterday I basically migrated to a project that gives me much, much more points per hour on my GTX 670 (once you make a few modifications that is!). But thanks to your message I'm going to try and resume GPUgrid and get two paola tasks.

I'll report back if I manage. Most likely though I'm not going to wait for them to finish. I'll time how long it takes to get to say 5% or 10% and extrapolate from there to give an estimate of total completion time.
ID: 26709 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Luke Formosa

Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26710 - Posted: 27 Aug 2012, 0:52:55 UTC - in response to Message 26708.  

OK, I got one Nathan and one Paola. Running them both at the same time and no other projects I get:

CPU usage 20%
GPU load 99%

After 5 minutes:
Nathan progress: 1.546% (estimated 323 minutes or 5.39 hours to completion - usually takes me 4.84 hours when running alone)
Paola progress: 0.386% (estimated 1295 minutes or 21.5 hours to completion - usually takes me 11.51 hours when running it alone)

Now I'll try and get two paola tasks running at the same time.
ID: 26710 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile nate

Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 26711 - Posted: 27 Aug 2012, 9:39:55 UTC - in response to Message 26710.  

Thanks for keeping up on this and trying a few ideas, Luke. I'm looking at this again today and hopefully I can come up with some good news. :-/
ID: 26711 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Luke Formosa

Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26712 - Posted: 27 Aug 2012, 10:04:59 UTC - in response to Message 26711.  

You're welcome :)

I couldn't get a second Paola to try. I kept aborting Nathans and updating in the hopes of being given a second Paola, but the server kept sending me more Nathan tasks. I aborted a total of 3 of them and gave up for now.
ID: 26712 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
flashawk

Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 26719 - Posted: 27 Aug 2012, 19:31:51 UTC - in response to Message 26712.  

I would try backing up both BOINC folders, you can reuse the same ones as many times as you like without aborting new tasks.
ID: 26719 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
neilp62

Send message
Joined: 23 Nov 10
Posts: 14
Credit: 8,017,535,732
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26725 - Posted: 28 Aug 2012, 6:21:47 UTC

Luke:

Thank you for your efforts on this issue!

I'd received three 3EKO Paolas tasks almost in a row, so I adapted your app_info.xml to my 4GB GTX 680 setup early this morning. Before, my single Paola task average CPU/GPU util was ~70%/32%, with an average GPU runtime of ~17.5 hours (after four Paolas), regardless whether my other three CPU cores were processing other BOINC WUs or idle (the GTX 680 is running at factory clock 1188MHz but typically underclocking itself to less; the CPU is an older Intel Quad Q9300 @ 2.5GHz). After restarting BOINC with GPUGRID <count>.5</count>, I received two 3EKO Paolas. I am now about 70% complete on both Paola tasks, and am averaging ~80% CPU core util per Paola, with total GPU util of ~56%. Total GPU runtime is projected at ~16.75 hrs per Paola, with total GPU mem usage of ~1400 MB.

Looks like my setup sees a slight runtime improvement with two 3EKO Paola GPU tasks in parallel. I'm a lot happier with the effective throughput than with one at a time, and will tolerate having to swap app_info <count> values as needed. I only wish the project would permit three 3EKO Paolas at once :)

Let me know if there are any other data points I can provide that might be helpful. Thanks again!
ID: 26725 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Luke Formosa

Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26726 - Posted: 28 Aug 2012, 10:39:18 UTC - in response to Message 26719.  

I would try backing up both BOINC folders, you can reuse the same ones as many times as you like without aborting new tasks.


I never thought of that. Does it work for you? Which two folders do you mean when you say "both" BOINC folders?

I don't know if something will go wrong at upload time because of external files that boinc keeps task lists in. But your suggestion would be good to try for someone who doesn't mind potentially losing two tasks' worth of work (lately I've become one of these people because in the course of testing, I've been aborting more tasks than I've been completing hehe).
ID: 26726 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Luke Formosa

Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26727 - Posted: 28 Aug 2012, 10:54:17 UTC - in response to Message 26725.  

Luke:

Thank you for your efforts on this issue!

I'd received three 3EKO Paolas tasks almost in a row, so I adapted your app_info.xml to my 4GB GTX 680 setup early this morning. Before, my single Paola task average CPU/GPU util was ~70%/32%, with an average GPU runtime of ~17.5 hours (after four Paolas), regardless whether my other three CPU cores were processing other BOINC WUs or idle (the GTX 680 is running at factory clock 1188MHz but typically underclocking itself to less; the CPU is an older Intel Quad Q9300 @ 2.5GHz). After restarting BOINC with GPUGRID <count>.5</count>, I received two 3EKO Paolas. I am now about 70% complete on both Paola tasks, and am averaging ~80% CPU core util per Paola, with total GPU util of ~56%. Total GPU runtime is projected at ~16.75 hrs per Paola, with total GPU mem usage of ~1400 MB.

Looks like my setup sees a slight runtime improvement with two 3EKO Paola GPU tasks in parallel. I'm a lot happier with the effective throughput than with one at a time, and will tolerate having to swap app_info <count> values as needed. I only wish the project would permit three 3EKO Paolas at once :)

Let me know if there are any other data points I can provide that might be helpful. Thanks again!


Welcome :). What adaptations did you have to make to run it on your 680? Would be good to compare notes.

The throughput does rise considerably when you run multiple tasks. This is why I hate GPUgrid's enforced limit of 2 tasks. In fact I'm no longer participating in GPUgrid. I'm running a project right now that is giving me 10 points per second with an experimental app_info.xml. I've tried up to 9 concurrent tasks on my single GTX 670 but the CPU becomes the limiting factor. GPUgrid was giving me 1, 2 or 4 points per second depending on task. My plan is to keep this up until I take #1 spot on the country-wide Malta leaderboard (should take a few days now, I've been #2 for a month because #1 guy earns 300,000 points a day and I was only earning 375,000 with GPUgrid. I'm now earning 800,000 a day just by changing project!!). Then I will temporarily switch back to running GPUGrid until I get my 10 million point badge (should take 13 days). And then I'll either give up GPUgrid for good or split the time between two projects.

I wonder if the two task limit is per-GPU? Would someone with many GPUs in their computer tell us? Because if so, we could perhaps trick BOINC into reporting a GPU count of 2 so that we could get four tasks. Another idea would be to move all the tasks out of the BOINC project folder. Then do a scheduler request so that BOINC downloads two new tasks. Then close boinc and move the two old tasks back into the folder, and hopefully you'd have four tasks in the queue.

Perhaps there's also a way to run many instances of BOINC on the same PC, and each one would act as a separate host and have an individual limit of 2 tasks, so maximum tasks would be 2 x number of BOINC instances.

What I know for sure is that the 2 task limit is not something they're willing to change. I had posted about this a few weeks ago and that was the answer I got. They need speedy returns of tasks (because one builds on the results of a previous one) so that's how they make it happen.
ID: 26727 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Snow Crash

Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26731 - Posted: 28 Aug 2012, 14:33:59 UTC

I wonder if the two task limit is per-GPU?
Yes, it is 2 WU per GPU.

Perhaps the project would consider using the BETA queue for PAOLOA tasks so we could isolate these types and get multiple WUs going per GPU to help speed up the research? I know this requires a bit of work for both the project and for crunchers, of which only some will understand enough to get an app_info working properly. I for one would gladly participate on my 3 gpus (GTX670, GTX660Ti, GTX480). I might even consider resurrecting my GTX295 just for these tasks :-)

side note: my PC crashed hard last night and managed to drop the entries for GPUGrid entirely so while GPUGrid thinks I have tasks in process I actually don't. I think I may have to wait until they clear (no reply) before I can get more as I don't feel like changing the machine name just to trick the project into giving me more right away.
Thanks - Steve
ID: 26731 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Luke Formosa

Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26732 - Posted: 28 Aug 2012, 14:47:51 UTC - in response to Message 26731.  


Perhaps the project would consider using the BETA queue for PAOLOA tasks so we could isolate these types and get multiple WUs going per GPU to help speed up the research? I know this requires a bit of work for both the project and for crunchers, of which only some will understand enough to get an app_info working properly. I for one would gladly participate on my 3 gpus (GTX670, GTX660Ti, GTX480). I might even consider resurrecting my GTX295 just for these tasks :-)


I don't know if putting paola tasks in the beta queue is such a good idea, because they'll lose a lot of throughput. I think beta is not activated by default when anyone signs up, and I think many crunchers don't tick that checkbox because of the higher risk of getting a computation error and not getting any points for beta tasks.

The ideal solution is for them to create a third queue (right alongside acemd2 and acemdlong), and make <coproc> number equal to 0.5 from THEIR side. Then everyone would be running two Paolas without having to mess around with any app_info.xml files themselves!
ID: 26732 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Luke Formosa

Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26733 - Posted: 28 Aug 2012, 15:05:43 UTC

Just tried to remove the currently running tasks from the project folder in BOINC and re-starting boinc to get it to download more tasks. It didn't work. BOINC just re-downloads the two tasks I removed from the folder. We need to find another way of getting more than 2 tasks downloaded.
ID: 26733 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Snow Crash

Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26735 - Posted: 28 Aug 2012, 17:37:21 UTC

I'm not part of the project team and I'm not saying I don't think you have good ideas, and I really would like to see better efficiency but ... I'm aiming for an approach that I think may actually have a chance of happening.

I have been convinced over may conversations that the administrative overhead associated with adding another queue (beyond the effort just to get it up and running) and the general slowdown of the entire project by increasing the max WU count are a combination that just simply isn't in the best interest of the long term goals of the project. This run of WUs is limited and will be through the pipeline in the near future but decisions taken to handle specific egde conditions will need to be managed for the long haul and make any future upgrade just all that more difficult to manage also.


I suggested using BETA as the queue itself already exists and I believe these are the very users that are more likley to be running high end cards and have the ability to utilize a custom app_info file successfully. If this was in place and we brought this back to our respective teams, actively recruit high end cards/ team members we CAN get the WUs process for Paola.
Thanks - Steve
ID: 26735 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
flashawk

Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 26738 - Posted: 28 Aug 2012, 19:27:07 UTC - in response to Message 26726.  

I would try backing up both BOINC folders, you can reuse the same ones as many times as you like without aborting new tasks.


I never thought of that. Does it work for you? Which two folders do you mean when you say "both" BOINC folders?

I don't know if something will go wrong at upload time because of external files that boinc keeps task lists in. But your suggestion would be good to try for someone who doesn't mind potentially losing two tasks' worth of work (lately I've become one of these people because in the course of testing, I've been aborting more tasks than I've been completing hehe).


BOINC creates 2 main folders when you install it, 1 in Program Files and the other depends on you're OS but on the same drive and it's labeled Data. I copy and paste everything out of those 2 folders into 2 other folders that I created, compress them and store them on my data server. If anything goes wrong, you can copy and paste it all back as many times as you like.
ID: 26738 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Luke Formosa

Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26740 - Posted: 28 Aug 2012, 20:02:34 UTC - in response to Message 26738.  

I would try backing up both BOINC folders, you can reuse the same ones as many times as you like without aborting new tasks.


I never thought of that. Does it work for you? Which two folders do you mean when you say "both" BOINC folders?

I don't know if something will go wrong at upload time because of external files that boinc keeps task lists in. But your suggestion would be good to try for someone who doesn't mind potentially losing two tasks' worth of work (lately I've become one of these people because in the course of testing, I've been aborting more tasks than I've been completing hehe).


BOINC creates 2 main folders when you install it, 1 in Program Files and the other depends on you're OS but on the same drive and it's labeled Data. I copy and paste everything out of those 2 folders into 2 other folders that I created, compress them and store them on my data server. If anything goes wrong, you can copy and paste it all back as many times as you like.


Ahh, you mean the boinc program and boinc data folders. Yes, that might work! Thanks for the idea
ID: 26740 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Luke Formosa

Send message
Joined: 11 Jul 12
Posts: 32
Credit: 33,298,777
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 26741 - Posted: 28 Aug 2012, 20:11:44 UTC - in response to Message 26735.  

I'm not part of the project team and I'm not saying I don't think you have good ideas, and I really would like to see better efficiency but ... I'm aiming for an approach that I think may actually have a chance of happening.

I have been convinced over may conversations that the administrative overhead associated with adding another queue (beyond the effort just to get it up and running) and the general slowdown of the entire project by increasing the max WU count are a combination that just simply isn't in the best interest of the long term goals of the project. This run of WUs is limited and will be through the pipeline in the near future but decisions taken to handle specific egde conditions will need to be managed for the long haul and make any future upgrade just all that more difficult to manage also.


I suggested using BETA as the queue itself already exists and I believe these are the very users that are more likley to be running high end cards and have the ability to utilize a custom app_info file successfully. If this was in place and we brought this back to our respective teams, actively recruit high end cards/ team members we CAN get the WUs process for Paola.


Course there's always option 3 - they could let us have more than two workunits in the queue! Maybe just add a tickbox somewhere, and explain why it's not on by default and why they want it limited to 2 tasks... but at least give us the option for extraordinary circumstances such as these :)
ID: 26741 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
neilp62

Send message
Joined: 23 Nov 10
Posts: 14
Credit: 8,017,535,732
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26747 - Posted: 29 Aug 2012, 7:08:35 UTC - in response to Message 26727.  

What adaptations did you have to make to run it on your 680? Would be good to compare notes.


Just small changes specific to my rig based on my GPUGRID sched_request file, i.e. values for <avg_ncpus>, <flops>, <gpu_ram>, etc. I have not attempted any further optimization.
ID: 26747 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
spout23

Send message
Joined: 8 May 12
Posts: 2
Credit: 304,211,223
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 26802 - Posted: 6 Sep 2012, 16:49:18 UTC

Luke

Where is the file app_info.xml in the ProgramData directory from which GPUGRD
program works?

I know it does not work in the cc_config file in Bonic program.

So in other words, where are you or other people putting the program to have 2 wu's work at once.

Thanks
spout23
ID: 26802 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : Custom app_info.xml for these pesky low-utilisation 3EKO_paola tasks

©2025 Universitat Pompeu Fabra