New CPU work units

Message boards : News : New CPU work units
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 6 · Next

AuthorMessage
TJ

Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38466 - Posted: 13 Oct 2014, 17:56:19 UTC
Last modified: 13 Oct 2014, 17:57:41 UTC

Found it, I have some data

Step Time Lambda
4891000 9782.00000 0.00000
(99.895% done, 5CPU, 24:23:20h)

Step Time Lambda
3856000 7712.00000 0.00000
(99.887% done, 5CPU, 23:53:20h)

Step Time Lambda
2938000 5876.00000 0.00000
(97.075% done, 4CPU, 18:51:13h)

So will take more time to finish. At one rig I got 4 more, they will not meet the deadline of 17 October.
Greetings from TJ
ID: 38466 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TJ

Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38467 - Posted: 13 Oct 2014, 18:56:44 UTC

Finally got the first one finished in 24h55m!
Greetings from TJ
ID: 38467 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MJH

Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38472 - Posted: 13 Oct 2014, 20:29:02 UTC - in response to Message 38451.  

exapower,

Actually it's dihydrofolate reductase http://en.wikipedia.org/wiki/Dihydrofolate_reductase

Matt
ID: 38472 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MJH

Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38473 - Posted: 13 Oct 2014, 20:31:04 UTC - in response to Message 38467.  

TJ - odd that it ran with 5 threads -- do you have a CPU limit set ?

Matt
ID: 38473 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MJH

Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38474 - Posted: 13 Oct 2014, 20:56:11 UTC

For the WUs going on tonight I've increased the expected compute cost by 10x, and the deadline to 7 days.

Matt
ID: 38474 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
eXaPower

Send message
Joined: 25 Sep 13
Posts: 293
Credit: 1,897,601,978
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 38477 - Posted: 13 Oct 2014, 22:23:32 UTC - in response to Message 38472.  
Last modified: 13 Oct 2014, 23:23:17 UTC

Thanks for the correction. Gives me another enzyme to learn about. These CPUMD tasks for Ivy Bridge 4 threads looks to be ~48 hr runtime. My Westmere generation Pentium runtime is about 90~hr. (It really a Dual core Xeon 30TDP L3403) reason why- a 4.8GT/s QPI internal link and external DMI link. No westmere Pentium has QPI- only DMI links. Intel been rebadging Xeons for while now.

I just received a new task with 10x compute cost-- 1084hour estimated runtime! Old task estimated a 5hr runtime- while taking 48~hr to finish. Going to be very interestxing.
http://www.gpugrid.net/workunit.php?wuid=10165559

Edit*** Task http://www.gpugrid.net/workunit.php?wuid=10157944 is now running in high priority mode- kicking out one of my two GPU tasks off to waiting to run. All settings for Boinc have no limitations. Manual restarting of task and shutting off SLI has no affect. This just started to happen when task went into High priority mode after I downloaded new CPUMD task with high compute cost.

I aborted http://www.gpugrid.net/workunit.php?wuid=10165559 and GPU waiting to run task started back up. A new task http://www.gpugrid.net/workunit.php?wuid=10165641 downloaded and stopped GPU task again- boinc saying waiting to run state. Aborted task and second GPU starts task again.
What's changed from prior batch of CPUMD?
Having a new 10x compute cost task in cache suspends one of two GPU tasks running. If no 10x compute task are in cache- both GPU tasks run normal.
All this is happening while the older CPUMD task is running in high priority mode with two GPU SDOERR tasks and one 10x compute CPUMD task in cache.
ID: 38477 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TJ

Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38478 - Posted: 13 Oct 2014, 22:25:22 UTC - in response to Message 38473.  

TJ - odd that it ran with 5 threads -- do you have a CPU limit set ?

Matt

Yes it is at 70 so that 2 are free to feed the GPU's and 1 for the system to stay responsive so I can use this site and do some other things.

I have another rig set to 100 so the next one should run at 100% when the current one has finished.
Greetings from TJ
ID: 38478 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Chilean
Avatar

Send message
Joined: 8 Oct 12
Posts: 98
Credit: 385,652,461
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38481 - Posted: 14 Oct 2014, 0:57:49 UTC
Last modified: 14 Oct 2014, 1:08:25 UTC

Running one @ 8 threads (4 cores + HT).

EDIT: is there a way to let it use only 6 cores and free up two cores (so vLHC can ran as well...)
ID: 38481 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38482 - Posted: 14 Oct 2014, 2:18:51 UTC
Last modified: 14 Oct 2014, 2:23:16 UTC

Hmm... I got a task, which will correctly run at 6 of my 8 CPUs on my rig (since I'm using "Use at most 75% CPUs" to presently accommodate 2 VM tasks outside of BOINC):
http://www.gpugrid.net/workunit.php?wuid=10166037

BUT...

It immediately started running in "High Priority" "Earliest Deadline First" mode. Could this mean that the 1-week-deadline is too short? The estimated runtime is 1130+ hours, which is, uhh, 6.7 weeks? Does that sound right?

This configuration must be incorrect. The deadline should never be less than the expected initial runtime, right?
ID: 38482 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MJH

Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38484 - Posted: 14 Oct 2014, 7:08:38 UTC - in response to Message 38482.  

Yeah, no idea what went wrong with the runtime estimates. The last set got an estimate of 5h, but when I increased the cost by 10x the estimate went up 200x.

Matt
ID: 38484 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
eXaPower

Send message
Joined: 25 Sep 13
Posts: 293
Credit: 1,897,601,978
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 38487 - Posted: 14 Oct 2014, 9:41:02 UTC
Last modified: 14 Oct 2014, 9:41:47 UTC

Task http://www.gpugrid.net/workunit.php?wuid=10157944 been @ 99.978 compete with an estimated 3 minute left for last 12hr with a current 2million steps to go before finish. This task been past 98% for about 25hr out 32hr running. The newer CPUMD tasks sitting in cache kills one of GPU tasks - keeping task in waiting to run mode. Once 10x compute cost task booted all is normal again.
ID: 38487 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TJ

Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38489 - Posted: 14 Oct 2014, 12:38:52 UTC - in response to Message 38481.  

Running one @ 8 threads (4 cores + HT).

EDIT: is there a way to let it use only 6 cores and free up two cores (so vLHC can ran as well...)

Yes that can be achieved, but not with the current one running.
Eight cores, so 12.5 per core. You want to use 6 cores thus 12.5x6=75.
So you have to set in BOINC Manager, Tools, Computing preferences, and then at the bottom: On multiprocessor systems, use at most and there you set it to 75%.

This works only for new WU's that not have been started. Once your current WU has finished, the next one will only use 75% thus 6 threads.
Greetings from TJ
ID: 38489 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MJH

Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38490 - Posted: 14 Oct 2014, 12:47:54 UTC - in response to Message 38489.  

I am thinking about hard-coding the CPU use to [number-of-CPU-cores] - [number-of-GPUs]. Opinions?

Matt
ID: 38490 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MJH

Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38491 - Posted: 14 Oct 2014, 12:47:59 UTC - in response to Message 38489.  

I am thinking about hard-coding the CPU use to [number-of-CPU-cores] - [number-of-GPUs]. Opinions?

Matt
ID: 38491 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
captainjack

Send message
Joined: 9 May 13
Posts: 171
Credit: 4,594,296,466
RAC: 130,244
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38493 - Posted: 14 Oct 2014, 12:56:51 UTC

Running one @ 8 threads (4 cores + HT).

EDIT: is there a way to let it use only 6 cores and free up two cores (so vLHC can ran as well...)


Another way to accomplish this is to set up an app_config.xml file like the following:

<app_config>
    <app>
        <name>android</name>
        <max_concurrent>8</max_concurrent>
    </app>
  <app_version>
    <app_name>cpumd</app_name>
    <plan_class>mt</plan_class>
    <avg_ncpus>6</avg_ncpus>
    <cmdline>--nthreads 6</cmdline>
  </app_version>
</app_config>


The "avg_ncpus" parameter sets the number of threads reserved in the BOINC scheduler and the "__nthreads" parameter set the number of threads you want a task to use when the task runs.

The app_config.xml file goes into the /Boinc/projects/www.gpugird.net folder.

And just as a reminder, if you are using Windows, use Notepad to edit the app_config.xml file. Do not use Word as it put in extra formatting characters that confuses the xml interpreter. If you are using Ubuntu, use Gedit to edit the app_config.xml file.

That will leave 2 threads free for BOINC to schedule other tasks.

Sometimes you can make this effective by opening BOINC Manager and clicking on "Advanced" then "Read config files". In the message log, you should get a message that says something like app_config.xml found for www.gpugrid.net. You might need to shut down BOINC and start it back up again to make the app_config.xml effective. Then the parameters will apply to any new work that is downloaded after the parameters are effective.

Hope that helps.
ID: 38493 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38494 - Posted: 14 Oct 2014, 12:57:59 UTC
Last modified: 14 Oct 2014, 13:10:15 UTC

MJH:
Yeah, no idea what went wrong with the runtime estimates. The last set got an estimate of 5h, but when I increased the cost by 10x the estimate went up 200x.

Did you increase the <rsc_flops_bound> value appropriately? I believe it's used for task size, and hence task runtime estimation.

I am thinking about hard-coding the CPU use to [number-of-CPU-cores] - [number-of-GPUs]. Opinions?

I personally don't care either way, but I absolutely care that you make absolutely sure that the number of cores used (via commandline) matches the number of cores budgeted (via ncpus), so BOINC doesn't overcommit or undercommit the CPU. I hope that makes sense.
ID: 38494 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
eXaPower

Send message
Joined: 25 Sep 13
Posts: 293
Credit: 1,897,601,978
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 38495 - Posted: 14 Oct 2014, 13:03:10 UTC - in response to Message 38491.  

I am thinking about hard-coding the CPU use to [number-of-CPU-cores] - [number-of-GPUs]. Opinions?

Matt


Doesn't hurt to try. Only Nvidia GPU? How much would a GPU help with task time? If runtime is lowered by half or more then I'd say this is workable. Would be an option to run a task with half of GPU cores so maybe two of MD task can run at a time? Or is whole GPU required?
ID: 38495 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38496 - Posted: 14 Oct 2014, 13:13:37 UTC

Side note:

Is it possible that progress % is not hooked up correctly?

For instance, progress.log says:
   nsteps               = 5000000

and if I scroll to the bottom:
           Step           Time         Lambda
        1254000     2508.00000        0.00000


... yet, the BOINC UI only says "1.777% done"
Shouldn't it say 25% done?
ID: 38496 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MJH

Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38497 - Posted: 14 Oct 2014, 13:20:16 UTC - in response to Message 38496.  

The app isn't reporting its progress to the client yes. It's just being estimated from the flopses.

Matt
ID: 38497 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38498 - Posted: 14 Oct 2014, 13:35:34 UTC

Is it possible to change it so that the app can report a better progress %?
ID: 38498 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 . . . 6 · Next

Message boards : News : New CPU work units

©2025 Universitat Pompeu Fabra