Message boards :
Number crunching :
Must set rsc_memory_bound correctly
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
GPUGrid Team: You need to change your work unit parameters, to properly set <rsc_memory_bound> correctly. BOINC 7.3.14 alpha (and potentially future versions also) will read that value, and compare it to the Working Set size, and will auto-abort the work unit if it exceeds the bound. As of right now, I cannot do any GPUGrid work -- they all error because of your invalid parameter setting. They all immediately abort, saying: working set size > workunit.rsc_memory_bound: 194.14MB > 95.37MB Your setting of <rsc_memory_bound>100000000.000000</rsc_memory_bound> ... is 95.37MB, which is too low. Could you please promptly fix this? Regards, Jacob Klein |
MJHSend message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level ![]() Scientific publications ![]()
|
Jacob, Just to be sure I understand: are you saying that this is new behaviour with release 7.3.14? Matt |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
That is correct, it is new behavior to the 7.3.14 alpha release. Previously, the client would compare the running Working Set against the user-specified BOINC RAM limit, to determine whether a task needed to be suspended. But NOW, additionally, in 7.3.14+, the running Working Set value is also compared against wu.rsc_memory_bound, and the work unit is immediately auto-aborted if the bound is exceeded, with a message such as: working set size > workunit.rsc_memory_bound: 194.14MB > 95.37MB - Jacob |
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
Hi Jacob, thanks for noting this in time. It affected the short WUs, and should be fixed for the forthcoming ones. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Toni, What do you mean by that? David thinks the change is correct, and will include it in a public release depending on how much trouble it causes. In the meantime, I'd like to continue to do work for your project. Would you mind fixing the input parameters for your work units? |
MJHSend message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level ![]() Scientific publications ![]()
|
Ok, thanks for the heads-up, Jacob. I've amended the limit to 300MB but can't modify any of the WUs that are already in the system, alas. Matt |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thank you Matt. GPUGrid isn't the only project that will have to make wu parameter changes, if we do decide to keep this change. |
|
Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
|
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
It looks like this change is being reverted for now, per David's email below. > Date: Mon, 31 Mar 2014 18:53:33 -0700 > From: d...a@ssl.berkeley.edu > To: b...c_alpha@ssl.berkeley.edu > Subject: Re: [boinc_alpha] 7.3.14 - Heads up - Memory bound enforcement > > On further thought, I'm going to change things back to the way they were, namely > > 1) workunit.rsc_memory_bound is used only by the server; > it won't send a job if rsc_memory_bound > host's available RAM > 2) the client aborts a job if working set size > host's available RAM > 3) the client will run a set of jobs only if the sum of their WSSs > fits in available RAM > (i.e. if a job's WSS is close to all available RAM, > it would run that job and nothing else) > > The reason for not aborting jobs when WSS > rsc_memory_bound is that > it requires projects to come up with very accurate estimates of RAM usage, > which I don't think is feasible in general. > Also, it will lead to lots of aborted jobs, which is bad for volunteer morale. > > -- David |
©2025 Universitat Pompeu Fabra