WU: OPM995 simulations

Author	Message
Stefan Project administrator Project developer Project tester Project scientist Send message Joined: 5 Mar 13 Posts: 348 Credit: 0 RAC: 0 Level Scientific publications	Message 43600 - Posted: 27 May 2016, 8:54:11 UTC Here we go again :) This time with 33% more credits + corrected runtimes which means an additional 2x credit for WUs which take more than 18 hours on a 780 and only WUs which take up to a max of 24 hours on a 780. I hope I don't seriously overshoot on credits this time but it's really a bit hit & miss. ID: 43600 · Rating: 0 · rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 43602 - Posted: 27 May 2016, 9:19:12 UTC - in response to Message 43600. Thanks Stefan! As there is plenty of workunits queued (7920 atm), and some of these are very long I suggest everyone to reduce their work cache to 0.03 days to maximize throughput & the credits earned. ID: 43602 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 43604 - Posted: 27 May 2016, 13:51:39 UTC - in response to Message 43602. Last modified: 27 May 2016, 13:52:17 UTC Thanks Stefan! As there is plenty of workunits queued (7920 atm), and some of these are very long I suggest everyone to reduce their work cache to 0.03 days to maximize throughput & the credits earned. Good suggestion. Given the length of these tasks (extra-long or at least some of them), and so many being available, there is no point in people hoarding tasks - they will just miss bonus deadlines and get less credit. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help ID: 43604 · Rating: 0 · rate: / Reply Quote

WPrion Send message Joined: 30 Apr 13 Posts: 106 Credit: 3,805,237,860 RAC: 53 Level Scientific publications	Message 43628 - Posted: 29 May 2016, 2:05:23 UTC - in response to Message 43602. Thanks Stefan! As there is plenty of workunits queued (7920 atm), and some of these are very long I suggest everyone to reduce their work cache to 0.03 days to maximize throughput & the credits earned. Are you referring to the setting: "Maintain enough work for an additional" I set mine to 0.03 several hours ago and updated my client. Yet it downloaded another WU shortly after one was finished just as the the running WU barely started. Is there something else to tweak? Thanks, Win ID: 43628 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 43632 - Posted: 29 May 2016, 13:14:27 UTC - in response to Message 43628. Last modified: 29 May 2016, 13:19:19 UTC Yes, in Boinc Manager (advanced view) under Options, Computing preference and the Computing tab you need to set two values: Store at least [0.02] days of work Store up to an additional [0.01] days of work If the combined values add up to anything less than 0.10 then the settings should work reasonably well. It's likely that the second value was something like 0.25 or 0.5 and that caused you to download additional work (a second task). FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help ID: 43632 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 43633 - Posted: 29 May 2016, 13:30:23 UTC Last modified: 29 May 2016, 13:32:05 UTC Please note that really low buffer settings cause increased stress on project scheduler servers, for all projects you are attached to. I personally leave my buffers at something like "store at least 1 day, store up to 0.5 days more", since I don't care about the GPUGrid credit bonus, and short buffers don't really help GPUGrid throughput unless very few work units are available, and I don't want to add increased stress to my attached projects' scheduler servers. ID: 43633 · Rating: 0 · rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 351 Level Scientific publications	Message 43634 - Posted: 29 May 2016, 14:20:12 UTC - in response to Message 43633. ... short buffers don't really help GPUGrid throughput ... Not necessarily true. I'm not speaking specifically about the OPM simulations here, but I think most GPUGrid work is run as a sort of relay race - you hold the baton for a short while, complete your lap of the track, and then hand it back in for somebody else to take over. If you sit at the side of the track for a day and a half before you even start running, that particular baton - series of linked tasks, each generated from the result of the previous lap - is permanently delayed, and the final results aren't available for the scientists to study until that much later. ID: 43634 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 43635 - Posted: 29 May 2016, 14:26:29 UTC - in response to Message 43634. Last modified: 29 May 2016, 14:28:33 UTC That had slipped my mind. But, if GPUGrid was having a problem getting the batons back for the next runners, and they wanted to ensure that the race kept running smoothly, they could tighten the deadlines on the relay chunks if need be. So, I'm just going to stick with the deadlines they give me, and not micro-manage BOINC, and not add stress to my attached projects' servers. I actually have GPUGrid set to 99999 resource share, and GPUs crunching 2-at-a-time, so ... :) When I get tasks from this project, they are usually firing on all cylinders, top priority. ID: 43635 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 43639 - Posted: 29 May 2016, 19:25:46 UTC - in response to Message 43635. Last modified: 29 May 2016, 19:28:57 UTC Until the scheduler is re-written at a per device/device-specific level there will be issues with attaching to multiple projects (when using multiple devices). However, these have been addressed as far as reasonably feasible with the existing manager. Would add that many CPU projects have long tasks; some Einstein and WCG tasks for example take ~20h to complete, ClimatePrediction several days to weeks. If you have a low cache and are running a GPUGrid task(s) on your GPU(s) and WCG tasks on your CPU then you won't badger the server for new work until you are almost out of work which probably won't be very often (a few times per day, which isn't an issue). Granted there are/where some projects with very short run-times, but that does not mean it's better to have long a long queue/big cache of tasks. There are substantial issues with having hundreds/thousands of tasks in your queue too. For example, if you crunch for BU and your Internet goes down, all queued tasks will fail - not exactly great news for their server. My opinion for here - low cache good for the project and user/team credits, higher (but reasonably low) cache not as good for either but still good, Not Bad, and it's your choice. High cache (3+ days) bad news. The bonus system is designed to reflect this projects need for a quick return. It can't take into account what else you crunch. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help ID: 43639 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 43640 - Posted: 29 May 2016, 19:53:28 UTC - in response to Message 43639. Last modified: 29 May 2016, 20:00:50 UTC It can't take into account what else you crunch. That's exactly the reason that you shouldn't make blanket suggestions on suggested cache settings that benefit GPUGrid most, without also specifying some of the drawbacks :) I digress. For my particular scenario, I have modified my cache settings a bit, in order to try to keep all my GPUs sustained at 2-GPUGrid-tasks-per-GPU without taking on additional work from other attached GPU projects. I'm using 0.9d+0.9d on the PC that has GTX970+GTX660Ti+GTX660Ti, and 0.5d+0.5d on the PC that has GTX980Ti+GTX980Ti. To each their own. ID: 43640 · Rating: 0 · rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 43641 - Posted: 29 May 2016, 21:02:47 UTC For years many have asked for per project work buffer settings or at LEAST separate settings for GPUs and CPUs. All to no avail, while a lot of effort has been spent on less important (IMO) issues. ID: 43641 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 43642 - Posted: 29 May 2016, 22:04:57 UTC - in response to Message 43640. It can't take into account what else you crunch. That's exactly the reason that you shouldn't make blanket suggestions on suggested cache settings that benefit GPUGrid most, without also specifying some of the drawbacks :) I digress. For my particular scenario, I have modified my cache settings a bit, in order to try to keep all my GPUs sustained at 2-GPUGrid-tasks-per-GPU without taking on additional work from other attached GPU projects. I'm using 0.9d+0.9d on the PC that has GTX970+GTX660Ti+GTX660Ti, and 0.5d+0.5d on the PC that has GTX980Ti+GTX980Ti. To each their own. My suggestions are predominantly for GPUGrid only and are typically optimisations for GPUGrid throughput and user/team credit. I don't make suggestions at GPUGrid to facilitate every conceivable combination of Boinc-wide project admix, nor could I - it can't be done. You have different views, values, opinions and objectives which you are quite entitled to express and implement for yourself and to your own ends. My advice is mostly aimed at new, novice or just GPUGrid-new crunchers or people with a specific problem to here. Usually they need a setup to facilitate crunching here and often changes just to make it work. Occasionally I digress too, to advise on an experience crunching elsewhere, or to pass on some observations or knowledge, but there is no catch all super setup for Boinc. I enjoy the fact that people crunch for a diversity of reasons with different setups and takes on crunching. Highlighting different circumstances and experiences adds to my knowledge and crunchers knowledge as a whole, but one shoe doesn't fit all and this is a GPUGrid forum not the Boinc central forum where generic advice might better be propagated. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help ID: 43642 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 43643 - Posted: 29 May 2016, 22:10:02 UTC - in response to Message 43641. For years many have asked for per project work buffer settings or at LEAST separate settings for GPUs and CPUs. All to no avail, while a lot of effort has been spent on less important (IMO) issues. I don't bother any more. IMO it is what it is and that's just about all it will ever be. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help ID: 43643 · Rating: 0 · rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 43644 - Posted: 29 May 2016, 23:04:59 UTC - in response to Message 43643. For years many have asked for per project work buffer settings or at LEAST separate settings for GPUs and CPUs. All to no avail, while a lot of effort has been spent on less important (IMO) issues. I don't bother any more. IMO it is what it is and that's just about all it will ever be. Gave up too. However it is supremely important to devise more ways for people to burn up their phones while doing nothing useful. ID: 43644 · Rating: 0 · rate: / Reply Quote

klepel Send message Joined: 23 Dec 09 Posts: 189 Credit: 4,798,881,008 RAC: 0 Level Scientific publications	Message 43654 - Posted: 30 May 2016, 14:21:55 UTC Stefan, Two of my computers have received SDOERR_opm995 tasks which are processed by an other computer at the same time. They have been send more or less at the same time. https://www.gpugrid.net/workunit.php?wuid=11614785 https://www.gpugrid.net/workunit.php?wuid=11614829 Is this by your intention as these SDOERR WUs had been so error prone or is it a fault of the scheduler? Please advise as fast as possible so I might kill them as soon as possible. I do not like to make double work if it is not required. ID: 43654 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 43655 - Posted: 30 May 2016, 15:17:12 UTC - in response to Message 43654. initial replication 2 https://www.gpugrid.net/workunit.php?wuid=11614785 That means two tasks are sent out, by design. One of the OPM995's I'm running also has an initial replication of 2: https://www.gpugrid.net/workunit.php?wuid=11614838 FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help ID: 43655 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 43656 - Posted: 30 May 2016, 15:25:56 UTC - in response to Message 43655. Perhaps the question is: Why was it set up with initial replication set to 2? ID: 43656 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 43659 - Posted: 30 May 2016, 20:48:32 UTC - in response to Message 43656. Last modified: 30 May 2016, 22:10:00 UTC Probably validation; any proof of concept experiment to demonstrate ability needs to contain appropriate verification for it to be accepted as a model/framework for performing experiments. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help ID: 43659 · Rating: 0 · rate: / Reply Quote

WPrion Send message Joined: 30 Apr 13 Posts: 106 Credit: 3,805,237,860 RAC: 53 Level Scientific publications	Message 43661 - Posted: 31 May 2016, 0:56:38 UTC - in response to Message 43632. Thanks! ID: 43661 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 43662 - Posted: 31 May 2016, 1:14:43 UTC - in response to Message 43659. Hmm... validation deals with quorum though, and also, I thought the way these GPUGrid tasks worked was that the results couldn't really be validated against each other. I might be mistaken though. ID: 43662 · Rating: 0 · rate: / Reply Quote