Message boards :
News :
WARNING/CHALLENGE: VERY LONG WU (VERYLONG_CXCL12_confAna)
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 8 · Next
Author | Message |
---|---|
![]() Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
OK, so here are the KISS ('Keep it simple, stupid') instructions. Thanks Richard. I found 3 of the verylong WUs and executed your fix. One thing that may make this easier: I simply searched for _0_9 and in each case the first instance found was the correct one. Just make sure it's a verylong WU and not a Noelia that you're editing. All 3 of these are on 750Ti cards and completion looks to be around 50 hours. Edit: Hmm from the example above it looks like some of the WUs may have _1_9 instead of _0_9 that all of mine had: <file> <name>2x10-GERARD_VERYLONG_CXCL12_confAna-0-1-RND3907_1_9</name> <nbytes>0.000000</nbytes> <max_nbytes>256000000.000000</max_nbytes> <status>0</status> <upload_url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</upload_url> </file> |
Send message Joined: 26 Mar 14 Posts: 101 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
The BOINC administrator just raised the upload limit to 512 Mb, please let us know if you can upload the WU now. |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 326,008 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The BOINC administrator just raised the upload limit to 512 Mb, please let us know if you can upload the WU now. It should be fine for newly created WUs, but I'm not sure whether the change will propagate to automatically-generated replacements for tasks which fail - we'll need to keep an eye on those. It certainly won't be passed to tasks which are already 'out in the field' - on volunteers' computers. They will have to be modified manually, or allowed to fail. |
![]() ![]() Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The upload of the first result is finished. Here are the details: 3x17-GERARD_VERYLONG_CXCL12_confAna-0-1-RND9026_1 |
![]() Send message Joined: 16 Jul 13 Posts: 56 Credit: 1,626,354,890 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The BOINC administrator just raised the upload limit to 512 Mb, please let us know if you can upload the WU now. Even if you "update" the project ? |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 326,008 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
My GTX 980 host is uploading the first result. Credit 600,000.00 Congratulations on the home run - and thanks for the confirmation that the file edit is effective. |
![]() ![]() Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The BOINC administrator just raised the upload limit to 512 Mb, please let us know if you can upload the WU now. I've received a 1894-NOELIA_BI3_unbind-1-10-RND9593_0 just now, and the file info has the old size limit: <file_info> <name>1894-NOELIA_BI3_unbind-1-10-RND9593_0_8</name> <nbytes>0.000000</nbytes> <max_nbytes>256000000.000000</max_nbytes> <generated_locally/> <status>0</status> <upload_when_present/> <url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</url> </file_info> <file_info> <name>1894-NOELIA_BI3_unbind-1-10-RND9593_0_9</name> <nbytes>0.000000</nbytes> <max_nbytes>128000000.000000</max_nbytes> <generated_locally/> <status>0</status> <upload_when_present/> <url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</url> </file_info> |
![]() ![]() Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Congratulations on the home run - and thanks for the confirmation that the file edit is effective. Thank you! We had similar upload size problems before, and the solution was the same back then. |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 326,008 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Congratulations on the home run - and thanks for the confirmation that the file edit is effective. And we recently had the same thing at CPDN, which was why I checked - it had been bumped back to the top of my list of "things project administrators forget to do" when they're excited by an interesting bit of research. Which reminds me..... @ Gerard, If you find yourself having to re-generate all or part of this batch of 'verylong' tasks, could you please adjust <rsc_fpops_est> proportionately, so that our BOINC clients show a fair estimate of the task runtime from the beginning, and the task doesn't mess up DCF when it finishes? |
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
I've raised the limit in the DB for VERYLONG WUs. I'm not sure whether such changes propagate to clients at some time. |
![]() ![]() Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
My second very long workunit is uploading. 7x6-GERARD_VERYLONG_CXCL12_confAna-0-1-RND0829_0 |
Send message Joined: 26 Feb 13 Posts: 7 Credit: 2,242,660,281 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'm glad I checked this thread again. Two of my clients have one of the very long WUs. I implemented the fix as described by Richard. I'll keep an eye on it and check the status on completion. http://www.gpugrid.net/result.php?resultid=13737049 (GTX 770) http://www.gpugrid.net/result.php?resultid=13737185 (GTX 670) ![]() |
Send message Joined: 26 Aug 08 Posts: 183 Credit: 10,085,929,375 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks for the detailed guidance Richard and Retvari! I just got home from work and checked this thread. My WU was 90%+ done so just in time. |
![]() ![]() Send message Joined: 30 Jul 14 Posts: 225 Credit: 2,658,976,345 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
4x6-GERARD_VERYLONG_CXCL12_confAna-0-1-RND1754_0_0 working now. It started out and continued to count down about 18.5 hours till completion, then at 48.8% finished and about 13 hours is jumped to 13.5 hours left. hah! Anyway, with just around 10 hours showing left I am changing the xml according to Richard's instructions. Only have one since I only have the one machine I run longs on. Good thing it has the 3 780's in it. I'll keep an eye out to see if I get any more in the near future also. Right now I have 3 queued and 3 working and only 1 of these. 1 Corinthians 9:16 "For though I preach the gospel, I have nothing to glory of: for necessity is laid upon me; yea, woe is unto me, if I preach not the gospel!" Ephesians 6:18-20, please ;-) http://tbc-pa.org |
Send message Joined: 22 Nov 09 Posts: 114 Credit: 589,114,683 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
We just launched 400 very long WU (they will take about 24h in a 780GTX) named VERYLONG_CXCL12_confAna whose results we need as soon as possible (we are in a hurry). Absolutely agree that this is an inappropriate way to handle these large work units. Another "very long" queue like Retvari says or have the server automatically figure out what computers should get them based on the installed graphics cards. |
Send message Joined: 5 Jan 09 Posts: 670 Credit: 2,498,095,550 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Finally got one but on my GTX560ti which I aborted but nothing for my GTX970 Need to get a better system, no doubt about that. |
![]() ![]() Send message Joined: 30 Jul 14 Posts: 225 Credit: 2,658,976,345 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Absolutely agree that this is an inappropriate way to handle these large work units. Another "very long" queue like Retvari says or have the server automatically figure out what computers should get them based on the installed graphics cards. As far as that goes, the Notice that went out and started this thread is clear that they had a limited number of work units that needed immediate release and ASAP completion. The fact that they are very long is secondary to the fact that they are needed ASAP. Having the priority on the ASAP means that adding a different queue for them involves either voluntary addition to that queue by the end users, maybe in response to a notice that goes out calling for them, or forcing everyone onto that queue which then ends in the exact thing you have right now, which is having them go out to the first come/first serve whether they can be completed or not by those machines. I don't think either of these is an appropriate thing for an on the spot addition of a longer task that needs to be completed ASAP. So that leaves the other option, which is having the servers determine if the machine can run it in the time needed before assigning it to that machine. I suppose that could be done, but I think if it could be currently done immediately and it was not, it was just a bad judgment call. Based on that, I would assume that their side of the system does not currently have that ability past what the user tells them you can do, via the queues you choose for your machines, i.e. Short, Long, Test, CPU, etc. I think assigning these tasks to the machines that are set to receive "normal" long work units and then sending out an official BOINC Notice to flash on the client IS the right way to have done this this time. And then, based on finances and manpower, work on adding more functionality to their back-end to determine what machines can do what tasks to fine tune what is already in place in the voluntary queues. It seems very clear that they were not expecting these VERYLONG work units too far in advance to actually have done this any better, making the way it was done the best way it could have been done. And now for the future, if it is to be done on occasion, not much manpower and time needs to go into it to "correct" the process, but if the VERYLONG work units are to become a regular thing for the grid, then time should be invested to add a queue or help their servers better determine the potential of machines to finish them in the times needed. All in all, people who have no better solutions, but only want to share frustrations are better off for everyone involved to state that there is an issue, what they think the issue is, and then know that someone saw the statement and will work to fix it if it needs fixing. We don't need to get emotionally involved unless we are singled out as overtly ignored. And then, there is always more projects or more official channels than the fellow user base and volunteer workers on a forum board. Not flaming, just always want to see solution makers making solutions and agitators making quiet. Life works better that way around. :-) |
Send message Joined: 7 May 13 Posts: 1 Credit: 157,304,655 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I see that after the first VERYLONG units errored out, it has been posted that it was necessary to edit the xml before finishing the task. When I read that, it had taken my GTX 770 31 hours 24 minutes to complete the task. Not amused to see the log: 24/01/2015 08:46:57 | GPUGRID | Output file 6x13-GERARD_VERYLONG_CXCL12_confAna-0-1-RND0906_0_9 for task 6x13-GERARD_VERYLONG_CXCL12_confAna-0-1-RND0906_0 exceeds size limit. 31 Hours of wasted time, which I could have used for 1 abandonned and 1 suspended NOELIA, just because I answered to the notice in my BOINC-manager which asked for help to finish those VERYLONG wu's ASAP. :-( |
Send message Joined: 13 Feb 11 Posts: 25 Credit: 7,516,466,698 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hello. I got the same unhappy result. http://www.gpugrid.net/result.php?resultid=13737081 It took my Titan black roughly 25 hours to complete the task. Next time, please be more careful what you prepare for cranching. I am very keen to help, but this is a waste of time that could had been used for other tasks. |
Send message Joined: 13 Feb 11 Posts: 25 Credit: 7,516,466,698 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hello. I got the same unhappy result. http://www.gpugrid.net/result.php?resultid=13737081 It took my Titan black roughly 25 hours to complete the task. Next time, please be more careful what you prepare for cranching. I am very keen to help, but this is a waste of time that could had been used for other tasks. |
©2025 Universitat Pompeu Fabra