Message boards :
Number crunching :
Duplicated work
Message board moderation
| Author | Message |
|---|---|
Carlesa25Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hello: I would like to comment on a matter which does not understand. I see that there are tasks that we have made TWO users, completed, validated and marked; i.e. we have doubled the work. Best regards |
|
Send message Joined: 5 Jan 09 Posts: 670 Credit: 2,498,095,550 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
You'll find that the first WU is over 2 days old without a result so it is issued to a second host. The reason they do this is because they require fast and reliable completion as the next WU in a job depends on the first one being completed and can't be sent out before. Radio Caroline, the world's most famous offshore pirate radio station. Great music since April 1964. Support Radio Caroline Team - Radio Caroline |
Carlesa25Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hello: From a few days ago the number tereas we performed in duplicate (I + a collaborator) and not by spending two days between sending each of them, in these last few hours have passed. If there is a coordination problem on the server, it seems a waste of labor. Greetings. |
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
Hi, if you can, please report the task numbers which seem duplicate. |
Carlesa25Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hello: So far I see this, there were two more but now indicate miscalculation ...?. Will report if I find others. Greetings. http://www.gpugrid.net/workunit.php?wuid=2462615 |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
After the first task sent out was not returned for about 3days, another task was sent out. The first task then returned with an error. I presume this triggered a second task to be sent out, at which time there would have been 2 tasks in progress. Now one of these tasks has been returned, but there is still one in progress. Given the nature of the project (dependency on fast turn around of tasks) this is an acceptable situation; if several tasks are slow to return or fail it slows down the overall research really badly. |
Carlesa25Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hello Again it happened that a task is sent to another user before the end of two days of processing in the first ...! Proceed to cancel the task because it is losing work with it, but it is unfortunate the time I lost my ...! Greetings. Work: http://www.gpugrid.net/workunit.php?wuid=2514023 |
Carlesa25Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi, again a task has been sent to two users on the same day (three hour time difference) I think is worth reporting if a server failure or any special circumstances of the task ...? Greetings. Workunit 2521888 http://www.gpugrid.net/workunit.php?wuid=2521888 |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The work unit was sent out. The host did not complete the task within 2days so it was resent. The first host then returned the task as an Error. This prompted the server to send out an additional WU. Subsequently both resends returned as Valid results. This is normal server behavior. |
|
Send message Joined: 17 Jul 07 Posts: 2 Credit: 332,470 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]()
|
At the moment I suspect that a lot of work is being wasted due to this policy. Don't forget that BOINC is a multi-project environment and the scheduler takes into account the project resource share and the deadline when scheduling tasks. The problem is made worse in that BOINC can take up to a day to send the results back. A much better solution would be to make the deadline 2 days and let BOINC work as it was designed. You will get more complaints about the short deadline but it's better to be open about it instead of wasting resources. The users can then decide for themselves whether GPU is for them. It's only recently I've become aware of this policy and it's very frustrating to realise that probably every one of my tasks has been re-issued. What a waste! |
|
Send message Joined: 10 Oct 08 Posts: 18 Credit: 39,100,916 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hmm I agree with making the deadline 2 days instead of 5. I often get sent tasks which are still running on slower GPUs, my WU will complete and it'll still be running on the other host, just a waste of resources if you ask me. Would also, hopefully, stop boinc downloading 2 GPUGRID tasks (my buffer is set to 1 day because I need enough einstein WUs to feed my GPU which runs 4 of those at once) |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I almost agree in principle, though I would opt for a 3 day cutoff point and change the credit system slightly (150% for <1day return, 100% for <2day return, 50% for <3day return). Anyway, it's really down to the scientists to work out what is best for the project overall, not us. They know things we don't. One low deadline concern would be getting tasks in the first place; Boinc's lack of up-time could prevent you getting a task. There is a recommendation to use report tasks immediately (See the FAQ section). It's important to remember that the next batch of tasks relies on the completion of all existing tasks, so following a failure, it is essential to have another task sent out; to expedite the overall science, not just the one task. There has been plenty of suggestions on how to improve the science project. For example, if a task is completed by one person, send a signal out to any other crunchers to stop crunching that task and credit them for the percentage already completed. The problem is implementation, and many suggestions are just not doable with present Boinc versions. Running more than one GPU project often results in problems. You would be better off crunching GPUGrid tasks for a while and then Einstein tasks for a while (if you must), rather than both at the same time. A low cache is the recommended setting at GPUGrid. PS. I would have thought running 4 Einstein tasks on a GTX480 would be suboptimal (2 or perhaps 3, but not 4). |
|
Send message Joined: 10 Oct 08 Posts: 18 Credit: 39,100,916 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Yes, I usually don't run both GPU projects at the same time. Also use report tasks immediately. I myself have no trouble completing WUs in time. Running 4 einstein WUs works great btw. If I'm not using the pc it completes them in about 95 mins (with 4 CPU tasks running alongside). Running 3 or 2 only makes them complete a bit faster. But not enough to cancel out running 1 or 2 tasks less. Annoyingly enough even 4 einstein WUs only get GPU usage up to a max of about 80%. |
|
Send message Joined: 17 Jul 07 Posts: 2 Credit: 332,470 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]()
|
By all means reward users for a fast turnaround, but I don't think it's a good idea to penalise users from returning tasks within the deadline period. Not sure I understand what the issues are around BOINC's lack of up-time, however the last thing you want is for users to stock up their buffer above 2 days. For info I've never had any problems running multiple GPU projects at the same time. BOINC takes care of this rather well and ensures tasks are (almost) always completed within the deadline, which brings us back to the original problem. |
|
Send message Joined: 16 Mar 11 Posts: 509 Credit: 179,005,236 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
What's causing the waste here is the fact that tasks are resent before the first computer misses the deadline. The concept is simple...decide how soon you need results back and set the deadline to that many days. If they want them back in 2 days then grow a pair, draw the line and set a 2 day deadline. If they feel 3 days is adequate then make the deadline 3 days. Setting the deadline to 5 and then resending tasks after 2 days is just plain dumb because it's inevitably going to end up producing a lot of wasted (duplicated) effort. It's better for systems that can't return a task in 2 days to crunch somewhere else rather than waste their time duplicating work here. |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I agree with Dagorath on the deadline. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Boinc calculates the ratio of time the computer is on and running Boinc. If Boinc thinks it is not on long enough to finish a task in 2 days then Boinc will not ask for a task. Crunching for other GPU projects would also impact upon asking for GPUGrid tasks as would things like task failures, your Boinc configurations such as use GPU while computer is in use... 2 days would be long enough for many people for any tasks, but not long enough for many people trying to run long WU's. Boinc sees different tasks as all part of the same GPUGrid propject, so just trying to run one long task might throw it out and prevent someone that would be capable of running normal tasks from getting any. Then there is different Boinc versions (some might behave slightly differently to others) and what would happen if you had a poor GPU and replaced it with a high end GPU? Basically I think a bit more leeway is called for than a 2 day cutoff. Ideally Boinc would better understand separate task types and their requirements and even allow the user to more exactly specify what type of task and when these tasks should run. So GPUGrid has to be a bit more flexible and try to facilitate more people and operate within Boinc's restrictions. The research team also spent a lot of time and effort getting to where we are now. At some stage (now) they have to concentrate more on the Science and maintenance (new GPUs/drivers/CUDA versions) and less on the project setup and development. Also, changes always annoy someone. |
|
Send message Joined: 16 Mar 11 Posts: 509 Credit: 179,005,236 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Boinc calculates the ratio of time the computer is on and running Boinc. If Boinc thinks it is not on long enough to finish a task in 2 days then Boinc will not ask for a task. How can that be when the deadline is currently 5 days? Was the 2 above a typo? Or do I not understand the scheduler? 2 days would be long enough for many people for any tasks, but not long enough for many people trying to run long WU's. Then make the deadline 3 days. Or 4 days. Or 100 days, I don't care. But stop resending tasks before the deadline is up. Boinc sees different tasks as all part of the same GPUGrid propject, so just trying to run one long task might throw it out and prevent someone that would be capable of running normal tasks from getting any. If they shortened the deadline to 2 days some people would find their ACEMD long tasks would get status Abandoned or No Response or whatever this server gives when a task goes over deadline. They would soon learn to go to their preferences and deselect ACEMD long so they get only ACEMD standard tasks. To make the transition less painful they could advertise in the threads and on the home page that the deadline is about to change and that those with slower GPUs and those who don't run BOINC 24/7 should deselct the long tasks. Then there is different Boinc versions (some might behave slightly differently to others) and what would happen if you had a poor GPU and replaced it with a high end GPU? I don't see any problem with that. If they've deselected the long tasks because they have a slower GPU and then they get a faster GPU they can easily select the long tasks. Seems simple enough. Did I miss something? Basically I think a bit more leeway is called for than a 2 day cutoff. I mentioned 2 days in my last post because that seems to be the return time the admins want. I could live with a 3 day deadline. Or 4 days. Or 100 days. Whatever the admins want ius fine with me. Just stop resending tasks before the deadline expires. Ideally Boinc would better understand separate task types and their requirements and even allow the user to more exactly specify what type of task and when these tasks should run. So GPUGrid has to be a bit more flexible and try to facilitate more people and operate within Boinc's restrictions. Indeed projects should be flexible and facilitate and operate within BOINC's restrictions. However, I think project admins' primary obligation is to use donated resources as efficiently as possible. That's not happening now. The research team also spent a lot of time and effort getting to where we are now. At some stage (now) they have to concentrate more on the Science and maintenance (new GPUs/drivers/CUDA versions) and less on the project setup and development. Also, changes always annoy someone. As I understand it, changing the deadline is as simple as changing a number in a config file. Resending tasks before the deadline expires seems to me to be custom code by GPUgrid devs but it shouldn't be too hard to go back to the standard BOINC server way. As for people being annoyed by changes...well...there are people annoyed now by the current policy of resending tasks before the deadline expires. So what's an admin to do? In my mind there is no contest. They MUST live up to their primary obligation to use donated resources as efficiently as possible. That means appease the people who want to eliminate duplicated work and to heck with those who support the status quo. Remember, less duplicated work means more throughput for the project. Who doesn't want more throughput? |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I was explaining what might happen if we had a 2day deadline, speaking hypothetically, rather than explaining the existing 5day system. I agree that the system should be reviewed periodically, and the deadline probably reduced, but it's not my call and I don't have all the info. Resends are always going to happen (tasks fail, as do GPU's, computers, routers). Even if the deadline was 2 or 3 days, tasks would still fail/not be returned in time and have to be resent. The number of resends might actually rise if the deadline was changed. I don't have the figures so I can't work it out, but the team could do some stats to work out in advance if it would expedite the project overall, or hinder it. If they were really up for the challenge they could write a program to continuously analyze returns and self-regulate the deadline within boundaries, but I expect Boinc would backfire. |
|
Send message Joined: 16 Mar 11 Posts: 509 Credit: 179,005,236 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I was explaining what might happen if we had a 2day deadline, speaking hypothetically, rather than explaining the existing 5day system. I agree that the system should be reviewed periodically, and the deadline probably reduced, but it's not my call and I don't have all the info. I realize it's not your call but maybe the project admins can be persuaded through a discussion here. Resends are always going to happen (tasks fail, as do GPU's, computers, routers). Resends for the reasons you state will always be with us but that doesn't mean we should set a deadline of 5 days but resend if the result hasn't returned in 2 days. Even if the deadline was 2 or 3 days, tasks would still fail/not be returned in time and have to be resent. Indeed tasks would still fail but that's extraneous to discussion of the deadline. Tasks not returned in time is what we're focused on here. The number of resends might actually rise if the deadline was changed. No, the number would not rise, because the tasks are being resent now after 2 days anyway. How can reducing the real deadline to match the artificial deadline induce more late returns? There is no cause-effect relationship there that I can see. If the real deadline (now 5 days) were reduced to match the artificial deadline (now 2 days) then volunteers who do not return tasks in 2 days would get some feedback that their results are not returning in time. They might notice that their RAC decreases or they might see many "Abandoned" or "No Reply" outcomes and zero credits awarded in their list of tasks on the website. They'll figure it out for themselves or they'll inquire in the forums as to what's going wrong. Ideally someone would create a sticky thread and put an item on the home page news warning folks of an upcoming reduction in real deadline a month or more in advance and advise users what to watch for and do. Another mechanism would kick in too. BOINC would warn users that task XYZ is about to miss its deadline. That would get users asking questions too. One way or another, volunteers who cannot return a task in 2 days would come to realize that they need to select only the short tasks in their preferences or perhaps move their resources to a different project. With the way things are now they just keep missing the 2 day artificial deadline over and over but get credits anyway and no feedback that they've missed the artificial deadline. They have no reason/motivation to change. Therefore, in the long run, reducing the real deadline from 5 days to match the artificial deadline of 2 days will actually reduce the number of resends. As I said, that won't happen immediately. It will happen as volunteers become aware of the change. If we leave things as they are then there is almost zero chance of ever reducing the number of resends. I don't have the figures so I can't work it out, but the team could do some stats to work out in advance if it would expedite the project overall, or hinder it. It seems to me pertinent stats won't exist until the real deadline is reduced to match the artificial deadline. I don't see how they can predict from the stats they have now. At present we can only reason our way to the outcome based on what we know (or assume) about volunteer behavior. If they were really up for the challenge they could write a program to continuously analyze returns and self-regulate the deadline within boundaries, but I expect Boinc would backfire. Perhaps I don't fully understand what you're getting at but it sounds like that would result in a floating deadline. I think that could only confuse users in the sense that some would find their results return in time today but not the next day. I can't see that as being a good thing. Decide what the project needs for a return time (it seems to be 2), draw one line in the sand and stick to it. Don't fudge around with artificial deadlines or floating deadlines. |
©2025 Universitat Pompeu Fabra