Longer deadline for NVIDIA Quadro FX 570

Author	Message
Krzychu P. Send message Joined: 10 Jul 07 Posts: 3 Credit: 2,610,800 RAC: 0 Level Scientific publications	Message 3502 - Posted: 30 Oct 2008, 14:43:14 UTC Hi! I've tried to crunch on NVIDIA Quadro FX 570. http://www.ps3grid.net/workunit.php?wuid=64568 4 days for one WU to report is too short. Now I have 112 hours of crunching (73,67% of progress). So to crunch WU's for GPUgrid I have to have at least 8 days to report unit. My question is: Is it possible to make the time to report a little bit longer (for example 10 days)? ID: 3502 · Rating: 0 · rate: / Reply Quote

Krunchin-Keith [USA] Send message Joined: 17 May 07 Posts: 512 Credit: 111,288,061 RAC: 0 Level Scientific publications	Message 3506 - Posted: 30 Oct 2008, 16:50:35 UTC - in response to Message 3502. Hi! I've tried to crunch on NVIDIA Quadro FX 570. http://www.ps3grid.net/workunit.php?wuid=64568 4 days for one WU to report is too short. Now I have 112 hours of crunching (73,67% of progress). So to crunch WU's for GPUgrid I have to have at least 8 days to report unit. My question is: Is it possible to make the time to report a little bit longer (for example 10 days)? That has been answered before, but I could not find the answer. The answer is NO. Basic reason: The science is needed back quickly, hence the current deadline. Additionally some resutls, once returned are resent to another user for additional processing, this can happen many times. With long deadlines it takes too much time to get the final result. ID: 3506 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 3511 - Posted: 30 Oct 2008, 18:05:11 UTC I understood that the steps within each WU are timesteps, so you need a finished WU before you can generate / distribute the next WU with the same structure. MrS Scanning for our furry friends since Jan 2002 ID: 3511 · Rating: 0 · rate: / Reply Quote

Krzychu P. Send message Joined: 10 Jul 07 Posts: 3 Credit: 2,610,800 RAC: 0 Level Scientific publications	Message 3621 - Posted: 3 Nov 2008, 7:28:27 UTC - in response to Message 3506. Last modified: 3 Nov 2008, 7:30:47 UTC Hi! My question is: Is it possible to make the time to report a little bit longer (for example 10 days)? The answer is NO. Basic reason: The science is needed back quickly, hence the current deadline. Additionally some resutls, once returned are resent to another user for additional processing, this can happen many times. With long deadlines it takes too much time to get the final result. OK. I can understand the reason. So, maybe it is possible to make shorter WUs with the same 4-days-deadline? ID: 3621 · Rating: 0 · rate: / Reply Quote

[BOINC@Poland]AiDec Send message Joined: 2 Sep 08 Posts: 53 Credit: 9,213,937 RAC: 0 Level Scientific publications	Message 3624 - Posted: 3 Nov 2008, 8:55:18 UTC - in response to Message 3621. Last modified: 3 Nov 2008, 8:56:00 UTC So, maybe it is possible to make shorter WUs with the same 4-days-deadline? Great idea!!! :) ID: 3624 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 3642 - Posted: 3 Nov 2008, 22:03:30 UTC It would take quite some work to make sure BOINC wouldn't distribute longer WUs to slower clients. I guess it's not very high on the priority list. MrS Scanning for our furry friends since Jan 2002 ID: 3642 · Rating: 0 · rate: / Reply Quote

rebirther Send message Joined: 7 Jul 07 Posts: 53 Credit: 3,048,781 RAC: 0 Level Scientific publications	Message 3645 - Posted: 3 Nov 2008, 22:12:15 UTC 1 week deadline for slower cards (8600GT), every contribution are useful! ID: 3645 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 3647 - Posted: 3 Nov 2008, 22:26:41 UTC By doing that you would actually slow the project down. Sure, they could work on a bit more different work in parallel, but they'd have to wait longer for results. The point of this project is to build a fast supercomputer. And if you start with longer deadlines for slower GPUs, why stop at 32 shaders? Wouldn't the owners of 16 and 8 shader GPUs demand the same special treatment? MrS Scanning for our furry friends since Jan 2002 ID: 3647 · Rating: 0 · rate: / Reply Quote

rebirther Send message Joined: 7 Jul 07 Posts: 53 Credit: 3,048,781 RAC: 0 Level Scientific publications	Message 3648 - Posted: 3 Nov 2008, 22:32:34 UTC - in response to Message 3647. By doing that you would actually slow the project down. Sure, they could work on a bit more different work in parallel, but they'd have to wait longer for results. The point of this project is to build a fast supercomputer. And if you start with longer deadlines for slower GPUs, why stop at 32 shaders? Wouldn't the owners of 16 and 8 shader GPUs demand the same special treatment? MrS CUDA starts only with 64 shaders? ID: 3648 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 3650 - Posted: 3 Nov 2008, 23:05:43 UTC What do you mean? All nVidia chips with G80 or higher have some CUDA capability, this is not related to performance / amount of shaders. The limitation arises for GPU-Grid because the project team had to set some deadline: "if we get results back in at least x days we're fine and we can actually make some scientific progress". Running (e.g.) millions of slow WUs in parallel can only help you so much if you have to wait years for the results. MrS Scanning for our furry friends since Jan 2002 ID: 3650 · Rating: 0 · rate: / Reply Quote

Scott Brown Send message Joined: 21 Oct 08 Posts: 144 Credit: 2,973,555 RAC: 0 Level Scientific publications	Message 3657 - Posted: 4 Nov 2008, 0:51:22 UTC - in response to Message 3642. It would take quite some work to make sure BOINC wouldn't distribute longer WUs to slower clients. I guess it's not very high on the priority list. I think that the original suggestion was to break ALL the work down into smaller units where slower cards could still run within the 4-day period. This would not be difficult from the perspective of BOINC, but might not be possible with the specific scientific application for the project. Indeed, given the suggestion elsewhere in the forum that 256mb cards might not be able to handle workunits in the future, it sounds like the trend would be in the direction of larger rather than shorter workunits. If the workunits, however, can be broken down into smaller units, then this should be seriously considered given the more widespread use of slightly slower and less expensive graphics cards (e.g., 16 & 32 shader cards are fairly common OEM components for off-the-shelf systems currently). Also, I think you are somewhat overstating the difficulty for distributing different kinds of work in BOINC since other projects seem to be able to do this through various methods (though with the GPU applications there are additional complications--but even here there is separate PS3 and GPU work?). Each card's model ID is already recorded by the client and reported with each task, so perhaps this could possibly be used without too much additional database overhead? ID: 3657 · Rating: 0 · rate: / Reply Quote

[BOINC@Poland]AiDec Send message Joined: 2 Sep 08 Posts: 53 Credit: 9,213,937 RAC: 0 Level Scientific publications	Message 3659 - Posted: 4 Nov 2008, 3:20:36 UTC Last modified: 4 Nov 2008, 3:31:59 UTC NO for longer deadline... NO for shorter WU`s... Then mby two different WU`s? With possibility to choose in Profile (as it is for example in CPDN project)? It`s just an idea - pls don`t kill me for that ;) ID: 3659 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 3661 - Posted: 4 Nov 2008, 9:09:52 UTC Mh, they could easily reduce the number of steps in each WU, trading off file transfer overhead for shorter computation times. That would still slow the project down on average though, as you'd get less work done in the same time. I don't know how much of a slow-down would be tolerable. Also, I think you are somewhat overstating the difficulty for distributing different kinds of work in BOINC since other projects seem to be able to do this through various methods (though with the GPU applications there are additional complications--but even here there is separate PS3 and GPU work?). Each card's model ID is already recorded by the client and reported with each task, so perhaps this could possibly be used without too much additional database overhead? OK, implementing something is probably not that difficult. But to get it right and to make it robust enough that it just works? Now it took us (the BOINC devs and we as a tester-community) more than a few weeks to properly teach BOINC a trick as simple as "keep the GPUs fed". So it seems like under the hood things are quite complicated, that's why i'm sceptical. Then mby two different WU`s? With possibility to choose in Profile (as it is for example in CPDN project)? That's not in the spirit of BOINC ;) It's supposed to do these things for you, so you don't have to micro-manage your crunching and don't have to care about the underlying details. Just imagine all the forum posts: "Why does it not work properly?" "What did you set? How fast is your card, how many shaders, what's the core and shader clocks?" "What's a shader? I just want to run this project.." I guess that's not what they want to have. Would be good to hear from the project team if anything in this direction (support of slower cards) is planned. MrS Scanning for our furry friends since Jan 2002 ID: 3661 · Rating: 0 · rate: / Reply Quote

Krunchin-Keith [USA] Send message Joined: 17 May 07 Posts: 512 Credit: 111,288,061 RAC: 0 Level Scientific publications	Message 3666 - Posted: 4 Nov 2008, 18:03:57 UTC I can't answer exactly for the project team, but my gut feeling is that we, users, project and developers, are concentrating on getting 1 aspect working first. If you attempt too much at once you just get more confusion and problems. Once things are more perfect with boinc and the gpu ahndling, I'm sure efforts will go into development for other areas, ie slower cards, ATI, etc. As it stands now, boinc is already capable of handling this, although it has not been tested. The client already sends complete info about the GPU to the server, at least everything returned by the CUDA API. This info is passed to the scheduler, which can use it to decide which app to send and to estimate job completion time. However to implement this is not as easy as the above makes it sound. Special habndling is needed, FPOPS estimate for the gpu requesting work and then separate apps for each instance slow vs fast GPUs. The scheduler can then match short jobs to slow cards and long jobs to fast cards. But even with this, time restrictions of the science, the necessity of results within a certain deadline, may not mean that smaller or shorter tasks will benefit the current application. There are plans for additional uses for the GPU and CPU by this project, possibly those uses may benifit those not able to participate at the current moment. There have been hints about this. But liek I said ebfore, we need to get one thing working well first. The thing to remember is all this is really new, only about 8 months so far has gone into the development, so GPU computing is still in its infancy. ID: 3666 · Rating: 0 · rate: / Reply Quote

Krzychu P. Send message Joined: 10 Jul 07 Posts: 3 Credit: 2,610,800 RAC: 0 Level Scientific publications	Message 3713 - Posted: 7 Nov 2008, 7:07:07 UTC - in response to Message 3661. Mh, they could easily reduce the number of steps in each WU, trading off file transfer overhead for shorter computation times. That would still slow the project down on average though, as you'd get less work done in the same time. I don't know how much of a slow-down would be tolerable. "That would still slow the project" - I think I can't agree with that. Example: You have 20 strongmen. Each is carrying 50 stones at once from place to place. And you send them help - 50 normal men, that each can carry 15 stones in the same time. When the work will be done faster (in the same time)? In the first situation or in the second? @Keith Thanks for the answer. It seems, I have to wait for new things that will be implemented in the future. I can not wait for that. :) I hope, when shorter WUs (with reduced number of steps) are available, I will read this in the news on home page of this project. ;) Cheers P.S. sorry for my english ;) ID: 3713 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 3744 - Posted: 7 Nov 2008, 21:06:38 UTC - in response to Message 3713. Maybe I should have explained it a bit more detailed. There are 2 kinds of "speed / fast". Throughput: that's what you're referring to, it's the entire amount of work being done per time interval. This could increase if slower cards could contribute as well. Latency: that's what this project is concerned with, apart from getting reasonable throughput. It's "the time it takes to get an answer to your question". Since the WUs are based on each other this is much longer than the time of a single WU. As an extreme example consider the following case: you may have millions of WUs running in parallel which all take several years to complete. Once the first one is in you discover that you made some mistake and all the results are worthless.. Allowing a longer deadline or shorter WUs with the same deadline both increase latency and throughput, so a compromise has to be made (or different WUs distributed). MrS Scanning for our furry friends since Jan 2002 ID: 3744 · Rating: 0 · rate: / Reply Quote

Krunchin-Keith [USA] Send message Joined: 17 May 07 Posts: 512 Credit: 111,288,061 RAC: 0 Level Scientific publications	Message 3748 - Posted: 7 Nov 2008, 21:39:38 UTC - in response to Message 3713. Last modified: 7 Nov 2008, 21:41:06 UTC "That would still slow the project" - I think I can't agree with that. Example: You have 20 strongmen. Each is carrying 50 stones at once from place to place. And you send them help - 50 normal men, that each can carry 15 stones in the same time. When the work will be done faster (in the same time)? In the first situation or in the second? That would depend on how the stones are piled. You might have to wait for some to be moved before you could get to the others. Thus you just have men standing around doing nothing waiting on others to move thier load. --- Its like this, in order for that to work here, it would mean a current task would have to be able to be split, lets say in 6 parts, and the each part could be done without depending on the other. Then all 6 parts could be done by six older GPUS in the 4 days and all returned on the foruth day. then the parts reassembled, assuming no errors are produced. If one part of the six was errored by the host, then it would have to be redone, causing a delay in getting the final result. Now lets assume you can split the task into six parts, but you need the result from one part befotre the other can start. if you have to wait 4 days for part 1 to be returned before part 2 could be started, and so on, then it would take 24 days to finish 1 result. Even if each part took one day it still takes too long to get the final result. I do not know what is possible here. But the point is the final result is needed back within the time frame the project has set. ID: 3748 · Rating: 0 · rate: / Reply Quote

Woyteck Send message Joined: 1 Nov 08 Posts: 2 Credit: 68,858 RAC: 0 Level Scientific publications	Message 3761 - Posted: 8 Nov 2008, 10:09:45 UTC I've just had to abort two WUs because I couldn't manage to crunch them. I use my PC in cycle: getting back from work, use for browsing/other stuff - 2-3 hours, then some games 2 hours (have to suspend crunching on my GPU during play) and then power off as I'm off to bed. I've been crunching Seti@Home since 1999, and been in boinc since the begining and I can tell you one thing: The deadline is too short. You will se many people coming in and going shortly afterwards just because of way too short deadline. My 8800GT does the WU in 11h, in comparsion - my CPU does a Seti@Home WU in about 3h and S@H gives three weeks deadline, but still I manage to crunch one/two every evening... And PLEASE, remember that people who actually buy graphics cards capable of proper crunching are VERY likely into games, so vast majority of the PC being on is when they play, and then they have to suspend computing because of bad performance in games. I believe, that that's why BETA crunch is all about, to find problems and solve them. One of them is the deadline being to short, and it's one of the easiest problems to solve... I just hope that someone will listen to an experienced BOINC user. ;) ID: 3761 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 3763 - Posted: 8 Nov 2008, 11:14:40 UTC - in response to Message 3761. I know this thread is getting a bit long, but please read some of Keiths or my posts to understand why GPU-Grid is different from SETI and why just increasing the deadline won't work. i'm confident though, that something will eventually be implemented, when other problems are solved. And regarding gaming and desktop use: they tried to reduce the impact of GPU-Grid, but it didn't work well (massive performance hit). But GTX 260 is already fast enough so that it doesn't disturb gaming any more - so as time goes by the problem will become less pronounced. MrS Scanning for our furry friends since Jan 2002 ID: 3763 · Rating: 0 · rate: / Reply Quote

Kokomiko Send message Joined: 18 Jul 08 Posts: 190 Credit: 24,093,690 RAC: 0 Level Scientific publications	Message 3772 - Posted: 9 Nov 2008, 9:31:16 UTC Last modified: 9 Nov 2008, 9:31:58 UTC Shorter deadlines would make it impossible for me to crunch on my XP 64 PC. Every WU with a runtime less than 8 hours has a consumption >= 71 MB of the video memory. After 2 days over 420 MB video memory are used and I have to restart the PC, otherwise he will crash all WUs at the third day. With a shorter deadline I have to stop crunching, while I can't restart the PC every day. I'm not home often enough. ID: 3772 · Rating: 0 · rate: / Reply Quote