Message boards :
Graphics cards (GPUs) :
Redundent Result
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 9 Mar 09 Posts: 25 Credit: 3,721,079,753 RAC: 200 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Why do I have a bunch of "Redundent Result"? 1 Apr 2009 8:28:46 UTC 1 Apr 2009 19:25:33 UTC Over Redundant result Cancelled by server 0.00 --- --- 475819 350586 1 Apr 2009 1:27:08 UTC 1 Apr 2009 11:19:36 UTC Over Redundant result Cancelled by server 0.00 --- --- 475532 350440 1 Apr 2009 14:00:10 UTC 6 Apr 2009 14:00:10 UTC In progress --- New --- --- --- 475146 350206 31 Mar 2009 20:32:21 UTC 1 Apr 2009 14:00:10 UTC Over Success Done 1,362.38 2,883.44 4,613.50 473604 349200 1 Apr 2009 3:41:34 UTC 6 Apr 2009 3:41:34 UTC In progress --- New --- --- --- 473175 349017 31 Mar 2009 14:07:04 UTC 1 Apr 2009 1:27:08 UTC Over Redundant result Cancelled by server 0.00 |
Paul D. BuckSend message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The most common cause is that these tasks are flawed and were canceled because someone else already ran them into the wall. So, they got canceled to save you the trouble ... |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
That's interesting: these tasks have initial replication 2 or 3. And as soon as the 1st result is in the others are canceled. Looks like the project has enough GPUs, so that enough WUs are started in parallel. But they need to get results back quick to finish these WUs, that's why they went for a higher initial replication. MrS Scanning for our furry friends since Jan 2002 |
|
Send message Joined: 9 Mar 09 Posts: 25 Credit: 3,721,079,753 RAC: 200 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I am new to GPUGRID, just joined on March 9th with a GTX 260. Did I do anything wrong. It looks like I only had three successes today where I have been averaging 4 a day before. - Mitch |
Paul D. BuckSend message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I am new to GPUGRID, just joined on March 9th with a GTX 260. Did I do anything wrong. It looks like I only had three successes today where I have been averaging 4 a day before. Which of your two systems do you want to discuss? The one has not returned errors, but, through no fault of its own had a lot of tasks canceled. The other system is having errors and missed deadlines. One looks to me like it is working and the other isn't. As to the 260, you can, based on *MY* personal experience can see between 2 and 4 tasks per day downloaded and processed. Depending on the time reported you can see your daily number varying between 1 and 7 ... While I am up and in the computer room I do a force to push my work up and am slowly moving to 6.6.20 and as I do adding the "report results immediately" flag to do this more auto-magically ... Anyway, I am confused ... |
|
Send message Joined: 10 Apr 08 Posts: 254 Credit: 16,836,000 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
That's interesting: these tasks have initial replication 2 or 3. And as soon as the 1st result is in the others are canceled. That's right. It improves a little bit WU turnaround times for us, although we are going to try different things to improve this as this "redundant result" thing causes too much confusion to the users. thanks, ignasi |
|
Send message Joined: 9 Mar 09 Posts: 25 Credit: 3,721,079,753 RAC: 200 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The one the question is about is the one that is working. FOX-AMD-X4-940. The one that is not working is becasue I tried to connect a 8600 GTS to GPUGRID and did not relaize it was not supported. It looks like I am still getting redundent results today, so the problem has not gone away. |
|
Send message Joined: 21 Oct 08 Posts: 144 Credit: 2,973,555 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
The one that is not working is becasue I tried to connect a 8600 GTS to GPUGRID and did not relaize it was not supported. The 8600GTS is a supported card. It is not recommended due to its speed, but 32 shader cards will easily meet the new 5-day deadline in a single core machine, and unless shader clocked very slow (I'd say under 1000) should also have no problems in a dual-core. In an i7 such as yours, it will not be able to consistently meet deadlines as a single card due to the 4+ workunits downloaded, but if paired with another card should have no problems. |
|
Send message Joined: 9 Mar 09 Posts: 25 Credit: 3,721,079,753 RAC: 200 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
When I tried the 8600 GTS I only got an error for clinet detached. I had no successes over two days so I disconnected it and attached it to SETI instead. I have 2 8600 GTS, 1 8500 GT, and 8400 GS on SETI. I have the 260 GTX on GPUGRID and it was working fine until yesterday and today where I started to get a lot of "Redundant Result" messages mixed in with successes. Here is my two days of trying the 8600 GTS, I had no successes. 22 Mar 2009 6:41:22 UTC 23 Mar 2009 14:09:26 UTC Over Client detached New 0.00 --- --- 431335 325033 21 Mar 2009 22:25:14 UTC 23 Mar 2009 14:09:26 UTC Over Client detached New 0.00 --- --- 430860 324820 22 Mar 2009 11:43:41 UTC 23 Mar 2009 14:09:26 UTC Over Client detached New 0.00 --- --- 430235 324507 22 Mar 2009 2:03:37 UTC 23 Mar 2009 14:09:26 UTC Over Client detached New 0.00 --- --- 429495 324141 21 Mar 2009 18:40:09 UTC 22 Mar 2009 11:43:41 UTC Over Redundant result Cancelled by server 0.00 --- --- 429222 323988 21 Mar 2009 16:43:48 UTC 23 Mar 2009 14:09:26 UTC Over Client detached New 0.00 --- --- 429087 323904 21 Mar 2009 15:46:10 UTC 23 Mar 2009 14:09:26 UTC Over Client detached New 0.00 --- --- 429085 323903 21 Mar 2009 15:44:55 UTC 21 Mar 2009 18:40:09 UTC Over Client error Compute error 301.04 2,883.90 --- 429031 323870 21 Mar 2009 15:22:47 UTC 21 Mar 2009 16:43:47 UTC Over Client error Compute error 51.56 2,478.99 --- 428976 323844 21 Mar 2009 15:45:34 UTC 22 Mar 2009 6:41:22 UTC Over Client error Compute error 986.61 2,960.09 --- 428714 323679 21 Mar 2009 15:46:10 UTC 23 Mar 2009 14:09:26 UTC Over Client detached New 0.00 --- |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
OK, to point it out more clearly: the redundant results are intentional, they're nothing to worry about. MrS Scanning for our furry friends since Jan 2002 |
|
Send message Joined: 9 Mar 09 Posts: 25 Credit: 3,721,079,753 RAC: 200 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I understand they are intentional. I was only curious becasue all of a sudden I seemed to get a lot of them. Is a WU marked Redundent result before you ever start processing the WU, during the processing of a workunit, or after you have completed processing the WU? - Mitch |
|
Send message Joined: 21 Oct 08 Posts: 144 Credit: 2,973,555 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
I understand they are intentional. I was only curious becasue all of a sudden I seemed to get a lot of them. Is a WU marked Redundent result before you ever start processing the WU, during the processing of a workunit, or after you have completed processing the WU? Before. |
|
Send message Joined: 9 Mar 09 Posts: 25 Credit: 3,721,079,753 RAC: 200 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Then I really dont care :) Thanks, Mitch |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Ignasi & GDF, with the higher initial replication it sometimes happens that a WU is crunched by 2 hosts at the same time. I can see an intersting opportunity here: - make the server check for such WUs - compare their results -> if they're identical: great -> if there are differences: * trace the WU more closely * does it error shortly afterwards? * for some of them: issue both results as seeds for the following WUs and observe if the results converge Maybe you already tested this carefully and extensively. And as I understand GDF is quite confident in the error finding mechanisms. But I think it would be quite interesting to see how reliable the GPU calculations are in the real world, the wild west of overclocking country. And as long as you don't issue new WUs this error checking and tracing is basically free. MrS Scanning for our furry friends since Jan 2002 |
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
We have done a couple of weeks of tests with redundant results and I did not like much. It provides a better return time but it generates confusion. We are implementing a better and clever way to do it which does not waste so many resources and guarantees better balancing between WUs belonging to the same batch. For what it regards replication for validation, it is not easy to create a general validator for MD (if at all possible) and in fact not even so useful, as we are practically validating by hand when we do the analysis every week. gdf |
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
I forgot. We have implemented and testing a remote submission mechanism for BOINC and gpugrid which seems to work very well which will in the future provide load balancing of workunits (as said in previous message). GDF |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
OK.. thanks for the answer! MrS Scanning for our furry friends since Jan 2002 |
©2025 Universitat Pompeu Fabra