Message boards :
Graphics cards (GPUs) :
Have work units gotten longer????
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 20 Aug 07 Posts: 18 Credit: 1,319,274 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
My SLI system was completing two work units every 19-20 hours. Now it appears they are taking 24 to 38 hours. Has anyone else seen a jump in their completion times? |
Michael GoetzSend message Joined: 2 Mar 09 Posts: 124 Credit: 124,873,744 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Forgive the dumb question, but just checking the obvious, easy stuff first: You didn't accidently leave SLI turned on, right? That would reduce the number of GPUs available to CUDA down to 1. Mike |
|
Send message Joined: 1 Sep 08 Posts: 37 Credit: 5,864,088 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
My time to complete a Wu jumps from about 8h to a little more than 10h - Vista 64Bit. With XP 32Bit from about 6h to 7h... Both with a GTX295 |
|
Send message Joined: 20 Aug 07 Posts: 18 Credit: 1,319,274 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
No, both XFX 9600 GSO's are operating independently. I just thought it was kind of odd to see a jump in times when they were pretty steady before and both are doing seperate work units. |
|
Send message Joined: 10 Apr 08 Posts: 254 Credit: 16,836,000 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
If WUs do have the same name, you can assume they are of the same length. However, between different names (which refer to different experiments) there may be differences. We are usually running different systems, that have different computational costs... so completion times may vary, though we intend to homogenize them as much as possible. ignasi |
|
Send message Joined: 21 Oct 08 Posts: 144 Credit: 2,973,555 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
If WUs do have the same name, you can assume they are of the same length. Any possibility of knowing what the naming schemes mean? |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
mscharmack, something odd is happening with your machine. If you take a look at one of your recent results you'll see that there are many "No heartbeat from core client for 30 sec - exiting" messages. This means that the GPU-Grid client looses connection to the BOINC client and subsequentially shuts down. So the actual time per step didn't grow, the WUs didn't grow (at least not that much), but the wall clock time until WU completion increases. This may either mean that the BOINC client stopped working (you'd have to restart it manually) or that the communication is disrupted, which could be due to a overzealous firewall or any other strange windows / network issue. Sorry, can't be more specific than this. MrS Scanning for our furry friends since Jan 2002 |
Megacruncher TSBTSend message Joined: 7 Aug 08 Posts: 8 Credit: 5,694,345,812 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I've certainly found that WU are taking longer and that as a result my daily credit is plummeting. What gives? The Scottish Boinc Team
|
Paul D. BuckSend message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I've certainly found that WU are taking longer and that as a result my daily credit is plummeting. What gives? Are you using BOINC 6.6.20? Best alternative, roll back to 6.5.0 ... a reported problem ... 6.6.23 is a potential alternative, but it looks to ME like the screwed up work fetch so you have to watch and reset debts (for me every 24-48 hours) which means that these later versions are not yet ready for prime time ... 6.6.24 will not use the second GPU if you have more than one ... Waiting for 6.6.25 ... :) |
JockMacMad TSBTSend message Joined: 26 Jan 09 Posts: 31 Credit: 3,877,912 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Paul I am using 6.6.24 and all 4 GPU's (2x295) are working (still!). Now I have tweaked the registry to force all the GPU's so maybe the same will work for you. Or is it just in the situation where they are 2 different GPU's ? |
Paul D. BuckSend message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
No, I have the same... two GTX 295s ... though I think that they are from different MFGRs (pretty sure they are) ... which may be the cause. The trouble is the new test does not tell you WHAT failed. I suggested a change that would do that ... though I don't know if they are going to add that code to 6.6.25 or not ... There maybe a different change with a cc_config flag to force use of all GPUs ... |
Megacruncher TSBTSend message Joined: 7 Aug 08 Posts: 8 Credit: 5,694,345,812 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I've certainly found that WU are taking longer and that as a result my daily credit is plummeting. What gives? I'm using 6.6.15. I've got no problem getting work and on my 2 X 260GTX machine both GPUs are crunching away. Nothing is crashing and there are no errors. It's just that WUs are taking two or three times longer to run - for the same credit. The Scottish Boinc Team
|
Paul D. BuckSend message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I've certainly found that WU are taking longer and that as a result my daily credit is plummeting. What gives? You may be running into the bug we thought was in 6.6.20 ... roll back to 6.5.0 and try that and see if the times go back to normal. That is what I did when the problem hit me. I went from 16 tasks a day to 8-10. Rolled back to 6.5.0 and my 6 hour run times came back. (some tasks took as long as 24 hours thought the time step did not seem to change). Seriously, you should consider it. If this changes things, then we know the long run bug was introduced earlier than 6.6.20 ... |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Seriously, you should consider it. If this changes things, then we know the long run bug was introduced earlier than 6.6.20 ... I second that. A rutime 2-3 times larger is not normal. MrS Scanning for our furry friends since Jan 2002 |
Megacruncher TSBTSend message Joined: 7 Aug 08 Posts: 8 Credit: 5,694,345,812 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I downgraded to 6.5.0 and it seems to be have speeded things up no end! Thanks very much for the advice. The Scottish Boinc Team
|
|
Send message Joined: 20 Aug 07 Posts: 18 Credit: 1,319,274 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Reverted back to 6.4.7 today. The system seems to be much more stable than under 6.6.20. I will keep it there for a few days and check out the results. |
Paul D. BuckSend message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thank you for the feedback. I wish it were otherwise, but, sometimes the latest and greatest isn't ... That is why, unless there is a compelling new feature you MUST have, it is usually best to wait at least 30 days before adopting a new version. The problems with a version don't always come out in the first few days, unless it is glaring and major. The more subtle issues are harder and thus take longer to figure out what is going on. The only reason I always suggest 6.5.0 over the 6.4.x versions is that there is no need for the configuration file hacks to get the GPU running. Anyway, good you are back running. |
|
Send message Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Hmm now to be honest i see some units really have a much longer time to complete nowadays. These GIANNI units really cost my little machine much more time to complete the older units i did almost allways in less then 24 hours. These newer units start with an estimate of 17 and by the looks of it is going to take more then 24 hours. It is at the moment that i post at 21 hours 45 minutes and reports to have still 5 hours to go. Even with good old boinc 6.5.0 |
|
Send message Joined: 13 Mar 09 Posts: 59 Credit: 324,366 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]()
|
It would be good to have some repeatable test units through GPU grid which you could subscribe to for testing purposes (In CPDN Beta you can subscribe to different tests). Maybe 3 or 4 workunits that have been verified as clean and reliable could be repeatedly retrievable would make comparisons a lot easier. It would stop me running out of units for the day when it starts erroring out and I cannot get any more. Rob |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'd say +/-10% is normal. It could also be that you'll get more credits for WUs which lasted longer. MrS Scanning for our furry friends since Jan 2002 |
©2026 Universitat Pompeu Fabra