Message boards :
Number crunching :
Long Running Tasks
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 26 Dec 10 Posts: 115 Credit: 416,576,946 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I just completed a task 4092363 and it ran for 51,000 seconds or almost 14 hours. I have never had a task run this long. Are these longer running tasks? In the past the 8 - 12 hour tasks required about 5 - 6 hours. Has anyone else run into this situation? thank you |
|
Send message Joined: 24 Dec 08 Posts: 738 Credit: 200,909,904 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I just completed a task 4092363 and it ran for 51,000 seconds or almost 14 hours. I have never had a task run this long. Are these longer running tasks? In the past the 8 - 12 hour tasks required about 5 - 6 hours. Yes they vary a bit depending on the task. I've had "long" work units that take my factory OC'ed 570 over 12 hours and others that come in under 6. It just depends on the work unit. You'll need to see what the name is as they usually create a batch of them with the same names, and the quicker ones will be from a different batch with a different name. Looking at the wu you liked to there were a couple of exits due to "no heartbeat from core client". Also I notice you have the card running at 1.81Ghz, so presumable you've OC'ed it which might be effecting it. BOINC blog |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Paul, your GTX570 must be downclocking. I expect the OC is too high, or the GPU is being overtaxed (Voltage/Power). Most likely you are running at half speed or less, as the card tries to protect itself. TONI_AGGdense tasks use lots of power and push the GPU close to their limit. So when you OC you might be using more power than these GPU's were designed to take. I'm running a TONI_AGGdense task now on my GTX470 and it will take less than 7h. So your GPU should take less than 6h. I'm using SWAN_SYNC and have 2 CPU cores freed up, so it's well optimized. Utilization on W2003 is 98%, no OC used. I occasionally get typing lag, but I'm not using the system for anything other than light work. I suggest you run with at most a 5% OC for now. Good luck, |
|
Send message Joined: 26 Dec 10 Posts: 115 Credit: 416,576,946 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thanks for the note. How do I dedicate more CPUs to the GPU? I have Swan_Sync=0. Can I set it to 1 and dedicate two processors to the GPU? Thank you. |
|
Send message Joined: 16 Mar 11 Posts: 509 Credit: 179,005,236 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
No. Go into the Preferences in BOINC manager, click on the Processor Usage tab and set the value for "On multiprocessor systems use at most _ % of the processors" to the appropriate value. Your computer has 4 processors so to leave 2 processors free for feeding the GPU you would put 50 in the box. |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I had a strange kind of long GIANNI_KKFREE5 type wu. It was restarted several times, due to system restarts for different reasons (Windows update, and possibly two crash recovery system restarts - maybe the latter is the explanation for the strange behavior of this wu). After all it took 61077s (2 minutes less than 17 hours) to process. (normal running time is 9 hours for this kind of wu) My GPU was not downclocking and was not overheated, the progress indicator didn't go back to 0% during processing (it's used to do, after crash recovery system restarts, possibly when the GPUGrid client crashes). There is two revelant error messages in the wu's log: 1. MDIO: read error for file "restart.coor", byte number 4: number of atoms (0) != (36497) expected (at least now I know the number of atoms for this one :) ) 2. No heartbeat from core client for 30 sec - exiting Any ideas to comfort me? :) |
|
Send message Joined: 28 Mar 09 Posts: 490 Credit: 11,731,645,728 RAC: 57 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Long running tasks are supposed to be 8 to 12 hours on the fastest cards, but the latest TONI long units have been running about 6 to 7 hours on my GTX 480 running Windows 7. They are not exactly the fastest combination card and platform. The fastest platform combinations can finish the tasks in under 5 hours. So shouldn't these work units be enlarged, (incorporating more useful computational work, not just slowing them down or adding useless stuff)? |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Zoltan, updates run at highest CPU priority, so if they took a while (especially if you are running Boinc with higher priority using efmer) they could have caused the no heartbeat problem. If Boinc autostarts and Windows is finishing installing updates following a reboot and trying to recover there is a good chance this can happen. The read error sounds like a corrupt file (more likely the system rebooted when the file was in use and then the file was restored to an earlier state). So next time close Boinc before doing the updates. |
©2025 Universitat Pompeu Fabra