Message boards :
Number crunching :
Incorrectly reported run time
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 1 Feb 09 Posts: 4 Credit: 427,753,832 RAC: 2 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
I notice that the run times reported for Python tasks on Win10 are much larger than the actual elapsed clock time; rather it is mirroring the value of CPU time. Example, a recently completed WU took just shy of 14 hours to complete (50739 seconds exactly based on first and last time reported in the log) while the Completed Tasks list shows the task taking 208,842.7 seconds to complete. Curious, is this a new anomaly? Task is here: https://www.gpugrid.net/result.php?resultid=33337456 |
|
Send message Joined: 27 Jul 11 Posts: 138 Credit: 539,953,398 RAC: 0 Level ![]() Scientific publications ![]()
|
I notice that the run times reported for Python tasks on Win10 are much larger than the actual elapsed clock time; rather it is mirroring the value of CPU time. Example, a recently completed WU took just shy of 14 hours to complete (50739 seconds exactly based on first and last time reported in the log) while the Completed Tasks list shows the task taking 208,842.7 seconds to complete. Curious, is this a new anomaly? ________ Not a new issue. Go to the news section and then the Python concerning thread. Every issue is covered in that thread. |
|
Send message Joined: 1 Feb 09 Posts: 4 Credit: 427,753,832 RAC: 2 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
I notice that the run times reported for Python tasks on Win10 are much larger than the actual elapsed clock time; rather it is mirroring the value of CPU time. Example, a recently completed WU took just shy of 14 hours to complete (50739 seconds exactly based on first and last time reported in the log) while the Completed Tasks list shows the task taking 208,842.7 seconds to complete. Curious, is this a new anomaly? Thanks. Searched a bit, but obviously not in the correct location. |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
The posts about this ARE in there. The gist is that since these tasks are of dual cpu-gpu nature and primarily compute on the cpu, BOINC has no mechanism for handling or understanding these tasks. It only knows that a task is either run on the cpu or run on the gpu. It does not compute correctly the cpu_time or run_time for any task that is run on both components. Ignore the cpu_times and run_times, they are meaningless values. Just pay attention to the delta between time sent and time returned to determine the period of time the task takes to crunch. |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 428 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Not quite. Runtime and CPU time are recorded locally by the BOINC client, and reported accurately to the project server. At some point, the server throws its hands up, and says "the runtime can't be a low as that, if the CPU time is so high", and 'corrects' the runtime to what it regards is the true value - even if that's impossible given the clock times when the task was issued and returned. I've never summoned up the energy to investigate further, and find out whether that's 1) an error by GPUGrid 2) a deliberate decision by BOINC 3) because GPUGrid is running an outdated server 4) something else But it could be worked out. |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
But neither the run_time or cpu_time is correctly reported on ANY of my hosts. I return a PythonGPU task in 8.5 - 9 hours consistently on my fastest host. Times reported to the scheduler are ~80000 seconds or equivalent to 22.2 hours. So off by a factor of 3X there consistently. My slowest hosts report using 135K seconds or 37.8 hours of run_time. Tasks actually run and report in 12 - 13 hours. So also off by ~3X factor. So I ignore the reported values as nonsense. [Edit] OK, I see what you are talking about. You are referring the time recorded in the job_log for the task aren't you? But even that is not correct. This is the job_log output for this task. 1678063389 ue 3024664.119761 ct 80540.580000 et 32929.492896 ABOU_rnd_ppod_expand_demos29_2_exp3-0-1-RND8739_0 ue = Estimated runtime = 3024664 seconds = !wrong! ct CPU time = 80540 seconds = !correct! = that is correctly reported value to the project from the job_log. et = elapsed time = 32929 seconds = ! actual correct runtime not reported to the project. But does not reflect any true relation to the actual time the calculation needs to run and report. And here is the task at the project. https://www.gpugrid.net/result.php?resultid=33336333 |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 428 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
[Edit] OK, I see what you are talking about. You are referring the time recorded in the job_log for the task aren't you? Actually, no - I'm not relying on historical logs. I'm looking at the real-time data interchange between BOINC client and BOINC server. I've just reported task e00006a07567-ABOU_rnd_ppod_expand_demos29_2_exp3-0-1-RND8382_1 The event log messages relating to that task are: Mon 06 Mar 2023 12:34:06 GMT | GPUGRID | Computation for task e00006a07567-ABOU_rnd_ppod_expand_demos29_2_exp3-0-1-RND8382_1 finished Mon 06 Mar 2023 12:34:26 GMT | GPUGRID | Sending scheduler request: To report completed tasks. Mon 06 Mar 2023 12:34:26 GMT | GPUGRID | Reporting 1 completed tasks Mon 06 Mar 2023 12:34:27 GMT | GPUGRID | [sched_op] handle_scheduler_reply(): got ack for task e00006a07567-ABOU_rnd_ppod_expand_demos29_2_exp3-0-1-RND8382_1 That report was accomplished by sending a file called sched_request_www.gpugrid.net.xml The significant content of that file was: <result>
<name>e00006a07567-ABOU_rnd_ppod_expand_demos29_2_exp3-0-1-RND8382_1</name>
<final_cpu_time>79615.790000</final_cpu_time>
<final_elapsed_time>30189.014614</final_elapsed_time>
...So, the CPU and elapsed times were different when the report left my machine. But they are the same on the website: Run time 79,615.79 CPU time 79,615.79 Neglecting the possibility that it was intercepted by a trojan router between northern England and eastern Spain, I'm asserting that the change happened within the GPUGrid server. |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
OK, poTaytoe - poTahtoe Same values in the job log culled from the sched_reply. But my statement about what the website displaying is nonsense values is correct. |
©2025 Universitat Pompeu Fabra