Message boards :
Number crunching :
Tasks returning compute error
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 15 Aug 19 Posts: 7 Credit: 27,732,011 RAC: 41,631 Level ![]() Scientific publications
|
I received some work units (tasks) today but most of them failed with a "error while computing" error. The error occurs usually within 1-2 minutes and one task went 7-8 minutes. Another task is still running (80% done), Error encountered on two Windows 11 PCs both running BOINC v 8.2.8 with different NVidia GPUs. The first is an older PC where a task is still running: https://gpugrid.net/gpugrid/show_host_detail.php?hostid=648899 with 3 tasks failed and 1 still processing (type is ATM: Free energy calculations of protein...) The other PC is a newer PC: https://gpugrid.net/gpugrid/show_host_detail.php?hostid=638123 with 3 tasks failed. |
Steve DoddSend message Joined: 26 Dec 08 Posts: 19 Credit: 4,622,334,506 RAC: 167,146 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Having same issue.
|
|
Send message Joined: 13 Dec 17 Posts: 1423 Credit: 9,186,946,190 RAC: 1,288,374 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
It usually takes the researchers a few small batches of tasks to sort out the proper configuration parameters. Only when the small batches are mostly successful do they give a larger, longer lasting batch. |
|
Send message Joined: 15 Aug 19 Posts: 7 Credit: 27,732,011 RAC: 41,631 Level ![]() Scientific publications
|
Thank you Keith! Glad to see progress and glad to help. The first task did complete successfully. A second task last night failed after multiple hours. |
|
Send message Joined: 12 Aug 25 Posts: 2 Credit: 55,500,000 RAC: 541,603 Level ![]() Scientific publications
|
Slot 0 was already occupied by another program, so it was aborted after 7 minutes. Werkeenheid 31550481 WARNING: The script pyaml.exe is installed in 'C:\ProgramData\BOINC\slots\0\Scripts' which is not on PATH |
|
Send message Joined: 13 Dec 17 Posts: 1423 Credit: 9,186,946,190 RAC: 1,288,374 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Yes, the tasks are much harder to setup correctly on Windows. OTOH, in Linux it is much easier to get the configuration right because of the way that the OS sets up support files and applications. Still doesn't help you when the researcher plainly forgets to include a needed research in the task package and then attempts to use it when it isn't there. At least most of those fail fast, less than a minute. The irksome ones are the ones that seem to be running correctly but eventually hits a NaN error after many hours. Again, expect to have many errors for the first batch of work that gets sent out. |
|
Send message Joined: 1 Jan 15 Posts: 1168 Credit: 12,311,898,501 RAC: 331,341 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
...The irksome ones are the ones that seem to be running correctly but eventually hits a NaN error after many hours...I have had exactly those quite frequently within the past 2 days :-( https://gpugrid.net/gpugrid/results.php?userid=125700&offset=40&show_names=0&state=0&appid= |
|
Send message Joined: 12 Aug 25 Posts: 2 Credit: 55,500,000 RAC: 541,603 Level ![]() Scientific publications
|
It's a shame that all the p38_A15_A13 etc. don't even reach 1%, 17 succeeded, 22 failed, too bad it was always going smoothly. |
|
Send message Joined: 1 Jan 15 Posts: 1168 Credit: 12,311,898,501 RAC: 331,341 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
It's a shame that all the p38_A15_A13 etc. don't even reach 1%, 17 succeeded, 22 failed, too bad it was always going smoothly.as I already wrote in the GPUGRID Discord channel: I am surprised that obviously, before issuing a new batch, none of the tasks are being tested before being sent out. |
©2026 Universitat Pompeu Fabra