Message boards :
Number crunching :
failing tasks lately
Message board moderation
Previous · 1 · 2 · 3 · 4 · Next
| Author | Message |
|---|---|
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
any idea why all tasks downloaded within the last few hours fail immediately? yes, I had checked that before I wrote my posting above. I wonder whether the GPUGRID team has realized this problem yet. |
|
Send message Joined: 18 Oct 13 Posts: 53 Credit: 406,647,419 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
same here all WU's with the same Error Code <core_client_version>7.14.2</core_client_version> <![CDATA[ <message> (unknown error) - exit code -44 (0xffffffd4)</message> ]]> |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
it seems that the licence for Windows 10 (and maybe for Windows 7/8, too) has expired. Why do I think so? My Windows XP host downloaded a new tasks a few minutes ago, and it works well. |
JStatesonSend message Joined: 31 Oct 08 Posts: 186 Credit: 3,578,903,157 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
any idea why all tasks downloaded within the last few hours fail immediately? Things left to themselves tend to go from bad to worse. |
robertmilesSend message Joined: 16 Apr 09 Posts: 503 Credit: 769,991,668 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Several more tasks with computation errors, but nothing definite about just what kind of error. At least they didn't use much CPU or GPU time. http://www.gpugrid.net/result.php?resultid=21242466 http://www.gpugrid.net/result.php?resultid=21242065 http://www.gpugrid.net/result.php?resultid=21241863 http://www.gpugrid.net/result.php?resultid=21233480 And so on. Could more diagnostics be added to v9.22 (cuda80) to show what caused this error, if you can't fix it instead? This appears for both short and long runs. |
|
Send message Joined: 7 Jun 10 Posts: 3 Credit: 208,405,467 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]()
|
Same here... |
|
Send message Joined: 28 Mar 09 Posts: 490 Credit: 11,731,645,728 RAC: 57 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I actually got one to finish successfully: http://www.gpugrid.net/workunit.php?wuid=16709219 I changed the date to before the license expired, right after the WU started crunching and before it crashes, and then change it back. It's actually tricky to do, because boinc acts strangely when the date is moved back. My two other attempts failed, so I had enough of this. BTW, the video card that I used was a gtx 980 ti, not the rtx 2080 ti. |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I actually got one to finish successfully: so it's clear that the license has expired. Changing the date of the host can indeed be tricky, even more if also other BOINC projects are running which could be totally confused by doing this. Happened to me last time when the license expired, it all ended up in a total mess. Let's hope that it won't take too long until there is a new acemd with a valid license. |
|
Send message Joined: 2 Jul 16 Posts: 338 Credit: 7,987,341,558 RAC: 213 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
I actually got one to finish successfully: I thought one of the reasons for the new app was to not need the license that keeps expiring. Plus Turing support in a BOINC wrapper to separate the science part from the BOINC part. |
|
Send message Joined: 21 Mar 16 Posts: 513 Credit: 4,673,458,277 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
They are not using the new app yet, the reason the app expired is because it's still the old app. |
|
Send message Joined: 2 Jul 16 Posts: 338 Credit: 7,987,341,558 RAC: 213 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
They are not using the new app yet, the reason the app expired is because it's still the old app. And? I was replying to this part "new acemd with a valid license." The new app won't need a license from what I recall. |
robertmilesSend message Joined: 16 Apr 09 Posts: 503 Credit: 769,991,668 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I've seen some mentions of tasks still completing properly on some rather old versions of Windows, such as Windows XP. Could some people with at least one computer with such a version give more details? Perhaps the older versions don't include an expiration check, and therefore have to assume that it is not expired. |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
the "older versions" also include an expiration check. However, for XP, a differnt acemd.exe is used (running with CUDA 65), the license for which seems to expire at a later date. No idea at what date exactly, it could be tomorrow, or in a week, or next month ... |
|
Send message Joined: 12 Dec 11 Posts: 91 Credit: 2,730,095,033 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I´m using Win XP 64 and havind just errors aswell. |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
No, you are using Windows 7 x64. |
|
Send message Joined: 27 Jul 11 Posts: 138 Credit: 539,953,398 RAC: 0 Level ![]() Scientific publications ![]()
|
Stderr output <core_client_version>7.14.2</core_client_version> <![CDATA[ <message> (unknown error) - exit code -44 (0xffffffd4)</message> ]]> name e18s22_e7s95p0f111-PABLO_V4_UCB_p27_sj403_no_salt_IDP-0-2-RND0646 application Long runs (8-12 hours on fastest card) created 8 Aug 2019 | 21:02:41 UTC minimum quorum 1 initial replication 1 max # of error/total/success tasks 7, 10, 6 errors Too many errors (may have bug) 100% failure rate for the last three days. |
|
Send message Joined: 11 Feb 18 Posts: 41 Credit: 579,891,424 RAC: 0 Level ![]() Scientific publications
|
Hello everyone, Please read the post in "news" about "expired licence". It is not at our side, but at server side. Admin know it already two days. |
|
Send message Joined: 12 Dec 11 Posts: 91 Credit: 2,730,095,033 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
No, you are using Windows 7 x64. You are right, my bad. But I was having errors with the new drivers. Then I rolled back to 378.94 driver and it´s running fine now. http://www.gpugrid.net/show_host_detail.php?hostid=413063 http://www.gpugrid.net/workunit.php?wuid=16717273 |
|
Send message Joined: 2 Jan 09 Posts: 303 Credit: 7,321,800,090 RAC: 270 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hello everyone, That's fixed now. But the errors continue, 2 seconds into a Pablo unit and poof they error out. I turned off the long run units and it seems there aren't any short run units to do for the gpu's. |
|
Send message Joined: 4 Aug 14 Posts: 266 Credit: 2,219,935,054 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
But the errors continue, 2 seconds into a Pablo unit and poof they error out mikey, the tasks with errors were run on a Turing based card (GTX1660ti). These GPUs are not currently supported by the ACEMD2 app. Admins are working on ACEMD3 app which will support Turing based GPUs. Hopefully this will be released soon. There is currently no short tasks in the queue. |
©2025 Universitat Pompeu Fabra