process exited with code 212 (0xd4, -44)

Message boards : Number crunching : process exited with code 212 (0xd4, -44)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Easton West

Send message
Joined: 26 Sep 09
Posts: 4
Credit: 197,335,031
RAC: 0
Level
Ile
Scientific publications
watwat
Message 48809 - Posted: 4 Feb 2018, 3:19:33 UTC

Got same error code on my Linux box for several tasks:

http://gpugrid.net/results.php?hostid=447186

Most recent task, I suspended the task before it began. I rolled my computer's date back a month (February to January), then resumed the task to let it begin.

It didn't stop with an error immediately, that's a good sign.

I waited a minute, then returned my computer's date to real time. It's still running, hope it finishes okay.

http://gpugrid.net/workunit.php?wuid=13116635
ID: 48809 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 318
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 48810 - Posted: 4 Feb 2018, 10:30:41 UTC

More than half the current research projects on the server status page are now showing amber for the error rate. That's the sort of signal that might attract the admins attention.

I've had no response to my PM, but if there's no sign of change by tomorrow morning I might try again.
ID: 48810 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Easton West

Send message
Joined: 26 Sep 09
Posts: 4
Credit: 197,335,031
RAC: 0
Level
Ile
Scientific publications
watwat
Message 48811 - Posted: 4 Feb 2018, 14:46:48 UTC - in response to Message 48809.  

http://gpugrid.net/result.php?resultid=16993864

It was going so well, but suspending and resuming with the date set to February caused the error again. The app seems to check the date at the beginning, and upon resuming as well. Ten hours of work wiped out, d'oh.

Trying again, starting with a January date and then trying to never suspend. Maybe it'll work this time.

http://www.gpugrid.net/workunit.php?wuid=13117301
ID: 48811 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 318
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 48812 - Posted: 4 Feb 2018, 18:49:11 UTC - in response to Message 48811.  

BOINC never keeps a task scheduled for a GPU in memory (there's no equivalent of paging or a swap file for graphics memory), so every 'resume' is actually a complete application relaunch. Try to avoid letting it swap out.
ID: 48812 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Easton West

Send message
Joined: 26 Sep 09
Posts: 4
Credit: 197,335,031
RAC: 0
Level
Ile
Scientific publications
watwat
Message 48813 - Posted: 5 Feb 2018, 8:41:46 UTC
Last modified: 5 Feb 2018, 9:00:11 UTC

Success! So fiddling with the date is a possible workaround for Linux users.

http://www.gpugrid.net/workunit.php?wuid=13117301

Edit: P.S. Suspend other tasks before rolling the date back. It's okay if they stay in memory, but if they're running, the rollback will lock them up. There are probably other ways to do it, but to be safe: I suspended everything, then rolled back date, then started GPUGRID task (manager won't show it running, but it is running), then corrected the date (manager now shows GPUGRID running), then resumed other tasks, and made sure the GPUGRID task never suspended.
ID: 48813 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 662
Level
Tyr
Scientific publications
watwatwatwatwat
Message 48817 - Posted: 5 Feb 2018, 19:21:25 UTC
Last modified: 5 Feb 2018, 19:35:41 UTC

Got a response to my trouble ticket at acellera.com about the app error from Gianni.

the error message is misleading, there is no license to request. We are already fixing the application in gpugrid.


So I hope that we see a new Linux application from him shortly that doesn't error out.

[Edit] There is also a new News item mentioning the issue.

[Edit2] Question from me:

Thanks for the reply Gianni. Any estimate when that will be made available?


Gianni's reply

not yet, but it will be posted on gpugrid forum. We have a problem that the person that builds it is out until the end of the week. We have to see if the others can do it without him.
ID: 48817 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 662
Level
Tyr
Scientific publications
watwatwatwatwat
Message 48869 - Posted: 7 Feb 2018, 17:11:20 UTC

Looks like the same issue with the Short Acemd tasks. Thought it might use a different application, guess not. Same error in Linux as before.

Will have to wait for the developer to fix the application.
ID: 48869 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[AF>Libristes]Maeda

Send message
Joined: 5 May 12
Posts: 6
Credit: 650,185,478
RAC: 0
Level
Lys
Scientific publications
watwatwatwat
Message 48923 - Posted: 12 Feb 2018, 21:51:48 UTC

Should we turn off GPUs in the meantime to avoid failed WU or it doesn't matter ?
ID: 48923 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 318
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 48924 - Posted: 12 Feb 2018, 22:11:24 UTC - in response to Message 48923.  

Should we turn off GPUs in the meantime to avoid failed WU or it doesn't matter ?

Perhaps better to help out another project temporarily until this one is ready for you again?
ID: 48924 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Easton West

Send message
Joined: 26 Sep 09
Posts: 4
Credit: 197,335,031
RAC: 0
Level
Ile
Scientific publications
watwat
Message 48966 - Posted: 16 Feb 2018, 1:06:17 UTC

Just completed a WU for the new version 9.19, seems to be working fine.
ID: 48966 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 662
Level
Tyr
Scientific publications
watwatwatwatwat
Message 48967 - Posted: 16 Feb 2018, 2:08:15 UTC

I've already done and validated two new tasks with the 919 application. Original problem is resolved.
ID: 48967 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : Number crunching : process exited with code 212 (0xd4, -44)

©2025 Universitat Pompeu Fabra