Message boards : Number crunching : process exited with code 212 (0xd4, -44)
Author | Message |
---|---|
Hi all, starting from a couple of days ago one of my computers (http://www.gpugrid.net/show_host_detail.php?hostid=178360) all the gpugrid tasks error out with this message <core_client_version>7.2.47</core_client_version> <![CDATA[ <message> process exited with code 212 (0xd4, -44) </message> <stderr_txt> </stderr_txt> ]]> The same computer has no problems crunching collatz or primegrid tasks. Any hints? [edit] by running the program manually I got the following ./acemd.914-80.bin # ACEMD Molecular Dynamics Version [3212u2] # Basic license will expire soon. Contact info@acellera.com for licensing details # Basic license has expired. Contact info@acellera.com for licensing details | |
ID: 48771 | Rating: 0 | rate: / Reply Quote | |
Looking at the 'workunit' column for those errors, every machine which has attempted them has failed right at startup. That looks like a bad batch of work, rather than any problem with your machine. | |
ID: 48772 | Rating: 0 | rate: / Reply Quote | |
I've had 7 of these in a row since the 30th. Only 1 previous bad WU prior and nearly 100mil points worth of good tasks. System has been crunching away at E@H in the mean time. | |
ID: 48773 | Rating: 0 | rate: / Reply Quote | |
After a long absence from GPUGrid, I tried getting some work yesterday for my linux rig with 2-GTX1080 GPUs. After several hours of trying to get work, I did get one work unit and it failed with the same error code as posted above. Another computer with windows as OS finished that WU sucessfully. | |
ID: 48774 | Rating: 0 | rate: / Reply Quote | |
After a long absence from GPUGrid, I tried getting some work yesterday for my linux rig with 2-GTX1080 GPUs. After several hours of trying to get work, I did get one work unit and it failed with the same error code as posted above. Another computer with windows as OS finished that WU sucessfully. If the overclocks are reasonable and not on the bleeding edge the error rates are basically nothing. (Excluding errors caused by the project like this one). I have not had any tasks that crashed due to my own hardware. These these ones that don't even start and one after like 3 seconds. | |
ID: 48775 | Rating: 0 | rate: / Reply Quote | |
Here as well! The last 4 WUs haver errored out after a few seconds with: | |
ID: 48776 | Rating: 0 | rate: / Reply Quote | |
Edit again - the vast majority of the failures among your wingmates are also running some flavour of Linux. That feels like a Clue.Sure! The license of the Linux app is expired. It should be renewed - most probably there should be a new (at least in its version number) app for Linux as soon as possible! The same thing happened in April 2017 with the Windows XP app (and the license of the present Windows XP app will expire in April 2018 again forever). Perhaps the Linux users should try to set their clocks to an earlier date to make a fool of the app's license expiration check. | |
ID: 48777 | Rating: 0 | rate: / Reply Quote | |
I've had 27 consecutive failures since Jan 30 on three different Linux machines (GTX-1060). Review of the results in the wingman column indicates all failures to be from v9.14 (ACEMD and Linux version?) and the successful completions using v9.18 Windows version I assume. | |
ID: 48778 | Rating: 0 | rate: / Reply Quote | |
Has anyone sent a PM to Gianni to suggest he deprecates the Linux app (to save wasted bandwidth) until the licence problem has been sorted out? | |
ID: 48779 | Rating: 0 | rate: / Reply Quote | |
Has anyone sent a PM to Gianni to suggest he deprecates the Linux app (to save wasted bandwidth) until the licence problem has been sorted out?I haven't. The better way to address this issue to put the question this way: Who will send a PM to Gianni to suggest he deprecates the Linux app? Perhaps they have already noticed this issue on their own Linux machines? | |
ID: 48780 | Rating: 0 | rate: / Reply Quote | |
Up to 17 failed tasks for me. The project out put is really low. Looks like a test to see if the admins actually give 2 cents. | |
ID: 48781 | Rating: 0 | rate: / Reply Quote | |
Has anyone sent a PM to Gianni to suggest he deprecates the Linux app (to save wasted bandwidth) until the licence problem has been sorted out?I haven't. Not from what people are saying here. In general, and from speaking with BOINC developers/administrators, we crunchers are much more aware of the small details of how a project is running than the admins are. Although Gianni's failure rate on the server status page has crept up from 23% to 24% overnight, it's still in the green, and work is being returned (my Windows resend went up this morning) - so, nothing obvious to ring alarm bells. I'll send the PM... ...done Hi Gianni, | |
ID: 48782 | Rating: 0 | rate: / Reply Quote | |
Thanks for sending the PM Richard. | |
ID: 48783 | Rating: 0 | rate: / Reply Quote | |
This is the reply I got from the Acellera Help Desk | Dear Keith, So apparently Gianni doesn't have a license to run acellera software at GPU Grid.net. Thus the message. | |
ID: 48785 | Rating: 0 | rate: / Reply Quote | |
I believe Gianni is the scientific founder of Acellera so I would think he would get a steep discount on licenses. | |
ID: 48786 | Rating: 0 | rate: / Reply Quote | |
I believe Gianni is the scientific founder of Acellera so I would think he would get a steep discount on licenses. That's choice. Ha ha. Didn't know. | |
ID: 48787 | Rating: 0 | rate: / Reply Quote | |
Somebody must be listening as I haven't had a gpu wu since my last post here on any of my 3 gpu capable machines. Just crunching QC, WCG and E@H right now. Also noticed I haven't received an upgrade to ACEMD v 9.14 yet (original download date 5/6/17 still on file). | |
ID: 48793 | Rating: 0 | rate: / Reply Quote | |
Keep an eye on https://www.gpugrid.net/apps.php for new versions. That standard BOINC url works here as on all BOINC projects, even though there isn't an obvious link. | |
ID: 48794 | Rating: 0 | rate: / Reply Quote | |
Keep an eye on https://www.gpugrid.net/apps.php for new versions. That standard BOINC url works here as on all BOINC projects, even though there isn't an obvious link. Thanks Richard. Very helpful info and link, added it to my favorites. | |
ID: 48795 | Rating: 0 | rate: / Reply Quote | |
Rats... I am getting more gpu work and they are all failing. Since I don't have any windows machines, I am out of luck until this is resolved. | |
ID: 48807 | Rating: 0 | rate: / Reply Quote | |
Got same error code on my Linux box for several tasks: | |
ID: 48809 | Rating: 0 | rate: / Reply Quote | |
More than half the current research projects on the server status page are now showing amber for the error rate. That's the sort of signal that might attract the admins attention. | |
ID: 48810 | Rating: 0 | rate: / Reply Quote | |
http://gpugrid.net/result.php?resultid=16993864 | |
ID: 48811 | Rating: 0 | rate: / Reply Quote | |
BOINC never keeps a task scheduled for a GPU in memory (there's no equivalent of paging or a swap file for graphics memory), so every 'resume' is actually a complete application relaunch. Try to avoid letting it swap out. | |
ID: 48812 | Rating: 0 | rate: / Reply Quote | |
Success! So fiddling with the date is a possible workaround for Linux users. | |
ID: 48813 | Rating: 0 | rate: / Reply Quote | |
Got a response to my trouble ticket at acellera.com about the app error from Gianni. the error message is misleading, there is no license to request. We are already fixing the application in gpugrid. So I hope that we see a new Linux application from him shortly that doesn't error out. [Edit] There is also a new News item mentioning the issue. [Edit2] Question from me: Thanks for the reply Gianni. Any estimate when that will be made available? Gianni's reply not yet, but it will be posted on gpugrid forum. We have a problem that the person that builds it is out until the end of the week. We have to see if the others can do it without him. | |
ID: 48817 | Rating: 0 | rate: / Reply Quote | |
Looks like the same issue with the Short Acemd tasks. Thought it might use a different application, guess not. Same error in Linux as before. | |
ID: 48869 | Rating: 0 | rate: / Reply Quote | |
Should we turn off GPUs in the meantime to avoid failed WU or it doesn't matter ? | |
ID: 48923 | Rating: 0 | rate: / Reply Quote | |
Should we turn off GPUs in the meantime to avoid failed WU or it doesn't matter ? Perhaps better to help out another project temporarily until this one is ready for you again? | |
ID: 48924 | Rating: 0 | rate: / Reply Quote | |
Just completed a WU for the new version 9.19, seems to be working fine. | |
ID: 48966 | Rating: 0 | rate: / Reply Quote | |
I've already done and validated two new tasks with the 919 application. Original problem is resolved. | |
ID: 48967 | Rating: 0 | rate: / Reply Quote | |
Message boards : Number crunching : process exited with code 212 (0xd4, -44)