Message boards :
Number crunching :
New app update (acemd3)
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
I am testing the new acemd3 app. The app is entirely new: faster and more general. The idea is to replace the old one asap. We'll also try to make it more maintainable (a long standing issue) using the boinc wrapper. I've sent a handful of test WUs for now -- cuda 8.0, linux. The goal is that it should work on properly configured machines, i.e. with relatively recent drivers, where the previous app was already working. So far we got one success, i.e. 20962989. |
|
Send message Joined: 30 Apr 19 Posts: 54 Credit: 168,971,875 RAC: 0 Level ![]() Scientific publications
|
do you mean this one? crunched in 6 or 7 minutes. http://www.gpugrid.net/result.php?resultid=20962989 but i cann't see which gpu is used to crunch this task |
|
Send message Joined: 16 Jul 07 Posts: 209 Credit: 5,496,860,456 RAC: 9,935 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
How to select work for the new app? "New version of ACEMD" app is not a choice under project preferences. The current choices are only these:
ACEMD long runs (8-12 hours on fastest GPU) ACEMD Beta Quantum Chemistry (CPU) Quantum Chemistry (CPU, beta) Python Runtime
Reno, NV Team: SETI.USA |
|
Send message Joined: 4 Aug 14 Posts: 266 Credit: 2,219,935,054 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
So far we got one success, i.e. 20962989. The other 5 Test tasks seem "stuck". They have been in progress now for quite a while. They must be really long, have errored, or hosts have downloaded the tasks and then been turned off. Can our Linux crunchers check your Linux hosts for progress? EDIT: The successful task above has also been accepted by 2 Windows Hosts ("New version of ACEMD v1.19" but failed. Also failed on 2 Linux hosts "New version of ACEMD v2.00"). So it seems the Test tasks are being accepted by Windows and Linux hosts. The successful Linux Host has Nvidia driver v430.14. The failed hosts had Nvidia drivers ranging from v375.70 to v418.19. |
|
Send message Joined: 2 Jul 16 Posts: 338 Credit: 7,987,341,558 RAC: 213 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
How to select work for the new app? "New version of ACEMD" app is not a choice under project preferences. The current choices are only these: I've just selected everything including test apps with only Use GPUs selected. Nothing yet though but I would think that should be enough. Devs can sneak in about anything under the test apps options. |
|
Send message Joined: 30 Apr 19 Posts: 54 Credit: 168,971,875 RAC: 0 Level ![]() Scientific publications
|
probably my next build (in 6-10months) will be 4, 5 or 6 rtx cards. hopefully is the app then mature enough for investing couple of thousand euro for gpugrid |
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
The number of failures, and the existence of one success, is odd. Doesn't seem to be explained by driver versions alone. |
|
Send message Joined: 21 Mar 16 Posts: 513 Credit: 4,673,458,277 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Try sending out more experimental WUs and see if it is one driver version |
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
Recent changes: * sent 100 more test wus * deprecated the windows "acemd3" app * made acemd3 as beta * fixed its name in prefs |
|
Send message Joined: 21 Mar 16 Posts: 513 Credit: 4,673,458,277 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
|
|
Send message Joined: 26 Aug 08 Posts: 183 Credit: 10,085,929,375 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Multiple failures of this task on both windows and linux http://www.gpugrid.net/workunit.php?wuid=16517304 <message> process exited with code 195 (0xc3, -61)</message> <stderr_txt> 15:19:27 (30109): wrapper (7.7.26016): starting 15:19:27 (30109): wrapper (7.7.26016): starting 15:19:27 (30109): wrapper: running acemd3 (--boinc input --device 0) # Engine failed: Error launching CUDA compiler: 32512 sh: 1: : Permission denied 15:19:28 (30109): acemd3 exited; CPU time 0.186092 15:19:28 (30109): app exit status: 0x1 15:19:28 (30109): called boinc_finish(195) </stderr_txt> Why is the app launching CUDA compiler? |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 351 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
My host 43404 got one of WU 16517259. Like all the others, it failed within one second: 03/06/2019 15:36:58 | GPUGRID | Starting task a27-TONI_TEST3-0-1-RND0985_6 03/06/2019 15:36:58 | GPUGRID | [cpu_sched] Starting task a27-TONI_TEST3-0-1-RND0985_6 using acemd3 version 119 (cuda80) in slot 0 03/06/2019 15:36:59 | GPUGRID | [sched_op] Deferring communication for 00:01:03 03/06/2019 15:36:59 | GPUGRID | [sched_op] Reason: Unrecoverable error for task a27-TONI_TEST3-0-1-RND0985_6 03/06/2019 15:36:59 | GPUGRID | Computation for task a27-TONI_TEST3-0-1-RND0985_6 finished 03/06/2019 15:36:59 | GPUGRID | Output file a27-TONI_TEST3-0-1-RND0985_6_0 for task a27-TONI_TEST3-0-1-RND0985_6 absent 03/06/2019 15:36:59 | GPUGRID | Output file a27-TONI_TEST3-0-1-RND0985_6_9 for task a27-TONI_TEST3-0-1-RND0985_6 absent with no further information than Incorrect function. (0x1) - exit code 1 (0x1) But I did capture all the specifications and downloaded files between download and run, so I can recreate the attempt offline and see what additional crash information I can collect. May take me a little time... Windows 7/64, GTX 970, runs v9.22 just fine. Edit - all I can get in offline runs is ACEMD can run with Boinc only! - even when I supply a dummy init_data.xml file which has worked in other standalone test environments. I'll go out for a walk and see if that activates the little grey cells. |
|
Send message Joined: 16 Jul 07 Posts: 209 Credit: 5,496,860,456 RAC: 9,935 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Recent changes: Can you please explain which app we have to select in our project preferences to get these tasks? The app name "New version of ACEMD" is not a an option in the project preferences. Reno, NV Team: SETI.USA |
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
Should be called "ACEMD3 Beta". It's for Linux only (for now). Windows machines should soon stop receiving it. |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 351 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Can you please explain which app we have to select in our project preferences to get these tasks? The app name "New version of ACEMD" is not a an option in the project preferences. The computer I got a test app on has If no work for selected applications is available, accept work from other applications?yes Nothing else out of the ordinary. The app name appeared as 'acemd3'. |
|
Send message Joined: 26 Aug 08 Posts: 183 Credit: 10,085,929,375 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I got 1 task but it failed.:-( http://www.gpugrid.net/result.php?resultid=20974689 linux mint 19.1 GTX 1080 Driver: 390.116 Cuda version 9.1 |
|
Send message Joined: 26 Aug 08 Posts: 183 Credit: 10,085,929,375 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
All but 2 of the libraries that were downloaded are marked as executable. Should libgcc and libOpenCL also be executable? -rwxr-xr-x 1 boinc boinc 425056 Jun 3 11:52 libcudart.so.8.0.61.46fcfd92ffc5c805d076b5e2b17e9647 -rwxr-xr-x 1 boinc boinc 146772120 Jun 3 11:56 libcufft.so.8.0.61.b142ab8797d534b619ef19c7e98cffc7 -rwxr-xr-x 1 boinc boinc 1647707 Jun 3 11:53 libfftw3f.so.3.4.4.a4580ddf9efebaad56fab49847a8c899 -rwxr-xr-x 1 boinc boinc 31467 Jun 3 11:52 libfftw3f_threads.so.3.4.4.dd0c6fcfa550371acf730db2d9d5a270 -rw-r--r-- 1 boinc boinc 819744 Jun 3 11:52 libgcc_s.so.1.d7f787a9bf6c3633eaebb9015c6d9044 -rwxr-xr-x 1 boinc boinc 937656 Jun 3 11:52 libgomp.so.1.0.0.efdf718669edc7fff00e0c5f7f0b8791 -rwxr-xr-x 1 boinc boinc 9659424 Jun 3 11:54 libnvrtc-builtins.so.8.0.61.ef79235263e650333dd8c573faa47432 -rwxr-xr-x 1 boinc boinc 18517368 Jun 3 11:54 libnvrtc.so.8.0.61.1ac77468cd8086b8cd1a6c855da50f8c -rw-r--r-- 1 boinc boinc 31696 Jun 3 11:52 libOpenCL.so.1.0.0.343dee45a7d7eb4b9016b6cd9d1bd8d5 -rwxr-xr-x 1 boinc boinc 655240 Jun 3 11:54 libOpenMMCPU.so.19849b4ff1cf4d33f75d9433b4d5c6bb -rwxr-xr-x 1 boinc boinc 37096 Jun 3 11:53 libOpenMMCudaCompiler.so.aaed781fe4caa9d1099312d458a9b902 -rwxr-xr-x 1 boinc boinc 2774560 Jun 3 11:52 libOpenMMCUDA.so.8867021fdc0daf2e39f1b7228ece45af -rwxr-xr-x 1 boinc boinc 2979224 Jun 3 11:52 libOpenMMOpenCL.so.6a31fa1ff5ae3a26ea64f2abfb5a66cc -rwxr-xr-x 1 boinc boinc 80808 Jun 3 11:53 libOpenMMPME.so.3208e45e71567824e8390ab1c79c6a66 -rwxr-xr-x 1 boinc boinc 4062370 Jun 3 11:53 libOpenMM.so.5406dfd716045d08ad6369e2399a98e2 -rwxr-xr-x 1 boinc boinc 9536208 Jun 3 11:54 libstdc++.so.6.0.25.e344f48acfbd4f5abbf99b2c75cc5e50 |
|
Send message Joined: 26 Aug 08 Posts: 183 Credit: 10,085,929,375 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
regarding the error on my task: # Engine failed: Error launching CUDA compiler: 32512 sh: 1: : Permission denied Is this solution relevant? https://github.com/pandegroup/openmm/issues/1352 |
|
Send message Joined: 30 Apr 19 Posts: 54 Credit: 168,971,875 RAC: 0 Level ![]() Scientific publications
|
|
|
Send message Joined: 4 Aug 14 Posts: 266 Credit: 2,219,935,054 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi Toni are you explicitly naming the path to libnvrtc-builtins.so when compiling? perhaps include boinc project folder in LD_LIBRARY_PATH |
©2025 Universitat Pompeu Fabra