New app update (acemd3)

Message boards : Number crunching : New app update (acemd3)
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 7 · Next

AuthorMessage
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 51939 - Posted: 30 May 2019, 16:01:47 UTC

I am testing the new acemd3 app. The app is entirely new: faster and more general. The idea is to replace the old one asap. We'll also try to make it more maintainable (a long standing issue) using the boinc wrapper.

I've sent a handful of test WUs for now -- cuda 8.0, linux.

The goal is that it should work on properly configured machines, i.e. with relatively recent drivers, where the previous app was already working. So far we got one success, i.e. 20962989.

ID: 51939 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
erik

Send message
Joined: 30 Apr 19
Posts: 54
Credit: 168,971,875
RAC: 0
Level
Ile
Scientific publications
wat
Message 51940 - Posted: 30 May 2019, 19:36:59 UTC - in response to Message 51939.  
Last modified: 30 May 2019, 19:39:14 UTC

do you mean this one? crunched in 6 or 7 minutes.

http://www.gpugrid.net/result.php?resultid=20962989

but i cann't see which gpu is used to crunch this task
ID: 51940 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]

Send message
Joined: 16 Jul 07
Posts: 209
Credit: 5,496,860,456
RAC: 9,935
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51942 - Posted: 30 May 2019, 21:32:24 UTC

How to select work for the new app? "New version of ACEMD" app is not a choice under project preferences. The current choices are only these:


    ACEMD short runs (2-3 hours on fastest card)
    ACEMD long runs (8-12 hours on fastest GPU)
    ACEMD Beta
    Quantum Chemistry (CPU)
    Quantum Chemistry (CPU, beta)
    Python Runtime



"ACEMD Beta" looks likely, but the name doesn't match "New version of ACEMD", which is how it is being reported over on wuprop. And also it does not match the name on the app page. In fact, the app page indicates that "ACEMD Beta" and "New version of ACEMD" are completely different apps.


Reno, NV
Team: SETI.USA
ID: 51942 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rod4x4

Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,219,935,054
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 51947 - Posted: 30 May 2019, 23:38:22 UTC
Last modified: 31 May 2019, 0:21:02 UTC

So far we got one success, i.e. 20962989.


The other 5 Test tasks seem "stuck". They have been in progress now for quite a while.

They must be really long, have errored, or hosts have downloaded the tasks and then been turned off.

Can our Linux crunchers check your Linux hosts for progress?

EDIT: The successful task above has also been accepted by 2 Windows Hosts ("New version of ACEMD v1.19" but failed. Also failed on 2 Linux hosts "New version of ACEMD v2.00"). So it seems the Test tasks are being accepted by Windows and Linux hosts. The successful Linux Host has Nvidia driver v430.14. The failed hosts had Nvidia drivers ranging from v375.70 to v418.19.
ID: 51947 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Jul 16
Posts: 338
Credit: 7,987,341,558
RAC: 213
Level
Tyr
Scientific publications
watwatwatwatwat
Message 51948 - Posted: 31 May 2019, 1:16:13 UTC - in response to Message 51942.  

How to select work for the new app? "New version of ACEMD" app is not a choice under project preferences. The current choices are only these:


    ACEMD short runs (2-3 hours on fastest card)
    ACEMD long runs (8-12 hours on fastest GPU)
    ACEMD Beta
    Quantum Chemistry (CPU)
    Quantum Chemistry (CPU, beta)
    Python Runtime



"ACEMD Beta" looks likely, but the name doesn't match "New version of ACEMD", which is how it is being reported over on wuprop. And also it does not match the name on the app page. In fact, the app page indicates that "ACEMD Beta" and "New version of ACEMD" are completely different apps.



I've just selected everything including test apps with only Use GPUs selected. Nothing yet though but I would think that should be enough. Devs can sneak in about anything under the test apps options.
ID: 51948 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
erik

Send message
Joined: 30 Apr 19
Posts: 54
Credit: 168,971,875
RAC: 0
Level
Ile
Scientific publications
wat
Message 51957 - Posted: 31 May 2019, 19:01:43 UTC

probably my next build (in 6-10months) will be 4, 5 or 6 rtx cards. hopefully is the app then mature enough for investing couple of thousand euro for gpugrid
ID: 51957 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 51966 - Posted: 3 Jun 2019, 10:57:57 UTC - in response to Message 51957.  

The number of failures, and the existence of one success, is odd. Doesn't seem to be explained by driver versions alone.
ID: 51966 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
PappaLitto

Send message
Joined: 21 Mar 16
Posts: 513
Credit: 4,673,458,277
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 51967 - Posted: 3 Jun 2019, 12:20:32 UTC

Try sending out more experimental WUs and see if it is one driver version
ID: 51967 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 51968 - Posted: 3 Jun 2019, 13:19:29 UTC - in response to Message 51967.  

Recent changes:
* sent 100 more test wus
* deprecated the windows "acemd3" app
* made acemd3 as beta
* fixed its name in prefs
ID: 51968 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
PappaLitto

Send message
Joined: 21 Mar 16
Posts: 513
Credit: 4,673,458,277
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 51969 - Posted: 3 Jun 2019, 14:16:23 UTC

Errored WUs on multiple different drivers and OS's

http://www.gpugrid.net/results.php?userid=306281
ID: 51969 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
biodoc

Send message
Joined: 26 Aug 08
Posts: 183
Credit: 10,085,929,375
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51970 - Posted: 3 Jun 2019, 14:25:12 UTC

Multiple failures of this task on both windows and linux

http://www.gpugrid.net/workunit.php?wuid=16517304

<message>
process exited with code 195 (0xc3, -61)</message>
<stderr_txt>
15:19:27 (30109): wrapper (7.7.26016): starting
15:19:27 (30109): wrapper (7.7.26016): starting
15:19:27 (30109): wrapper: running acemd3 (--boinc input --device 0)
# Engine failed: Error launching CUDA compiler: 32512
sh: 1: : Permission denied

15:19:28 (30109): acemd3 exited; CPU time 0.186092
15:19:28 (30109): app exit status: 0x1
15:19:28 (30109): called boinc_finish(195)

</stderr_txt>


Why is the app launching CUDA compiler?
ID: 51970 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 351
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51971 - Posted: 3 Jun 2019, 14:50:42 UTC
Last modified: 3 Jun 2019, 15:29:06 UTC

My host 43404 got one of WU 16517259.

Like all the others, it failed within one second:

03/06/2019 15:36:58 | GPUGRID | Starting task a27-TONI_TEST3-0-1-RND0985_6
03/06/2019 15:36:58 | GPUGRID | [cpu_sched] Starting task a27-TONI_TEST3-0-1-RND0985_6 using acemd3 version 119 (cuda80) in slot 0
03/06/2019 15:36:59 | GPUGRID | [sched_op] Deferring communication for 00:01:03
03/06/2019 15:36:59 | GPUGRID | [sched_op] Reason: Unrecoverable error for task a27-TONI_TEST3-0-1-RND0985_6
03/06/2019 15:36:59 | GPUGRID | Computation for task a27-TONI_TEST3-0-1-RND0985_6 finished
03/06/2019 15:36:59 | GPUGRID | Output file a27-TONI_TEST3-0-1-RND0985_6_0 for task a27-TONI_TEST3-0-1-RND0985_6 absent
03/06/2019 15:36:59 | GPUGRID | Output file a27-TONI_TEST3-0-1-RND0985_6_9 for task a27-TONI_TEST3-0-1-RND0985_6 absent

with no further information than

Incorrect function.
 (0x1) - exit code 1 (0x1)

But I did capture all the specifications and downloaded files between download and run, so I can recreate the attempt offline and see what additional crash information I can collect. May take me a little time...

Windows 7/64, GTX 970, runs v9.22 just fine.

Edit - all I can get in offline runs is

ACEMD can run with Boinc only!

- even when I supply a dummy init_data.xml file which has worked in other standalone test environments. I'll go out for a walk and see if that activates the little grey cells.
ID: 51971 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]

Send message
Joined: 16 Jul 07
Posts: 209
Credit: 5,496,860,456
RAC: 9,935
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51972 - Posted: 3 Jun 2019, 15:18:12 UTC - in response to Message 51968.  

Recent changes:
* sent 100 more test wus
* deprecated the windows "acemd3" app
* made acemd3 as beta
* fixed its name in prefs


Can you please explain which app we have to select in our project preferences to get these tasks? The app name "New version of ACEMD" is not a an option in the project preferences.
Reno, NV
Team: SETI.USA
ID: 51972 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 51973 - Posted: 3 Jun 2019, 15:22:39 UTC - in response to Message 51972.  
Last modified: 3 Jun 2019, 15:24:02 UTC

Should be called "ACEMD3 Beta". It's for Linux only (for now).
Windows machines should soon stop receiving it.
ID: 51973 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 351
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51974 - Posted: 3 Jun 2019, 15:23:46 UTC - in response to Message 51972.  
Last modified: 3 Jun 2019, 15:25:40 UTC

Can you please explain which app we have to select in our project preferences to get these tasks? The app name "New version of ACEMD" is not a an option in the project preferences.

The computer I got a test app on has

If no work for selected applications is available, accept work from other applications?
yes

Nothing else out of the ordinary.

The app name appeared as 'acemd3'.
ID: 51974 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
biodoc

Send message
Joined: 26 Aug 08
Posts: 183
Credit: 10,085,929,375
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51975 - Posted: 3 Jun 2019, 16:08:19 UTC

I got 1 task but it failed.:-(

http://www.gpugrid.net/result.php?resultid=20974689

linux mint 19.1
GTX 1080
Driver: 390.116
Cuda version 9.1
ID: 51975 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
biodoc

Send message
Joined: 26 Aug 08
Posts: 183
Credit: 10,085,929,375
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51976 - Posted: 3 Jun 2019, 16:28:07 UTC

All but 2 of the libraries that were downloaded are marked as executable. Should libgcc and libOpenCL also be executable?


-rwxr-xr-x  1 boinc boinc    425056 Jun  3 11:52 libcudart.so.8.0.61.46fcfd92ffc5c805d076b5e2b17e9647
-rwxr-xr-x  1 boinc boinc 146772120 Jun  3 11:56 libcufft.so.8.0.61.b142ab8797d534b619ef19c7e98cffc7
-rwxr-xr-x  1 boinc boinc   1647707 Jun  3 11:53 libfftw3f.so.3.4.4.a4580ddf9efebaad56fab49847a8c899
-rwxr-xr-x  1 boinc boinc     31467 Jun  3 11:52 libfftw3f_threads.so.3.4.4.dd0c6fcfa550371acf730db2d9d5a270
-rw-r--r--  1 boinc boinc    819744 Jun  3 11:52 libgcc_s.so.1.d7f787a9bf6c3633eaebb9015c6d9044
-rwxr-xr-x  1 boinc boinc    937656 Jun  3 11:52 libgomp.so.1.0.0.efdf718669edc7fff00e0c5f7f0b8791
-rwxr-xr-x  1 boinc boinc   9659424 Jun  3 11:54 libnvrtc-builtins.so.8.0.61.ef79235263e650333dd8c573faa47432
-rwxr-xr-x  1 boinc boinc  18517368 Jun  3 11:54 libnvrtc.so.8.0.61.1ac77468cd8086b8cd1a6c855da50f8c
-rw-r--r--  1 boinc boinc     31696 Jun  3 11:52 libOpenCL.so.1.0.0.343dee45a7d7eb4b9016b6cd9d1bd8d5
-rwxr-xr-x  1 boinc boinc    655240 Jun  3 11:54 libOpenMMCPU.so.19849b4ff1cf4d33f75d9433b4d5c6bb
-rwxr-xr-x  1 boinc boinc     37096 Jun  3 11:53 libOpenMMCudaCompiler.so.aaed781fe4caa9d1099312d458a9b902
-rwxr-xr-x  1 boinc boinc   2774560 Jun  3 11:52 libOpenMMCUDA.so.8867021fdc0daf2e39f1b7228ece45af
-rwxr-xr-x  1 boinc boinc   2979224 Jun  3 11:52 libOpenMMOpenCL.so.6a31fa1ff5ae3a26ea64f2abfb5a66cc
-rwxr-xr-x  1 boinc boinc     80808 Jun  3 11:53 libOpenMMPME.so.3208e45e71567824e8390ab1c79c6a66
-rwxr-xr-x  1 boinc boinc   4062370 Jun  3 11:53 libOpenMM.so.5406dfd716045d08ad6369e2399a98e2
-rwxr-xr-x  1 boinc boinc   9536208 Jun  3 11:54 libstdc++.so.6.0.25.e344f48acfbd4f5abbf99b2c75cc5e50
ID: 51976 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
biodoc

Send message
Joined: 26 Aug 08
Posts: 183
Credit: 10,085,929,375
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51977 - Posted: 3 Jun 2019, 16:45:41 UTC

regarding the error on my task:

# Engine failed: Error launching CUDA compiler: 32512
sh: 1: : Permission denied

Is this solution relevant?

https://github.com/pandegroup/openmm/issues/1352
ID: 51977 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
erik

Send message
Joined: 30 Apr 19
Posts: 54
Credit: 168,971,875
RAC: 0
Level
Ile
Scientific publications
wat
Message 51978 - Posted: 3 Jun 2019, 22:11:46 UTC

http://www.gpugrid.net/result.php?resultid=20974104

fail on msi gtx 1070, 8gb itx card, windows 10
ID: 51978 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rod4x4

Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,219,935,054
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 51979 - Posted: 4 Jun 2019, 0:04:55 UTC
Last modified: 4 Jun 2019, 0:07:29 UTC

Hi Toni

are you explicitly naming the path to libnvrtc-builtins.so when compiling?

perhaps include boinc project folder in LD_LIBRARY_PATH
ID: 51979 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · 4 . . . 7 · Next

Message boards : Number crunching : New app update (acemd3)

©2025 Universitat Pompeu Fabra