New app update (acemd3)

Message boards : Number crunching : New app update (acemd3)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 7 · Next

AuthorMessage
Nick Name

Send message
Joined: 3 Sep 13
Posts: 53
Credit: 1,533,531,731
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 51980 - Posted: 4 Jun 2019, 5:18:42 UTC

name a11-TONI_TEST3-0-1-RND0663
https://www.gpugrid.net/workunit.php?wuid=16517242
Failure on all machines.

My result here:https://www.gpugrid.net/result.php?resultid=20976177

My log:
50 GPUGRID 6/4/2019 2:44:30 AM Started download of acemd3.119.exe
51 GPUGRID 6/4/2019 2:44:30 AM Started download of boost_filesystem-vc140-mt-1_65_1.119.dll
52 GPUGRID 6/4/2019 2:44:32 AM Finished download of acemd3.119.exe
53 GPUGRID 6/4/2019 2:44:32 AM Started download of boost_system-vc140-mt-1_65_1.119.dll
54 GPUGRID 6/4/2019 2:44:33 AM Finished download of boost_filesystem-vc140-mt-1_65_1.119.dll
55 GPUGRID 6/4/2019 2:44:33 AM Finished download of boost_system-vc140-mt-1_65_1.119.dll
56 GPUGRID 6/4/2019 2:44:33 AM Started download of cufft64_80.119.dll
57 GPUGRID 6/4/2019 2:44:33 AM Started download of msvcp140.119.dll
58 GPUGRID 6/4/2019 2:44:38 AM Finished download of msvcp140.119.dll
59 GPUGRID 6/4/2019 2:44:38 AM Started download of nvrtc64_80.119.dll
60 GPUGRID 6/4/2019 2:45:06 AM Finished download of nvrtc64_80.119.dll
61 GPUGRID 6/4/2019 2:45:06 AM Started download of nvrtc-builtins64_80.119.dll
62 GPUGRID 6/4/2019 2:45:28 AM Finished download of nvrtc-builtins64_80.119.dll
63 GPUGRID 6/4/2019 2:45:28 AM Started download of OpenMMCPU.119.dll
64 GPUGRID 6/4/2019 2:45:30 AM Finished download of OpenMMCPU.119.dll
65 GPUGRID 6/4/2019 2:45:30 AM Started download of OpenMMCudaCompiler.119.dll
66 GPUGRID 6/4/2019 2:45:32 AM Finished download of OpenMMCudaCompiler.119.dll
67 GPUGRID 6/4/2019 2:45:32 AM Started download of OpenMMCUDA.119.dll
68 GPUGRID 6/4/2019 2:45:39 AM Finished download of OpenMMCUDA.119.dll
69 GPUGRID 6/4/2019 2:45:39 AM Started download of OpenMM.119.dll
70 GPUGRID 6/4/2019 2:45:48 AM Finished download of OpenMM.119.dll
71 GPUGRID 6/4/2019 2:45:48 AM Started download of OpenMMOpenCL.119.dll
72 GPUGRID 6/4/2019 2:45:54 AM Finished download of OpenMMOpenCL.119.dll
73 GPUGRID 6/4/2019 2:45:54 AM Started download of OpenMMPME.119.dll
74 GPUGRID 6/4/2019 2:45:58 AM Finished download of OpenMMPME.119.dll
75 GPUGRID 6/4/2019 2:45:58 AM Started download of psprolib.119.dll
76 GPUGRID 6/4/2019 2:46:00 AM Finished download of psprolib.119.dll
77 GPUGRID 6/4/2019 2:46:00 AM Started download of vcruntime140.119.dll
78 GPUGRID 6/4/2019 2:46:01 AM Finished download of vcruntime140.119.dll
79 GPUGRID 6/4/2019 2:46:01 AM Started download of a11-TONI_TEST3-0-conf_file_enc
80 GPUGRID 6/4/2019 2:46:02 AM Finished download of a11-TONI_TEST3-0-conf_file_enc
81 GPUGRID 6/4/2019 2:46:02 AM Started download of a11-TONI_TEST3-0-coor_file
82 GPUGRID 6/4/2019 2:46:03 AM Finished download of a11-TONI_TEST3-0-coor_file
83 GPUGRID 6/4/2019 2:46:03 AM Started download of a11-TONI_TEST3-0-vel_file
84 GPUGRID 6/4/2019 2:46:04 AM Finished download of a11-TONI_TEST3-0-vel_file
85 GPUGRID 6/4/2019 2:46:04 AM Started download of a11-TONI_TEST3-0-idx_file
86 GPUGRID 6/4/2019 2:46:05 AM Finished download of a11-TONI_TEST3-0-idx_file
87 GPUGRID 6/4/2019 2:46:05 AM Started download of a11-TONI_TEST3-0-xsc_file
88 GPUGRID 6/4/2019 2:46:06 AM Finished download of a11-TONI_TEST3-0-xsc_file
89 GPUGRID 6/4/2019 2:46:06 AM Started download of a11-TONI_TEST3-0-pdb_file
90 GPUGRID 6/4/2019 2:46:11 AM Finished download of a11-TONI_TEST3-0-pdb_file
91 GPUGRID 6/4/2019 2:46:11 AM Started download of a11-TONI_TEST3-0-psf_file
92 GPUGRID 6/4/2019 2:46:24 AM Finished download of a11-TONI_TEST3-0-psf_file
93 GPUGRID 6/4/2019 2:46:24 AM Started download of a11-TONI_TEST3-0-par_file
94 GPUGRID 6/4/2019 2:46:26 AM Finished download of a11-TONI_TEST3-0-par_file
95 GPUGRID 6/4/2019 2:46:26 AM Started download of a11-TONI_TEST3-0-prmtop_file
96 GPUGRID 6/4/2019 2:46:27 AM Finished download of a11-TONI_TEST3-0-prmtop_file
97 GPUGRID 6/4/2019 2:49:48 AM Finished download of cufft64_80.119.dll
98 GPUGRID 6/4/2019 2:49:49 AM Starting task a11-TONI_TEST3-0-1-RND0663_5
99 GPUGRID 6/4/2019 2:49:50 AM Computation for task a11-TONI_TEST3-0-1-RND0663_5 finished
100 GPUGRID 6/4/2019 2:49:50 AM Output file a11-TONI_TEST3-0-1-RND0663_5_0 for task a11-TONI_TEST3-0-1-RND0663_5 absent
101 GPUGRID 6/4/2019 2:49:50 AM Output file a11-TONI_TEST3-0-1-RND0663_5_9 for task a11-TONI_TEST3-0-1-RND0663_5 absent
Team USA forum | Team USA page
Join us and #crunchforcures. We are now also folding:join team ID 236370!
ID: 51980 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 51981 - Posted: 4 Jun 2019, 15:49:25 UTC - in response to Message 51980.  

I think I debugged it (app version 201). 100 new WUs sent. Progress bar should also work (please report if not).
ID: 51981 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 51982 - Posted: 4 Jun 2019, 16:03:29 UTC - in response to Message 51981.  
Last modified: 4 Jun 2019, 17:21:54 UTC

There are many more successes now.

Edit.

The reason for failures is not really clear. Question for anybody who has seen a success: do you have the CUDA Toolkit installed?
ID: 51982 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
PappaLitto

Send message
Joined: 21 Mar 16
Posts: 513
Credit: 4,673,458,277
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 51983 - Posted: 4 Jun 2019, 18:31:02 UTC - in response to Message 51982.  
Last modified: 4 Jun 2019, 18:32:31 UTC

There are many more successes now.

Edit.

The reason for failures is not really clear. Question for anybody who has seen a success: do you have the CUDA Toolkit installed?

Hello Toni, I have received many successes and when I typed "nvcc -V" to verify the CUDA Toolkit version, it says "The program 'nvcc' is currently not installed. You can install it by typing:

sudo apt install nvidia-cuda-toolkit"

My system seems to not have it installed.

This is the list of the successful tasks: http://www.gpugrid.net/results.php?userid=306281
ID: 51983 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1168
Credit: 12,311,898,501
RAC: 246,185
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51984 - Posted: 4 Jun 2019, 18:36:35 UTC - in response to Message 51983.  

This is the list of the successful tasks: http://www.gpugrid.net/results.php?userid=306281

access denied :-(
ID: 51984 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
PappaLitto

Send message
Joined: 21 Mar 16
Posts: 513
Credit: 4,673,458,277
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 51985 - Posted: 4 Jun 2019, 18:48:15 UTC - in response to Message 51984.  

This is the list of the successful tasks: http://www.gpugrid.net/results.php?userid=306281

access denied :-(

Perhaps you can view a single WU? http://www.gpugrid.net/result.php?resultid=20978809
ID: 51985 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
biodoc

Send message
Joined: 26 Aug 08
Posts: 183
Credit: 10,085,929,375
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51986 - Posted: 4 Jun 2019, 19:06:44 UTC

I have 2 machines. Both have linux mint 19.1 installed, same nvidia driver (390.116), cuda toolkit release 9.1 (both tested as functional), same boinc version 7.14.2.

The hardware is different:

dual GTX 1080's on 2700X: All tasks are failing.

http://www.gpugrid.net/results.php?hostid=482792

dual GTX 1080 Ti's on E5-2690 v2: All tasks are completing successfully!

http://www.gpugrid.net/results.php?hostid=464987


There must be a clue here. Any ideas?
ID: 51986 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Aurum
Avatar

Send message
Joined: 12 Jul 17
Posts: 404
Credit: 17,408,899,587
RAC: 0
Level
Trp
Scientific publications
watwatwat
Message 51987 - Posted: 4 Jun 2019, 20:07:29 UTC - in response to Message 51982.  

Question for anybody who has seen a success: do you have the CUDA Toolkit installed?


No. I installed the Nvidia 430.14 drivers as Linux metapackages. According to the Synaptic Package Manager I do not have the CUDA Toolkit installed.

ID: 51987 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
biodoc

Send message
Joined: 26 Aug 08
Posts: 183
Credit: 10,085,929,375
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51988 - Posted: 4 Jun 2019, 21:58:14 UTC - in response to Message 51986.  

I have 2 machines. Both have linux mint 19.1 installed, same nvidia driver (390.116), cuda toolkit release 9.1 (both tested as functional), same boinc version 7.14.2.

The hardware is different:

dual GTX 1080's on 2700X: All tasks are failing.

http://www.gpugrid.net/results.php?hostid=482792

dual GTX 1080 Ti's on E5-2690 v2: All tasks are completing successfully!

http://www.gpugrid.net/results.php?hostid=464987


There must be a clue here. Any ideas?


I can't find anything in the logs. I was running Rosetta on the machine that had the failed GPUGrid tasks. There was no other project running on the machine that had the successful GPUGrid tasks.
ID: 51988 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1423
Credit: 9,189,196,190
RAC: 1,326,743
Level
Tyr
Scientific publications
watwatwatwatwat
Message 51989 - Posted: 5 Jun 2019, 0:05:22 UTC - in response to Message 51983.  

There are many more successes now.

Edit.

The reason for failures is not really clear. Question for anybody who has seen a success: do you have the CUDA Toolkit installed?

Hello Toni, I have received many successes and when I typed "nvcc -V" to verify the CUDA Toolkit version, it says "The program 'nvcc' is currently not installed. You can install it by typing:

sudo apt install nvidia-cuda-toolkit"

My system seems to not have it installed.

This is the list of the successful tasks: http://www.gpugrid.net/results.php?userid=306281

Even though nvcc is actually present on my Jetson Nano, nvcc -V yielded program not found. It is located at /usr/local/cuda-10.0/bin/nvcc

I had to export the directory where nvcc was located for it to be found. That enabled a program to find nvcc.
keith@Nano:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sun_Sep_30_21:09:22_CDT_2018
Cuda compilation tools, release 10.0, V10.0.166

But as soon as I rebooted, nvcc could not be found. So I ended up adding the library directory as an export in .bashrc and then I could find nvcc after reboots.
ID: 51989 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Jul 16
Posts: 339
Credit: 7,990,341,558
RAC: 3,287
Level
Tyr
Scientific publications
watwatwatwatwat
Message 51990 - Posted: 5 Jun 2019, 0:34:07 UTC
Last modified: 5 Jun 2019, 0:34:21 UTC

I completed one while 5 others had errors.
https://www.gpugrid.net/workunit.php?wuid=16520276

nvcc -V results
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85
ID: 51990 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
biodoc

Send message
Joined: 26 Aug 08
Posts: 183
Credit: 10,085,929,375
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51991 - Posted: 5 Jun 2019, 1:13:16 UTC - in response to Message 51989.  


But as soon as I rebooted, nvcc could not be found. So I ended up adding the library directory as an export in .bashrc and then I could find nvcc after reboots.


Another option is to place the cuda library path in a file in /etc/ld.so.conf.d.

you could name the file cuda.conf

then:

sudo ldconfig



ID: 51991 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1423
Credit: 9,189,196,190
RAC: 1,326,743
Level
Tyr
Scientific publications
watwatwatwatwat
Message 51992 - Posted: 5 Jun 2019, 6:00:16 UTC - in response to Message 51991.  


But as soon as I rebooted, nvcc could not be found. So I ended up adding the library directory as an export in .bashrc and then I could find nvcc after reboots.


Another option is to place the cuda library path in a file in /etc/ld.so.conf.d.

you could name the file cuda.conf

then:

sudo ldconfig


Correct. That is the other method I researched as a popular solution.

So am I correct in understanding now is that one has to install the CUDA toolkit to run the new acemd application?

That the wrapper download itself is insufficient?
ID: 51992 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 51993 - Posted: 5 Jun 2019, 7:33:01 UTC - in response to Message 51992.  

Hi all, thanks for the reports.

The app SHOULD not require the cuda toolkit (which includes nvcc), yet on SOME hosts it is looking for it, and fails (the error message is more or less the same).

I still don't understand the conditions when this occurs. In particular, as biodoc's precious example, there is no clear relationship between the card generation, driver, and success/failure.

@biodoc, can you see other obvious differences between the two machines? E.g.

- boinc installation method
- presence of the gcc package
ID: 51993 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1423
Credit: 9,189,196,190
RAC: 1,326,743
Level
Tyr
Scientific publications
watwatwatwatwat
Message 51994 - Posted: 5 Jun 2019, 7:50:41 UTC

Well, I see I attempted to run a task that failed on one host. I looked over all the downloaded files and thought to do a sanity check on the executable. This is what ldd showed.

keith@Numbskull:~/Desktop/BOINC/projects/www.gpugrid.net$ ldd '/home/keith/Desktop/BOINC/projects/www.gpugrid.net/acemd.919-80.bin' linux-vdso.so.1 (0x00007ffdf14d5000)
libcuda.so.1 => /usr/lib/x86_64-linux-gnu/libcuda.so.1 (0x00007fa630a0c000)
libcudart.so.8.0 => not found
libcufft.so.8.0 => not found
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fa630808000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fa6305e9000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fa630260000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fa62fec2000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fa62fcaa000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa62f8b9000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fa62f6b1000)
libnvidia-fatbinaryloader.so.418.56 => /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.418.56 (0x00007fa62f463000)
/lib64/ld-linux-x86-64.so.2 (0x00007fa631b63000)
keith@Numbskull:~/Desktop/BOINC/projects/www.gpugrid.net$


So right off the bat, the app had no chance of succeeding when it can't find its own downloaded libcudart.so.8.0 and libcufft.so.8.0 files in the project directory.

I don't think it would make any difference if/when all the files and work unit get copied into a slot.
ID: 51994 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 51995 - Posted: 5 Jun 2019, 7:51:07 UTC - in response to Message 51992.  



So am I correct in understanding now is that one has to install the CUDA toolkit to run the new acemd application?

That the wrapper download itself is insufficient?


You don't (shouldn't) need to install any additional software, if everything works as intended (not the wrapper, nor the cuda toolkit).

You may need to update the drivers, though.
ID: 51995 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 2
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51996 - Posted: 5 Jun 2019, 8:02:27 UTC - in response to Message 51994.  

libcudart.so.8.0 => not found
libcufft.so.8.0 => not found

So right off the bat, the app had no chance of succeeding when it can't find its own downloaded libcudart.so.8.0 and libcufft.so.8.0 files in the project directory.

If somebody can post or upload the three components of a test workunit specification:

* <app_version>
* <workunit>
* <result>

all from client_state.xml - make sure you get the right (latest) version of <app_version>, there will be several of them - I can proofread that there are no bugs in the BOINC deployment of the app files. This one could be a problem with the version renaming or copying.
ID: 51996 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
biodoc

Send message
Joined: 26 Aug 08
Posts: 183
Credit: 10,085,929,375
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51997 - Posted: 5 Jun 2019, 8:42:59 UTC - in response to Message 51993.  

Hi all, thanks for the reports.

@biodoc, can you see other obvious differences between the two machines? E.g.

- boinc installation method
- presence of the gcc package


No, the boinc installation method is the same (repository meta package) and gcc is installed on both machines (build-essential package). I ran ldd on wrapper_26198_x86_64-pc-linux-gnu and acemd3.e72153abf98cb1fcd0f05fc443818dfc on both machines and the output is identical.

Working machine:

mark@x20-linux:/var/lib/boinc/projects/www.gpugrid.net$ ldd ./wrapper_26198_x86_64-pc-linux-gnu
	linux-vdso.so.1 (0x00007ffc1bfab000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f7ab23ba000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f7ab21a2000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f7ab1f83000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f7ab1b92000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f7ab2758000)
mark@x20-linux:/var/lib/boinc/projects/www.gpugrid.net$ ldd ./acemd3.e72153abf98cb1fcd0f05fc443818dfc 
	linux-vdso.so.1 (0x00007ffda9bfe000)
	libOpenMM.so => not found
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007ffb4cb37000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007ffb4c7ae000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007ffb4c410000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007ffb4c1f8000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ffb4be07000)
	/lib64/ld-linux-x86-64.so.2 (0x00007ffb4cd3b000)


machine with failures:

mark@x16-linux:/var/lib/boinc/projects/www.gpugrid.net$ ldd ./wrapper_26198_x86_64-pc-linux-gnu 
	linux-vdso.so.1 (0x00007ffd96952000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fd300b09000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fd3008f1000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fd3006d2000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd3002e1000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fd300ea7000)
mark@x16-linux:/var/lib/boinc/projects/www.gpugrid.net$ ldd ./acemd3.e72153abf98cb1fcd0f05fc443818dfc 
	linux-vdso.so.1 (0x00007ffef0097000)
	libOpenMM.so => not found
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fabe9b83000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fabe97fa000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fabe945c000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fabe9244000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fabe8e53000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fabe9d87000)



ID: 51997 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 51998 - Posted: 5 Jun 2019, 9:16:14 UTC - in response to Message 51994.  
Last modified: 5 Jun 2019, 9:19:03 UTC


So right off the bat, the app had no chance of succeeding when it can't find its own downloaded libcudart.so.8.0 and libcufft.so.8.0 files in the project directory.

I don't think it would make any difference if/when all the files and work unit get copied into a slot.


We are distributing the two files with the app. They are copied (via copy_file) into the slot, and the slot is added to LD_LIBRARY_PATH. It works locally and on many machines; I am inclined to think it's not the problem.

The "permission denied" bit seems related to a later stage, possibly an attempt to compile the cuda bytecode into the form necessary for the specific graphic card (done via nvrtc).

If anybody is able to capture the "progress.log" file before it's deleted, thanks!

T
ID: 51998 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
biodoc

Send message
Joined: 26 Aug 08
Posts: 183
Credit: 10,085,929,375
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51999 - Posted: 5 Jun 2019, 9:38:29 UTC

I did find a "messy" install of the nvidia driver on the offending machine. There seems to be remnants of a driver installed via download directly from nvidia. I'll clean that up.

'sudo apt search nvidia' showed significant differences between the 2 machines.

ID: 51999 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 . . . 7 · Next

Message boards : Number crunching : New app update (acemd3)

©2026 Universitat Pompeu Fabra