Message boards :
Number crunching :
New "testX-RAIMIS" WUs, all erroring
Message board moderation
| Author | Message |
|---|---|
ServicEnginICSend message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 1,447 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
New "testX-RAIMIS" - Anaconda Python 3 Environment v4.01 (cuda 100) received today: I received 4 tasks at four different of my Linux hosts, and all 4 finished with error after several minutes. test2-RAIMIS_PYTEST6-0-1-RND6445_4 test2-RAIMIS_NNPMM-0-1-RND9981_4 test3-RAIMIS_NNPMM-0-1-RND4429_1 test4-RAIMIS_NNPMM-0-1-RND7976_7 I got to watch an extrange behavior in the last of them: It started to progress quickly, reaching 100% after about 3 minutes, then it returned to 10% progress for a while, then to 100% again for a couple of minutes, and finally it finished erroring... (?) |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
I got one of these too that errored out after 1200 seconds or so. I wonder if the tasks really are completely self-contained within the wrapper. From the errors in the stderr.txt output, I wonder if the problem is the app trying to run within the original python environment which is deprecated in the modern distros. I can't run any python apps anymore. I always have to remember to run them as python3 since python2 was deprecated. |
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
We are debugging those workunits. They don't depend on the system's python. There is a relatively large initial download which gets a full environment (which should be then reused). |
ServicEnginICSend message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 1,447 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
In fact, I received these other two tasks, and both succeeded: test15-RAIMIS_NNPMM-1-2-RND4792_0 test7-RAIMIS_NNPMM-0-1-RND7929_0 Well done! |
|
Send message Joined: 4 Aug 14 Posts: 266 Credit: 2,219,935,054 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
In fact, I received these other two tasks, and both succeeded: There is a volume of info in the STDerr Output A lot of the output describes the atoms, time period, calculations, box size, etc. What I found interesting was it can identify the GPU #Platform properties: It checks the Platform, both OpenCL and CUDA # WARNING: there is no library for "OpenCL" plugin Then states calculation platform # Setting up platform: CUDA Dare I say Gpugrid are preparing for OpenCL version.... AMD compatibility? EDIT: Or maybe not. Prior to the OpenCL warning, there is this: # Plugin directory: /var/lib/boinc-client/slots/4/gpugridpy/lib/acemd3 The Wrapper has only preloaded CUDA plugins, no OpenCL plugins are packaged with the App. So I think i got excited too early. |
©2025 Universitat Pompeu Fabra