Update acemd3 app

Message boards : News : Update acemd3 app
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · Next

AuthorMessage
Boca Raton Community HS

Send message
Joined: 27 Aug 21
Posts: 38
Credit: 7,254,068,306
RAC: 0
Level
Tyr
Scientific publications
wat
Message 59097 - Posted: 11 Aug 2022, 19:51:28 UTC - in response to Message 59095.  

Is there a way to tell if it is single or double point precision by looking/inspecting the process in the task manager (Windows)?
ID: 59097 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 891
Level
Tyr
Scientific publications
watwatwatwatwat
Message 59098 - Posted: 11 Aug 2022, 20:10:13 UTC

You would have to ask Toni whether the acemd3 application uses single or double precision.

All I've seen mentioned in this thread is that they use FP32 registers.

Without a specific answer by the developer or a look at the source code of the app we are just guessing.
ID: 59098 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 59100 - Posted: 11 Aug 2022, 22:42:54 UTC

I'm going to guess that the vast majority is FP32 and INT32. I have not observed any correlation with FP64 across devices on GPUGRID tasks, so if any FP64 operations are being done, the percentage of compute time should be so small to be only marginal.
ID: 59100 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
kotenok2000

Send message
Joined: 18 Jul 13
Posts: 79
Credit: 210,528,292
RAC: 0
Level
Leu
Scientific publications
wat
Message 59448 - Posted: 13 Oct 2022, 0:13:25 UTC

I have received 5 acemd tasks and they all failed.
And not only on my computer.
https://www.gpugrid.net/workunit.php?wuid=27319802
When i ran executable manually it said that licence is expired.
ID: 59448 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
gemini8
Avatar

Send message
Joined: 3 Jul 16
Posts: 31
Credit: 2,248,809,169
RAC: 0
Level
Phe
Scientific publications
watwat
Message 59629 - Posted: 21 Dec 2022, 14:03:19 UTC

I have several of those on two machines:
Stderr Ausgabe

<core_client_version>7.16.6</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)</message>
<stderr_txt>
14:52:08 (347837): wrapper (7.7.26016): starting
14:52:29 (347837): wrapper (7.7.26016): starting
14:52:29 (347837): wrapper: running bin/acemd3 (--boinc --device 0)
14:52:30 (347837): bin/acemd3 exited; CPU time 0.003638
14:52:30 (347837): app exit status: 0x1
14:52:30 (347837): called boinc_finish(195)

</stderr_txt>
]]>

Nice to have ACEMD back, but I'd consider this even nicer if the ACEMD's didn't crash. ;-)
- - - - - - - - - -
Greetings, Jens
ID: 59629 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Boca Raton Community HS

Send message
Joined: 27 Aug 21
Posts: 38
Credit: 7,254,068,306
RAC: 0
Level
Tyr
Scientific publications
wat
Message 59630 - Posted: 21 Dec 2022, 14:21:12 UTC

Looks like I received 37 work units overnight, all failed after about 10-30 seconds.
ID: 59630 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 59631 - Posted: 21 Dec 2022, 15:39:37 UTC

same. all the acemd3 tasks failed without any informative error message. on Linux
ID: 59631 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Pop Piasa
Avatar

Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 59632 - Posted: 21 Dec 2022, 16:00:34 UTC
Last modified: 21 Dec 2022, 16:03:48 UTC

3skh-ADRIA_KDeepMD_100ns_2489-0-1-RND9110_5

Stderr output
<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code 195 (0xc3)</message>
<stderr_txt>
09:43:36 (18020): wrapper (7.9.26016): starting
09:43:36 (18020): wrapper: running bin/acemd3.exe (--boinc --device 0)
09:43:37 (18020): bin/acemd3.exe exited; CPU time 0.000000
09:43:37 (18020): app exit status: 0x1
09:43:37 (18020): called boinc_finish(195)

I have over 40 of these so far.

Host OS (win10) and drivers are up-to-date.

Are these ACEMD apps also failing under Linux?
"Together we crunch
To check out a hunch
And wish all our credit
Could just buy us lunch"


Piasa Tribe - Illini Nation
ID: 59632 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Boca Raton Community HS

Send message
Joined: 27 Aug 21
Posts: 38
Credit: 7,254,068,306
RAC: 0
Level
Tyr
Scientific publications
wat
Message 59633 - Posted: 21 Dec 2022, 16:05:57 UTC - in response to Message 59632.  



Are these ACEMD apps also failing under Linux?



Yes, per Ian&Steve C.
ID: 59633 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bjstateson

Send message
Joined: 9 Sep 20
Posts: 1
Credit: 60,303,468
RAC: 0
Level
Thr
Scientific publications
wat
Message 59634 - Posted: 21 Dec 2022, 16:09:19 UTC - in response to Message 57041.  
Last modified: 21 Dec 2022, 16:10:49 UTC

I just had 13 of the apps crash with computation error (195)
Running CUDA 12

5			12/20/2022 2:03:07 PM	CUDA: NVIDIA GPU 0: NVIDIA GeForce GTX 1660 Ti (driver version 526.86, CUDA version 12.0, compute capability 7.5, 6144MB, 6144MB available, 5530 GFLOPS peak)	
6			12/20/2022 2:03:07 PM	OpenCL: NVIDIA GPU 0: NVIDIA GeForce GTX 1660 Ti (driver version 526.86, device version OpenCL 3.0 CUDA, 6144MB, 6144MB available, 5530 GFLOPS peak)	
ID: 59634 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile JStateson
Avatar

Send message
Joined: 31 Oct 08
Posts: 186
Credit: 3,578,903,157
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 59636 - Posted: 21 Dec 2022, 18:07:34 UTC

Linux: Took almost 30 minutes to download but only seconds to error out


GPUGRID	x86_64-pc-linux-gnu__cuda1121.zip.b4692e2ec3b7e128830af5c05a9f0037	98.225	1013587.50 K	00:28:10	587.44 KBps	Downloading	dual-linux	
GPUGRID	2.19 ACEMD 3: molecular dynamics simulations for GPUs (cuda1121)	3sni-ADRIA_KDeepMD_100ns_3150-0-1-RND1427_7	00:00:24 (-)	0.00	100.000	-	12/26/2022 11:33:32 AM	0.993C + 1NV	Computation error			d
12/21/2022 9:59:43 AM	CUDA: NVIDIA GPU 0: NVIDIA P102-100 (driver version 470.99, CUDA version 11.4, compute capability 6.1, 5060MB, 5060MB available, 10771 GFLOPS peak)	
12/21/2022 9:59:43 AM	OS: Linux Ubuntu: Ubuntu 20.04.5 LTS [5.4.0-135-generic|libc 2.31]	

try my performance program, the BoincTasks History Reader.
Find and read about it here: https://forum.efmer.com/index.php?topic=1355.0
ID: 59636 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 59645 - Posted: 23 Dec 2022, 6:07:29 UTC

will there be more acemd3 tasks in the near future?
ID: 59645 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 891
Level
Tyr
Scientific publications
watwatwatwatwat
Message 59647 - Posted: 23 Dec 2022, 17:55:43 UTC - in response to Message 59645.  

will there be more acemd3 tasks in the near future?

I would hope so. Would be nice to return to quick running acemd3 tasks that run only on the gpu.

Let's hope the developer can rework the parameters for these new tasks so that they don't fail instantly on everyone's hosts.
ID: 59647 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 59649 - Posted: 24 Dec 2022, 12:26:00 UTC - in response to Message 59647.  

Let's hope the developer can rework the parameters for these new tasks so that they don't fail instantly on everyone's hosts.

+ 1
ID: 59649 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Pop Piasa
Avatar

Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 59650 - Posted: 24 Dec 2022, 22:34:02 UTC

Here's an alternate perspective, looking at the time it took to reach the error is probably a valid way to compare the speed of the various hosts that ran the same scripts. Assuming that the error is in the script, of course.
"Together we crunch
To check out a hunch
And wish all our credit
Could just buy us lunch"


Piasa Tribe - Illini Nation
ID: 59650 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 891
Level
Tyr
Scientific publications
watwatwatwatwat
Message 59651 - Posted: 25 Dec 2022, 3:18:34 UTC

Well since the wrapper hasn't changed and the app hasn't changed, then the issue with the tasks is that the configuration of the task parameters is doing something rude.

Or unlikely but possible, the task parameters have uncovered a "edge case flaw" of the acemd3 application that hasn't been exposed up to this point.

I would put my money on the bet that simply the task generation configuration script is generating some values that are "out of bounds" in memory access.
ID: 59651 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 59654 - Posted: 26 Dec 2022, 7:02:11 UTC

I am surprised that the Server Status page still shows some 140 tasks "in process".
How come?
ID: 59654 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
gemini8
Avatar

Send message
Joined: 3 Jul 16
Posts: 31
Credit: 2,248,809,169
RAC: 0
Level
Phe
Scientific publications
watwat
Message 59660 - Posted: 27 Dec 2022, 9:06:44 UTC - in response to Message 59651.  

I would put my money on the bet that simply the task generation configuration script is generating some values that are "out of bounds" in memory access.

This might well be the case, but I have a different shot at an explanation:
IIRC the certificates usually had to be renewed sometime in summer or early autumn, and I think it may be possible there's no working certificate laid down for ACEMD at all.
- - - - - - - - - -
Greetings, Jens
ID: 59660 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 59668 - Posted: 29 Dec 2022, 7:41:33 UTC - in response to Message 59660.  

[quote]... and I think it may be possible there's no working certificate laid down for ACEMD at all.

this has happened numerous times in the past :-(
So it would be no surprise if also now this is the reason for the problem.
ID: 59668 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Bill F
Avatar

Send message
Joined: 21 Nov 16
Posts: 36
Credit: 164,429,114
RAC: 18
Level
Ile
Scientific publications
wat
Message 59717 - Posted: 12 Jan 2023, 18:33:09 UTC

I managed to get one of the ACEMD3 tasks that came out and it ran fine on my old GTX 1060.
In October of 1969 I took an oath to support and defend the Constitution of the United States against all enemies, foreign and domestic;
There was no expiration date.


ID: 59717 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · Next

Message boards : News : Update acemd3 app

©2025 Universitat Pompeu Fabra