ATMML

Message boards : Number crunching : ATMML
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 8 · Next

AuthorMessage
Bedrich Hajek

Send message
Joined: 28 Mar 09
Posts: 490
Credit: 11,731,645,728
RAC: 69
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61582 - Posted: 5 Jul 2024, 14:15:37 UTC

I just finished crunching a task for this new application successfully.

https://www.gpugrid.net/result.php?resultid=35379717

What exactly are we crunching here?

ID: 61582 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 891
Level
Tyr
Scientific publications
watwatwatwatwat
Message 61583 - Posted: 5 Jul 2024, 15:08:01 UTC

By the name of the app, somehow uses machine learning.
ID: 61583 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 61584 - Posted: 6 Jul 2024, 16:38:31 UTC - in response to Message 61582.  

I just finished crunching a task for this new application successfully.

how did you manage to download such a task?
The list in which you can choose from the various subprojects does NOT include ATMML
ID: 61584 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Steve
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 21 Dec 23
Posts: 51
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 61585 - Posted: 6 Jul 2024, 18:36:43 UTC

This is an app in testing mode, it does not appear as one to select yet. You will only get the WUs if you have selected to run the test applications. It is a different version of the existing ATM app that includes machine learning based forcefields for the molecular dynamics.
ID: 61585 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 891
Level
Tyr
Scientific publications
watwatwatwatwat
Message 61586 - Posted: 6 Jul 2024, 19:59:54 UTC - in response to Message 61585.  

Thanks for the progress update and explanation of just what kind of ML is being used for the ATM tasks, Steve.


I see also you released a new beta ATM app yesterday to go along with the ATMML app. Already did one of those today.
ID: 61586 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Drago

Send message
Joined: 3 May 20
Posts: 19
Credit: 1,043,759,208
RAC: 39
Level
Met
Scientific publications
wat
Message 61594 - Posted: 15 Jul 2024, 9:34:47 UTC - in response to Message 61586.  

Is it Windows, Linux or both?
ID: 61594 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ServicEnginIC
Avatar

Send message
Joined: 24 Sep 10
Posts: 592
Credit: 11,972,186,510
RAC: 1,447
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61595 - Posted: 15 Jul 2024, 10:32:05 UTC

You can verify OS compatibility for different applications at GPUGRID apps page.
ID: 61595 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bedrich Hajek

Send message
Joined: 28 Mar 09
Posts: 490
Credit: 11,731,645,728
RAC: 69
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61602 - Posted: 17 Jul 2024, 1:10:16 UTC

I noticed that this batch of ATMML units takes almost 3 times longer than the previous batches to complete. One of them, I suspended and when I restarted it, it would not start, I kept "running" it for over an hour, and no progress, so I had no option, but to abort it.


ID: 61602 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Pascal

Send message
Joined: 15 Jul 20
Posts: 95
Credit: 2,550,803,412
RAC: 248
Level
Phe
Scientific publications
wat
Message 61604 - Posted: 17 Jul 2024, 7:45:34 UTC - in response to Message 61602.  

effectivement elles sont tres longue a calculer.Je vais les arreter aussi.
9h20 sur ma rtx 4060 et 14h20 sur rtx a2000.

They are very long to calculate. I will stop them too.
9h20 on my rtx 4060 and 14h20 on rtx a2000.
ID: 61604 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Pascal

Send message
Joined: 15 Jul 20
Posts: 95
Credit: 2,550,803,412
RAC: 248
Level
Phe
Scientific publications
wat
Message 61607 - Posted: 17 Jul 2024, 20:37:40 UTC

j ai annulé les 4 taches ATMML que j'avais car trop longues a calculer.
entre 16 et 24 heures.MESSIEURS LES PROGRAMMEURS,j'espere que vous allez vous pencher sur ce probleme?

I cancelled the 4 ATMML stains I had because too long to calculate.
between 16 and 24 hours.PROGRAMMERS, I hope you will look into this problem?
ID: 61607 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 891
Level
Tyr
Scientific publications
watwatwatwatwat
Message 61609 - Posted: 18 Jul 2024, 16:30:26 UTC

Didn't have any issues with the new ATMML tasks I received. Rescued one at "the last chance saloon" as the _7 wingman.

Don't seem to have a "unreasonable" crunch time for the hardware used. About 7 hours or so. I've had acemd that went for 12-14 hours before.
ID: 61609 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Jul 16
Posts: 338
Credit: 7,987,341,558
RAC: 259
Level
Tyr
Scientific publications
watwatwatwatwat
Message 61611 - Posted: 20 Jul 2024, 1:46:11 UTC

I don't recall a larger executable from a BOINC project. 4.67 GB! That is larger than some LHC VDI files.
ID: 61611 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
roundup

Send message
Joined: 11 May 10
Posts: 68
Credit: 12,293,491,875
RAC: 3,176
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61612 - Posted: 20 Jul 2024, 4:15:20 UTC

I had 57 units so far without a single error. Great!
Fastest unit took 4,197 seconds (1,17 hours) on a 4080 Super, longest one took a bit over 30,000 seconds (8,33 hours) on a 4060ti.
More than reasonable.
ID: 61612 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Drago

Send message
Joined: 3 May 20
Posts: 19
Credit: 1,043,759,208
RAC: 39
Level
Met
Scientific publications
wat
Message 61615 - Posted: 22 Jul 2024, 13:05:22 UTC

Hello everyone! My four hosts running 3060, 3060ti and 3070ti were not able to complete a single unit so far. They all fail at the very beginning with the following STDERR output: "Error loading cuda module". I am running Linux Mint and Ubuntu with Nvidida driver 470. The newer drivers produce errors in other projects so I decided to stick to that driver version. I noticed that a lot of my wingmen successfully crunch the units with driver 530 or 535. is that a driver issue? All other projects run just fine on version 470.


Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead.
Traceback (most recent call last):
File "/var/lib/boinc-client/slots/24/bin/rbfe_explicit_sync.py", line 10, in <module>
rx.setupJob()
File "/var/lib/boinc-client/slots/24/lib/python3.11/site-packages/sync/atm.py", line 85, in setupJob
self.worker = OMMWorkerATM(ommsystem, self.config, self.logger)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/boinc-client/slots/24/lib/python3.11/site-packages/sync/worker.py", line 34, in __init__
self.simulation = Simulation(self.topology, self.ommsystem.system, self.integrator, platform, properties)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/boinc-client/slots/24/lib/python3.11/site-packages/openmm/app/simulation.py", line 106, in __init__
self.context = mm.Context(self.system, self.integrator, platform, platformProperties)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/boinc-client/slots/24/lib/python3.11/site-packages/openmm/openmm.py", line 12171, in __init__
_openmm.Context_swiginit(self, _openmm.new_Context(*args))
^^^^^^^^^^^^^^^^^^^^^^^^^^
openmm.OpenMMException: Error loading CUDA module: CUDA_ERROR_UNSUPPORTED_PTX_VERSION (222)
ID: 61615 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 61616 - Posted: 22 Jul 2024, 15:07:38 UTC - in response to Message 61615.  

with that error, yes i would assume the old driver version is the issue.

CUDA historically has not been forward compatible. as in, a CUDA10 binary could not run on a system with only CUDA 8 drivers. but the opposite was true in most cases, that backward compatibility is fine and you can run even very old CUDA code with the latest drivers.

only starting with CUDA 11.1 was forward compatibility introduced, and only within the same major version. So a system with only CUDA 11.1 drivers could still run up to CUDA 11.8 binaries. Same goes for CUDA12, where all CUDA 12 drivers will be compatible with all CUDA 12+ binaries.

I have a feeling that some parts of this new ATMML app, and probably in particular OpenMM (based on what's throwing the error) actually requires CUDA 12+ drivers. and the app is misidentified at the project as being CUDA 11 compatible.

you could test this by installing the newer drivers and see if they then run.

what other project has issue with the newer drivers?
ID: 61616 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Pascal

Send message
Joined: 15 Jul 20
Posts: 95
Credit: 2,550,803,412
RAC: 248
Level
Phe
Scientific publications
wat
Message 61617 - Posted: 22 Jul 2024, 16:10:31 UTC - in response to Message 61616.  

chez moi les pilotes d'origine du system fonctionne tres bien.ce sont les pilotes 535 fourni a l'install de linux mint..

https://www.gpugrid.net/results.php?userid=563937

at me the original drivers of the system works three good.this are the 535 drivers provided to install linux mint..
ID: 61617 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Drago

Send message
Joined: 3 May 20
Posts: 19
Credit: 1,043,759,208
RAC: 39
Level
Met
Scientific publications
wat
Message 61619 - Posted: 23 Jul 2024, 12:25:52 UTC - in response to Message 61617.  

chez moi les pilotes d'origine du system fonctionne tres bien.ce sont les pilotes 535 fourni a l'install de linux mint..

https://www.gpugrid.net/results.php?userid=563937

at me the original drivers of the system works three good.this are the 535 drivers provided to install linux mint..


I tried to install the 535 driver but after that my GPU is no longer recognised by Amicable, Einstein and Asteroids. GPUgrid lets me start new wus but they fail after 43 seconds saying that no Nvidia GPU was found. Do I have to install additional libraries or something like that? I also noticed that there is an open driver package from Nvidia and a regualar meta package and a server version of that driver. Which one are you guys using?
ID: 61619 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 61620 - Posted: 23 Jul 2024, 13:13:11 UTC - in response to Message 61619.  

chez moi les pilotes d'origine du system fonctionne tres bien.ce sont les pilotes 535 fourni a l'install de linux mint..

https://www.gpugrid.net/results.php?userid=563937

at me the original drivers of the system works three good.this are the 535 drivers provided to install linux mint..


I tried to install the 535 driver but after that my GPU is no longer recognised by Amicable, Einstein and Asteroids. GPUgrid lets me start new wus but they fail after 43 seconds saying that no Nvidia GPU was found. Do I have to install additional libraries or something like that? I also noticed that there is an open driver package from Nvidia and a regualar meta package and a server version of that driver. Which one are you guys using?


if you're running opencl applications then yes you need additional opencl package.

sudo apt install ocl-icd-libopencl1

535 drivers work fine for einstein, most of my hosts are on that driver and I contribute to einstein primarily.
ID: 61620 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Pascal

Send message
Joined: 15 Jul 20
Posts: 95
Credit: 2,550,803,412
RAC: 248
Level
Phe
Scientific publications
wat
Message 61621 - Posted: 23 Jul 2024, 15:40:38 UTC - in response to Message 61620.  

je n'utilise rien de supplemntaire comme package.
J'ai installé linux mint normalement et fais les mises a jours systeme et pilotes.
J'ai installé les pilotes 535 en passant par le gestionnaire de pilotes at tout fonctionne tres bien.
boinc reconnait ma rtx 4060 et ma rtx a2000 et ma gtx 1650 dans le meme pc.
je calcule pour gpugrid et amicable numbers sans problemes.
soit vous avez une installation systeme défaillante soit un probleme hardware.

I don’t use anything extra as a package.
I installed linux mint normally and make the system and driver updates.
I installed the 535 drivers through the driver manager and everything works fine.
boinc recognizes my rtx 4060 and my rtx a2000 and my gtx 1650 in the same pc.
I calculate for gpugrid and friendly numbers without problems.
either you have a system installation failure or a hardware problem.
ID: 61621 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Pascal

Send message
Joined: 15 Jul 20
Posts: 95
Credit: 2,550,803,412
RAC: 248
Level
Phe
Scientific publications
wat
Message 61622 - Posted: 23 Jul 2024, 15:55:08 UTC

pour commencer,je vous conseille de tester vos barrettes de ram avec memtest free et pas un autre programme.il fonctionne tres bien et est fiable.

To start with, I advise you to test your ram strips with memtest free and not another program.it works very well and is reliable.

https://www.memtest86.com/
ID: 61622 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · 4 . . . 8 · Next

Message boards : Number crunching : ATMML

©2025 Universitat Pompeu Fabra