Message boards :
News :
ACEMD updated app
Message board moderation
Author | Message |
---|---|
![]() Send message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
As I said. We are currently compiling the Windows version. GDF |
Send message Joined: 21 Feb 20 Posts: 1109 Credit: 40,469,283,595 RAC: 3,993,807 Level ![]() Scientific publications ![]() |
might as well compile it for CUDA 11.8 to bring Ada (40-series) support. ![]() |
Send message Joined: 23 Nov 08 Posts: 1 Credit: 612,500 RAC: 0 Level ![]() Scientific publications ![]() |
大家好! 我在中国上海 如何让GPU 工作在百分之一百的状态 我发现在运行时GPU 一直在百分之30左右![img][/img] |
Send message Joined: 21 Feb 20 Posts: 1109 Credit: 40,469,283,595 RAC: 3,993,807 Level ![]() Scientific publications ![]() |
大家好! 我在中国上海 如何让GPU 工作在百分之一百的状态 我发现在运行时GPU 一直在百分之30左右![img][/img] 这个情况对于这个Python程序很正常,这个python程序用更多的CPU,而不是GPU。GPU的使用会被CPU限制。如果你同时运行两个任务,可以提高GPU的使用。但是在用这个Python程序的时候,你无法让GPU达到百分之百的状态。 ![]() |
Send message Joined: 17 Mar 10 Posts: 1 Credit: 5,362,500 RAC: 0 Level ![]() Scientific publications ![]() |
我Nvidia能到80%。我也同时在运行其他的CPU(20%)和Intel GPU(97%)项目。电源调成最佳性能后,CPU到50%。Intel i7 12代。 |
![]() Send message Joined: 8 Aug 19 Posts: 252 Credit: 458,054,251 RAC: 0 Level ![]() Scientific publications ![]() ![]() |
Looking around I see the present batch of protein ligand sims are crashing... DARNIT! process exited with code 195 (0xc3, -61)</message> anything else found? "Together we crunch To check out a hunch And wish all our credit Could just buy us lunch" Piasa Tribe - Illini Nation |
Send message Joined: 21 Feb 20 Posts: 1109 Credit: 40,469,283,595 RAC: 3,993,807 Level ![]() Scientific publications ![]() |
Looking around I see the present batch of protein ligand sims are crashing... DARNIT! if someone can preserve the data files and slot directory before it gets uploaded and subsequently wiped from your system, should be easy to figure out what's wrong. my guess is they didn't name that run.sh file properly (via open_name probably), or didnt add a task to extract the file in the wrapper config file (jobs.xml), or something along those lines. ![]() |
Send message Joined: 21 Feb 20 Posts: 1109 Credit: 40,469,283,595 RAC: 3,993,807 Level ![]() Scientific publications ![]() |
actually I have some on my system so i took a look. there appear to be many things wrong. the job.xml file is calling just tar, with no reference to what tar is. this should probably be /bin/tar to use the system tar. the extracted run.sh script looks woefully lacking in detail. i can see it trying to call python and conda from 'bin/' but that is not included in the input package and will fail. the input tarball only includes some text/config files and not the whole python package. ![]() |
![]() Send message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
What app exactly? |
Send message Joined: 21 Feb 20 Posts: 1109 Credit: 40,469,283,595 RAC: 3,993,807 Level ![]() Scientific publications ![]() |
What app exactly? the new free energy one ('ATM' moniker). using the wrapper to call the run.sh script. also it would be a good idea to add a checkbox for this app in project preferences. this app showed up with no warning and no announcement from the project and no way to prevent it it seems. I'm not sure if it's marked as beta or not. ![]() |
![]() Send message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
Yes, we should have made a beta, but this app is not related to this thread. |
Send message Joined: 21 Feb 20 Posts: 1109 Credit: 40,469,283,595 RAC: 3,993,807 Level ![]() Scientific publications ![]() |
Yes, we should have made a beta, but this app is not related to this thread. you're right, but there is no announcement thread for this app, so no where else appropriate in the News section to get your attention about it. ![]() |
![]() Send message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
Soon we will announce it. This is just testing to see if it works which should have been done on a beta app. I expect tons of workunits using this app. Soon I will introduce a new postdoc running the simulations. g |
Send message Joined: 21 Feb 20 Posts: 1109 Credit: 40,469,283,595 RAC: 3,993,807 Level ![]() Scientific publications ![]() |
interesting to see that Ada "should" run on the Ampere cubins. I know the app has an architecture compatibility check, and it may fail there even if it could otherwise work. you could also consider compiling your apps with the PTX version for forward compatibility like this: -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_86,code=compute_86 and the user can set the environment variable as needed. or you could set it in the wrapper config file ![]() |
Send message Joined: 1 Jan 15 Posts: 1162 Credit: 12,205,098,501 RAC: 9,135,494 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I am successfully running the current ACEMD_3 tasks on a GTX980ti, on a Quadro P5000, and on two RTX3070. However, they fail on a GTX1650 after a few seconds: https://www.gpugrid.net/result.php?resultid=33263379 https://www.gpugrid.net/result.php?resultid=33263343 can anyone tell me what might be the reason? |
![]() ![]() Send message Joined: 24 Sep 10 Posts: 588 Credit: 11,396,036,510 RAC: 11,719,261 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
As a first, you can try resetting GPUGRID project at failing host. But probably the reason is 4GB RAM being too short for executing these tasks. |
Send message Joined: 1 Jan 15 Posts: 1162 Credit: 12,205,098,501 RAC: 9,135,494 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
... that's what I am guessing, too. However, I was closely watching the RAM usage (via MemInfo) when the tasks started: at the moment the task crashed, about 2 GB were still free. Further, for the tasks running on the other hosts mentioned above, the Windows tasks manager shows a RAM usage between 60MB and 400MB per task. Maybe the CPU Intel Core2 Duo E7400 @ 2.80GHz is too old for these tasks? (However, some other GPU projects like Einstein, WCG and Primegrid are running well). |
Send message Joined: 21 Feb 20 Posts: 1109 Credit: 40,469,283,595 RAC: 3,993,807 Level ![]() Scientific publications ![]() |
... i could very well be that the CPU is too old. it does not support AVX extensions for example, and if the application is built with this requirement then that could be a reason. ![]() |
Send message Joined: 1 Jan 15 Posts: 1162 Credit: 12,205,098,501 RAC: 9,135,494 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
perhaps one of the GPUGRID people could tell me if this is the case? |
Send message Joined: 6 Mar 18 Posts: 38 Credit: 1,323,842,080 RAC: 328,325 Level ![]() Scientific publications ![]() |
Just had one and it failed after 26 seconds on my 4090 |
©2025 Universitat Pompeu Fabra