PYSCFbeta: Quantum chemistry calculations on GPU

Message boards : News : PYSCFbeta: Quantum chemistry calculations on GPU
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · Next

AuthorMessage
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 61312 - Posted: 21 Feb 2024, 13:22:20 UTC - in response to Message 61311.  

schedule requests from your host are not specific about what it's asking for. it just asks for work for "Nvidia" and the scheduler on the project side decides what you need and what to send based on your preferences. the way the scheduler is setup right now, you wont be sent both types of work when both are available, only ATM.

you will need to move the GPUs to different hosts and setup the project preferences to be different for each of them. or run two clients on one host with one gpu attached to each,

or just stay with ATM on both cards.


ID: 61312 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Pascal

Send message
Joined: 15 Jul 20
Posts: 95
Credit: 2,550,803,412
RAC: 248
Level
Phe
Scientific publications
wat
Message 61313 - Posted: 21 Feb 2024, 13:32:31 UTC - in response to Message 61312.  

ok merci
ID: 61313 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ServicEnginIC
Avatar

Send message
Joined: 24 Sep 10
Posts: 592
Credit: 11,972,186,510
RAC: 1,447
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61319 - Posted: 21 Feb 2024, 22:59:26 UTC - in response to Message 61307.  

QChem seems to not be classified in the scheduler as "test" or beta. despite being treated as such by the staff and the app name literally has the word beta in it. if you disable test tasks, and enable only QChem, you will get them still.

Giving a bit more assortment to current GPUGRID apps spectrum, I happened to be watching Server status page when a limited number (about 215) of "ATM: Free energy calculations of protein-ligand binding" tasks grew up. To be distinguished from previously existing ATMbeta branch.
I managed to configure a venue at GPUGRID preferences page to catch one of them before unsent tasks vanished.
Task: tnks2_m5f_m5l_1_RE-QUICO_ATM_GAFF2_1fs-0-5-RND3367_1
To achieve this, I disabled getting test apps, and enabled only (somehow paradoxical ;-) "ATM (beta)" app.
That task is currently running at my GTX 1660 Ti GPU, at an estimated rate of 9,72% per hour.

And quickly returning to PYSCFbeta (QChem) topic: tasks for this app grew up today to a noticeable amount of 80K+ ready to send ones.
After peaking, QChem unsent tasks are now decreasing again.
ID: 61319 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Pascal

Send message
Joined: 15 Jul 20
Posts: 95
Credit: 2,550,803,412
RAC: 248
Level
Phe
Scientific publications
wat
Message 61320 - Posted: 22 Feb 2024, 9:50:55 UTC

Bonjour
y a t il des unités de calcul pour windows disponible?

Hello
Are there computing units for windows available?
ID: 61320 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ServicEnginIC
Avatar

Send message
Joined: 24 Sep 10
Posts: 592
Credit: 11,972,186,510
RAC: 1,447
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61321 - Posted: 22 Feb 2024, 11:05:33 UTC - in response to Message 61320.  
Last modified: 22 Feb 2024, 11:28:25 UTC

Yes, ATM and ATMbeta apps have both Windows and Linux versions currently available.

Edit.
Regarding Quantum chemistry, there is no still any Windows version
ID: 61321 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 61322 - Posted: 22 Feb 2024, 16:47:17 UTC - in response to Message 61321.  

Regarding Quantum chemistry, there is no still any Windows version

:-( :-( :-(
ID: 61322 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bedrich Hajek

Send message
Joined: 28 Mar 09
Posts: 490
Credit: 11,731,645,728
RAC: 69
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61353 - Posted: 2 Mar 2024, 1:00:26 UTC

ID: 61353 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Skip Da Shu

Send message
Joined: 13 Jul 09
Posts: 64
Credit: 2,922,790,120
RAC: 98
Level
Phe
Scientific publications
watwatwatwatwatwatwat
Message 61360 - Posted: 3 Mar 2024, 0:58:43 UTC

https://imgur.com/evCBB73

GPUGRID error rate across 2x 3070 8GB, 2x 3080 10GB & 1 4070 Super 12GB (early part is with 3x 3070 8GB one of which was replaced by 4070S 2/20).

Skip
ID: 61360 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Skip Da Shu

Send message
Joined: 13 Jul 09
Posts: 64
Credit: 2,922,790,120
RAC: 98
Level
Phe
Scientific publications
watwatwatwatwatwatwat
Message 61363 - Posted: 4 Mar 2024, 13:00:29 UTC - in response to Message 61360.  
Last modified: 4 Mar 2024, 13:04:08 UTC




Going the wrong direction :-(

Skip
ID: 61363 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 61364 - Posted: 4 Mar 2024, 13:07:01 UTC - in response to Message 61363.  

to be expected with 8-10GB cards.

might get better context if you split the graphs up by card type. so you can see the relative error rate vs different VRAM sizes. I'm guessing most errors come from the 8GB cards.

ID: 61364 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[BAT] Svennemans

Send message
Joined: 27 May 21
Posts: 54
Credit: 1,004,151,720
RAC: 0
Level
Met
Scientific publications
wat
Message 61370 - Posted: 4 Mar 2024, 16:02:45 UTC

On my GTX1080ti 11GB, I've only got about 1% error rate due to memory.

But watching 'nvidia-smi dmon' there are a lot of close shaves, where I'm only a couple of MB's below the limit...

So from a 10GB card, I'd already expect a non-trivial error rate.
ID: 61370 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Skip Da Shu

Send message
Joined: 13 Jul 09
Posts: 64
Credit: 2,922,790,120
RAC: 98
Level
Phe
Scientific publications
watwatwatwatwatwatwat
Message 61401 - Posted: 9 Mar 2024, 20:10:53 UTC - in response to Message 61364.  
Last modified: 9 Mar 2024, 20:13:28 UTC

to be expected with 8-10GB cards.

might get better context if you split the graphs up by card type. so you can see the relative error rate vs different VRAM sizes. I'm guessing most errors come from the 8GB cards.


They do:

8GB – last 2 checks of 2 cards 44.07
10GB – last 2 checks of 2 cards 30.80
12GB – last 2 checks of 1 card 7.62

But I need to look at the last day or two as rates have been going up.
- da shu @ HeliOS,
"A child's exposure to technology should never be predicated on an ability to afford it."
ID: 61401 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Skip Da Shu

Send message
Joined: 13 Jul 09
Posts: 64
Credit: 2,922,790,120
RAC: 98
Level
Phe
Scientific publications
watwatwatwatwatwatwat
Message 61402 - Posted: 9 Mar 2024, 20:22:16 UTC
Last modified: 9 Mar 2024, 20:23:07 UTC

Anyone have insight into this error:

<stderr_txt>
09:06:00 (130033): wrapper (7.7.26016): starting
[x86_64-pc-linux-gnu__cuda1121.zip]
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of x86_64-pc-linux-gnu__cuda1121.zip or
x86_64-pc-linux-gnu__cuda1121.zip.zip, and cannot find x86_64-pc-linux-gnu__cuda1121.zip.ZIP, period.
boinc_unzip() error: 9

It looks like every WU since the afternoon of the 7th (Zulu) is getting this but only on my single 12GB 4070S

Skip
ID: 61402 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 891
Level
Tyr
Scientific publications
watwatwatwatwat
Message 61403 - Posted: 9 Mar 2024, 20:35:22 UTC - in response to Message 61402.  

Download error causing the zip file to be corrupted because it is missing the end of file signature.

I was getting that on a Google Drive zip archive a couple of days ago. Switching browsers let me download the archive correctly so it would unpack.
ID: 61403 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Skip Da Shu

Send message
Joined: 13 Jul 09
Posts: 64
Credit: 2,922,790,120
RAC: 98
Level
Phe
Scientific publications
watwatwatwatwatwatwat
Message 61404 - Posted: 9 Mar 2024, 23:13:30 UTC - in response to Message 61403.  

Download error causing the zip file to be corrupted because it is missing the end of file signature.

I was getting that on a Google Drive zip archive a couple of days ago. Switching browsers let me download the archive correctly so it would unpack.


Well after 100+ of these errors I finally got 3 good ones out of that box after a reboot for a different reason.

Thanx, Skip
ID: 61404 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Pascal

Send message
Joined: 15 Jul 20
Posts: 95
Credit: 2,550,803,412
RAC: 248
Level
Phe
Scientific publications
wat
Message 61405 - Posted: 11 Mar 2024, 8:33:44 UTC

Bonjour
y a t il des unités de calcul pour windows disponible?

Hello
Are there computing units for windows available?
ID: 61405 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Boca Raton Community HS

Send message
Joined: 27 Aug 21
Posts: 38
Credit: 7,254,068,306
RAC: 0
Level
Tyr
Scientific publications
wat
Message 61406 - Posted: 11 Mar 2024, 11:59:23 UTC - in response to Message 61405.  
Last modified: 11 Mar 2024, 12:00:00 UTC

There are not for this project (at this time).
ID: 61406 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Skip Da Shu

Send message
Joined: 13 Jul 09
Posts: 64
Credit: 2,922,790,120
RAC: 98
Level
Phe
Scientific publications
watwatwatwatwatwatwat
Message 61454 - Posted: 10 Apr 2024, 11:35:30 UTC

Error rates skyrocketed on me for this app... even on the 10GB cards (12GB card will be back on Thursday). This started late on April 7th.

Error rate now over 50% so I will have to NNW till I can figure it out.

Skip
- da shu @ HeliOS,
"A child's exposure to technology should never be predicated on an ability to afford it."
ID: 61454 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 61455 - Posted: 10 Apr 2024, 13:17:01 UTC - in response to Message 61454.  

Error rates skyrocketed on me for this app... even on the 10GB cards (12GB card will be back on Thursday). This started late on April 7th.

Error rate now over 50% so I will have to NNW till I can figure it out.

Skip


It's not you. its the new v4 tasks require more VRAM. I asked about this on their discord.

I asked:
it seems the newer "v4" tasks on average require a bit more VRAM than the previous v3 tasks. I'm seeing a higher error percentage on 12GB cards.

v3 had about 5% failure from OOM on 12GB VRAM
v4 is more like 15% failure from OOM on 12GB VRAM
no failures with 16GB VRAM

what changed in V4?


Steve replied:
yes this make sense unfortunately. In the previous round of "inputs_v3**" it was calculating things incorrectly for any molecule containing Iodine. This is heaviest element in our dataset. The computational cost of this QM method scales with the size of the elements (it depends on the number of electrons). We are resending the incorrect calculations for Iodine containing molecules in this round of "v4" work units. Therefore the v4 set is a subset of the previous v3 WUs containing heavier elements, hence there are more OOM errors.

ID: 61455 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Skip Da Shu

Send message
Joined: 13 Jul 09
Posts: 64
Credit: 2,922,790,120
RAC: 98
Level
Phe
Scientific publications
watwatwatwatwatwatwat
Message 61456 - Posted: 10 Apr 2024, 16:15:25 UTC - in response to Message 61455.  
Last modified: 10 Apr 2024, 16:15:57 UTC

Thank you. U probably just saved me hours of wasted time.

Error %
AVG ALL: 29.1
AVG – last 3: 59.0

8GB – last 2 72.76
10GB – last 2 66.52
12GB – last 2 3.55 (card out for a week)
ID: 61456 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · Next

Message boards : News : PYSCFbeta: Quantum chemistry calculations on GPU

©2025 Universitat Pompeu Fabra