PYSCFbeta: Quantum chemistry calculations on GPU

Message boards : News : PYSCFbeta: Quantum chemistry calculations on GPU
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 14 · Next

AuthorMessage
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 61098 - Posted: 26 Jan 2024, 16:00:09 UTC - in response to Message 61097.  

The work-units require a lot of GPU memory.


How much is "a lot" exactly? I have a pacal card, so it meets the compute capability requirement. But it has only 2gb of VRAM. But without knowing the amount of VRAM required, I am not sure if it will work.

The highest being used today on my Pascal cards is 795 MB.


Might want to watch that on a longer time scale, the VRAM use is not static, it fluctuates up and down
ID: 61098 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Aurum
Avatar

Send message
Joined: 12 Jul 17
Posts: 404
Credit: 17,408,899,587
RAC: 0
Level
Trp
Scientific publications
watwatwat
Message 61099 - Posted: 26 Jan 2024, 16:18:23 UTC
Last modified: 26 Jan 2024, 16:32:42 UTC

Retraction: I'm monitoring with the BoincTasks Js 2.4.2.2 and it has bugs.
I loaded NVITOP and it does use 2 GB VRAM with 100% GPU utilization.

BTW, if anyone wants to try NVITOP here's my notes to install for Ubuntu 22.04:
sudo apt update
sudo apt upgrade -y
sudo apt install python3-pip -y
python3 -m pip install --user pipx
python3 -m pip install --user --upgrade pipx
python3 -m pipx ensurepath
# if requested: sudo apt install python3.8-venv -y
For LM 21.3: sudo apt install python3.10-venv -y
Open a new terminal:
pip3 install --upgrade nvitop
pipx run nvitop --colorful -m full
ID: 61099 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 61100 - Posted: 26 Jan 2024, 16:26:34 UTC - in response to Message 61099.  

I'm not seeing any different behavior on my titan Vs. the VRAM use still exceeds 3GB at times. but it's spikey. you have to watch it for a few mins. instantaneous measurements might not catch it.
ID: 61100 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Boca Raton Community HS

Send message
Joined: 27 Aug 21
Posts: 38
Credit: 7,254,068,306
RAC: 0
Level
Tyr
Scientific publications
wat
Message 61101 - Posted: 26 Jan 2024, 17:04:26 UTC - in response to Message 61100.  

I am seeing spikes to ~7.6 GB with these. Not long lasting (in the context of the whole work unit) but consistently elevated during that part of the work unit. I want to say that I saw that spike at about 5% complete and then at 95% complete, but that also could be somewhat coincidental versus factual.
ID: 61101 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 61102 - Posted: 26 Jan 2024, 17:11:14 UTC - in response to Message 61101.  
Last modified: 26 Jan 2024, 17:14:12 UTC

I am seeing spikes to ~7.6 GB with these. Not long lasting (in the context of the whole work unit) but consistently elevated during that part of the work unit. I want to say that I saw that spike at about 5% complete and then at 95% complete, but that also could be somewhat coincidental versus factual.


to add on to this, for everyone's info.

these tasks (and a lot of CUDA applications in general) do not require any set absolute value of VRAM. VRAM will scale to the GPU individually. generally, the more SMs you have, to more VRAM will be used. it's not linear, but there is some portion of the allocated VRAM that scales directly with how many SMs are being used.

to put it simply, different GPUs with different core counts, will have different amounts of VRAM utilization.

so even if one powerful GPU like an RTX 4090 with 100+ SMs on the die might need 7+GB, doesn't mean that something much smaller like a GTX 1070 needs that much. it needs to be evaluated on a case by case basis.
ID: 61102 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Boca Raton Community HS

Send message
Joined: 27 Aug 21
Posts: 38
Credit: 7,254,068,306
RAC: 0
Level
Tyr
Scientific publications
wat
Message 61103 - Posted: 26 Jan 2024, 17:20:59 UTC - in response to Message 61102.  

I am seeing spikes to ~7.6 GB with these. Not long lasting (in the context of the whole work unit) but consistently elevated during that part of the work unit. I want to say that I saw that spike at about 5% complete and then at 95% complete, but that also could be somewhat coincidental versus factual.


to add on to this, for everyone's info.

these tasks (and a lot of CUDA applications in general) do not require any set absolute value of VRAM. VRAM will scale to the GPU individually. generally, the more SMs you have, to more VRAM will be used. it's not linear, but there is some portion of the allocated VRAM that scales directly with how many SMs are being used.

to put it simply, different GPUs with different core counts, will have different amounts of VRAM utilization.

so even if one powerful GPU like an RTX 4090 with 100+ SMs on the die might need 7+GB, doesn't mean that something much smaller like a GTX 1070 needs that much. it needs to be evaluated on a case by case basis.



Thanks for this! I did not know about the scaling and I don't think this is something I ever thought about (the correlation between SMs and VRAM usage).
ID: 61103 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bibi

Send message
Joined: 4 May 17
Posts: 15
Credit: 17,444,875,743
RAC: 293
Level
Trp
Scientific publications
watwatwatwatwat
Message 61108 - Posted: 29 Jan 2024, 13:55:27 UTC

Why do I allways get segmentation fault
on Windows/wsl2/Ubuntu 22.04.3 LTS
12 processors, 28 GB memory, 16GB swap, GPU RTX 4070 Ti Super with 16 GB, driver version 551.23

https://www.gpugrid.net/result.php?resultid=33759912
https://www.gpugrid.net/result.php?resultid=33758940
https://www.gpugrid.net/result.php?resultid=33759139
https://www.gpugrid.net/result.php?resultid=33759328
ID: 61108 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 61109 - Posted: 29 Jan 2024, 14:01:46 UTC - in response to Message 61108.  

Why do I allways get segmentation fault
on Windows/wsl2/Ubuntu 22.04.3 LTS
12 processors, 28 GB memory, 16GB swap, GPU RTX 4070 Ti Super with 16 GB, driver version 551.23

https://www.gpugrid.net/result.php?resultid=33759912
https://www.gpugrid.net/result.php?resultid=33758940
https://www.gpugrid.net/result.php?resultid=33759139
https://www.gpugrid.net/result.php?resultid=33759328


something wrong with your environment or drivers likely.

try running a native Linux OS install, WSL might not be well supported
ID: 61109 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 61117 - Posted: 30 Jan 2024, 12:54:49 UTC - in response to Message 61109.  
Last modified: 30 Jan 2024, 13:15:58 UTC

Steve,

these TEST units you have out right now. they seem to be using a ton of reserved memory. one process right now is using 30+GB. that seems much higher than usual. and i even have another one reserving 64GB of memory. that's way too high.


ID: 61117 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Freewill

Send message
Joined: 18 Mar 10
Posts: 28
Credit: 41,810,583,419
RAC: 13,276
Level
Trp
Scientific publications
watwatwatwatwat
Message 61118 - Posted: 30 Jan 2024, 13:17:30 UTC
Last modified: 30 Jan 2024, 13:19:20 UTC

Here's one that died on my Ubuntu system which has 32 GB RAM:
https://www.gpugrid.net/result.php?resultid=33764282
ID: 61118 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 61119 - Posted: 30 Jan 2024, 14:33:26 UTC - in response to Message 61117.  
Last modified: 30 Jan 2024, 15:13:20 UTC

i see v3 being deployed now

the memory limiting you're trying isn't working. I'm seeing it spike to near 100%

i see you put export CUPY_GPU_MEMORY_LIMIT=50%

a quick google seems to indicate that you need to put the percentage in quotes. like this - export CUPY_GPU_MEMORY_LIMIT="50%". or additionally you can set a discrete memory amount as the limit. for example, export CUPY_GPU_MEMORY_LIMIT="1073741824" to limit to 1GB.

and the system memory use is still a little high, around 10GB each. EDIT - system memory use still climbed to ~30GB by the end
ID: 61119 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 61120 - Posted: 30 Jan 2024, 16:01:04 UTC - in response to Message 61119.  
Last modified: 30 Jan 2024, 16:01:30 UTC

v4 report.

i see you attempted to add some additional VRAM limiting. but the task is still trying to allocate more VRAM, and instead of using more VRAM, the process gets killed for trying to allocate more than the limit.

https://gpugrid.net/result.php?resultid=33764464
https://gpugrid.net/result.php?resultid=33764469
ID: 61120 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Steve
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 21 Dec 23
Posts: 51
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 61121 - Posted: 30 Jan 2024, 16:11:32 UTC

Yes I was doing some testing to see how large molecules we can compute properties for.

The previous batches have been for small molecules which all work very well.

The memory use scales very quickly with increased molecule size.
This test today had molecules 3 to 4 times the size of the previous batches. As you can see I have not solved the memory limiting issue it. It should be possible to limit instantaneous GPU memory use (at the cost of runtime performance and increased CPU memory use). But due to the different levels of CUDA libraries in play in this code it is rather complicated. I will work on this locally for now and resume sending out the batches that were working well tomorrow!

Thank you for the assistance and compute availability, it is much appreciated!
ID: 61121 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 61122 - Posted: 30 Jan 2024, 16:13:47 UTC - in response to Message 61121.  

no problem! glad to see you were monitoring my feedback and making changes.

looking forward to another stable batch tomorrow :) should be similar to previous runs like yesterday right?
ID: 61122 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Steve
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 21 Dec 23
Posts: 51
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 61123 - Posted: 30 Jan 2024, 16:18:55 UTC - in response to Message 61122.  

Yes It will be same as yesterday but roughly 10x the work units released.

Each workunit contains 100 small molecules.
ID: 61123 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 61124 - Posted: 30 Jan 2024, 16:19:50 UTC - in response to Message 61123.  

looking forward to it :)

ID: 61124 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 428
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61126 - Posted: 31 Jan 2024, 12:38:25 UTC

I have Task 33765246 running on a RTX 3060 Ti under Linux Mint 21.3

It's running incredibly slowly, and with zero GPU usage. I've found this in stderr.txt:

+ python compute_dft.py
/hdd/boinc-client/slots/5/lib/python3.11/site-packages/pyscf/dft/libxc.py:771: UserWarning: Since PySCF-2.3, B3LYP (and B3P86) are changed to the VWN-RPA variant, corresponding to the original definition by Stephens et al. (issue 1480) and the same as the B3LYP functional in Gaussian. To restore the VWN5 definition, you can put the setting "B3LYP_WITH_VWN5 = True" in pyscf_conf.py
warnings.warn('Since PySCF-2.3, B3LYP (and B3P86) are changed to the VWN-RPA variant, '
/hdd/boinc-client/slots/5/lib/python3.11/site-packages/gpu4pyscf/lib/cutensor.py:174: UserWarning: using cupy as the tensor contraction engine.
warnings.warn(f'using {contract_engine} as the tensor contraction engine.')
/hdd/boinc-client/slots/5/lib/python3.11/site-packages/pyscf/gto/mole.py:1280: UserWarning: Function mol.dumps drops attribute charge because it is not JSON-serializable
warnings.warn(msg)
Exception:
Fallback to CPU
Exception:
Fallback to CPU
ID: 61126 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 61127 - Posted: 31 Jan 2024, 12:40:53 UTC - in response to Message 61124.  
Last modified: 31 Jan 2024, 13:08:03 UTC

Steve,

this new batch, right off the bat, is loading up the GPU VRAM nearly full again.

edit, that's for a v1 tasks, will check out the v2s
ID: 61127 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 6,423
Level
Trp
Scientific publications
wat
Message 61128 - Posted: 31 Jan 2024, 13:12:40 UTC - in response to Message 61127.  

OK. looks like the v2 tasks are back to normal. it was only that v1 task that was using lots of vram

ID: 61128 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Steve
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 21 Dec 23
Posts: 51
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 61129 - Posted: 31 Jan 2024, 13:19:52 UTC - in response to Message 61127.  

Ok my previous post was incorrect.

It turns out the previous large batch was not a respresentative test set. It only contained very small molecules. This is why the GPU RAM usage was low. As per my previous post these task use a lot of GPU memory. You can see more detail in this post: http://gpugrid.org/forum_thread.php?id=5428&nowrap=true#60945

The work units are now just 10 molecules. They vary in size from 10 to 20 atoms per molecule. All molecules in a WU are the same size. Tests WU's (smallest and largest sized molecules) pass on my GTX1080 (8GB) test machine without failing.

The CPU fallback part was left over from testing this should have been removed but appears it was not.
ID: 61129 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 14 · Next

Message boards : News : PYSCFbeta: Quantum chemistry calculations on GPU

©2025 Universitat Pompeu Fabra