Message boards :
Graphics cards (GPUs) :
RTX 5090 performance
Message board moderation
| Author | Message |
|---|---|
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The longest of the present batch (QUICO_ATM_GAFF2_new_cdk8) takes 7,500-7,700s (2h 5m - 2h 8m) on my RTX 5090. The power consumption of the GPU fluctuates between 200W and 400W (power limint of the card is 575W). The same selection of the present batch takes 8,826 - 8,853s (2h 27m) on my RTX 4090. The power consumption of the GPU fluctuates between 212W and 283W (power limit of the card is 450W). The performance advantage of the 5090 over the 4090 on this selection of the present batch is 15-17%. This batch utilizes the GPU sparsely. I've tested the performace difference between the 5090 and the 4090 with FAH: the full potential of the 5090 can be utilized only by the most demanding (largest structure, 714,096 atoms) project (#16525). The performance advantage of the 5090 over the 4090 is 35-37%, while the 5090 conusmes 30% more power (520W vs 398W). So the 5090 vs 4090 feels like the 3080Ti vs 2080Ti: the new card is faster, but it didn't become more power effective. |
|
Send message Joined: 18 Mar 10 Posts: 28 Credit: 41,810,583,419 RAC: 13,276 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
How do the 4090 and 5090 card compare if you run 2 tasks at once? 3 tasks at once? |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Good question. I've recreated the app_config.xml files on both hosts, and now 2 workunits are running simultaneously. The power consumption is went up a little on both cards, so I think it didn't help that much. We'll see in the morning. I hope there will be enough workunits on my hosts. |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The runtimes almost doubled by running two tasks simultaneously. It's had to tell exactly, because there's a significant variation in runtimes even between the same selection of the present batch. I prefer to finish tasks as fast as they can be finished, so I've reverted back to one wu per GPU. |
|
Send message Joined: 18 Mar 10 Posts: 28 Credit: 41,810,583,419 RAC: 13,276 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Thanks for checking the 2X configuration option. |
|
Send message Joined: 29 Aug 24 Posts: 71 Credit: 3,321,790,989 RAC: 1,408 Level ![]() Scientific publications
|
KeithM gave me the protocol for running multiple jobs on Linux. Involved as usual to get it setup. I assume you did that as well, Cards ran pegged at 100% so it depends on the usage profile of a single task to know the benefits. I can't remember exactly but I think the old ATMMLs were running on average at 90% so I got a 10% boost running (don't quote me) MTS. Depending on when I received the task, was running into finishing them within 24h so the 10% boost was being countered by the time to finish. The other question for me was do I want them running at 100% almost nonstop. |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Just use a app_config.xml file for the task with a 0.5 gpu_usage factor. That will cause the gpu to load up two concurrent tasks. On Linux, you can use the Nvidia mps-server software included in the Nvidia drivers also to benefit. <app_config>
<app>
<name>ATM</name>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
</gpu_versions>
</app>
</app_config>export CUDA_VISIBLE_DEVICES=0,1 export CUDA_MPS_ACTIVE_THREAD_PERCENTAGE=70 export CUDA_MPS_PIPE_DIRECTORY=/tmp/nvidia-mps export CUDA_MPS_LOG_DIRECTORY=/tmp/nvidia-log nvidia-cuda-mps-control -d Adjust the CUDA_VISIBLE_DEVICES count to the number of gpu cards in the host as necessary. Run the mps-server commands as root in the terminal before launching Boinc. https://docs.nvidia.com/deploy/mps/index.html |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
to add on to what Keith said, make sure you don't run any OpenCL tasks when using MPS. MPS is very useful in making your computation more efficient on the GPU but it's for CUDA code only and any OpenCL tasks will fail. you cannot run OpenCL at the same time as CUDA tasks with MPS. and you will need to stop the MPS server daemon before resuming any OpenCL computation.
|
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thanks for the cuda mps info. I'll give it a try when there will be plenty of work. Due to the non-linear bonus system of FAH, it's more "reasonable" to crunch there with the top-end cards, as a 35% faster GPU receives 60% more credit. BTW my RTX 5090 was suspiciously slow (by 5.5%) with FAH tasks (project 16525) after I've crunched a few GPUGrid workunits (some of them simultaneously). It was fixed by a restart. It could be unrelated to GPUGrid, as I've observed similar (but not that significant) slowdown crunching only FAH tasks. |
koschiSend message Joined: 14 Aug 08 Posts: 127 Credit: 913,858,161 RAC: 18 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Is there any benefit from MPS when running a single WU on a single card? |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
Is there any benefit from MPS when running a single WU on a single card? not for increasing overall performance. no. but the longer answer is it depends. in rare instances, you may have a type of task that needs a lot of VRAM, say just slightly more than you have, and this causes you to error out most tasks. leading to you getting almost no credit for this type of work since they will all get out of memory type errors. however, a lot of CUDA applications actually allocate VRAM based on how many active SMs or active cores you have, since each parallel chunk needs it's own memory. more parallel chunks = more memory needed. you can use the same method for decreasing the active thread percentage of the tasks through MPS to effectively clamp down the amount of VRAM needed to process the task. this will have a net effect of slowing down that task slightly, but if you clamp it enough so that you can now complete the majority of tasks, then you will get more credit and complete more work overall, even if each task will not be using the whole GPU and probably run a little slower. in this case getting work done slower is better than no work at all. that second rare situation is (or was) actually helpful running Quantum Chemistry tasks here at GPUGRID on GPUs with large core counts and low-ish VRAM, like the Titan V. I believe clamping the active thread percentage to 60 or 70% was enough to allow most tasks to complete, whereas most failed when you tried to run with the full GPU core enabled.
|
koschiSend message Joined: 14 Aug 08 Posts: 127 Credit: 913,858,161 RAC: 18 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thanks, understood. |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The MPS server doesn't seem to work: https://gpugrid.net/gpugrid/result.php?resultid=38431975 https://gpugrid.net/gpugrid/result.php?resultid=38431034 https://gpugrid.net/gpugrid/result.php?resultid=38429714 https://gpugrid.net/gpugrid/result.php?resultid=38429343 Openmm.OpenMMException: Error initializing CUDA: CUDA_ERROR_MPS_CONNECTION_FAILED (805) |
|
Send message Joined: 29 Aug 24 Posts: 71 Credit: 3,321,790,989 RAC: 1,408 Level ![]() Scientific publications
|
I don't believe it's MPS related, Subsequent errors on the first WU above is: Error loading CUDA module: CUDA_ERROR_UNSUPPORTED_PTX_VERSION (222) |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Did you set up the mps-server environment variables beforehand? I just run a script after booting and before I start Boinc. export CUDA_VISIBLE_DEVICES=0,1 export CUDA_MPS_ACTIVE_THREAD_PERCENTAGE=70 export CUDA_MPS_PIPE_DIRECTORY=/tmp/nvidia-mps export CUDA_MPS_LOG_DIRECTORY=/tmp/nvidia-log nvidia-cuda-mps-control -d You need to use a root terminal to set the environment variables and start the mps-server daemon running in the background. [EDIT] Maybe the 5090 is too new and not supported yet in the mps-server. None of my teammates using the mps-server have anything that new in their hosts. 2000, 3000 and 4000 series cards run it fine. |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Error loading CUDA module: CUDA_ERROR_UNSUPPORTED_PTX_VERSION (222)I can't find this message in the stderr of my 4 tasks. EDIT: So this error showed up on other hosts. I don't think I was that unlucky that my host got 4 bad tasks in succession. |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
The MPS server doesn't seem to work: you'd need to explain how you setup MPS. what environment variables did you use? the more information you can provide, the better.
|
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
Maybe the 5090 is too new and not supported yet in the mps-server. None of my teammates using the mps-server have anything that new in their hosts. 2000, 3000 and 4000 series cards run it fine. I've rented a 5090 before. MPS works fine on it.
|
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
I don't believe it's MPS related, that's specific to the hosts that have that error. they have a driver version that is too old. it's not related to the WU/task at all.
|
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I've rented a 5090 before. MPS works fine on it.It's working now for me too. I have to start the daemon as root: sudo nvidia-cuda-mps-control -dAfter that I've restarted the boinc daemon to be safe, but I don't think it's necessary (only the actual CUDA tasks should be started after the mps daemon). EDIT: the power consumption doesn't fluctuate as much as before, but still far from the power limit (~350W) |
©2025 Universitat Pompeu Fabra