RTX 5090 performance

Author	Message
Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 62299 - Posted: 29 Mar 2025, 21:21:50 UTC The longest of the present batch (QUICO_ATM_GAFF2_new_cdk8) takes 7,500-7,700s (2h 5m - 2h 8m) on my RTX 5090. The power consumption of the GPU fluctuates between 200W and 400W (power limint of the card is 575W). The same selection of the present batch takes 8,826 - 8,853s (2h 27m) on my RTX 4090. The power consumption of the GPU fluctuates between 212W and 283W (power limit of the card is 450W). The performance advantage of the 5090 over the 4090 on this selection of the present batch is 15-17%. This batch utilizes the GPU sparsely. I've tested the performace difference between the 5090 and the 4090 with FAH: the full potential of the 5090 can be utilized only by the most demanding (largest structure, 714,096 atoms) project (#16525). The performance advantage of the 5090 over the 4090 is 35-37%, while the 5090 conusmes 30% more power (520W vs 398W). So the 5090 vs 4090 feels like the 3080Ti vs 2080Ti: the new card is faster, but it didn't become more power effective. ID: 62299 · Rating: 0 · rate: / Reply Quote

Freewill Send message Joined: 18 Mar 10 Posts: 28 Credit: 42,166,087,419 RAC: 4,032,862 Level Scientific publications	Message 62301 - Posted: 29 Mar 2025, 23:49:29 UTC - in response to Message 62299. How do the 4090 and 5090 card compare if you run 2 tasks at once? 3 tasks at once? ID: 62301 · Rating: 0 · rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 62302 - Posted: 30 Mar 2025, 0:22:06 UTC - in response to Message 62301. Good question. I've recreated the app_config.xml files on both hosts, and now 2 workunits are running simultaneously. The power consumption is went up a little on both cards, so I think it didn't help that much. We'll see in the morning. I hope there will be enough workunits on my hosts. ID: 62302 · Rating: 0 · rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 62306 - Posted: 30 Mar 2025, 10:32:44 UTC - in response to Message 62301. The runtimes almost doubled by running two tasks simultaneously. It's had to tell exactly, because there's a significant variation in runtimes even between the same selection of the present batch. I prefer to finish tasks as fast as they can be finished, so I've reverted back to one wu per GPU. ID: 62306 · Rating: 0 · rate: / Reply Quote

Freewill Send message Joined: 18 Mar 10 Posts: 28 Credit: 42,166,087,419 RAC: 4,032,862 Level Scientific publications	Message 62308 - Posted: 30 Mar 2025, 14:23:17 UTC - in response to Message 62306. Thanks for checking the 2X configuration option. ID: 62308 · Rating: 0 · rate: / Reply Quote

KeithBriggs Send message Joined: 29 Aug 24 Posts: 71 Credit: 3,330,790,989 RAC: 113,338 Level Scientific publications	Message 62310 - Posted: 30 Mar 2025, 14:47:28 UTC - in response to Message 62308. KeithM gave me the protocol for running multiple jobs on Linux. Involved as usual to get it setup. I assume you did that as well, Cards ran pegged at 100% so it depends on the usage profile of a single task to know the benefits. I can't remember exactly but I think the old ATMMLs were running on average at 90% so I got a 10% boost running (don't quote me) MTS. Depending on when I received the task, was running into finishing them within 24h so the 10% boost was being countered by the time to finish. The other question for me was do I want them running at 100% almost nonstop. ID: 62310 · Rating: 0 · rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1423 Credit: 9,186,946,190 RAC: 1,288,374 Level Scientific publications	Message 62312 - Posted: 30 Mar 2025, 16:47:30 UTC - in response to Message 62310. Just use a app_config.xml file for the task with a 0.5 gpu_usage factor. That will cause the gpu to load up two concurrent tasks. On Linux, you can use the Nvidia mps-server software included in the Nvidia drivers also to benefit. <app_config> <app> <name>ATM</name> <gpu_versions> <gpu_usage>0.5</gpu_usage> </gpu_versions> </app> </app_config> export CUDA_VISIBLE_DEVICES=0,1 export CUDA_MPS_ACTIVE_THREAD_PERCENTAGE=70 export CUDA_MPS_PIPE_DIRECTORY=/tmp/nvidia-mps export CUDA_MPS_LOG_DIRECTORY=/tmp/nvidia-log nvidia-cuda-mps-control -d Adjust the CUDA_VISIBLE_DEVICES count to the number of gpu cards in the host as necessary. Run the mps-server commands as root in the terminal before launching Boinc. https://docs.nvidia.com/deploy/mps/index.html ID: 62312 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 423,674 Level Scientific publications	Message 62314 - Posted: 30 Mar 2025, 19:38:02 UTC - in response to Message 62312. to add on to what Keith said, make sure you don't run any OpenCL tasks when using MPS. MPS is very useful in making your computation more efficient on the GPU but it's for CUDA code only and any OpenCL tasks will fail. you cannot run OpenCL at the same time as CUDA tasks with MPS. and you will need to stop the MPS server daemon before resuming any OpenCL computation. ID: 62314 · Rating: 0 · rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 62315 - Posted: 30 Mar 2025, 20:32:01 UTC - in response to Message 62312. Thanks for the cuda mps info. I'll give it a try when there will be plenty of work. Due to the non-linear bonus system of FAH, it's more "reasonable" to crunch there with the top-end cards, as a 35% faster GPU receives 60% more credit. BTW my RTX 5090 was suspiciously slow (by 5.5%) with FAH tasks (project 16525) after I've crunched a few GPUGrid workunits (some of them simultaneously). It was fixed by a restart. It could be unrelated to GPUGrid, as I've observed similar (but not that significant) slowdown crunching only FAH tasks. ID: 62315 · Rating: 0 · rate: / Reply Quote

koschi Send message Joined: 14 Aug 08 Posts: 127 Credit: 919,858,161 RAC: 154,051 Level Scientific publications	Message 62323 - Posted: 31 Mar 2025, 16:58:36 UTC Is there any benefit from MPS when running a single WU on a single card? ID: 62323 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 423,674 Level Scientific publications	Message 62324 - Posted: 31 Mar 2025, 17:09:59 UTC - in response to Message 62323. Last modified: 31 Mar 2025, 17:17:11 UTC Is there any benefit from MPS when running a single WU on a single card? not for increasing overall performance. no. but the longer answer is it depends. in rare instances, you may have a type of task that needs a lot of VRAM, say just slightly more than you have, and this causes you to error out most tasks. leading to you getting almost no credit for this type of work since they will all get out of memory type errors. however, a lot of CUDA applications actually allocate VRAM based on how many active SMs or active cores you have, since each parallel chunk needs it's own memory. more parallel chunks = more memory needed. you can use the same method for decreasing the active thread percentage of the tasks through MPS to effectively clamp down the amount of VRAM needed to process the task. this will have a net effect of slowing down that task slightly, but if you clamp it enough so that you can now complete the majority of tasks, then you will get more credit and complete more work overall, even if each task will not be using the whole GPU and probably run a little slower. in this case getting work done slower is better than no work at all. that second rare situation is (or was) actually helpful running Quantum Chemistry tasks here at GPUGRID on GPUs with large core counts and low-ish VRAM, like the Titan V. I believe clamping the active thread percentage to 60 or 70% was enough to allow most tasks to complete, whereas most failed when you tried to run with the full GPU core enabled. ID: 62324 · Rating: 0 · rate: / Reply Quote

koschi Send message Joined: 14 Aug 08 Posts: 127 Credit: 919,858,161 RAC: 154,051 Level Scientific publications	Message 62326 - Posted: 31 Mar 2025, 19:43:59 UTC Thanks, understood. ID: 62326 · Rating: 0 · rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 62335 - Posted: 2 Apr 2025, 14:32:03 UTC - in response to Message 62312. The MPS server doesn't seem to work: https://gpugrid.net/gpugrid/result.php?resultid=38431975 https://gpugrid.net/gpugrid/result.php?resultid=38431034 https://gpugrid.net/gpugrid/result.php?resultid=38429714 https://gpugrid.net/gpugrid/result.php?resultid=38429343 Openmm.OpenMMException: Error initializing CUDA: CUDA_ERROR_MPS_CONNECTION_FAILED (805) ID: 62335 · Rating: 0 · rate: / Reply Quote

KeithBriggs Send message Joined: 29 Aug 24 Posts: 71 Credit: 3,330,790,989 RAC: 113,338 Level Scientific publications	Message 62336 - Posted: 2 Apr 2025, 16:02:19 UTC - in response to Message 62335. I don't believe it's MPS related, Subsequent errors on the first WU above is: Error loading CUDA module: CUDA_ERROR_UNSUPPORTED_PTX_VERSION (222) ID: 62336 · Rating: 0 · rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1423 Credit: 9,186,946,190 RAC: 1,288,374 Level Scientific publications	Message 62337 - Posted: 2 Apr 2025, 16:06:01 UTC - in response to Message 62335. Last modified: 2 Apr 2025, 16:09:24 UTC Did you set up the mps-server environment variables beforehand? I just run a script after booting and before I start Boinc. export CUDA_VISIBLE_DEVICES=0,1 export CUDA_MPS_ACTIVE_THREAD_PERCENTAGE=70 export CUDA_MPS_PIPE_DIRECTORY=/tmp/nvidia-mps export CUDA_MPS_LOG_DIRECTORY=/tmp/nvidia-log nvidia-cuda-mps-control -d You need to use a root terminal to set the environment variables and start the mps-server daemon running in the background. [EDIT] Maybe the 5090 is too new and not supported yet in the mps-server. None of my teammates using the mps-server have anything that new in their hosts. 2000, 3000 and 4000 series cards run it fine. ID: 62337 · Rating: 0 · rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 62338 - Posted: 2 Apr 2025, 16:49:41 UTC - in response to Message 62336. Last modified: 2 Apr 2025, 16:54:39 UTC Error loading CUDA module: CUDA_ERROR_UNSUPPORTED_PTX_VERSION (222) I can't find this message in the stderr of my 4 tasks. EDIT: So this error showed up on other hosts. I don't think I was that unlucky that my host got 4 bad tasks in succession. ID: 62338 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 423,674 Level Scientific publications	Message 62339 - Posted: 2 Apr 2025, 17:15:04 UTC - in response to Message 62335. The MPS server doesn't seem to work: https://gpugrid.net/gpugrid/result.php?resultid=38431975 https://gpugrid.net/gpugrid/result.php?resultid=38431034 https://gpugrid.net/gpugrid/result.php?resultid=38429714 https://gpugrid.net/gpugrid/result.php?resultid=38429343 Openmm.OpenMMException: Error initializing CUDA: CUDA_ERROR_MPS_CONNECTION_FAILED (805) you'd need to explain how you setup MPS. what environment variables did you use? the more information you can provide, the better. ID: 62339 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 423,674 Level Scientific publications	Message 62340 - Posted: 2 Apr 2025, 17:15:50 UTC - in response to Message 62337. Maybe the 5090 is too new and not supported yet in the mps-server. None of my teammates using the mps-server have anything that new in their hosts. 2000, 3000 and 4000 series cards run it fine. I've rented a 5090 before. MPS works fine on it. ID: 62340 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 423,674 Level Scientific publications	Message 62341 - Posted: 2 Apr 2025, 17:18:12 UTC - in response to Message 62336. I don't believe it's MPS related, Subsequent errors on the first WU above is: Error loading CUDA module: CUDA_ERROR_UNSUPPORTED_PTX_VERSION (222) that's specific to the hosts that have that error. they have a driver version that is too old. it's not related to the WU/task at all. ID: 62341 · Rating: 0 · rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 62342 - Posted: 2 Apr 2025, 17:36:36 UTC - in response to Message 62340. Last modified: 2 Apr 2025, 17:43:42 UTC I've rented a 5090 before. MPS works fine on it. It's working now for me too. I have to start the daemon as root: sudo nvidia-cuda-mps-control -d After that I've restarted the boinc daemon to be safe, but I don't think it's necessary (only the actual CUDA tasks should be started after the mps daemon). EDIT: the power consumption doesn't fluctuate as much as before, but still far from the power limit (~350W) ID: 62342 · Rating: 0 · rate: / Reply Quote