Message boards : Number crunching : RTX performance on Windows
Author | Message |
---|---|
First of all, it's awesome that there's plenty of work again. I am happily crunching again on my favourite GPU project. | |
ID: 53713 | Rating: 0 | rate: / Reply Quote | |
I run all Linux hosts and the most powerful card I have is a RTX2080. | |
ID: 53716 | Rating: 0 | rate: / Reply Quote | |
I know Linux is generally faster, however I don't remember the difference being that stark and core load was always at least around 90% (admittedly my experience is naturally limited to the old app). Also, as I said, the Pascal card seems more or less the same on Windows. | |
ID: 53727 | Rating: 0 | rate: / Reply Quote | |
a) Interrupted WUs cannot be continued, I get an instant error. I guess this is nothing new, though.It happens if you have different GPUs in the same system, and the task restarts on a different GPU than it was running before. To avoid this you should suspend the queued GPUGrid tasks first, then the running GPUGrid tasks one by one, and make note which GPU became unused before you suspend the next running task. After restart first you should resume the task which was running on device 0, then the task was running on device 1 and so on, then the unprocessed tasks. b) My RTX 2080 can't seem to get maxed out, I'm lucky to make it above 80% (whereas a 1660 Ti and a 1060 3GB in the same system don't have that problem, although the other Turing also usually stays below 90%).It's usually the result of an overcomitted CPU. Depending on the other (CPU+GPU) tasks running it is advised to reduce the number of simultaneous CPU tasks to the number of CPU cores (50% of CPUs in BOINC manager / Computing settings), or even less. High core count CPUs (AMD Threadripper, Intel i9-9900, AMD Ryzen 9 3950x) can use up their memory bandwidth when many CPU tasks are running simultaneously, even with 4 channel memory. This result in increased runtime of the CPU app, and reduced performance of the GPU app. Reduced PCIe bandwidth can also be the cause of reduced GPUGrid performance. CPUs with 20 PCIe lanes can't provide PCIe 3.0 x16 for all GPUs in a multi-GPU setup. The other factor is the atom count of the given simulation - we don't have any info on that with the new GPUGrid app - but I think the present batch has low atom count, and it can cause lower GPU utilization on high-end GPUs. The third factor is the host OS: Linux tends to be faster (however the previous batch was just as fast on Windows). I see 89% GPU (RTX 2080Ti) usage under Windows 10, while 97-98% GPU (RTX 2080Ti) usage under Linux. For the record, SWAN_SYNC is enabled and a full thread reserved for each task.SWAN_SYNC is always on for the acemd3, this environmental variable is ignored. | |
ID: 53730 | Rating: 0 | rate: / Reply Quote | |
It happens if you have different GPUs in the same system, and the task restarts on a different GPU than it was running before. ah right, gotcha. I remember it resuming on a different GPU being the problem now. It's usually the result of an overcomitted CPU. Depending on the other (CPU+GPU) tasks running it is advised to reduce the number of simultaneous CPU tasks to the number of CPU cores (50% of CPUs in BOINC manager / Computing settings), or even less. High core count CPUs (AMD Threadripper, Intel i9-9900, AMD Ryzen 9 3950x) can use up their memory bandwidth when many CPU tasks are running simultaneously, even with 4 channel memory. This result in increased runtime of the CPU app, and reduced performance of the GPU app. I set cores used to 50% and the RTX load went up by maybe 1-2%. Setting it even lower adds maybe another percent, adding up to a whoopin' 82% for this particular WU. Running Rosetta on CPU right now, which I guess it quite memory intensive, but I was doing more forgiving stuff previously too and GPU usage was the same. Reduced PCIe bandwidth can also be the cause of reduced GPUGrid performance. CPUs with 20 PCIe lanes can't provide PCIe 3.0 x16 for all GPUs in a multi-GPU setup. I have PCI-E 2.0, three cards. Bus load is below 10% for the RTX anyway. The 1950X has 64 lanes, apparently? The other factor is the atom count of the given simulation - we don't have any info on that with the new GPUGrid app - but I think the present batch has low atom count, and it can cause lower GPU utilization on high-end GPUs. I think best I got so far was maybe 85%. | |
ID: 53731 | Rating: 0 | rate: / Reply Quote | |
Hm okay then, now I got a WU that's running around 90% on the RTX. | |
ID: 53743 | Rating: 0 | rate: / Reply Quote | |
I two have nearly identical systems: same CPU, almost identical motherboard, same GPU. The main hardware difference is RAM. One is Windows 8 and the other Linux, and the Linux box has the lesser quality RAM. GPU load under Linux is ~90% or higher, under Windows it's ~50-60% depending. I can get it marginally higher if I completely stop CPU work, but that's not a reasonable trade-off in my opinion. The Windows app just seems to be much less efficient, whatever the reason. This has been the case since the ACEMD3 release. | |
ID: 53744 | Rating: 0 | rate: / Reply Quote | |
GPU load under Linux is ~90% or higher, under Windows it's ~50-60% depending.This is way too low. How did you get this readout? In Windows task manager the main display for GPU usage don't show the right value. You should change one of the sub-displays to "CUDA" to get the right one. | |
ID: 53749 | Rating: 0 | rate: / Reply Quote | |
GPU load under Linux is ~90% or higher, under Windows it's ~50-60% depending.This is way too low. How did you get this readout? System Information Viewer and Nvidia Inspector. I've verified these values with Nvidia-SMI too. There is quite a bit of variance in these jobs, since my previous post I've seen a job or two using ~80% of the card. But that's not usual. I've tried running two at a time but it didn't really improve run times. ____________ Team USA forum | Team USA page Join us and #crunchforcures. We are now also folding:join team ID 236370! | |
ID: 53753 | Rating: 0 | rate: / Reply Quote | |
I have PCI-E 2.0 Err, nevermind me, it's 3.0, durr | |
ID: 53755 | Rating: 0 | rate: / Reply Quote | |
a) Interrupted WUs cannot be continued, I get an instant error. I guess this is nothing new, though. Thank you very much, Retvari Zoltan. Your ingenious method works fine for me on this triple mixed graphics card system. I'm taking it in mind every time I reboot the system for any reason. It is a known problem in wrapper-working ACEMD3 tasks, already announced by Toni in a previous post. Can I use it on multi-GPU systems? | |
ID: 54031 | Rating: 0 | rate: / Reply Quote | |
Message boards : Number crunching : RTX performance on Windows