Message boards :
Graphics cards (GPUs) :
Tesla
Message board moderation
| Author | Message |
|---|---|
MumakSend message Joined: 7 Dec 12 Posts: 92 Credit: 225,897,225 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Just in case somebody is interested how some exotic GPUs perform, I did a run on a Tesla K20c: e4s166_e3s4f169-NOELIA_27x3-0-2-RND0924_0 https://www.gpugrid.net/result.php?resultid=13938855 Runtime: 24,097 s GPU load: ~95% Temperature: ~60 C Power: ~120 W For comparison, my 750 Ti does the same task in 40,300 s. Sure, Teslas are better in DPFP. The only such app I tried was Milkyway, which however uses OpenCL, so it's not ideal. The performance there was comparable to a RADEON HD7970/280X. |
|
Send message Joined: 20 Jul 14 Posts: 732 Credit: 130,089,082 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thanks Mumak! Really interesting :) I expected better performance from a GPU of this quality... [CSF] Thomas H.V. Dupont Founder of the team CRUNCHERS SANS FRONTIERES 2.0 www.crunchersansfrontieres |
MumakSend message Joined: 7 Dec 12 Posts: 92 Credit: 225,897,225 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Oh, I had ECC enabled on the Tesla. Switching off and giving it another run.. |
|
Send message Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Excellent power/performance ratio for [13]SMX (2496CUDA) while computing FP32. You're GK110 120W operating points: 0.04w per core or 9.23W per 192CUDA SMX. 120W is 1024CUDA GTX960 domain. A 12 SMX 780 can sip 145W at 836MHz or lower. (reference base clock) Not including Maxwell's GM204 GTX970 - cut GK110 are the best eco-tuners NVidia produced. Cut GK104 are also able eco-tuners as GTX660ti and GTX760 have proven. Will a full GK110 see ~150W at 95% core? Lowest is about 165W or so. This really good for an eco-tune even as GK110 are capable of maximizing every available ounce of power at 1.2GHz/250W. There are a lot of DP64 enabled GK110 running FP32 ACEMD. Maybe a FP64 ACEMD app will be created for those specific high performance DPFP GPU's? |
|
Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
You can use nvidia-smi to increase the application clocks. This will extract another 10% or so. |
MumakSend message Joined: 7 Dec 12 Posts: 92 Credit: 225,897,225 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thanks for the hint. Current/default clock is 705 MHz, max should be 758. Will first finish the current WU to see the difference between ECC/Non-ECC, then will try some OC ;-) |
MumakSend message Joined: 7 Dec 12 Posts: 92 Credit: 225,897,225 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Decided to try max GPU clock (758 MHz), the WU has not finished yet. Just for comparison (running NOELIA_PO now): 705 MHz - 133 W 758 MHz - 150 W |
MumakSend message Joined: 7 Dec 12 Posts: 92 Credit: 225,897,225 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
|
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I expected better performance from a GPU of this quality... There's no reason to do so. It's just the same GK110 chips as GTX780/Ti and Titan/Z. It gains energy efficiency by being run at very low clock speeds and voltages. To some extent thiis could easily be done on other cards as well. Although most will prefer a high performance state anyway. A comparison to a stock GTX960 sounds impressive, but that GPU is driven quite hard up to maximum voltages around 1.20 V and has a lot of room to run more efficiently for a minor performance loss (down to 1.10 - 1.00 V). Maybe a FP64 ACEMD app will be created for those specific high performance DPFP GPU's? Why? Even the super expensive Titan looses 2/3 of the maximum throughput in DP mode. If the app can get by with 32 bit it's always better to use only 32 bit. That's why "mixed precision" with 16 bit fp enhancements will become a topic for nVidia with Pascal. A valid reason would be to use new physical models which might not be possible in FP32. But I don't think it's the precision which limits, it's probably more often the flow control which makes these tasks better suited to CPUs. MrS Scanning for our furry friends since Jan 2002 |
©2025 Universitat Pompeu Fabra