Message boards :
Graphics cards (GPUs) :
GTX 970, 2 WU (Long), 2 CPU.
Message board moderation
| Author | Message |
|---|---|
Francois NormandinSend message Joined: 8 Mar 11 Posts: 71 Credit: 654,432,613 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I average 19 hours each for 2 WU (long one) on my GTX970 with a dedicated cpu core for each, did someone know if this seem ok performance? GERARD_FXCXCL12_LIG_1035426 Also Running Rosetta@home on 6 core. GPU load 84% Overcloked to 1440mhz Stable. +150mhz on memory. Windows 7 64bits Fx-8350 4ghz Asus m5a97 r2.0 8gig ram |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I average 19 hours each for 2 WU (long one) on my GTX970 with a dedicated cpu core for each, did someone know if this seem ok performance? 19 hours is a bit too long for this GPU, it should be around 12 hours (even on a WDDM OS like Windows 7). Also Running Rosetta@home on 6 core. Rosetta@home has the most demanding CPU app regarding CPU and memory usage (bandwidth & working set size), so this could be the reason for the GPU tasks taking this long to finish on your host. If the rosetta@home project usually grants less credits for the finished tasks than your host claims, it is a sign of that host is overcommitted. See your host vs my laptop (recently my laptop runs only CPU tasks). As GPU tasks are more rewarding, I usually prioritize these (i.e. I reduce the CPU tasks running until the GPU tasks don't suffer the lack of bandwidth). 8 CPU cores need a lot of RAM bandwidth, so this CPU's dual channel memory controller could be a serious bottleneck while using all cores for running many instances of the same demanding application (like rosetta@home's). If you have only one RAM module in this host, I suggest you to put in a same one to achieve dual channel memory (as it will really double the RAM's bandwidth). |
|
Send message Joined: 25 Mar 12 Posts: 103 Credit: 14,948,929,771 RAC: 17 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Not sure but I think he is saying that 19 hours is running two WUs at the same time. If it were the case, it would not be that bad, but just guessing. |
Francois NormandinSend message Joined: 8 Mar 11 Posts: 71 Credit: 654,432,613 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Yes, my bad. Two WU at the same time completed in 19-20 hours of work. (10hours/Wu) On the cpu, 2 core around 50%, and 2 core around 90% and the last 4 core at 99%-100%. (the cpu seem to feed the card, cpu usage 80%) Ram are 2 x 4gig. Will test later if Rosetta@home kinda kill something on gpugrid side. |
|
Send message Joined: 25 Mar 12 Posts: 103 Credit: 14,948,929,771 RAC: 17 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
It would be nice if some people with the right cards could do this same exercise with the GTX 980 and GTX 980 Ti, i.e two simultaneous Gerard WUs in a single card. The results will be interesting to reevaluate the cost/efficiency relation among these three card models. |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Then it's ok. :) |
Francois NormandinSend message Joined: 8 Mar 11 Posts: 71 Credit: 654,432,613 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thanks Trotador, if i rememeber i passed from 71% gpu load to 84%, so maybe 10% gain? My only problem is gpugrid let me just donwload two WU at a time, so when 1 finish one, i have to wait during the upload and donwload of the next one before starting crunching again at (full) gpu load. Thanks Retvari Zoltan*. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
For Ref/Comparison 970 on WinXP (one task at a time)*: GERARD_FXCXCL12_LIG_6644051-0-1-RND1102_1 34,159.06 (~9.5h) 255,000.00 NOELIA_ETQunboundx2-0-2-RND2064_0 14,912.33 (~4.25h) 75,000.00 970 on Win7 (two tasks at a time)†: GERARD_FXCXCL12_LIG 75,167.70 (~20.9h) 255,000.00 NOELIA_ETQunboundx1 23,301.72 (~6.5h) 75,000.00 * Slower system (CPU/system bus) only 90% power, 85% GPU usage. † I've these GPU's throttled to 80% power, though they sneak an extra 5% (roughly 7% slower but using 15% less energy each [based on 2 task at a time runtimes]; ~9% more efficient on top of any overall performance gain from running 2 tasks at a time). Note that two tasks at a time doesn't (and can't) make up for the WDDM overhead. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
|
Send message Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
PCI2.0 x4 Vs. PCI3.0 x8 GTX970 -- one task at a time Win8.1 ref/comparsion: My 970 is pushed to it's ACEMD stable OC limit: 1519MHz NOELIA's and 1506MHz GERALD. No CPU tasks compute nor is SWAN enabled. Everything is the same except PCIe width: 3.0 x8 lane NOELIA_ETQunbound: 13258sec (135W) 73% core 2.0 x4 lane NOELIA_ETQunbound: 16903sec (120W) 73% core -- NOELIAs >20% slower on 2.0 x4 -- GPU core temps rose another 4C with 3.0 x8 compared to 2.0 x4. (30C Ambient) -- +20C delta between core and ambient -- the average delta used to be +14 to 18C depending on ambient. If ambient is 20-25C - the GPU core would be 35-45C depending on ACEMD WU type. 2.0 x4 lane GERARD_FXCXCL: 46500sec (140W) 72% core 3.0 x8 lane GERARD_FXCXCL: estimated 36500sec (160W) 78% core --2.0 x4 is slower >20% again for GERALD. |
©2025 Universitat Pompeu Fabra