GPU Task Performance (vs. CPU core usage, app_config, multiple GPU tasks on 1 GPU, etc.)

Author	Message
Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 29297 - Posted: 31 Mar 2013, 15:37:02 UTC Last modified: 31 Mar 2013, 15:39:24 UTC Hello everyone, I'm creating this thread to document my GPUGrid GPU Task performance variances, while testing things such as: - GPU task with no other tasks - GPU task with full CPU load - GPU task with overloaded CPU load - Multiple GPU tasks on 1 video card My system (as of right now) is: Intel Core i7 965 Extreme (quad-core, hyper-threaded, Windows sees 8 processors) Memory: 6GB GPU device 0: eVGA GeForce GTX 660 Ti 3GB FTW (primary display) GPU device 1: eVGA GeForce GTX 460 (not connected to any display) OS: Windows 8 Pro x64 with Media Center So far, I have some interesting results to share, and would like to "get the word out". If you'd like to share your results within this thread, feel free. Regards, Jacob ID: 29297 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 29298 - Posted: 31 Mar 2013, 15:42:40 UTC - in response to Message 29297. Last modified: 31 Mar 2013, 15:45:34 UTC I originally did some performance testing in another thread, but wanted the results consolidated into this "GPU Task Performance" thread. That thread is titled "app_config.xml", and is located here: http://www.gpugrid.net/forum_thread.php?id=3319 Note: The post within that thread, which contains the app_config values that I recommend using, can be found here: http://www.gpugrid.net/forum_thread.php?id=3319#29216 ID: 29298 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 29299 - Posted: 31 Mar 2013, 15:43:03 UTC - in response to Message 29298. Last modified: 31 Mar 2013, 16:03:54 UTC Here are the first results (from running only on my GTX 660 Ti), copied from that thread: ======================================================================== Running with no other tasks (every other BOINC task and project was suspended, so the single GPUGrid task was free to use up the whole CPU core): Task: 6669110 Name: I23R54-NATHAN_dhfr36_3-17-32-RND2572_0 URL: http://www.gpugrid.net/result.php?resultid=6669110 Run time (sec): 19,085.32 CPU time (sec): 19,043.17 ======================================================================== Running at <cpu_usage>0.001</cpu_usage>, BOINC set at 100% processors, along with a full load of other GPU/CPU tasks: Task: 6673077 Name: I11R21-NATHAN_dhfr36_3-18-32-RND5041_0 URL: http://www.gpugrid.net/result.php?resultid=6673077 Run time (sec): 19,488.65 CPU time (sec): 19,300.91 Task: 6674205 Name: I25R97-NATHAN_dhfr36_3-13-32-RND4438_0 URL: http://www.gpugrid.net/result.php?resultid=6674205 Run time (sec): 19,542.35 CPU time (sec): 19,419.97 Task: 6675877 Name: I25R12-NATHAN_dhfr36_3-19-32-RND6426_0 URL: http://www.gpugrid.net/result.php?resultid=6675877 Run time (sec): 19,798.77 CPU time (sec): 19,606.33 ======================================================================== CONCLUSION: So, as expected, there is some minor CPU contention whilst under full load, but not much (Task Run time is maybe ~3% slower). It's not affected much because the ACEMD process actually runs at a higher priority than other BOINC task processes, and therefor, are never starved for CPU, and are likely only minorly starved for contention during CPU process context switching. ID: 29299 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 29300 - Posted: 31 Mar 2013, 15:44:03 UTC - in response to Message 29299. Last modified: 31 Mar 2013, 16:04:21 UTC Here are some more results, where I focused on the "short" Nathan units: ======================================================================== Running with no other tasks (every other BOINC task and project was suspended, so the single GPUGrid task was free to use up the whole CPU core): Task: 6678769 Name: I1R110-NATHAN_RPS1_respawn3-10-32-RND4196_2 URL: http://www.gpugrid.net/result.php?resultid=6678769 Run time (sec): 8,735.43 CPU time (sec): 8,710.61 Task: 6678818 Name: I1R42-NATHAN_RPS1_respawn3-12-32-RND1164_1 URL: http://www.gpugrid.net/result.php?resultid=6678818 Run time (sec): 8,714.75 CPU time (sec): 8,695.18 ======================================================================== Running at <cpu_usage>0.001</cpu_usage>, BOINC set at 100% processors, along with a full load of other GPU/CPU tasks: Task: 6678817 Name: I1R436-NATHAN_RPS1_respawn3-13-32-RND2640_1 URL: http://www.gpugrid.net/result.php?resultid=6678817 Run time (sec): 8,949.63 CPU time (sec): 8,897.27 Task: 6679874 Name: I1R414-NATHAN_RPS1_respawn3-7-32-RND6785_1 URL: http://www.gpugrid.net/result.php?resultid=6679874 Run time (sec): 8,828.17 CPU time (sec): 8,786.48 Task: 6679828 Name: I1R152-NATHAN_RPS1_respawn3-5-32-RND8187_0 URL: http://www.gpugrid.net/result.php?resultid=6679828 Run time (sec): 8,891.22 CPU time (sec): 8,827.11 ======================================================================== CONCLUSION: So, again, as expected, there is only slight contention while under full CPU load, because the ACEMD process actually runs at a higher priority than other BOINC task processes, and therefor, are never starved for CPU, and are likely only minorly starved for contention during CPU process context switching. ID: 29300 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 29302 - Posted: 31 Mar 2013, 16:01:12 UTC - in response to Message 29300. Last modified: 31 Mar 2013, 16:15:36 UTC So, previously, I was only running 1 GPU Task on that GPU (and the GPU Load would usually be around 87%-88%). But I wanted to find out what would happen when I run 2. So, the following tests will use <gpu_usage>0.5</gpu_usage> ... in my app_config.xml. Note: The GPU Load goes to ~97% when I do this, and I believe that's a good thing! ======================================================================== Long-run Nathan tasks... Running at <cpu_usage>0.001</cpu_usage>, <gpu_usage>0.5</gpu_usage>, BOINC set at 100% processors, along with a full load of other GPU/CPU tasks: Name: I19R1-NATHAN_dhfr36_3-22-32-RND2354_0 URL: http://www.gpugrid.net/result.php?resultid=6684711 Run time (sec): 35,121.51 CPU time (sec): 34,953.33 Name: I6R6-NATHAN_dhfr36_3-18-32-RND0876_0 URL: http://www.gpugrid.net/result.php?resultid=6685136 Run time (sec): 39,932.98 CPU time (sec): 39,549.67 Name: I22R42-NATHAN_dhfr36_3-15-32-RND5482_0 URL: http://www.gpugrid.net/result.php?resultid=6685907 Run time (sec): 35,077.12 CPU time (sec): 34,889.61 Name: I31R89-NATHAN_dhfr36_3-21-32-RND1236_0 URL: http://www.gpugrid.net/result.php?resultid=6687190 Run time (sec): 35,070.94 CPU time (sec): 34,901.26 Name: I8R42-NATHAN_dhfr36_3-22-32-RND2877_1 URL: http://www.gpugrid.net/result.php?resultid=6688517 Run time (sec): 32,339.90 CPU time (sec): 32,082.15 ======================================================================== Short-run Nathan tasks... Running at <cpu_usage>0.001</cpu_usage>, <gpu_usage>0.5</gpu_usage>, BOINC set at 100% processors, along with a full load of other GPU/CPU tasks: Name: I1R318-NATHAN_RPS1_respawn3-11-32-RND9241_0 URL: http://www.gpugrid.net/result.php?resultid=6684931 Run time (sec): 12,032.03 CPU time (sec): 11,959.47 Name: I1R303-NATHAN_RPS1_respawn3-14-32-RND0610_0 URL: http://www.gpugrid.net/result.php?resultid=6690144 Run time (sec): 14,621.04 CPU time (sec): 10,697.88 ======================================================================== CONCLUSIONS: Long-run Nathan units: 1-at-a-time + full CPU load: ~19,600 run time per task 2-at-a-time + full CPU load: ~35,100 run time per task Speedup: 1 - (35,100 / (19,600 * 2)) = 10.5% improvement Short-run Nathan units: 1-at-a-time + full CPU load: ~8,900 run time per task 2-at-a-time + full CPU load: ~13,300 run time per task Speedup: 1 - (13,300 / 8,900 * 2)) = 25.3% improvement So far, it looks like running multiple tasks at a time... GETS WORK DONE QUICKER! Now, admittedly, I am estimating on very few results here, but.. I'll continue using this "2-at-a-time" approach, and will reply here if I find anything different. ID: 29302 · Rating: 0 · rate: / Reply Quote

zombie67 [MM] Send message Joined: 16 Jul 07 Posts: 209 Credit: 5,616,860,456 RAC: 284,200 Level Scientific publications	Message 29313 - Posted: 3 Apr 2013, 1:18:10 UTC This is very good info. However, I need to point out a couple potential down-side issues: 1) even with 2 tasks per GPU via app_config.xml, it does not increase the number of tasks you can download. For example, on my 4 GPU machine, it normally has 4 running, and 4 waiting to run. Running 8 at once means all 8 are running. So now there is a delay between the time a task completes, uploads, reports, a new task is downloaded (big file), and starts running. That may wipe out any utilization advantage. 2) The longer run-time with 2 tasks per GPU may cause them to miss the credit bonus for early returns. YMMV Reno, NV Team: SETI.USA ID: 29313 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 29314 - Posted: 3 Apr 2013, 17:49:45 UTC - in response to Message 29313. Point 1: ideally this would average out after some time, so that the different WUs per GPU finish at different times. Depending on your upload speed this might provide enough overlap to avoid running dry. Having more GPUs & WUs in flight should help with this issue. Point 2: correct! MrS Scanning for our furry friends since Jan 2002 ID: 29314 · Rating: 0 · rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 29320 - Posted: 5 Apr 2013, 12:24:35 UTC - in response to Message 29314. Point 1: ideally this would average out after some time, so that the different WUs per GPU finish at different times. Depending on your upload speed this might provide enough overlap to avoid running dry. Having more GPUs & WUs in flight should help with this issue. To clarify, a simple example: a machine with 1 GPU would get 2 WUs and if these are not in sync, then while uploading/downloading 1 WU the other WU would run at 2x the speed. A real workaround would be to run the 2x WUs on a box with 1 NV and 1 ATI running on a different project, then 4 WUs would be allocated for the machine. As an aside I think running GPUGrid WUs 2x is a bad idea due to longer turn around time and possible errors. A machine reboot or GPU error (or as Jacob pointed out on the BOINC list, a BOINC restart) would be more likely to take out 2 of these long WUs instead of 1. ID: 29320 · Rating: 0 · rate: / Reply Quote

John C MacAlister Send message Joined: 17 Feb 13 Posts: 181 Credit: 144,871,276 RAC: 0 Level Scientific publications	Message 29321 - Posted: 5 Apr 2013, 15:17:53 UTC After some setup difficulties, I now have two long run tasks running - one on each of my GTX 650 Ti GPUs. GPUGrid runs 24/7 on this AMD A10 based PC and there are always two tasks running with either one or two waiting to run. As each GTX 650 processes at a slightly different rate the number of tasks waiting to run varies. I believe this will maximize output from my PC enabling me to make the maximum contribution to the research. ID: 29321 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 29322 - Posted: 5 Apr 2013, 15:22:46 UTC - in response to Message 29321. Last modified: 5 Apr 2013, 15:23:17 UTC John, My research indicates that you might be able to contribute more to the project, if you run 2 tasks on each of your GPUs, assuming the tasks don't result in computation errors. You might try that, using the app_config.xml file, and see if your overall performance increases. I was able to see gains in GPU Load (seen via a program called GPU-Z), as well as increased throughput (seen by looking at task times, as noted within this thread). Regards, Jacob ID: 29322 · Rating: 0 · rate: / Reply Quote

John C MacAlister Send message Joined: 17 Feb 13 Posts: 181 Credit: 144,871,276 RAC: 0 Level Scientific publications	Message 29323 - Posted: 5 Apr 2013, 15:45:27 UTC Hi, Jacob. I am very inexperienced in writing .xml files and fear losing running tasks through syntax errors. I would like to take it one step at a time for now and, maybe in a couple of weeks, try your suggestion. I will likely ask for help..... Thanks for the suggestion. John ID: 29323 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 29324 - Posted: 5 Apr 2013, 15:47:45 UTC - in response to Message 29323. No problem. It's really not that hard, so don't be afraid, and... when you're ready, I encourage you to read this entire thread, which has details and examples: "app_config.xml" located here: http://www.gpugrid.net/forum_thread.php?id=3319 - Jacob ID: 29324 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 29326 - Posted: 5 Apr 2013, 19:42:21 UTC Last modified: 5 Apr 2013, 19:44:24 UTC Careful, guys. The GTX650Ti (Johns GPUs) sounds like it's almost the same as a GTX660Ti (Jacobs GPUs), but it's actually about a factor of 2 slower. Currently 70k credit long-runs take John 33k seconds, running 2 of them might require ~60 ks. That's almost one day, so we're getting close to missing the deadline for the credit-bonus here for even longer tasks (some give 150k credits, so should take over twice as long). And this is not only about credits: the credit bonus is there to encourage people to return results early. The project needs this as much as it needs many WUs done in parallel. As long as we're still making the deadline for the credit bonus we can be sure to return results as quickly as they want us to return them. MrS Scanning for our furry friends since Jan 2002 ID: 29326 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 29327 - Posted: 5 Apr 2013, 19:46:42 UTC - in response to Message 29326. Last modified: 5 Apr 2013, 19:56:45 UTC Sure, in order to get maximum bonus credits, you'll have to be careful to make sure you complete all your tasks within 24 hours. And, in general, they want results returned quickly. But, in order to help the project the most, throughput (how fast can you do tasks) is the factor to measure, and the "deadline" is the task's deadline, which usually is a few days I think. If the administrators deem that a task must be done at a certain time, then I hope they are setting task deadline appropriately. ID: 29327 · Rating: 0 · rate: / Reply Quote

John C MacAlister Send message Joined: 17 Feb 13 Posts: 181 Credit: 144,871,276 RAC: 0 Level Scientific publications	Message 29328 - Posted: 5 Apr 2013, 20:59:54 UTC Last modified: 5 Apr 2013, 21:00:31 UTC Thanks, Gentlemen: I will leave this alone for now..... With falling prices for the GTX 660 Ti, I may add one to my other AMD A10 based PC in September around my birthday. John ID: 29328 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 29332 - Posted: 6 Apr 2013, 6:42:01 UTC - in response to Message 29328. Last modified: 6 Apr 2013, 6:50:56 UTC You still have plenty of testing to do; all the possible same and mixed WU combinations would need to be looked at: NATHAN_dhfr36 (Long) + NOELIA_148n (Long) NATHAN_dhfr36 (Long) + NOELIA_TRYP (Short) NATHAN_dhfr36 (Long) + NATHAN_stpwt1 (Short) NOELIA_148n (Long) + NOELIA_TRYP (Short) NOELIA_148n (Long) + NATHAN_stpwt1 (Short) NOELIA_TRYP (Short) + NATHAN_stpwt1 (Short) ... plus any I've missed and whatever else turns up... Basically, how do the various Long and Short tasks perform running together, how do mixed WU types perform and as there are several apps in use (16.16app, 16.18, 16.49, 6.52) - how do they get on together? You might want to start 'freeing up' a CPU thread/core when running two WU's; the ~3% loss could well be exponential (more like 9%). Note also that some apps might ask for a full CPU core, while others won't (I think this is also GPU specific; needed for Kepler's but not Fermi's). When you do all that, then you will be in a position to look at the error rates and thus determine overall gain, or loss :)) You have to remember that all this depends on the operating system. It's a well discussed fact that Linux/WinXP/2003 are faster for crunching at GPUGrid (11%+). Your numbers probably won't hold up on these operating systems, but should be true for Vista and W7. The Win 2008 servers are somewhere in between in terms of performance loss. This would all have to be tested for Fermi's and Titan's (which might offer more). I wouldn't be keen on running two long WU's but two short tasks looks interesting. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help ID: 29332 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 29333 - Posted: 6 Apr 2013, 7:00:23 UTC - in response to Message 29332. Last modified: 6 Apr 2013, 7:02:16 UTC Yes, I still have testing to do. You can/should test too! It's not easy to cherry-pick certain task type combinations -- I usually just let any task types run together. Maybe once I find even more time to test, I'll attempt doing the specific-combination testing, using custom suspending, and more vigilant monitoring. As far as "freeing up a core", my research indicates that, at this point, doing so is COMPLETELY UNNECESSARY, at least for me. If you look at the acemd processes in Process Explorer, you'll see that process priority is 6, and the CPU-intensive-thread priority is either 6 or 7. This ensures that the thread and process do not get swapped out of the processor, even when I'm running a full load of other CPU tasks, since those CPU tasks are usually priority 1 or 4. Watching how the CPU time gets divvied up (in Process Explorer, or in Task Manager), also proves it -- you'll see the other processes getting less-than-a-core, but you'll see the acemd process "suffer" much. Plus, as you said, sometimes the GPUGrid tasks don't require much CPU at all (like when a NATHAN Long-run is on my GTX 460), so, reserving a core is sheer waste at that point, at least for my goals. So I won't do it. I'm not trying to speculate here, and I'm certainly not trying to find reasons not to run multiple tasks on the same GPU. I think it's worth it. What I'm trying to do is show the results that I have achieved, using my goals (maximize throughput for GPUGrid, without sacrificing any throughput for my other projects), and I encourage others to do the same. Thanks, Jacob ID: 29333 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 29337 - Posted: 6 Apr 2013, 8:17:11 UTC - in response to Message 29333. Last modified: 6 Jun 2013, 12:15:21 UTC I don't have much time to test, but OK, I'll do a little bit... System: GTX660Ti @1202MHz, i7-3770K CPU @4.2GHz, 8GB DDR3 2133, SATAIII drive, W7x64, 310.90 drivers, Boinc 7.0.60. I've started using your suggested app_config.xml file: <app_config> <app> <name>acemdbeta</name> <max_concurrent>9999</max_concurrent> <gpu_versions> <gpu_usage>0.5</gpu_usage> <cpu_usage>0.001</cpu_usage> </gpu_versions> </app> <app> <name>acemdlong</name> <max_concurrent>9999</max_concurrent> <gpu_versions> <gpu_usage>0.5</gpu_usage> <cpu_usage>0.001</cpu_usage> </gpu_versions> </app> <app> <name>acemd2</name> <max_concurrent>9999</max_concurrent> <gpu_versions> <gpu_usage>0.5</gpu_usage> <cpu_usage>0.001</cpu_usage> </gpu_versions> </app> <app> <name>acemdshort</name> <max_concurrent>9999</max_concurrent> <gpu_versions> <gpu_usage>0.5</gpu_usage> <cpu_usage>0.001</cpu_usage> </gpu_versions> </app> </app_config> I was running one Long NATHAN_dhfr36 task. It had reached ~33% when I added the app_config file. GPU Utilization was around 87% (as you observed), power was about 87% and the temp ~60°C. CPU was set to only use 75% (free 2 threads), also running POGS. Note that I was using swan_sync=0. I increased the Boinc cache and downloaded a Short NATHAN_stpwt1 task. When I restarted Boinc, I had 4 POGS CPU tasks running (50% of the CPU). The two GPUGrid tasks used 25% of the CPU; a full CPU thread each (not due to swan_sync). GPU utilization rose to 98%, power to 97%, GPU temp to 65°C and the system Wattage went up by around 15 or 20W. On my system these NATHAN_dhfr36 tasks (6.18 app) have varied in runtime from between ~18,400s and ~19,000s and the only two previous NATHAN_stpwt1 tasks (6.16 app) took 5,166 and 5,210s. I expect that by just running another task you force the Kepler GPU's to run at higher clocks; they try to self-adjust their frequency! - The Short NATHAN_stpwt1 task completed in 8,112s, so it took 56% longer, but not twice as long... - Didn't automatically get another GPUGrid WU (Boinc Cache set to low??), but did when I updated; a NATHAN_RPS1_respawn (6.52app) Both GPUGrid tasks each still using a full CPU thread (swan on). Will try to run a few with swan on and then off, for comparison. The NATHAN_RPS1_respawn took 12,021sec. On average they take 8876sec, but have varied from 8,748 to 9,215sec. That's 35% longer than normal but a good bit less than twice as long. The third task to run along with the Long WU is NATHAN_RPS1_respawn3-25-32-RND4658_0. The Long task took 39,842sec, over twice as long as normal (2.13 times as long). Given that the first 33% was run by itself, the final 66% took over 3times as long as normal to complete the WU. That's a big loss when running Long and Short tasks together. Even considering the Short tasks were >0.5 as fast, in this case it looks less efficient overall. Warning! Running two NATHAN_RPS1_respawn3 tasks together caused dangerously bad lag. GPU utilization fell to 33% and GPU temp dropped to 41°C. After 55min the second Short task had only reached 1.7% complete. Just one of these tasks runs at 94% GPU utilization on my system, so there is no way running two would be beneficial. I've since retested this, and found the same results. I was also able to run 4 POEM tasks as well as one respawn3, but they were very slow. Alone these 4 POEM tasks used 88% of the GPU and with the respawn3 WU that went up to 99%. For now I have disabled app_config, as I'm just getting these respawn3 WU's. I ran a single NOELIA_Klebe_Equ task and then two at the same time. While running the single task GPU utilization was 87% and while two tasks were running it rose to 97%. Basically it's not any faster running two tasks: 041px21x3-NOELIA_Klebe_Equ-0-1-RND6607_0 4338582 7 Apr 2013 \| 6:07:54 UTC 7 Apr 2013 \| 10:08:02 UTC Completed and validated 13,243.66 5,032.64 23,700.00 Short runs (2-3 hours on fastest card) v6.52 (cuda42) 041px2x1-NOELIA_Klebe_Equ-0-1-RND3215_0 4338518 7 Apr 2013 \| 6:43:58 UTC 7 Apr 2013 \| 10:27:29 UTC Completed and validated 13,320.82 4,158.10 23,700.00 Short runs (2-3 hours on fastest card) v6.52 (cuda42) 005px46x3-NOELIA_Klebe_Equ-0-1-RND6629_0 4338501 7 Apr 2013 \| 2:28:54 UTC 7 Apr 2013 \| 4:50:37 UTC Completed and validated 6,288.30 2,656.45 23,700.00 Short runs (2-3 hours on fastest card) v6.52 (cuda42) - Running another two with swan off. The first of the two NOELIA_Klebe_Equ tasks started using 934MB and the second used an additional 808MB GDDR5. That 1742MB dropped to 1630MB before they reached 10% complete. With two tasks running the clock stabilized at 1189MHz. Note that I'll just edit this post with any further results. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help ID: 29337 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 29347 - Posted: 6 Apr 2013, 11:00:42 UTC - in response to Message 29337. Last modified: 6 Apr 2013, 11:04:03 UTC Sounds good, thanks for testing. Note: When running 2-at-a-time, I expect tasks to take slightly-less-than-double what they normally take, which would mean they are being processed faster over-all. ID: 29347 · Rating: 0 · rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 29348 - Posted: 6 Apr 2013, 11:01:39 UTC - in response to Message 29337. Last modified: 6 Apr 2013, 11:49:41 UTC Ah, you bring up a good point, I forgot to mention my clocking experiences with my Keplar architecture eVGA GTX 660 Ti 3GB FTW card... - It's base clock is 1045 MHz, which I think is the lowest clock it can be while running a 3d application or GPU task. - When GPU Load is not great (~60-75%), I think it usually upclocks a little (maybe up to 1160 MHz), but because it sees the application as "not demanding a lot", it doesn't try hard to upclock. - When GPU Load is decent-ish (86%), it auto-upclocks a bit (usually to around 1215 MHz or 1228 MHz I think), with Power Consumption around 96-98% TDP. - When GPU Load is better-saturated (97%-99%), it usually tries to upclock higher, but reaches a thermal limit. It usually ends up clocked at around 1180-1215 MHz, with a temperature of 84C-89C, at a Power Consumption around 96%. - TIP: At that saturation, if you want, you can usually allow it to auto-upclock just a tad more, by using whatever overclock tools you have (I have eVGA Precision X), and just adjust the "Power Target". By default, I think the driver sets a Power Target of 100%, but what I usually do is adjust it to 140%. This let's it auto-clock higher, until it starts really hitting those thermal limits. My end result: My card usually runs at 1215 MHz, 86C - 90C, with Power Consumption around 106% TDP. So, running at higher GPU Load keeps it clocked high, as high as the thermal limits can let it... which is a good thing, if you care more about GPUGrid throughput than the lifespan of your GPU. :) Regards, Jacob ID: 29348 · Rating: 0 · rate: / Reply Quote