Message boards :
Graphics cards (GPUs) :
New nvidia beta application
Message board moderation
| Author | Message |
|---|---|
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
Dear all, we have developed a new Nvidia application which so far performs 60% faster than the current one. It is stable, so we will release it in the next few days (next week). Today we compiled it also for Windows. I expect it to become the standard application quite quickly, but for now please accept beta work from us. GDF |
StoneagemanSend message Joined: 25 May 09 Posts: 224 Credit: 34,057,374,498 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Excellent news. That should blow the cobwebs off the server :) What effect will this have on card temperatures? |
K1atOdessaSend message Joined: 25 Feb 08 Posts: 249 Credit: 444,646,963 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Sounds good. Is this 60% improvement estimate an average across numerous card types, or should only a subset of users experience this? Just for those interested, can you give some detailed on what are the changes -- how you got this improvement, did you take advantage of cuda differently, etc. |
[AF>Libristes>Jip] Elgrande71Send message Joined: 16 Jul 08 Posts: 45 Credit: 78,618,001 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I hope it will be a fantastic step to reduce CPU occupation. I am ready to test it on my Linux computers. |
|
Send message Joined: 24 Dec 08 Posts: 738 Credit: 200,909,904 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Does this beta build address issues with cuda/fft bugs? That is should those of us with the 65nm GTX260 try them? BOINC blog |
JockMacMad TSBTSend message Joined: 26 Jan 09 Posts: 31 Credit: 3,877,912 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
|
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
From what was said in other threads, It will be about 60% better for Compute Capable 1.3 and slightly lower for CC1.1 (but still very significant). This should facilitate the return of more work units on time, for the lesser cards, and therefore in itself reduce the likelihood of failures over time – hopefully rendering some cards useful once again. You should note that there have also been some other advances that have recently been implemented that seem to improve reliability of the CC1.1 cards. Only long term testing will determine that for sure, and how well the new application will work for the GTX 260 sp192 cards. It was successfully compiled for Linux before Windows. I suggest you enable Beta work units! The Application should automatically download through Boinc. I would expect some rise in temperatures for many cards, so keep an eye on that. Although CUDA bugs are in CUDA, and this is not a re-write of CUDA, but rather the GPUGrid application, I think the bugs crept in due to using far too small time out settings (but I dont know this for sure), so perhaps there is scope for a work around. Hopefully one of the Guru's will reply here come Monday. If you think I am wrong about anything, shout up! |
Michael GoetzSend message Joined: 2 Mar 09 Posts: 124 Credit: 124,873,744 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
From what was said in other threads, What was said was that the new app is about 60% faster when compiled for 1.3 (double precision) and somewhat less fast when compiled for 1.1 (single precision). What was not said was which version or versions would actually be released. I would expect some rise in temperatures for many cards, so keep an eye on that. There's been nothing from the project saying this. If you have more information, please share it. GPUGRID currently runs at 77% utilization on my GTX280. Of the projects I currently run, Milkyway@Home has the highest utilization; approximately 90%. The temperature difference between GPUGRID and Milkyway is only 2 or 3 degrees. That with the GPU running some 25 degrees Centigrade below its maximum temperature, with the fan running well below full speed. Even running at 100% utilization, it's unlikely the running temperature would change significantly. On older architectures (which are less massively parallel than the G200/G200b) the application is more likely to be already be running closer to 100% utilization than it is on the G200s -- it's harder to keep a large number of parallel processors busy than it is to keep a small number of processors busy. So, most likely, the 77% utiization figure I'm seeing on the G200 is close to the lowest number you would see on any GPU. On older cards, there should be less room for improvement simply by increasing the GPU utilization. The bottom line is that on the G200 based cards, an increase in utilization probably won't raise the temperature a lot, and they have a lot of headroom. On older cards, you're not likely to raise the utilization (since they're probably closer to 100% to start with), so there's unlikely to be any rise in temperature. And all that, of course, is all making one huge assumption, that the increase in performance is due to increasing the efficiency of the parallelization to increase the GPU utilization. That's not necessarily true. The new application may use normal optimization techniques to be more efficient by performing fewer (or faster) calculations to achieve the same result. This would not increase the operating temperature. In any event, we'll see soon enough. We can replace the speculation and assumptions with some real observations as soon as the beta apps are released. Want to find one of the largest known primes? Try PrimeGrid. Or help cure disease at WCG.
|
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Michael, you're not seeing much higher temperatures at MW because it uses double precision and GT200 has only 1 dp unit for every 8 normal sp shader units. If GPU Grid went from 77 to 90% utilization the temperatures would surely increase quite a bit. Edit: that's also why MW is so slow on current nVidia hardware. Only a tiny fraction of the chip can be used at all, whereas ATI can use 40% of their regular shader power. MrS Scanning for our furry friends since Jan 2002 |
Michael GoetzSend message Joined: 2 Mar 09 Posts: 124 Credit: 124,873,744 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Interesting. Thanks for the information. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Unless the GPUGrid team start writing different applications for different cards, there will only be one release that is capable of increasing performance by around 60% for CC1.3 cards, but which will also increase performance for CC1.1 cards, but by slightly less. Running an IBUCH task my (40nm) GT 240: GPU load average is 82%, memory controller utilisation is 42%, memory usage is 171MB and the average GPU temp is 43degrees C (readings taken from GPU-Z). Compare this to my GTX 260 216sp (55nm): GPU load average is 80%, memory controller is only 12%, RAM used is 273MB, and the average temperature is 72degrees C. So a 60% increase in performance cannot come from simply upping the GPU load by 18 to 20%. It is more likely that it comes primarily from coding optimisations, which may or may not result in a GPU load increase (we will see soon enough). But if it does rise at all there will be an increase in heat. With CPU projects, the processors can be at 100% for 2 different projects, but the system power consumption can change by up to 10%, depending on the project, and higher Watts means more heat. |
|
Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi, We've improved the application by making several algorithmic and implemention changes, which the effect that it now requires less computation to calculate the same results as before. Just to clarify GDF's statement about the performance: the new app will complete WUs in ~60% the time of the current version. I'd expect the overall GPU utilisation to remain around the same level as before, maybe a little higher (though I don't know exactly what GPU-Z is measuring to derive that figure). Similarly, I'd not expect any significant change to GPU operating temperatures with the new app. The difference between 1.1 and 1.3, incidentally, is not an issue of floating point precision (we hardly use d.p. at all) but rather one of compiler efficieny. When we compile for 1.1 -- which is necessary for producing a binary that will run on both G80 and G200 hardware -- the compiler produces slightly less efficient code than if it is targetting 1.3, persumably as there are some G200-specific optimisations it is unable to apply. Lastly a word about the FFT bug on the 260s: For now we are still using CuFFT but, because of the extensive changes we've made to the code which uses the FFTs, it may be that the problem will no longer occur. Unfortunately we won't know for sure until we have some beta WUs out because we don't have the equipment to reproduce the bug in our lab. Matt |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks for clearing that up. I suspect CPU-Z just looks at wheather the GPU core is busy or not, working or waiting, so it does not reflect how much of the card is being used just if it is being used (but there is no literature on it). It may actually reflect that 18% of the time the CPU is being used! Which going by task manager would be about right. PS. I thought CC1.0 applied to the now obsolete G80, CC1.1 is G92, 1.3 is G200, and the CC1.2 cards use G215 and G216 cores? |
|
Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I thought CC1.0 applied to the now obsolete G80, CC1.1 is G92, 1.3 is G200, and the CC1.2 cards use G215 and G216 cores Yes, I was playing fast and loose with terms there. Here's the deal: With the exception of the very first G80 cards (8800GTX 768MB, IIRC), all of the G8x and G9x are compute 1.1 cards. As far as the CUDA programmer's concerned, all of the 1.1 silicon has pretty much the same performance characteristics and capabilities. The second generation G2xx silicon is rather more capable and has more features that are very useful to us[1], which is why we care to make the distinction for our app. Initially all of those devices described themselves as compute 1.3 but recently some low-mid GPUs have been released that call appear as compute 1.2. In practice, these seem to have the same performance characteristics as the original 1.3 devices, minus double-precision support (we've not had any of these in the lab to test, though so don't quote me). Matt [1] more on-chip registers so kernels can be more complex, relaxed memory access rules, better atomic memory ops and double precision, to name the highlights. |
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
We are now testing the Windows build on our local server. Possibly in the afternoon we will upload some WUs with the new application. gdf |
|
Send message Joined: 17 Dec 08 Posts: 3 Credit: 333,482 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Hi, i have nvidia 9800gt 512mb, will it be able to crunch new wu? (is it cc1.1, cc1.3...?). One more question: in "Maximum CPU % for graphics 0 ... 100", which parameter i should write? And what does it change by changing this %, in computing the wu? thank you |
Michael GoetzSend message Joined: 2 Mar 09 Posts: 124 Credit: 124,873,744 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
One more question: in "Maximum CPU % for graphics 0 ... 100", which parameter i should write? And what does it change by changing this %, in computing the wu? That parameter applies only to projects that have a screensaver (SETI, Einstein, CPDN, Rosetta, Docking, etc.). Also, as far as I know, only CPU-based tasks have screen savers and there are no GPU applications with screensavers. The parameter says how much of the CPU should be dedicated to running the screensaver graphics (and taking away from the number crunching). For GPUGRID, this setting has no effect. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Maximum CPU for graphics should be set to 100% 9800gt 512mb is Compute Capable 1.1 It will take plenty of time to run a task on that card, perhaps a couple of days, but it should work. Remember to disable use GPU when computer is in use, as this can cause problems with that card. When the new application is out and any fine tuning performed, your card should operate faster than at present. |
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
On windows it does not work well. We will come out with Linux at first. gdf |
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
We have uploaded the new application for Windows and Linux with some 50 workunits to test. First come first served. gdf |
©2026 Universitat Pompeu Fabra