BOINC 6.6.20 underestimating card GFLOPs

Author	Message
Andrew Send message Joined: 9 Dec 08 Posts: 29 Credit: 18,754,468 RAC: 0 Level Scientific publications	Message 8968 - Posted: 26 Apr 2009, 23:22:17 UTC Hi I noticed that in BOINC's starting up messages, where it says 'Found CUDA device' etc, it also says 'est. 60GFlops' Now this is an 8800GT, which should have about 500GFlops theoretically (according to Wikipedia), and given my 20% overclock, at most 600GFlops. Judging by the temps, more than 60GFlops is being used. Is this a problem at all? I gather this only really affects the number of queued WUs, right? p.s. temps are higher with F@H, so maybe that uses the GPU better, and then Crysis gets temps even higher (perhaps from the memory / texture units?) I suppose the WUs are optimized for the most common GPUs - from the boards it looks like the GF200 series would be the target. ID: 8968 · Rating: 0 · rate: / Reply Quote

X1900AIW Send message Joined: 12 Sep 08 Posts: 74 Credit: 23,566,124 RAC: 0 Level Scientific publications	Message 8973 - Posted: 27 Apr 2009, 6:43:11 UTC - in response to Message 8968. You mentioned F@H, have you seen the FLOP-FAQ ? ID: 8973 · Rating: 0 · rate: / Reply Quote

Michael Goetz Send message Joined: 2 Mar 09 Posts: 124 Credit: 124,873,744 RAC: 0 Level Scientific publications	Message 8974 - Posted: 27 Apr 2009, 6:53:32 UTC - in response to Message 8968. I suppose the WUs are optimized for the most common GPUs - from the boards it looks like the GF200 series would be the target. Actually, the design of CUDA makes it sort of sort of irrelevant which card you design for. The drivers hide the differences pretty well, and the application code scales nicely from small cards to big cards. There are a few differences, but for the most part the difference between the most modest CUDA card and the monster CUDA cards is that the big cards have more shaders and can therefore execute more calculations in parallel, as well as being able to do each calculation faster. But other than having more shaders and being faster, all the CUDA cards pretty much have the same architecture. The exception to this is the difference in compute capabilities, but that's not so much an optimization thing as a 'this card can't run this application' thing. Mike ID: 8974 · Rating: 0 · rate: / Reply Quote

Andrew Send message Joined: 9 Dec 08 Posts: 29 Credit: 18,754,468 RAC: 0 Level Scientific publications	Message 8977 - Posted: 27 Apr 2009, 9:06:57 UTC Hi, Hadn't seen that FAQ, thanks. Ok, I was thinking about optimization more in terms of shader vs. memory bandwidth, which is important when optimizing graphics shaders as the ratio varies card to card - perhaps not so important with CUDA. However, does BOINC's drastic underestimation of my gfx card FLOPS affect anything? ID: 8977 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 9025 - Posted: 27 Apr 2009, 21:34:06 UTC - in response to Message 8977. However, does BOINC's drastic underestimation of my gfx card FLOPS affect anything? Not that I know. And you're not the only one: I can remember my 9800GTX+ scored something around 80, whereas theoretically it would be ~750 GFlops. MrS Scanning for our furry friends since Jan 2002 ID: 9025 · Rating: 0 · rate: / Reply Quote

Andrew Send message Joined: 9 Dec 08 Posts: 29 Credit: 18,754,468 RAC: 0 Level Scientific publications	Message 9030 - Posted: 27 Apr 2009, 21:47:37 UTC - in response to Message 9025. cheers ID: 9030 · Rating: 0 · rate: / Reply Quote

X1900AIW Send message Joined: 12 Sep 08 Posts: 74 Credit: 23,566,124 RAC: 0 Level Scientific publications	Message 9065 - Posted: 28 Apr 2009, 12:41:32 UTC - in response to Message 8973. Last modified: 28 Apr 2009, 12:44:30 UTC Some results from the tool CUDA-Z (0.5.95): http://cuda-z.sourceforge.net/ 9800 GX2 (stock 600/1500/1000) @WinXP/32 - not active BOINC: CUDA device: GeForce 9800 GX2 (driver version 18565, CUDA version 1.1, 512MB, est. 69GFLOPS) Multiprocessors: 16 GPU Core Performance * Single-precision Float: 384907 Mflop/s Double-precision Float: Not Supported 32-bit Integer: 77326.2 Miop/s 24-bit Integer: 385118 Miop/s GTX 260 192 SP (OC 666/1512/1150) @Windows7/64 host 29460 BOINC: CUDA device: GeForce GTX 260 (driver version 18122, CUDA version 1.3, 896MB, est. 104GFLOPS) Multiprocessors: 24 GPU Core Performance * Single-precision Float: 575468 Mflop/s * Double-precision Float: 71788.9 Mflop/s 32-bit Integer: 103170 Miop/s 24-bit Integer: 575465 Miop/s GTX 260 216 SP (OC 666/1512/1150) @WinXP/32 host 23101 BOINC: CUDA device: GeForce GTX 260 (driver version 18206, CUDA version 1.3, 896MB, est. 117GFLOPS) Multiprocessors: 27 GPU Core Performance * Single-precision Float: 649640 Mflop/s * Double-precision Float: 80490.2 Mflop/s 32-bit Integer: 114848 Miop/s 24-bit Integer: 649810 Miop/s But I can´t tax the influence of driver, operating system or other factors to these results, they can be coherent within the measurment and this way usable for comparing. ID: 9065 · Rating: 0 · rate: / Reply Quote

Andrew Send message Joined: 9 Dec 08 Posts: 29 Credit: 18,754,468 RAC: 0 Level Scientific publications	Message 9114 - Posted: 29 Apr 2009, 18:01:46 UTC - in response to Message 9065. Thanks for that - I noticed you have CUDA 1.3, so i'll update my drivers to see whether that'll help my 8800GT. CUDA-Z reported that it was going at 377Gflops, (20% overclock) I suspect F@H uses the GPU more efficiently with higher Gflops as it makes more heat, but i'm not going to start a new WU to find out! That's what i use the oc for - to speed along GPUGrid - would be too hot for F@H. ID: 9114 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 9121 - Posted: 29 Apr 2009, 19:54:38 UTC - in response to Message 9114. The "1.3" refers to the CUDA hardware capability level of G200 chips. G80 has 1.0 and G9x have 1.1. This is different from the software CUDA version, which is currently 2.1 or 2.2. MrS Scanning for our furry friends since Jan 2002 ID: 9121 · Rating: 0 · rate: / Reply Quote

X1900AIW Send message Joined: 12 Sep 08 Posts: 74 Credit: 23,566,124 RAC: 0 Level Scientific publications	Message 9123 - Posted: 29 Apr 2009, 20:12:37 UTC - in response to Message 9121. Yes, I should have assigned it better: line with "BOINC: ..." is taken from BOINC Manager, not part of CUDA-Z, when it recognizes the abilities to initiate the card for the projects. Sorry for mixing up two different outputs, I thought it was clear. ID: 9123 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 10579 - Posted: 15 Jun 2009, 12:42:28 UTC - in response to Message 9123. Last modified: 15 Jun 2009, 12:44:43 UTC Boinc uses its own Bench Marks, so don’t expect estimates of Boinc’s GPU rating to match Nvidia's or any one else’s. The same is true for Boinc's CPU ratings - they are not the same as Si-Soft Sandra, CPUBench, or Vista's strange points system. Why? Because Boinc measures the devices' capabilities of doing Boinc tasks - which is exactly what you want to know! From the point of view of running Boinc, you don’t need to know how good your CPU is in a desktop environment, at crunching prime numbers or rendering video. Similarly it’s not important to know how well your video card performs in a flight simulator! So Boinc has produced its own benchmarking systems. ID: 10579 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 10586 - Posted: 15 Jun 2009, 20:15:15 UTC - in response to Message 10579. Because Boinc measures the devices' capabilities of doing Boinc tasks - which is exactly what you want to know! Well.. no. It's actually an almost meaningless low-level measurement, similar in nature to Sandra or Everest, just with a different specific implementation. Similarly the GFlops rating for GPUs is not any more useful than the theoretical maximum GFlops. MrS Scanning for our furry friends since Jan 2002 ID: 10586 · Rating: 0 · rate: / Reply Quote