NVIDIA BigKepler

Author	Message
ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 25191 - Posted: 21 May 2012, 19:46:07 UTC - in response to Message 25187. The article says it all: for simple code AMD gives you more bang for the buck. However, GPU-Grid does not fit into this category (as far as I see it). They already built an AMD app.. however, performance was quite bad (on VLIW chips, no GCN yet). And it wasn't stable due to bugs in the driver or SDK. In short: they're trying to support AMDs, but it's not as easy as the article might make one to believe. MrS Scanning for our furry friends since Jan 2002 ID: 25191 · Rating: 0 · rate: / Reply Quote

Carlesa25 Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level Scientific publications	Message 25205 - Posted: 22 May 2012, 14:22:15 UTC Hi, What I find is that AMD inmegable is getting the batteries. Greetings. http://blogs.amd.com/developer/2012/05/21/opencl%E2%84%A2-1-2-and-c-static-kernel-language-now-available/ ID: 25205 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 25214 - Posted: 23 May 2012, 20:16:47 UTC - in response to Message 25205. Hi, What I find is that AMD inmegable is getting the batteries. Is that a machine translation? Sorry, I can't figure out any sense in this sentence. MrS Scanning for our furry friends since Jan 2002 ID: 25214 · Rating: 0 · rate: / Reply Quote

Carlesa25 Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level Scientific publications	Message 25220 - Posted: 24 May 2012, 13:27:01 UTC - in response to Message 25214. Hi, What I find is that AMD inmegable is getting the batteries. Is that a machine translation? Sorry, I can't figure out any sense in this sentence. MrS Hi, I'm sorry, I mean that AMD is getting better and pushing Nvidia. Greetings. http://stats.free-dc.org/cpidtagb.php?cpid=b4bdc04dfe39b1028b9c5d6fef3082b8&theme=9&cols=1 ID: 25220 · Rating: 0 · rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 25379 - Posted: 31 May 2012, 14:54:06 UTC - in response to Message 25126. BTW looking at the GK110 architecture, I can see that it is superscalar (in single precision) as well as the GK104, so only 960 of its 2880 shaders could be utilized by GPUGrid. Isn't that "1920 out of 2880"? Assuming the superscalar capabilities can still not be used (newer architecture, newer software.. not sure this still holds true for Kepler). MrS Looking at the performance of the beta workunits, I came to the conclusion that Kepler can utilize only 1/3rd of it's shaders. ID: 25379 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 25385 - Posted: 31 May 2012, 18:55:10 UTC - in response to Message 25379. You're referring to this post? If so (I don't want to read the entire thread now, it's quite long and the forum says I haven't read it yet) .. you forgot an important factor of 2! GTX680 is 150ns/115ns = 1.30434 times faster than GTX580. It's got 3 times as many shaders, so we might have expected a performance increase by a factor of 3. That would be 1.3/3 = 43% shader utilization by now. However, GTX580 runs its core at 772 MHz and its shaders at 1544 MHz. GTX680 runs the shaders at 1006 MHz, so we should actually expect a performance increase of 3*1006/1544 = 1.95. So the shader utilization on Kepler appears to be 1.3/1.95 = 66.5%. That's 2/3 rather than 1/3 and fits the assumption very well that the super scalar capability can still not be used by GPU-Grid, despite the changes nVidia made to the compiler and scheduling (which was actually meant to save power, not increase performance). BTW: the lower clock speeds of Kepler chips are key to its power efficiency. MrS Scanning for our furry friends since Jan 2002 ID: 25385 · Rating: 0 · rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 25398 - Posted: 31 May 2012, 21:33:53 UTC - in response to Message 25385. I got your point. I assumed that nVidia redesigned their shaders in a way that the shaders of the Kepler chip can do as much work as the shaders of the Fermi can do at doubled clock speed (to compensate the eliminated hot clocks). Apparently I wasn't right about it. (they put 3 times as much shaders to compensate the lack of hot clocks) My 'reviewed' conclusion is that while it can utilize 2/3rd of its shaders, the Kepler performs as if it could utilize only the 1/3rd of its shaders, because of the eliminated hot clocks. On the other hand it's much more power efficient for the same reason. Scaling the (theoretical) performance indices of the Kepler (GK104) and the Fermi (GF110) looks like: (15362/31006)/(5127722)=1.3031 Apparently we get the same result, if we take out the "2" from both performance indices at the same time: (1536/31006)/(512*772)=1.3031 BTW: the lower clock speeds of Kepler chips are key to its power efficiency. Yes, it's the key for every chip. That's why Intel couldn't reach 10GHz with Pentium 4, as it was planned when the netburst architecture was introduced. I wonder how much overclock the Kepler could take? ID: 25398 · Rating: 0 · rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 25400 - Posted: 31 May 2012, 22:05:37 UTC Last modified: 31 May 2012, 22:06:38 UTC Depends on the chip. But not alot. 1300 seems to be about the limit. At this point, the locked voltage becomes an issue. However, realistically most are seeing 1200-1250. My 680 can go up to about 1275, but I tend to keep it at 1200 w/ +100 memory (3100) With a flashed BIOS and extreme cooling. They're getting into "hot clock" territory. Around 1400+ ID: 25400 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 25404 - Posted: 31 May 2012, 22:41:07 UTC - in response to Message 25400. Won't be until next year that we see the full potential of the Keplers, in any format. Looking forward to seeing a full-fat Kepler, or at least the results. I'm not saving for one however. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help ID: 25404 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 25446 - Posted: 2 Jun 2012, 12:48:55 UTC - in response to Message 25398. My 'reviewed' conclusion is that while it can utilize 2/3rd of its shaders, the Kepler performs as if it could utilize only the 1/3rd of its shaders, because of the eliminated hot clocks. On the other hand it's much more power efficient for the same reason. While we're talking about the same numbers, I prefer to use the "real" clock speed and say "2/3 of the shaders", as this corresponds directly to not being able to use the super scalar ones. This is more straight forward and easier to understand. Saying "can use only 1/3 of its shaders" makes it seem like a really really bad design - which it isn't. MrS Scanning for our furry friends since Jan 2002 ID: 25446 · Rating: 0 · rate: / Reply Quote