NVIDIA BigKepler

Author	Message
Carlesa25 Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level Scientific publications	Message 24499 - Posted: 20 Apr 2012, 19:41:31 UTC Last modified: 20 Apr 2012, 19:45:18 UTC Hi, NVIDIA plans to introduce GK100 (or GK110) known as Big Kepler in the upcoming GPU Thechnology Conference in San Jose (California, USA) May 14-17, 2012 Has 7,000 milion transistors. Greetings. NVIDIA-GPU Technology Conference ID: 24499 · Rating: 0 · rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 24500 - Posted: 20 Apr 2012, 21:14:32 UTC Last modified: 20 Apr 2012, 21:20:33 UTC Well that's good news!! Wonder how the design will be in regards to the FP32 vs. FP64 core layout, meaning I'm wondering if it will just be a monster at FP32, and slightly so at FP64 (like 680), or if they will make this one a more well-rounded GPU for crunching purposes. Hate to see how long it will take for any to actually become available though!! 680 been out for over a month, and they're still hard to get. Maybe a christmas gift for me? lol. 7 billion though, EXCITING!! EDIT: Hmmmm.... read some other forums, and many discussing that this may be for ONLY for their compute designed chips Quaddro and Tesla, which would make sense from a business standpoint : ( Gpu compute conference= expensive series, meant for scientists with grant money...... ID: 24500 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 24504 - Posted: 20 Apr 2012, 22:30:39 UTC Suggestion: they handle it just as with Fermi. The big chip gets FP64 at 1/2 speed. It's active on Telsa and Quadro, but restricted on Geforce. Probably not 1/24 (as GK104) though, more like the previous 1/8. MrS Scanning for our furry friends since Jan 2002 ID: 24504 · Rating: 0 · rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 24505 - Posted: 20 Apr 2012, 22:51:44 UTC I mean the 680 isn't restricted though like other cards though if I'm not mistaken. 680 only has 8 fp64 cores which run at 1/1. They're not throttled, they just don't exist, not even added in the core count too. ID: 24505 · Rating: 0 · rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 24506 - Posted: 21 Apr 2012, 0:23:09 UTC This should be interesting. Probably should be somewhere between 25-30% faster than 680? My guess anyways. Hope they have yields compared to 680! Rollout has been way to slow ID: 24506 · Rating: 0 · rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 24507 - Posted: 21 Apr 2012, 5:08:58 UTC Now that I think about it, with 7B transistors, couldn't this be the 690? ID: 24507 · Rating: 0 · rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level Scientific publications	Message 24508 - Posted: 21 Apr 2012, 10:25:51 UTC - in response to Message 24507. Last modified: 21 Apr 2012, 10:26:24 UTC I am surprised to see 7000 M. A gtx680 has only 3500 M, while the number of cores is only 50% more (2304) on gk110 according to some websites. Maybe they will actually have 2048 cores and not 2304, but these are actually more similar to a 110 chip rather than a 104, so more performing for us. It would be great because then I would expect almost a factor 2x in performance against a GTX680 at the same clock. gdf ID: 24508 · Rating: 0 · rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 24510 - Posted: 21 Apr 2012, 13:33:11 UTC Last modified: 21 Apr 2012, 14:25:48 UTC If it is a 685, and if it does have 2305, instead of 2048. Those extra cores could be the FP64 cores ? Seems like there's no way they could fit this on a 520 die though. IMHO. Also saw release date was rumored at August to September, which at the current rate of availability of 680 would probably make them AVAILABLE around Xmas. EDIT: Further if this is the case, this would begin to bring it dangerously close to Maxwell if I'm not mistaken. Guess I'll have to wait and see. ID: 24510 · Rating: 0 · rate: / Reply Quote

frankhagen Send message Joined: 18 Sep 08 Posts: 65 Credit: 3,037,414 RAC: 0 Level Scientific publications	Message 24511 - Posted: 21 Apr 2012, 14:48:39 UTC - in response to Message 24510. If it is a 685, and if it does have 2305, instead of 2048. Those extra cores could be the FP64 cores ? one thing for sure: they will come up with a chip that has decent DP-capabilities for the quadro/tesla line. did you read the announcement? "low-overhead ECC" will definitely mean something not to be seen on consumer-cards.... ..so most likely it's not a GTX-something. ID: 24511 · Rating: 0 · rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 24512 - Posted: 21 Apr 2012, 15:48:06 UTC Very true. Seems weird they would post it on Twitter feed to me anyways. If it wasn't consumer based. Have seen dev based betas 301.26 and 301.27 for them to build their tools. My 680 is using 301.25, which isn't even on their website yet, which has quaddro feature on it, as well as 670 or 660 ti specs as well. ID: 24512 · Rating: 0 · rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 24519 - Posted: 21 Apr 2012, 20:43:47 UTC I still think nVidia won't release BigKepler as a consumer card (i.e. GeForce). Besides the crunchers, there is no need for such a card in this market segment. We crunchers are a minority among the consumer card buyers, so we do not present such an urge for nVidia to release a cheap cruncher card built on an expensive chip. GDF's reaction was kind of a confirmation of my opinion, naming the GTX 680 as the flagship product. ID: 24519 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 24521 - Posted: 21 Apr 2012, 23:47:13 UTC - in response to Message 24519. Non-professional video editors might disagree, as might their favourite software developer. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help ID: 24521 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 24530 - Posted: 22 Apr 2012, 11:37:43 UTC 7B transistors seem fit for typical nVidia strategy: build it as large as TSMC will allow you. And 2x the transistors for 1.5x the shaders makes sense considering these will have to be cores, which can do FP64 at 1/2 the FP32 rate, just like on GF100 and GF110, which requires more transistors per shader. And I wouldn't be surprised if they moved away from the superscalar execution again, just like on GF10 and GF110, which improves the utilization of the shaders in typical HPC code, but requires more control logic (i.e. more transistors) per shader, again. Support for ECC memory may add some transistors. There may also be larger caches and other tweaks we do not yet know about. And considering the market for Quadros is still large compared to the market for pure Teslas, I'm sure these chips will still have graphics capabilities. This way they can be used for both markets, lowering design cost. And if there's graphics hardware in there and the chip is faster in games than GK104 (which it should be at this transistor count, although by far less than a factor of 2), they will introduce consumer Geforce cards based on them. The win margin on these high end cards is huge and since they already have the chip, it doesn't cost them much. And the high end GPUs will be bought, just like in previous generations. Despite the fact that GK104 should be much more efficient (power and price) for games. Name of the product: who knows, maybe even nVidia themselves have not decided this yet. Maybe straight GTX780 (which would make GTX680 look bad), or GTX685 (which would make big Kepler look weak) or "GTX680XT XXX Ultra-Monster-Core Golden Sample Edition" (which would make their other names look pretty good). Personally I'd bet the name of my sisters first-born on the latter ;) Regarding yields: they'll be bad for such a huge chip, but there'll be plety of untis to deactivate. No big deal. And the scarce availability of current 28 nm cards is not primarily a yield issue (otherwise we'd already be seeing more cut-down versions of the chip and no fully activated ones), but rather an issue of overall 28 nm capacity. This capacity will improve as TSMC converts more fabs or lines within each fab, but demand for 28 nm chips will also increase as more designs transition to the newer process. Anyway, TSMC expects supplies of any 28 nm chips to be tight until the end of the year. MrS Scanning for our furry friends since Jan 2002 ID: 24530 · Rating: 0 · rate: / Reply Quote

GPUGRID Role account Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level Scientific publications	Message 24533 - Posted: 22 Apr 2012, 13:00:43 UTC - in response to Message 24508. I am surprised to see 7000 M. A gtx680 has only 3500 M Nowhere does it say all those 7B transistors are all on a single die. In marketting-speak at least, a dual-GK104 card would satisfy the description "7B transistor GPU". MJH ID: 24533 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 24535 - Posted: 22 Apr 2012, 13:49:21 UTC - in response to Message 24533. Last modified: 22 Apr 2012, 13:50:48 UTC Sure, but GK104 would be far worse than an unlocked GF110 for FP64 compute. You can't offer a compute chip without good FP64 performance, the usage cases would be far too few. And I don't think it's got ECC either, as this is not needed for gaming. MrS Scanning for our furry friends since Jan 2002 ID: 24535 · Rating: 0 · rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 24536 - Posted: 22 Apr 2012, 14:41:09 UTC - in response to Message 24530. 7B transistors seem fit for typical nVidia strategy: build it as large as TSMC will allow you. They learned a lesson from the Fermi-fiasco not to come out with the largest chip first. And 2x the transistors for 1.5x the shaders makes sense considering these will have to be cores, which can do FP64 at 1/2 the FP32 rate, just like on GF100 and GF110, which requires more transistors per shader. And I wouldn't be surprised if they moved away from the superscalar execution again, just like on GF10 and GF110, which improves the utilization of the shaders in typical HPC code, but requires more control logic (i.e. more transistors) per shader, again. Support for ECC memory may add some transistors. There may also be larger caches and other tweaks we do not yet know about. I agree. However if all of this is true, I would be surprised if the BigKepler had more than 1024 cores. And considering the market for Quadros is still large compared to the market for pure Teslas, I'm sure these chips will still have graphics capabilities. This way they can be used for both markets, lowering design cost. Or they return to the design of the GT200, and use a discrete chip for this purpose. And if there's graphics hardware in there and the chip is faster in games than GK104 (which it should be at this transistor count, although by far less than a factor of 2), they will introduce consumer Geforce cards based on them. I hope it's right, and then we could have a nice cruncher card. The win margin on these high end cards is huge and since they already have the chip, it doesn't cost them much. The win margin is high on Teslas and Quadros, but it's low on the top end GeForces (like the GTX 295, the GTX 590, or even the GTX 580). And the high end GPUs will be bought, just like in previous generations. Despite the fact that GK104 should be much more efficient (power and price) for games. It's true, but they still could build a dual GPU card on the GK104, which would be very fast and very efficient at the same time. Name of the product: who knows, maybe even nVidia themselves have not decided this yet. Maybe straight GTX780 (which would make GTX680 look bad), or GTX685 (which would make big Kepler look weak) or "GTX680XT XXX Ultra-Monster-Core Golden Sample Edition" (which would make their other names look pretty good). Personally I'd bet the name of my sisters first-born on the latter ;) :) I'm sure they will find a fully satisfying name for this product, if there will be a product to name. What I meant was that if the GTX 680 is the flagship (GeForce) product, then we won't have a better (single-chip) GeForce this time. ID: 24536 · Rating: 0 · rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 24537 - Posted: 22 Apr 2012, 14:56:08 UTC Safe to say they gave just enough to get us excited didn't they!!! ID: 24537 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 24539 - Posted: 22 Apr 2012, 16:03:40 UTC - in response to Message 24536. They learned a lesson from the Fermi-fiasco not to come out with the largest chip first. This may be wise and intentional, although some rumors were floating around they tried to introduce the chip ealier, but couldn't due to some problems. I agree. However if all of this is true, I would be surprised if the BigKepler had more than 1024 cores. If they just expanded GF110 to 7B transistors, that would indeed yield ~1024 shaders. However, by getting rid of the hot clock they can save some transistors, as well as everything already implemented in "little Kepler". Starting from GK104 we could remove the superscalar capability, i.e. 1/3 the shaders and as a first-order approximation 1/3 the transistors. That yields 1024 shaders for 2.33 billion transistors, so for 2048 shaders only 4.7 billion would be neccessary, ~2400 could be possible at 5.5 billion transistors. These would be FP32 only, so now add some of the suff I mentioned in my post above and I think it works out. Or they return to the design of the GT200, and use a discrete chip for this purpose. Wasn't that the same chip for all of them? And the dual GK104 will come, I'm sure. It will rock for gaming and it will sell. The purpose of building GK104 fast & efficient, yet not so big to reach 250 W again was probably to have a decent dual chip GPU again (without heavy binning and downclocking to stay below 300 W). MrS Scanning for our furry friends since Jan 2002 ID: 24539 · Rating: 0 · rate: / Reply Quote

frankhagen Send message Joined: 18 Sep 08 Posts: 65 Credit: 3,037,414 RAC: 0 Level Scientific publications	Message 24543 - Posted: 22 Apr 2012, 18:04:10 UTC - in response to Message 24539. Last modified: 22 Apr 2012, 18:04:42 UTC And the dual GK104 will come, I'm sure. It will rock for gaming and it will sell. The purpose of building GK104 fast & efficient, yet not so big to reach 250 W again was probably to have a decent dual chip GPU again (without heavy binning and downclocking to stay below 300 W). SINGED! until now they had a single chip design for consumer and professional purposes and simply limited the consumer cards on DP-performance. is it such a wild guess, that this will no longer be the case? they will have absolutely no problem to scale down from GK-104 to feed every range they want. and they will come up with something to replace the current quadro/tesla line. but we will know mid of may.. ID: 24543 · Rating: 0 · rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 24546 - Posted: 22 Apr 2012, 20:03:43 UTC - in response to Message 24539. If they just expanded GF110 to 7B transistors, that would indeed yield ~1024 shaders. However, by getting rid of the hot clock they can save some transistors, as well as everything already implemented in "little Kepler". Starting from GK104 we could remove the superscalar capability, i.e. 1/3 the shaders and as a first-order approximation 1/3 the transistors. That yields 1024 shaders for 2.33 billion transistors, so for 2048 shaders only 4.7 billion would be neccessary, ~2400 could be possible at 5.5 billion transistors. These would be FP32 only, so now add some of the suff I mentioned in my post above and I think it works out. From what you say I got the feeling that the GF100 and the GF110 is a very wasteful design regarding the transistor count. I think nVidia wouldn't develop their professional product line from the consumer product line, which was derived from the previous professional product line. ID: 24546 · Rating: 0 · rate: / Reply Quote