Message boards :
Graphics cards (GPUs) :
Which graphic card
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 9 · Next
| Author | Message |
|---|---|
|
Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
You are speculating; cite a source that says that Nvidia supplies binned chips, and what card makers are using them That is another way of saying that you have no evidence to prove the assertion that the chips are binned. But you may believe it if you want to. As for heatsinks, power supply components, etc., they often are larger for overclocked cards to handle the extra heat and current load, but that has nothing to do with error rates (except to make them worse if they weren't oversized). I think you need to look at your error rates, which is the relevant data, and stop speculating as to what happens in any given factory. |
|
Send message Joined: 12 Dec 11 Posts: 34 Credit: 86,423,547 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Yes that is the one. But I saw a GTX 660 for only little more money. Same 192bit bus but more stream processors. So I am still a bit in doubt. The GTX 660 is great. The 650Ti has a 128-bit bus (not 192), less L2 cache, and fewer ROPs, in addition to the reduction of stream processors. The 650TiBoost is a 660 in the other regards, but has the same number of stream processors as the 650Ti. |
BeyondSend message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Yes that is the one. But I saw a GTX 660 for only little more money. Same 192bit bus but more stream processors. So I am still a bit in doubt. I agree, if the price is close, get the 660. Did you see skgiven's test and analysis of the 650 TI, 650 Ti Boost, 660 and 660 Ti a few days ago. Read it, it's good information. There's also companion information in the later posts of the same thread. Edit: Here's a link to the thread: http://www.gpugrid.net/forum_thread.php?id=3156&nowrap=true#29914 |
BeyondSend message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
You are speculating; cite a source that says that Nvidia supplies binned chips, and what card makers are using them Do you really not understand how you turned this around, or are you just trying to be difficult? I hope you're trying to be difficult, it's easier to fix ;-) |
|
Send message Joined: 12 Dec 11 Posts: 34 Credit: 86,423,547 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
You are speculating; cite a source that says that Nvidia supplies binned chips, and what card makers are using them Be nice, guys. We are all nerds on the same team. Why not chalk it up to: "I feel comfortable having an overclocked GPU" and "I feel more comfortable having a reference clocked GPU". Debating slight performance differences of 650Ti models is really splitting hairs. |
|
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Yes that is the one. But I saw a GTX 660 for only little more money. Same 192bit bus but more stream processors. So I am still a bit in doubt. Yes I saw that. I read most technical information from skgiven, seeing his credits and RAC he knows his stuff :) Also at other projects. But I have read all the information from other "specialists" as well and I will go for the 660 from EVGA, not SC or SSC as they are to expensive for me at the moment. @ matlock I guess I was mistaken the 650Ti Boost has a 192-bit bus and the 650Ti 128, right? Greetings from TJ |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Jim, you're of course right that higher clock speeds increase the risks of calculation errors. But in a well designed chip and with proper error-detecting tests (not sure we have these for GPUs..) this is not a smooth transition, but rather almost a step function. For my GTX660Ti that's "1228 MHz still works at GPU-Grid, 1241 MHz won't". While I do get occasional errors here (I've been watching it closer lately), these are always WUs which also fail for everyone else. So for me I conclude that I'm safe at 50 MHz over the top clock speed the manufacturer choose for my (heavily) factory-OCed card, even for Noelia tasks. I may have to adjust this down by 13 MHz in a year or two due to chip degradation.. I won't mind. And you may rightfully say "but that's just one example". To which I'd reply that the point is "there is a certain operation point for every card just before the unstable region starts [when clocking up]". That's where you're running the most efficient (unless tasks fail, of course). Find this point, keep some safe distance (generally I'd recommend more like 26 MHz rather than 13 MHz, but I don't always practice what I preach..). If one follows this it really doesn't matter what clock speed the manufacturer has set. Well, and a factory OC'ed card was at least tested for higher frequencies in games. This includes the shaders.. and makes it more likely the card will also perform better in DC. Actually I found that in practice I could achieve higher clock speeds in DC than in games, depending on the project. @Beyond: this "only uses 2/3 of its shaders" was not a software error. It's a design choice nVidia makes, which turns out to be more or less helpful depending on the code. All chips after GF110 and GF110 use this scheme (so even Titan). MrS Scanning for our furry friends since Jan 2002 |
BeyondSend message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
in a well designed chip and with proper error-detecting tests (not sure we have these for GPUs..) this is not a smooth transition, but rather almost a step function. For my GTX660Ti that's "1228 MHz still works at GPU-Grid, 1241 MHz won't". This fits experience also. It's also true across projects and not only here. As my GPUs get older, I usually have to slowly step down the OC speed to have them run 100% trouble free. @Beyond: this "only uses 2/3 of its shaders" was not a software error. It's a design choice nVidia makes, which turns out to be more or less helpful depending on the code. All chips after GF110 and GF110 use this scheme (so even Titan). MrS But then explain why ACEMD properly uses all the shaders now. As I understand it ACEMD improperly detected the number of shaders in the GF106 (or maybe more precisely those settings were not included in ACEMD, or ACEMD was hardwired to always assume a particular shader ratio). Then a couple of years later ACEMD was finally updated (corrected) to properly detect and use all the shaders in the GF106 based GPUs. Sure sounds like a software problem to me. At least that was the way it was explained to me in a different thread. Someone correct me if I have the details wrong. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I think the super-scalar cards are now preferentially favoured by the application. When the top GPU's were the GTX480 to GTX590's it made sense to favour these Compute Capable 2.0 architectures, for project optimization reasons. It now makes more sense to use an app that favours the CC3.0 GeForce 600 GPU's which are all super-scalar. This just happens to make the CC2.1 cards (superscalar GeForce 400 and 500 series GPUs) perform better than they did, and also makes my old GPU comparison tables (with the older apps) obsolete. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
BeyondSend message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I think the super-scalar cards are now preferentially favoured by the application. When the top GPU's were the GTX480 to GTX590's it made sense to favour these Compute Capable 2.0 architectures, for project optimization reasons. It now makes more sense to use an app that favours the CC3.0 GeForce 600 GPU's which are all super-scalar. This just happens to make the CC2.1 cards (superscalar GeForce 400 and 500 series GPUs) perform better than they did, and also makes my old GPU comparison tables (with the older apps) obsolete. Interesting. If this is the way it works it would seem to be a good idea to allow the sending of different optimized apps to different GPU types. That's the way other projects handle this kind of situation AFAIK. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
That's just my take on the situation. Only the researchers could tell you if that was the case, or how close it is to the situation. There are many potential issues with having multiple apps (queues, resources, management), but the biggest may be that any research has to be performed on the same app (or one that is essentially the same for the purpose of analysis) for the research to be presentable/acceptable. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
BeyondSend message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
That's just my take on the situation. Only the researchers could tell you if that was the case, or how close it is to the situation. Other projects seem to handle the same issues easily. What's so hard about setting up a different queue? Whats so hard about having different apps optimized for different NVidia class cards? I'm just not seeing why it should be so difficult. Why not ask for help from some of the other projects, or if need be even hire one of them to set up the queues, etc? |
|
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I'm just not seeing why it should be so difficult. Why not ask for help from some of the other projects, or if need be even hire one of them to set up the queues, etc? That is very difficult in a scientific setting. It has a lot to do with funding and research groups. Working for another science group, even on hire, does mean that work for the own group will come on hold. Personally I think that it is more important for the GPUGRID project to get the science right, as that will help cure some terrible diseases. If there is time left or a trainee/internship can work on several apps. Making one good app is better though for updating and maintenance than several with risk that something is forgotten etc. Greetings from TJ |
BeyondSend message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I'm just not seeing why it should be so difficult. Why not ask for help from some of the other projects, or if need be even hire one of them to set up the queues, etc? You may be right. The other side of the coin though is that the work is going to done done much faster if more crunchers and GPUs are accommodated. That's assuming that the project needs the crunching capacity. Maybe it doesn't. How much work is lost and computing time wasted by apps that don't work correctly with many GPUs and WUs that aren't formatted correctly. I think we can see by our experience that a lot is lost. Trade-offs? I'd say there's always trade-offs. Seriously though, setting up new queues should be a simple matter in BOINC. There's more than one forum / e-mail list where developers can get help from others who have already climbed the mountain. One can't be too proud to ask though... |
|
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I'm just not seeing why it should be so difficult. Why not ask for help from some of the other projects, or if need be even hire one of them to set up the queues, etc? I agree with you Beyond the more is crunched the better it is for the project. Greetings from TJ |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I don't think the issue is with the creation of new queues, they have been added and deleted before. It would be more of a maintenance issue. The project has an inherent need for longer WU's with more steps and larger detail. This makes shorter, more accommodating experiments, less useful. The solution there is to diversify, which is something it appears Gianni is trying to do. Then there is the app situation - GPUGrid is basically a one app project, with different WU types and lengths. From a scientific point of view this means results are comparable, and it allows the researchers to extend runs in order to get more detail. BTW. I'm not arguing for or against more queues, apps, or better stability (I crunch too), I'm just trying to give my take on the situation. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Let's take a look at the hard numbers. I've compared the average runtimes of 5 recent Noelia WUs on fast linux hosts (GPUs are less likely to be overlcocked here) and took the theoretical SP performance into account: runtime in ks | theoretical performance in TFlops | TFlops*runtime GTX 580: 39.49 | 1.58 | 62.4 GTX 570: 41.92 | 1.40 | 58.7 GTX 480: 46.00 | 1.34 | 61.6 GTX 660Ti: 38.07 | 2.46 | 93.7 GTX 560Ti: 67.71 | 1.26 | 85.3 The quantity "TFlops*runtime" may not be the most intuitive, but it makes sense if we want to compare architecture efficiencies. A low value signals a highly efficient architecture: - for a given theoretical speed, the longer the WUs take the less the hardware is actually used - for a given runtime, the more TFlops were neded to achieve it the less efficient the hardware is What we see here is still a clear ~50% advantage for the CC 2.0 cards, or put the other way around a 2/3 penalty for superscalar GPUs. Just as it has been since the introduction of these cards! Well, I don't have such hard numbers for the older apps at hand, but I suspect they'll be equivalent. The solution to this apparent paradox is pretty simple, IMO: the Keplers have added so much theoretical performance and gained power efficiency (due to architecture as well as process node) that they're clearly superior to the Fermis. And there are no new non-superscalar GPUs. Hence we forgive the new cards that 2/3 penalty and consider them quite good (which they are, IMO). Hence the suspecion "the client must have improved for the newer cards". I think in the light of these numbers we can quickly stop the discussion about new queues and multiple apps :) MrS Scanning for our furry friends since Jan 2002 |
|
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
If I have understand correctly, the lower the result of TFlops times runtime, the more efficient the card? Than would the 660Ti perform worst? Greetings from TJ |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I think the super-scalar cards are now preferentially favoured by the application. When the top GPU's were the GTX480 to GTX590's it made sense to favour these Compute Capable 2.0 architectures, for project optimization reasons. It now makes more sense to use an app that favours the CC3.0 GeForce 600 GPU's which are all super-scalar. This just happens to make the CC2.1 cards (superscalar GeForce 400 and 500 series GPUs) perform better than they did, and also makes my old GPU comparison tables (with the older apps) obsolete. You disagree? Going back a year to when the top GPU's were CC2.0, a GTX470 did 29% more work than a GTX560Ti. Now, with newer apps, a GTX470 can only do 7% more work than a GTX560Ti. With super-scaler cards you can never utilize all the shaders fully, but I think its went up from 66% to around 80%. The theoretical GFLOPS are not a good indicator of performance here, otherwise the GTX650Ti would have been 16% faster than a GTX470 from the outset (as suggested by their GFLOPS), and we wouldn't have needed correction factors to accurately compare GPU's of different compute capability. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Going back a year to when the top GPU's were CC2.0, a GTX470 did 29% more work than a GTX560Ti. I think it's because the newer apps are made with CUDA4.2. With super-scaler cards you can never utilize all the shaders fully, ... It's true even for a non super-scalar card :) I would say that the non-super-scalar cards (still) have a significant advantage over the super-scalar cards in the shader utilizaiton. ... but I think its went up from 66% to around 80%. This advantage is less than it was with the CUDA3.1 apps (it was around 33%) It's too bad from the cruncher's perspective that nVidia doesn't make non-super-scalar GPUs anymore, but (as kind of compensation) the good news is that the CUDA4.2 can better utilize the super-scalar architecture than the CUDA3.1. This discussion is difficult, because we're talking about the performance of a system consisting many parts, all of these parts continuously changing over time, and this change could alter (like it did in the past) their order of significance: 1. I/a The GPU 2. I/b The code running on the GPU 3. II - The computer (also a system consisting many parts) 4. II/a The operating system of the computer 5. II/b The optimization of the BOINC client for the hardware it's running on 6. II/c The hardware components of the computer (beside the GPU) This is the actual order. Except for item 2, the participants can optimize this system. But item 2 is the fundamental of this optimization: I've changed my Core 2 Quad systems to Core i7 systems to achieve better overall performance by eliminating the PCIe bandwith bottleneck which reduced the performance of the CUDA3.1 client. The CUDA 4.2 client is better in regards of this issue also, so no such change (read it as investment) is needed now from the participants. But you still have to spare a CPU thread per GPU (item 5) |
©2026 Universitat Pompeu Fabra