Credit per € / $

Author	Message
Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 22056 - Posted: 11 Sep 2011, 16:11:08 UTC - in response to Message 22055. I still don't think X8 makes all that much difference; well not in itself for 2 cards. For two GTX590's perhaps. I have 4 GTX 480s in two PCs: 1. Core2 Quad 9650 @4GHz (FSB 444MHz), X48 chipset, dual native PCIe x16 slots. 2. Core i7-870 @3.92GHz (BCLK 180), P55 chipset (PCIe x16 controller in the CPU) After overclocking the Core 2 Quad, it is faster (got higher RAC) than the Core i7-870, either I put the 2 GTX 480s in the x16 slots (and have only x8 for both), or I put the second card to the third PCIe slot (and have x16 for the first card, and x4 for the second). The lower the GPU usage of a WU, the higher the impact on the performance of slower PCIe. As far as I can recall, it was around 10% comparing x8 and x16. You can see the difference (it's now around 20%) of x4 and x16 among my host's reults. Anyway when the Sandy Bridge E type CPU's turn up their boards will have support Quad Channel Memory, PCI-e 3.0 and multiple x16 PCI-e lanes. Obviously these will become the CPU/board combination to have. Of course, but I have to add that they will be quite expensive. ID: 22056 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 22058 - Posted: 11 Sep 2011, 18:56:28 UTC - in response to Message 22056. I see your Core2 Quad cards are about 12% slower than your i7-870's X16 GPU, but your i7-870's X16 GPU is 17% faster than the X4 GPU (just going by a few tasks). So overall your C2Q is only around 3.5% slower. Not sure if you are using HT or not, and what else you are crunching? HT might make a slight difference between your CPU types, C2Q and i7-870, but only a little if any between X16, X8 and X4. What the CPU is being used for and what else the controller is spending time on could also make a difference. The faster the CPU and GPU, the more it would potentially expose the weakness in PCIE lanes. While I don't consider an 8.5% loss to be massive when at X8, it's not my system and there is a hidden catch; it's either 8.5% for both cards or 17% of one GTX480, so I take your point. Adding a second GPU on an X16/2x-X8 is really only adding 83% of a GPU. When you also consider that a fully optimized Win7 system is still around 10 to 15% (say 13%) slower than XP, adding a second GTX480 in a W7 system that only supports PCIE X8 for 2GPU's, would mean the setup would be doing around 22% less overall compared to a similar XP system that was X16 capable for 2 cards. I expect the difference might be different for different cards. PCI x8 deficiency might be slightly less for a GTX 460 and cards such as the GTS 450, but more for a GTX580. So a good setup includes a good operating system, GPU, CPU and other components; the motherboards PCIE implementation (two X16 channels), RAM and HDD. Presumably this would apply to a GTX590 as well; performance would be around 8.5% less than expected as the dual card is sharing a single X16 PCIE lain set. Would be interesting to know what this is for other cards in practice under different architectures (LGA 1366, 1155, AMD rigs) as well as your LGA 1156 setup. At the minute I think it means anyone with an older dual X16 board and one GPU should hold off getting a new system just to add a new card. Just buy the second card, unless you want to fork out for the faster CPU and can find a dual X16 motherboard at a reasonable price. ID: 22058 · Rating: 0 · rate: / Reply Quote

Fred J. Verster Send message Joined: 1 Apr 09 Posts: 58 Credit: 35,833,978 RAC: 0 Level Scientific publications	Message 22116 - Posted: 15 Sep 2011, 20:23:54 UTC - in response to Message 22058. Last modified: 15 Sep 2011, 21:02:42 UTC I'm still satisfied with my 2 year 'old' C2Extreme X9650, now running @ 3.51GHz, with stock air cooler, but no case. (Draws 335 Watt doing a GPUgrid WU) Running 2 or 3 SETI MB WUs, it needs 410Watt. On an ASUS P5E motherboard, X38 chipset, DDR2 memory runs just below 400MHz (per stick), FSB=1524MHz. O.S. Windows XP64 Pro, 4 GiG DDRII. And a GTX480 running in a PCIe ver.2.0 x16 bus. (Also runs SETI@home and EINSTEIN). (This type "eats anything") And this year, I build an i7-2600 system, with 2 ATI 5870 GPUs. (Cross-Fire) On an INTEL DP67BG motherboard, O.S. MULTI-BOOT: UBUNTU/DEBIAN/Windows 7, all 64bit (OEM) (USB 2.0 & 3.0; eSATA), DDR3 1333MHz. and does SETI, SETI Bêta ATI GPGPU, using OpenCL, which also can do anything from VLAR, and 'other' Multi Beam WUs. And CPDN (~500 hour) work. Also AstroPulse, avoided by a lot of people ;-). Time consuming on AMD CPU, take about 8 hours on a X9650(@3.51GHz) and half the time and 2 at a time on a 5870 GPU! Those SANDY BRIDGE, like i7-2600(K), sure are efficient, doing 1.65x (HT=on) X9650(@3.51GHz) with 85 v.s. 120Watt. MilkyWay and Collatz C. works great. MW hasn't much work, a.t.m. Knight Who Says Ni N! ID: 22116 · Rating: 0 · rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level Scientific publications	Message 22160 - Posted: 24 Sep 2011, 21:21:02 UTC - in response to Message 22058. Not sure if you are using HT or not, and what else you are crunching? I'm using HT, but only 4 tasks are running at the same time. (2 GPUGrid and 2 Rosetta@home) I expect the difference might be different for different cards. PCI x8 deficiency might be slightly less for a GTX 460 and cards such as the GTS 450, but more for a GTX580. It's obvious. That's why the GTX 590 (made of 2 underclocked GTX 580s) has its own NForce 200 PCIe x16 to 2 PCIe x16 bridge. In this way both chips have PCIe x16. Could NVidia put 2 GPUs on a single board without this bridge chip, and both GPUs would have PCIe x8 then, but it would decrease the the overall performance too much. It's impractical for a top-end dual GPU card. Would be interesting to know what this is for other cards in practice under different architectures (LGA 1366, 1155, AMD rigs) as well as your LGA 1156 setup. I couldn't resist to have the ASUS P7P55 WS SuperComputer MB I've mentioned earlier, so now my LGA1156 host runs with it. I kept the old P7P55D Deluxe MB too (and I bought an other Core i7-870 for it), so I can put the two GTX 480s from my Core2 Quad host to this MB, and we can compare the dual PCIe x8 setup with the dual PCIe x16 setup. ID: 22160 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 22200 - Posted: 2 Oct 2011, 13:02:21 UTC - in response to Message 22160. Last modified: 2 Oct 2011, 14:16:22 UTC Tested one of my i7-2600 systems, with a GTX470 @657MHz. Turns out the Gigabyte PH67A-UD3-B3 has one x16 slot and one x4 slot! The difference was around 13%. So a GTX580 would be >13% slower and a GTX460 would be less. Looked at several 1155 motherboards and on some the second slot is only x1 when both are occupied. ID: 22200 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 22210 - Posted: 4 Oct 2011, 0:38:16 UTC - in response to Message 22200. PCIE 3 is also now available on some motherboards (MSI mostly). While these boards still tend to offer one X16 and either one x8 or x4 the improvement from PCIE2.0 is basically two fold; the bandwidth doubles from PCIE 2 to PCIE 3. So a PCI Express x16 Gen 3 slot provides 32GB per sec compared to the 16GB per sec for PCIE2, and a PCIE3 slot at x8 is just as fast as a PCIE2 slot at x16. The LGA1155 Z68A-G45 (G3), for example, has two PCIE3 slots. If one is used it operates at x16 (32GB per sec) and if two are used both are x8 (16GB per sec each). ID: 22210 · Rating: 0 · rate: / Reply Quote

toms83 Send message Joined: 12 Oct 09 Posts: 3 Credit: 4,026,093 RAC: 0 Level Scientific publications	Message 22441 - Posted: 31 Oct 2011, 23:16:17 UTC I'm thinking about Gigabyte GA-890FXA-UD7 and 3x570. Slots speeds would be 8x, 4x, 8x ( with one slot free between graphics, better cooling ). I'm afraid about that one with 4x. How much performance drop may I expect? ID: 22441 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 22447 - Posted: 1 Nov 2011, 22:25:16 UTC - in response to Message 22441. A guesstimate, but for a drop from PCIE x16 to x8 you might see a performance drop of around 8% on each GTX570. A further PCIE bandwidth drop to X4 would see the performance drop to around 16% less than X16. So overall you would be losing about 1/3rd of a GPU. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help ID: 22447 · Rating: 0 · rate: / Reply Quote

microchip Send message Joined: 4 Sep 11 Posts: 110 Credit: 326,102,587 RAC: 0 Level Scientific publications	Message 22631 - Posted: 5 Dec 2011, 13:40:59 UTC Last modified: 5 Dec 2011, 13:43:37 UTC Since we're talking about PCIe lately, I've also got a question I'm running a GTX 560 on a board that only has a PCIe 1.0 x16 support (old nForce 630a chipset). How much of a roughly performance penalty am I getting for running the GTX 560 on a PCIe 1.0 bus? I've already ordered another mobo which has PCIe 2.0 but am curious about the penalty I'm getting currently on the PCIe 1.0 bus. Does it even matter much for GPUGRID? ID: 22631 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 22632 - Posted: 5 Dec 2011, 16:14:06 UTC - in response to Message 22631. My guess is between 4 and 8%, but let us know, when you get your motherboard and run a few tasks. PCIE2.0 x16 has twice the width of PCIE1.0 x16; 8GB/s vs 4GB/s. As your GTX560 is not a very high end crunching GPU (mid range), performance will not be impacted as much as it would be for a GTX570 for example, otherwise it would be around 8%. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help ID: 22632 · Rating: 0 · rate: / Reply Quote

Mike Hamm Send message Joined: 23 Aug 12 Posts: 1 Credit: 913,492,326 RAC: 0 Level Scientific publications	Message 27132 - Posted: 23 Oct 2012, 17:13:55 UTC - in response to Message 22632. Looks like the nvidia 680 690 are the way to go. http://en.wikipedia.org/wiki/Comparison_of_Nvidia_graphics_processing_units Best gflops per watt. ID: 27132 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 27141 - Posted: 24 Oct 2012, 9:55:02 UTC - in response to Message 27132. Going by wiki, a GTX660Ti and a GTX660 are better than a GTX680 and a GTX670! GTX660 16.22GFlops/Watt GTX660Ti 16.40GFlops/Watt GTX670 14.47GFlops/Watt GTX680 15.85GFlops/Watt GTX690 18.74GFlops/Watt To state the obvious - The GTX690 is two cards, and rather expensive. Purchase costs and running cost are important too. Anyway these are GFlops per Watt for the card only, not the system. You might be able to get 3 GTX660Ti cards for around about the same as one GTX690. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help ID: 27141 · Rating: 0 · rate: / Reply Quote

mikey Send message Joined: 2 Jan 09 Posts: 303 Credit: 7,322,550,090 RAC: 145 Level Scientific publications	Message 27460 - Posted: 28 Nov 2012, 15:39:42 UTC - in response to Message 22632. My guess is between 4 and 8%, but let us know, when you get your motherboard and run a few tasks. PCIE2.0 x16 has twice the width of PCIE1.0 x16; 8GB/s vs 4GB/s. As your GTX560 is not a very high end crunching GPU (mid range), performance will not be impacted as much as it would be for a GTX570 for example, otherwise it would be around 8%. I have seen webpages where people also say the difference is not as great as one would think it might be. 10% is not that big of a deal when talking about a 150.00US motherboard. Now going to a new mb and going to 16gb of ram too IS a big deal! The ram of course is not used for gpu crunching very much, but does help when crunching cpu units too. Getting a new mb, new ram AND a new cpu along with that pcie-2 slot...now THAT is DEFINITELY worth it!! ID: 27460 · Rating: 0 · rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 28874 - Posted: 28 Feb 2013, 18:14:22 UTC Anyone up to updating the OP? Question, supposedly the GPUGrid app only utilizes 2/3 of the shaders on the GTX 460 (GF104). Yet the GPU utilization on all my 460s run at 88-89% and the cards run hotter than at other projects. As I remember, when this was first identified as a problem the 460s ran with low GPU utilization and very cool. Are you SURE it's still an issue? If only 2/3 of the shaders are being used why are the temps relatively high as is the GP utilization? Just wondering... ID: 28874 · Rating: 0 · rate: / Reply Quote

MJH Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 28877 - Posted: 28 Feb 2013, 19:21:48 UTC - in response to Message 28874. If only 2/3 of the shaders are being used why are the temps relatively high as is the GP utilization? That's not an issue any more. The 4.2 apps do a good job of using the cc2.1 Fermis. MJH ID: 28877 · Rating: 0 · rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 28882 - Posted: 28 Feb 2013, 21:08:44 UTC - in response to Message 28877. If only 2/3 of the shaders are being used why are the temps relatively high as is the GP utilization? That's not an issue any more. The 4.2 apps do a good job of using the cc2.1 Fermis. MJH Thanks for the update, I thought something had changed for the better. ID: 28882 · Rating: 0 · rate: / Reply Quote

Raubhautz* Send message Joined: 18 Nov 12 Posts: 9 Credit: 1,867,450 RAC: 0 Level Scientific publications	Message 30136 - Posted: 21 May 2013, 4:55:36 UTC - in response to Message 28874. Anyone up to updating the OP? Question, supposedly the GPUGrid app only utilizes 2/3 of the shaders on the GTX 460 (GF104). Yet the GPU utilization on all my 460s run at 88-89% and the cards run hotter than at other projects. As I remember, when this was first identified as a problem the 460s ran with low GPU utilization and very cool. Are you SURE it's still an issue? If only 2/3 of the shaders are being used why are the temps relatively high as is the GP utilization? Just wondering... Interesting... I notice that the GIANNI (short tasks) uses just under 84% cpu! Not only that, it takes 124k-sec for a wu!!! That is almost 2x it takes to do the NATHAN (long tasks), which average 72k-sec on my Quadro-K2000M (Keppler w/2GB RAM). Can anyone explain why the 'short' tasks take almost 2x as long as the 'long' versions? My next step, out of sheer curiosity, I will try the same tests with my other machine which is a similar video, but Fermi design (Quadro-2000M). Thank you in advance. Phil ID: 30136 · Rating: 0 · rate: / Reply Quote

Stefan Project administrator Project developer Project tester Project scientist Send message Joined: 5 Mar 13 Posts: 348 Credit: 0 RAC: 0 Level Scientific publications	Message 30144 - Posted: 21 May 2013, 8:28:26 UTC - in response to Message 30136. Last modified: 21 May 2013, 8:28:47 UTC http://www.gpugrid.net/forum_thread.php?id=3370&nowrap=true#30017 ID: 30144 · Rating: 0 · rate: / Reply Quote

Raubhautz* Send message Joined: 18 Nov 12 Posts: 9 Credit: 1,867,450 RAC: 0 Level Scientific publications	Message 30193 - Posted: 22 May 2013, 10:56:36 UTC - in response to Message 30144. Last modified: 22 May 2013, 10:57:33 UTC http://www.gpugrid.net/forum_thread.php?id=3370&nowrap=true#30017 Uh, yeah. Hello. Here is a copy of the linked 'response' you posted: There are two main reasons why more credit is awarded on the long queue: 1) Greater risk for the cruncher to crunch (typically) much longer WUs. If a simulation crashes after 18 hours, that's a much bigger loss than a crash after 2 or 6 hours. This is especially true for older/slower cards. 2) To encourage/reward users who dedicate their computers to the long hours crunching that long WUs require. With he short queue, you can run a WU while you sleep or run errands, for example, and by the time you wake up or come home it's finished, and you can use your computer for other things. Dedicating a gpu/cpu/computer to run on long queue means you basically can't use it for other things, such as work, entertainment, etc., and so higher credits reward them for that. Gianni's WUs may be slightly long for the short queue, but my recent tasks were definitely short for the long queue. The reason was that, at the time, my WUs were the only work for anyone to crunch, so I didn't want to make them too long in case people who typically crunch on short tasks wanted WUs to do, but couldn't sacrifice the time. Basically, it was due to the circumstances at that time. We have had a few weeks recently where I was the only one who had work for you guys, which was never a common situation for us. However, we keep adding more users, and it is becoming harder to come up with ideas fast enough (which is a good problem to have!). We are also trying to bring in new scientists! Very neat-o. This explains how GPUGrid came up with their terminology.... it does NOT answer my question. Using the same equipment, these GIANNI are taking 2x as long to run as the NATHAN. Why? You are supposedly compiling with the same options and both are CUDA42. Can somebody answer this question that has more than 15 posts experience; someone who will read the question before attempting to provide a single word answer? Thank you. ID: 30193 · Rating: 0 · rate: / Reply Quote

Stefan Project administrator Project developer Project tester Project scientist Send message Joined: 5 Mar 13 Posts: 348 Credit: 0 RAC: 0 Level Scientific publications	Message 30195 - Posted: 22 May 2013, 11:05:08 UTC - in response to Message 30193. Gianni's WUs may be slightly long for the short queue, but my recent tasks were definitely short for the long queue Other than that, if you are asking why system A is slower than system B then the answer is probably that system A contains more atoms / is more complicated, or the person sending the simulations asked for more simulation steps. Not all simulations are born equal :) I think the new NATHAN's are indeed long enough for the long queue. I hope that answered your question. I just wanted to point out that the NATHAN's you were talking about were not "standard" for the long queue. If you suspect it's your hardware setup then I unfortunately cannot help you and maybe you should make a new thread about it for someone to help you with. Cheers ID: 30195 · Rating: 0 · rate: / Reply Quote