GTX 460

Author	Message
skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 18008 - Posted: 16 Jul 2010, 21:56:38 UTC - in response to Message 18005. Last modified: 16 Jul 2010, 21:58:35 UTC Although it does not work here yet, it still sounds like a very good card. The GF100's did not work straight out of the box either, so don't fret it. Four times faster than a GT240, and using 160W! I have 4 GT240s in one system and they use 260W. The GT240 was as efficient as the bigger G200 cards, so these still look like a good buy. What does Boinc report its performance as? ID: 18008 · Rating: 0 · rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 18010 - Posted: 16 Jul 2010, 22:22:32 UTC - in response to Message 18008. Although it does not work here yet, it still sounds like a very good card. The GF100's did not work straight out of the box either, so don't fret it. Four times faster than a GT240, and using 160W! I have 4 GT240s in one system and they use 260W. The GT240 was as efficient as the bigger G200 cards, so these still look like a good buy. What does Boinc report its performance as? > NVIDIA GPU 0: GeForce GTX 460 (driver version 25856, CUDA version 3010, compute capability 2.1, 738MB, 363 GFLOPS peak Notice also it's listed as compute capability 2.1. I think the other Fermis were 2.0. What's the difference? I just stuck the Kill-A-Watt on it and the total system draw is 227 watts running Collatz and 4 CPU projects at 100% with a Phenom 9600BE CPU. ID: 18010 · Rating: 0 · rate: / Reply Quote

M J Harvey Send message Joined: 12 Feb 07 Posts: 9 Credit: 0 RAC: 0 Level Scientific publications	Message 18012 - Posted: 16 Jul 2010, 22:54:51 UTC - in response to Message 18010. Notice also it's listed as compute capability 2.1. I think the other Fermis were 2.0. What's the difference? Ok, that explains why the current app isn't working on the 460s. We can get that fixed pretty quickly. MJH ID: 18012 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 18013 - Posted: 16 Jul 2010, 22:54:53 UTC - in response to Message 18010. Last modified: 16 Jul 2010, 23:15:17 UTC Yeah, the GF100 Fermi's are only CC2.0. I'm guessing the GF104 cards will better exploit CUDA 3.1 and 3.2. That 363 GFlops is a joke though (a correction factor is called for there). It will do about 4 times what a GT240 does; they only have 96 shaders, whereas a GTX460 has 336 shaders and improved architecture! - It should be 907, http://en.wikipedia.org/wiki/GeForce_400_Series - 16/07/2010 23:50:01 NVIDIA GPU 0: GeForce GT 240 (driver version 19621, CUDA version 3000, compute capability 1.2, 512MB, 307 GFLOPS peak) - OC'd a bit but nothing special. A system draw of 227 Watts running Collatz and 4 CPU projects at 100% is good going! My Phenom II 940 with the 4 GT240s pulls 355W at the wall. You are doing about the same work for 36% less power. Even my i7-920 (with Turbo off) uses 300 to 320W with my GTX470 running GPUGrid. This is one of the failed GTX460 WU's: Name 347-KASHIF_HIVPR_auto_spawn_2_90_ba1-26-100-RND1903_1 Workunit 1704860 Created 16 Jul 2010 20:03:26 UTC Sent 16 Jul 2010 20:06:38 UTC Received 16 Jul 2010 20:13:36 UTC Server state Over Outcome Client error Client state Compute error Exit status -40 (0xffffffffffffffd8) Computer ID 67635 Report deadline 21 Jul 2010 20:06:38 UTC Run time 1.450003 CPU time 0.920406 stderr out <core_client_version>6.10.57</core_client_version> <![CDATA[ <message> - exit code -40 (0xffffffd8) </message> <stderr_txt> # Using device 0 # There is 1 device supporting CUDA # Device 0: "GeForce GTX 460" # Clock rate: 0.81 GHz # Total amount of global memory: 774307840 bytes # Number of multiprocessors: 7 # Number of cores: 56 SWAN : Module load result [.fastfill.cu.] [200] SWAN: FATAL : Module load failed The core ratio to Multiprocessor is still out; it's 6 to 1 and should be 48 to 1: 7 Multiprocessors and 7 groups of 48 shaders (3x16). ID: 18013 · Rating: 0 · rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 18014 - Posted: 16 Jul 2010, 23:29:08 UTC - in response to Message 18012. Notice also it's listed as compute capability 2.1. I think the other Fermis were 2.0. What's the difference? Ok, that explains why the current app isn't working on the 460s. We can get that fixed pretty quickly. MJH I actually stumbled across something useful? What's the difference between 2.0 & 2.1? ID: 18014 · Rating: 0 · rate: / Reply Quote

M J Harvey Send message Joined: 12 Feb 07 Posts: 9 Credit: 0 RAC: 0 Level Scientific publications	Message 18015 - Posted: 17 Jul 2010, 1:03:53 UTC - in response to Message 18014. I actually stumbled across something useful? Yep, ta. What's the difference between 2.0 & 2.1? With Cuda 3.1 it looks likes there's a gnat's crotchet of difference between code targeted for 2.0 and 2.1, but I expect it will change in future compiler releases. Although the ISA likely hasn't changed between the GF100 and 103, with the latter being superscalar, instruction ordering is going to be much more important than on a Fermi and will mean more optimisation work in the compilation. The only reason the current app isn't working is because it doesn't know that the 2.0 Fermi kernels can be used on 2.1 devices. MJH PS Intriguingly, the compiler also accepts 2.2, 2.3 and 3.0 as valid compute capabilities. Make of that what you will. ID: 18015 · Rating: 0 · rate: / Reply Quote

MarkJ Volunteer moderator Volunteer tester Send message Joined: 24 Dec 08 Posts: 738 Credit: 200,909,904 RAC: 0 Level Scientific publications	Message 18017 - Posted: 17 Jul 2010, 4:28:12 UTC - in response to Message 17991. Anyone care to drop the BOINC alpha mailing list a note that their number of multiprocessors are correct, but they have to multiply them by 32 for GF100 and by 48 for GF104 to get the correct number of shaders? MrS Have just done. Also ordered one card (a 768Mb version, factory OC'ed). BOINC blog ID: 18017 · Rating: 0 · rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level Scientific publications	Message 18018 - Posted: 17 Jul 2010, 7:30:24 UTC - in response to Message 18017. We will try to give a new application for the 460 on Monday. gdf ID: 18018 · Rating: 0 · rate: / Reply Quote

bigtuna Volunteer moderator Send message Joined: 6 May 10 Posts: 80 Credit: 98,784,188 RAC: 0 Level Scientific publications	Message 18019 - Posted: 17 Jul 2010, 8:45:03 UTC Awesome! Can't wait to try it out. Not having any luck with the 460 on Folding. It is running 3dmark 06 right now so the drivers must be working, correct? ID: 18019 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 18021 - Posted: 17 Jul 2010, 9:29:13 UTC @MarkJ: sorry, this incorrect reporting is an issue of GPU-Grid, not BOINC! SK reported this to them yesterday or so.. don't be surprised if they're a little p***** off. Just tell them it was my fault ;) @GDF: will the correct number of shaders be reported in the new monday app? @GTX460 & Collatz: don't forget that CC is a comparably light workload, i.e. the cards draw considerably less power here than under GPU-Grid (or Milkyway for ATIs). Anyway, 14:30 is a nice result! THe 1 GB version may even improve this a bit, as CC loves memory bandwidth. For comparison: my HD4870 takes about 13 mins at 800 / 950 MHz core / mem. That's for 1.28 TFlops and 122 GB/s bandwidth. The GTX460 786 MB weights in at 0.9 TFlops and 86 GB/s. Apparently we're seeing slightly better utilization of the nVidia shaders here. MrS Scanning for our furry friends since Jan 2002 ID: 18021 · Rating: 0 · rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 18025 - Posted: 17 Jul 2010, 11:41:41 UTC - in response to Message 18021. Last modified: 17 Jul 2010, 11:51:22 UTC Not having any luck with the 460 on Folding. It is running 3dmark 06 right now so the drivers must be working, correct? The GTX 460 works very nicely in both Collatz & DNETC. @GTX460 & Collatz: don't forget that CC is a comparably light workload, i.e. the cards draw considerably less power here than under GPU-Grid (or Milkyway for ATIs). Just checked DNETC with the Kill-A-Watt. The GTX 460 is now running at 800 core & 1600 shaders, stock voltage (a bump from the 763/1526 factory OC). 239 watts total system draw at 97% GPU. The core/shader speed seem to be locked together at least in MSI Afterburner v1.61 and unlike earlier cards are not allowed to be pushed separately. Not sure if this is a Fermi requirement or a problem in Afterburner. Not a big deal though. Anyway, 14:30 is a nice result! THe 1 GB version may even improve this a bit, as CC loves memory bandwidth. For comparison: my HD4870 takes about 13 mins at 800 / 950 MHz core / mem. That's for 1.28 TFlops and 122 GB/s bandwidth. The GTX460 786 MB weights in at 0.9 TFlops and 86 GB/s. Apparently we're seeing slightly better utilization of the nVidia shaders here. With half a days results to average with the card set to 800/1600, Collatz is now averaging 13:40/WU and DNETC 21:50/WU. Temps still running at 53C. ID: 18025 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 18026 - Posted: 17 Jul 2010, 13:02:02 UTC - in response to Message 18025. Then Fermi cores and shaders are linked for both GF100 and GF104 card alike; we have to overclock both at the same time. 53C - I wish! I'm running a GPUGrid WU that is only using 60% GPU (going by EVGA Precision) and my GTX470 is at 67'C. It would default to about 91deg, if I did not up the fan speed. By the way anyone with a GF100 Fermi should increase the fan speed while crunching. There are 2 reasons: 1. The card stays cooler and should last longer. 2. When it is cooler it uses less energy (10 to 20W) - GF100's are leaky! The power usage of running a fan faster is more than offset by the savings from leakage by the GPU. It will still leak, but not as much. Think of it like an off shore oil well cap. Dont use one and it leaks everywhere. Only use one a bit and it still leaks, badly. Use a good one the correct way, and you have stemmed the flow as much as you can. ID: 18026 · Rating: 0 · rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 18035 - Posted: 18 Jul 2010, 12:42:56 UTC - in response to Message 18025. Last modified: 18 Jul 2010, 12:53:06 UTC Anyway, 14:30 is a nice result! THe 1 GB version may even improve this a bit, as CC loves memory bandwidth. For comparison: my HD4870 takes about 13 mins at 800 / 950 MHz core / mem. That's for 1.28 TFlops and 122 GB/s bandwidth. The GTX460 786 MB weights in at 0.9 TFlops and 86 GB/s. Apparently we're seeing slightly better utilization of the nVidia shaders here. With half a days results to average with the card set to 800/1600, Collatz is now averaging 13:40/WU and DNETC 21:50/WU. Temps still running at 53C. Since I use the optimized Collatz app on my ATI cards, decided to give it a try on the GTX 460. After finding the optimum settings it's now averaged just under 10:58 for the last 20 WUs (with a range of 10:38 - 11:20). GPU use has increased to 99% and temps to 55C - 57C depending on ambient (we're in the middle of a heat wave for us in Minnesota), fan bumps a bit to 44% at 57C. Core/shaders still at 800/1600. Memory is stock (for the Superclocked version) at 1900 according to Afterburner. Total system draw has increased to 247 watts with 4 CPU projects running at 100%. PS is an Antec EarthWatts 380 watt which is considerably less than recommended. ID: 18035 · Rating: 0 · rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 18038 - Posted: 18 Jul 2010, 18:01:10 UTC - in response to Message 18035. Last modified: 18 Jul 2010, 18:03:44 UTC Keep the memory at stock. If you overclock the memory on the GTX460 it will end up slowing the card down due to increased errors (the controller is almost at its max as is, which is why they used 4000MHZ max. rated RAM rather than 5000MHz)! Your Antec EarthWatts 380 is a good PSU, enough AMPS. I have one supporting a Q6600 and an OC'd GTX260-216, which uses more power than your GTX460. I ran a Folding@home task again today, while running GPUGrid (on my GTX470), and the temps did not go up from 72degC (GPU Fan at 83% mind you). I think most apps are reporting the actual usage of GPUGrid incorrectly (or at least a skewed version). EVGA Precision reported GPUGrid as only using 60% GPU, but the temp was just the same, 72degC. When I suspend tasks it drops below 40degC. Running Folding it also (as with your 460 running Collatz) said 99% GPU usage, but the temp was just the same, 72degC. ID: 18038 · Rating: 0 · rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 18041 - Posted: 18 Jul 2010, 20:14:34 UTC - in response to Message 18026. By the way anyone with a GF100 Fermi should increase the fan speed while crunching. There are 2 reasons: 1. The card stays cooler and should last longer. 2. When it is cooler it uses less energy (10 to 20W) - GF100's are leaky! The power usage of running a fan faster is more than offset by the savings from leakage by the GPU. It will still leak, but not as much. Completely agreed! Think of it like an off shore oil well cap. Dont use one and it leaks everywhere. Only use one a bit and it still leaks, badly. Use a good one the correct way, and you have stemmed the flow as much as you can. I'd rather put it this way: temperature is equivalent to movement of particles, including the atoms (should have fixed positions in your chip) and the free electrons. If it's hotter the latter are more often kicked around randomly and thus they sometimes go where they shouldn't - and that's your leakage. MrS Scanning for our furry friends since Jan 2002 ID: 18041 · Rating: 0 · rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 2 Level Scientific publications	Message 18042 - Posted: 19 Jul 2010, 8:25:27 UTC - in response to Message 18021. @MarkJ: sorry, this incorrect reporting is an issue of GPU-Grid, not BOINC! SK reported this to them yesterday or so.. don't be surprised if they're a little p***** off. Just tell them it was my fault ;) Report-back from boinc_alpha mailing list: Tell it to NVIDIA; they don't provide an API for getting the number of cores. I've asked NVIDIA to confirm that all CC 2.1 chips have 48 cores/MP; if this is the case I'll add that logic to the client. Note: this matters only for providing an accurate peak FLOPS message on startup. Everything related to scheduling and credit is determined by the actual performance, not the peak FLOPS. -- David On 17-Jul-2010 2:11 AM, Richard Haselgrove wrote: > The 32x part of that is already hard-coded into the BOINC client, as a fix > for the original GF100-based Fermi crads (GTX 470 and GTX 480) - the > original hard-coded value of 8 for compute capability 1.x cards produced > under-reporting as well. (Changesets 21034, 21036) > > Cards based on the GF104 chip - the GTX 460 released five days ago, and the > GTX 475 due later in Q3, have a shader count of 48 per multiprocessor, so > need yet another hard-coded CC 2.1 test. > > Surely this is a prehistoric way of doing things? We shouldn't have to > change the infrastructure framework every time a new chip is released. > Shader count detection belongs in the NVidia driver and API, not at the > application level. > > Is there any way BOINC itself, and the projects directly affected, can join > together and make representations to NVidia to get their API extended? ID: 18042 · Rating: 0 · rate: / Reply Quote

trn-xs Send message Joined: 12 Feb 10 Posts: 8 Credit: 17,551,984 RAC: 0 Level Scientific publications	Message 18048 - Posted: 19 Jul 2010, 12:36:39 UTC Last modified: 19 Jul 2010, 12:37:20 UTC SLI stock 460's; currently running DNETC while waiting for GPUGrid. Cuda 3.1 WU's complete in 13 min (win7 x64.) ID: 18048 · Rating: 0 · rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level Scientific publications	Message 18053 - Posted: 19 Jul 2010, 14:07:41 UTC - in response to Message 18048. Now it runs on gtx460 with acemdbeta 6.36. gdf ID: 18053 · Rating: 0 · rate: / Reply Quote

[TiDC] Revenarius Send message Joined: 19 Aug 09 Posts: 4 Credit: 4,582,561 RAC: 0 Level Scientific publications	Message 18065 - Posted: 19 Jul 2010, 19:41:36 UTC Fisrt unit ok in a gtx460 http://www.gpugrid.net/result.php?resultid=2692716 less than 946 s in complete. The new version is running ok ID: 18065 · Rating: 0 · rate: / Reply Quote

bigtuna Volunteer moderator Send message Joined: 6 May 10 Posts: 80 Credit: 98,784,188 RAC: 0 Level Scientific publications	Message 18066 - Posted: 19 Jul 2010, 20:17:00 UTC - in response to Message 18053. Now it runs on gtx460 with acemdbeta 6.36. gdf How does one run "acemdbeta 6.36"? I downloaded some work units and they all failed. They said something about CUDA 30? Does that mean I am running CUDA 3.0? How can you tell? ID: 18066 · Rating: 0 · rate: / Reply Quote