Advanced search

Message boards : Graphics cards (GPUs) : Maxwell now

Author Message
flashawk
Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,550,610,997
RAC: 743,480
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 34732 - Posted: 19 Jan 2014 | 20:25:30 UTC

There's a rumor going around that Maxwell is coming out next month. I wonder if this was planned of if AMD's sales are hurting them?

Jim1348
Send message
Joined: 28 Jul 12
Posts: 697
Credit: 1,371,999,968
RAC: 25
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 34736 - Posted: 20 Jan 2014 | 2:33:11 UTC - in response to Message 34732.

It looks more like a delaying action to hold off AMD until the 20 nm process arrives, probably later than they had originally hoped. A GTX 750 Ti won't set the world on fire in performance, and won't make them a ton of money. But it gives them a chance to see how well the design works in practice, and to give the software developers a head start before the real Maxwell arrives.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34807 - Posted: 24 Jan 2014 | 22:21:47 UTC

It likely is just a false rumor. No prove has been shown that these cards use Maxwell chips, despite relatively complete benchmarks already appeared. It's probably just GK106 with 768 shaders.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 82,949
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34823 - Posted: 26 Jan 2014 | 11:21:18 UTC - in response to Message 34807.
Last modified: 26 Jan 2014 | 11:33:19 UTC

Producing a Maxwell on 28nm process would be a complete change of direction for NVidia, so I agree this is likely to be a false rumor. There are two revision models (Rev. 2) of GPU's in the GF600 lineup (GT630 and GT640), so perhaps NVidia want to fill out their GF700 range with a lower end card, so if there is a Rev2 version of the GK650Ti en route, it makes more sense to shift it to the GF700 range.

The idea of constructing a Maxwell on 28n, does make a lot of sense however; GM could be tested directly against GK, and they could produce more competitive entry to mid-range cards earlier. Small cards are the largest part of the GPU market so why produce a big, immature card first? As GM's will have a built in CPU (of sorts) it would be better to test these (and their usefulness / scalability) on smaller cards first - no point producing a fat GM which has a insufficient CPU to support it.

I've always wondered why they produced the larger cards first. It's just been a flag waving exercise IMO. While that might be marketable, it makes no sense when dealing with the savvy buyer and other businesses (OEM's), especially for supercomputers.

NVidia could also have produced GF cards at 28nm, and they would have had a market. Perhaps they did; just for engineering, design and testing purposes and managed to keep these chips completely in-house. While such designs might have been/will be marketable, from a business point of view they would primarily be competing against other NVidia products - probably a bad thing - better to focus your developmental skill set on one controllable futuristic objective rather than tweaking.

The eventual 40% reduction in die size will probably facilitate cooler GPU's. In the main, GK temperatures are significantly less of an issue than GF temps, but for several high end cards in one system it's still a problem. So while NVidia doesn't have temperature licked now, it should fall into place at 20nm.

In the mean time, entry to mid-range 28nm cards are easy to produce and easy to cool. 28nm Maxwell's might be easier to work with now and for early 20nm products. When moving to 20nm, yields will inevitably be low (always are) so it would make sense to start at the small end where you are actually going to get enough sample to test with and enough product to release. The lesser bins tend to go to OEM anyway, so it might be better to start there and get a product out which will compete with AMD and Intel's integrated GPU processors ASAP. Lets face it, this is where the competition is highest and NVidia is weakest. So the first 28nm Maxwell's could well be for laptops and other mobile devices. ARM can already be used to support an OS, so IMO it's inevitable that ARM will bolster their CPU with an NVidia GPU. That's what the market really wants; sufficient CPU processing power to start up and run an OS and a high end GPU for the video-interface, gaming... isn't it?
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34825 - Posted: 26 Jan 2014 | 16:21:02 UTC - in response to Message 34823.
Last modified: 26 Jan 2014 | 16:23:00 UTC

ARM can already be used to support an OS, so IMO it's inevitable that ARM will bolster their CPU with an NVidia GPU. That's what the market really wants; sufficient CPU processing power to start up and run an OS and a high end GPU for the video-interface, gaming... isn't it?


I guess a fairly large part of the market wants that. I would be happy with a motherboard with a BIOS that can boot from PXE, no SuperIO (USB, RS-232, parallel port, PS/2), ISC bus for onboard temperature sensors, no IDE or SATA (no disks), just lots of RAM, an RJ-45 connector and gigabit ethernet, no wifi, enough CPU processing power to startup and run a minimal OS that has a good terminal, SSH and can run BOINC client and project apps. Don't need a desktop or anything to do with a GUI, no TWAIN or printer drivers/daemons, no PnP or printer service, no extra fonts (just a decent terminal and 1 font), network services required, Python or some other scripting language would be nice but not much more.

If they could fit all that onto a slightly larger video card I'd be happy, otherwise put it on a 2" x 5" board with a PCIe slot and power connectors and call it a cruncher. Something so no frills IKEA would consider stocking it.

What else would be unnecessary... no RTC (get the time off the LAN), no sound, no disk activity LED.
____________
BOINC <<--- credit whores, pedants, alien hunters

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35023 - Posted: 13 Feb 2014 | 20:09:11 UTC

Seems like the cat is out of the bag.. and we were all wrong, as usual for a new generation ;)
It's still not official, but far more solid than any rumors before this:

- at least 1, probably 2 small chips in 28 nm soon
- the bigger ones later in 20 nm
- architectural efficiency improvements
- much larger L2 cache
- the compute to texture ratio increases from 12:1 to 16:1 (like AMDs)
- the SMX goes down to 128 shaders (192 in Kepler)
-> that could mean they're going back to non-superscalar (i.e. just scalar)

If the latter is true this could mean significant per-clock per shader performance improvements here and in many other BOINC projects :)

MrS
____________
Scanning for our furry friends since Jan 2002

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35024 - Posted: 13 Feb 2014 | 23:05:14 UTC - in response to Message 35023.

Sounds like a big performance per watt increase will be coming too. I think I'll put planned purchases on hold, build savings and see what the picture looks like 4 months from now.
____________
BOINC <<--- credit whores, pedants, alien hunters

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35029 - Posted: 14 Feb 2014 | 8:12:58 UTC - in response to Message 35024.

That's not what nVidia would like you to do.. but I agree ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 82,949
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35030 - Posted: 14 Feb 2014 | 10:11:16 UTC - in response to Message 35024.
Last modified: 14 Feb 2014 | 11:48:48 UTC

For GPUGrid, performance touting is premature - we don't even know if it will work with the current app. It could take 6 months of development and debugging. It took ages before the Titan's worked.

As the GTX750Ti will only have 640 Cuda Cores, the 128bit bus probably won't be an issue. The Cuda core to bus lane ratio is about the same as a GTX670. However, the 670 is super-scalar and the GTX480 had 384lanes. Suggesting a 60W GTX750Ti will be slightly faster than a GTX480 still sounds unrealistic, but assuming the non-super-scalar Cuda cores aren't 'semi-skimmed' it might be powerful enough. I suspect they will not be 'full fat' in the way GF110 was, and there could be additional bottlenecks, driver bugs... So it's wait and see.

Having 6power pins means the GTX750Ti could be powered directly from the PSU, rather than through the motherboard. This is good if you want to use this card on a Riser in say a 3rd slot (which might not actually be capable of supplying 75W).

Avoid cards with small fans - they don't last.

I still say 'stay clear of the GTX750' if it's only got 1GB GDDR5.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

5pot
Send message
Joined: 8 Mar 12
Posts: 411
Credit: 2,083,882,218
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 35038 - Posted: 14 Feb 2014 | 13:53:07 UTC

I still feel like I'm stuck between a rock and a hard place. Haswell e will have an 8 core variant in q3. So this is definitely going to be bought. However, I would like this to be my last system build for more than a year, as pumping 5k annually is something I can not continue. Every other year, sure.

But with Volta and it's stacked dram.... I'm very cautious about dropping 1.8k+ on gpus that most likely won't be that large of a change. Well see I suppose

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35097 - Posted: 16 Feb 2014 | 14:24:22 UTC - in response to Message 35038.

But with Volta and it's stacked dram.... I'm very cautious about dropping 1.8k+ on gpus that most likely won't be that large of a change. Well see I suppose

Volta will still take some time, as GPUs have matured quite a bit (compared to the wild early days of a new chip every 6 months!) and progress is generally slower. That's actually not so bad, because we can keep GPUs longer and the software guys have some time to actually think about using those beasts properly.

If you still have Fermis or older running, get rid of them as long as you can still find (casual) gamers willing to pay something for them. If you think about upgrading from Kepler to Maxwell and don't want to spend too much I propose the following: replace 2 Keplers by 1 Maxwell for about the same throughput, which should hopefully be possible with 20 nm and the architectural improvements. This way you don't have to spend as much and reduce power usage significantly (further savings). You throughput won't increase, but so what? If you feel like spending again you could always add another GPU.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35103 - Posted: 17 Feb 2014 | 18:48:21 UTC - in response to Message 35023.

There is no ARM CPU on the block diagram of the GM107:


After reading the article it seems to me that this is only a half step towards the new generation: it has better performance/watt ratio because of the evolution of the 28nm process and because of the architectural changes (probably these two aspects are bound together: this architecture can achieve higher CUDA core/chip area ratio than the GK architecture).
As its performance is expected to be like the GTX480's performance, perhaps there is no need for an on-chip CPU to fully utilize this GPU.
Also, it's possible that there is no need for big changes in the GPUGrid application to work with this GPU.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35104 - Posted: 17 Feb 2014 | 21:13:20 UTC - in response to Message 35103.

As far as I remember this "ARM on chip" was still a complete rumor. Could well be that someone confused some material about future nVidia server chips with GPU (project Denver) for the regular GPUs.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 82,949
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35114 - Posted: 18 Feb 2014 | 13:38:14 UTC - in response to Message 35104.
Last modified: 18 Feb 2014 | 13:42:48 UTC

I would be a bit concerned about the 1306GFlops rating for the GTX750Ti. That's actually below the GTX650Ti (1420). The 750Ti also has a 128bit bus and bandwidth of 86.4GB/s. While the theoretical GFLOPS/W SP is 21.8, it's still an entry level card; it would talk 4 of these card to have the overall performance of a GTX780Ti. There should be plenty of OC models and potential for these GPU's to boost further.

There may also be a 1GB version of the GTX750Ti (avoid).

My confusion over ARM came from fudge reports which presumed Maxwell and ARM are joined at the hip. Just because Tesla's might get an ARM this decade does not mean any other card will. It hasn't even been announced that Maxwell based Tesla's will - just interpreted that way.
The use of ARM doesn't require Maxwell architecture; the Tegra K1 is based on Kepler and uses a Quad-Core ARM Cortex-A15 R3, and previous Tegra's also used ARM.
It is the case that NVidia want to do more on the discrete GPU and be less reliant on the underlying system but that doesn't in itself require an ARM processor.
The only really interesting change is that the Shader Model is 5.0 - so it's CC5.0. This non-super-scalar architecture probably helped throw people into thinking that these GPU's would come with ARM processors, but when you think about it, there is no sense putting a discrete processor onto an entry level GPU. A potential obstacle to the use of ARM might be Windows licences, as these typically limit your software use to 2CPU's (makes a second card a no-no).
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 82,949
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35128 - Posted: 18 Feb 2014 | 20:48:02 UTC - in response to Message 35114.
Last modified: 18 Feb 2014 | 20:48:31 UTC

I see EVGA are selling a GTX750Ti with a 1268MHz Boost. In theory that's 16.8% faster than the reference model, though I would expect the reference card to boost higher than the quoted 1085MHz (if it works)!
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 35139 - Posted: 19 Feb 2014 | 1:00:46 UTC

I have some GTX750Tis on order; should have them in my hands next week.
It's not yet clear whether we'll need to issue a new application build.
Stay tuned!

Matt

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 82,949
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35147 - Posted: 19 Feb 2014 | 14:02:13 UTC - in response to Message 35139.
Last modified: 19 Feb 2014 | 14:03:26 UTC

I read that the 128bit bus is a bottleneck, but as the card uses 6GHz GDDR5 a 10% OC is a given. The GPU also OC's well (as the temps are low). So these cards could be tweeked to be significantly more competitive than the reference model.

Compute is a bit mixed going by Anandtech, so its wait and see about the performance (if they work),

http://anandtech.com/show/7764/the-nvidia-geforce-gtx-750-ti-and-gtx-750-review-maxwell/22
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35157 - Posted: 19 Feb 2014 | 20:51:05 UTC

Don't be fooled by the comparably low maximum Flops. We got many of those with Kepler, and complained initially that we couldn't make proper use of them, as the performance per shader per clock was significantly below non-superscalar Fermis. Now we're going non-superscalar again and gain some efficiency through that, as well as through other tweaks.

And this show in the compute benchmarks at Anandtech: GTX750Ti beats GTX650Ti easily and consistently, often hangs with GTX650Ti Boost and GTX660 and sometimes performs more than twice as fast as GTX660! Neither of those benchmarks is GPU-Grid, but this bodes well for Maxwell here, since GPU-Grid never really liked the super-scalarity all that much. Let's wait for Matt's test.. but I expect Maxwell to do pretty well.

The 128 bit memory bus on GM107 is somewhat limiting, but mitigated by the far larger L2 cache. To what extend for GPU-Grid.. I don't know. And those chips seem to clock ridiculously high. I've seen up to almost 1.3 GHz at stock voltage (1.13 - 1.17 V). If wish the testers had lowered the voltage to see what the chips really can do, instead of being limited by the software sliders. The bigger chips naturally won't clock as well, but 20 nm should shake things up anyway.

Bottom line: don't rush to buy those cards, since they're only mainstream models after all. But don't buy any other cards for GPU-Grid until we know how good Maxwell really is over here.

MrS
____________
Scanning for our furry friends since Jan 2002

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35158 - Posted: 19 Feb 2014 | 21:23:05 UTC - in response to Message 35157.

OK no purchases but I would rather a professional or a Ph.D. test the pretend Maxwells so we can be sure of what we're looking at ;-)

____________
BOINC <<--- credit whores, pedants, alien hunters

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35159 - Posted: 19 Feb 2014 | 22:02:53 UTC - in response to Message 35158.

Professional enough? Or shall I search for a review written by someone with a PhD in ancient greek history? ;)

MrS
____________
Scanning for our furry friends since Jan 2002

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,348,955
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35162 - Posted: 19 Feb 2014 | 22:33:29 UTC

EVGA (Europe) just announced they have Titan Black and GTX 750 and 750Ti for sale. The latter for 150 euro, really cheap with 2GB. However not in stock so can not be ordered yet, but I won't.
____________
Greetings from TJ

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 82,949
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35165 - Posted: 20 Feb 2014 | 0:12:35 UTC - in response to Message 35162.

The GPU memory latency is supposedly better, so the GTX750Ti's memory bandwidth bottleneck might not be as bad as I first suspected. That said, compute performances are a bit 'all over the place'. It's definitely a wait and see situation for here.

My concerns for these cards are first and foremost compatibility; will it work straight out of the box, will an app revamp be required, will we have to wait on drivers or might they never be compatible (time, money and effort developing for an entry level GPU might well be better spent elsewhere).

If the apps work the memory bandwidth may or may not be an issue, but the performance/Watt should be very good nonetheless. Some of the Folding benchmarks are promising, so if they are not up to much for here they are good for there (SP), and possibly a Boinc project or two.

I get the distinct feeling that the 20nm Maxwell cards will bring exceptional performances for here, when they turn up. They won't all have memory bottlenecks, and performance/Watt is likely to be much better than with the 28nm versions (which are already great). I think it's really a good time to watch this space, and start to think about and prepare for future purchases; sell on existing hardware when the value is still good!
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1919
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 35171 - Posted: 20 Feb 2014 | 12:52:48 UTC - in response to Message 35165.

Some more info on Maxwell.

http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce-GTX-750-Ti-Whitepaper.pdf

gdf

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 82,949
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35173 - Posted: 20 Feb 2014 | 14:05:00 UTC - in response to Message 35171.

NVidia are comparing the GTX750Ti to a 3 generation old GTX480 for performance and a GT640 for power usage, but not a GTX650Ti! For some games its roughly equal to a GTX480 and in terms of performance/Watt the GTX750Ti is 1.7times better than a GT640 (and similar). While it is a GX107 product, the name suggests its an upgrade to a GTX650Ti.

____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile Coleslaw
Send message
Joined: 24 Jul 08
Posts: 35
Credit: 225,538,586
RAC: 64,238
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35236 - Posted: 22 Feb 2014 | 22:52:05 UTC
Last modified: 22 Feb 2014 | 22:52:54 UTC

One of my team mates has a couple 750Ti's and they keep failing here. They are running good at Einstein. I have encouraged our team members to post in here that have the new Maxwell cards.

http://www.gpugrid.net/show_host_detail.php?hostid=167781
____________

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 82,949
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35237 - Posted: 22 Feb 2014 | 23:07:56 UTC - in response to Message 35236.
Last modified: 22 Feb 2014 | 23:10:50 UTC

Thanks. That is important information to share!

Both long and short run tasks are failing. All present task types are failing. Tasks fail quickly, 0 to 4sec.

This strongly suggests that the present app would need to be modified to support these GPU's. It's likely that the different architecture is the reason for GPUGrid failures (non-super-scalar...) and running CUDA 5.5apps. At Einstein they use a CUDA 3.x app IIRC - a much more unrefined CUDA version; more tolerant, but slower.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

ChelseaOilman
Send message
Joined: 6 Jan 14
Posts: 2
Credit: 1,103,657,775
RAC: 6,991
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwat
Message 35238 - Posted: 22 Feb 2014 | 23:33:16 UTC

I think I might be the one Coleslaw is referring about. I installed a pair of 750Ti cards in my computer yesterday and tried to run GPUGRID. No go, instant fail within a couple seconds. Einstein seems to run just fine. I bought these cards to run GPUGRID and I'm not to happy that they can't. I'm not that interested in running Einstein and other than F@H there isn't much else out there for GPUs. I refuse to participate in F@H anymore. If you need any info feel free to ask.

Profile Coleslaw
Send message
Joined: 24 Jul 08
Posts: 35
Credit: 225,538,586
RAC: 64,238
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35239 - Posted: 23 Feb 2014 | 0:03:29 UTC - in response to Message 35238.

I think I might be the one Coleslaw is referring about. I installed a pair of 750Ti cards in my computer yesterday and tried to run GPUGRID. No go, instant fail within a couple seconds. Einstein seems to run just fine. I bought these cards to run GPUGRID and I'm not to happy that they can't. I'm not that interested in running Einstein and other than F@H there isn't much else out there for GPUs. I refuse to participate in F@H anymore. If you need any info feel free to ask.


Yes you are. Thanks for volunteering. :)

Gilthanis
____________

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 35240 - Posted: 23 Feb 2014 | 0:37:05 UTC - in response to Message 35237.


This strongly suggests that the present app would need to be modified to support these GPU's.


That's pretty annoying - it likely means that we'll not be able to use them until CUDA 6 goes public.

Matt

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 82,949
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35241 - Posted: 23 Feb 2014 | 1:07:07 UTC - in response to Message 35240.

Sorry for being annoying.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35270 - Posted: 23 Feb 2014 | 13:47:49 UTC - in response to Message 35238.

I bought these cards to run GPUGRID and I'm not to happy that they can't.

Well, it's a new architecture (or more precisely: significantly tweaked and rebalanced) so some "unexpected problems" can almost be expected. Be a bit patient, I'm sure this can be fixed. Maxwell is the new architecture for all upcoming nVidia chips in the next 1 - 2 years, after all.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 35272 - Posted: 23 Feb 2014 | 14:14:01 UTC - in response to Message 35270.


I bought these cards to run GPUGRID and I'm not to happy that they can't.


Don't worry, support is coming just as soon as possible. These new cards are very exciting for us! Unfortunately, because of the way we build our application, we need to wait for the next version of CUDA which contains explicit Maxwell support.
The other GPU-using projects don't have this limitation because they build their applications in a different way. The other side of that is that they aren't able to make GPU-specific optimisations the way we do.

Matt

ChelseaOilman
Send message
Joined: 6 Jan 14
Posts: 2
Credit: 1,103,657,775
RAC: 6,991
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwat
Message 35275 - Posted: 23 Feb 2014 | 14:50:43 UTC - in response to Message 35272.

Don't worry, support is coming just as soon as possible. These new cards are very exciting for us! Unfortunately, because of the way we build our application, we need to wait for the next version of CUDA which contains explicit Maxwell support.

I hope Nvidia comes out with the new version of CUDA soon. I expect GPUs that use much less electricity will become very popular quickly. I'll be switching back to GPUGRID when you get the Maxwell compatible client out.

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,348,955
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37914 - Posted: 15 Sep 2014 | 13:07:42 UTC

The new Haswell E 6-core is available in the Netherlands, but pricy. Any idea when the real Maxwell is launched. I read "soon" on the net in some articles, but did not find any date.
____________
Greetings from TJ

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 37915 - Posted: 15 Sep 2014 | 13:58:15 UTC - in response to Message 37914.
Last modified: 15 Sep 2014 | 14:03:28 UTC

The new Haswell E 6-core is available in the Netherlands, but pricy. Any idea when the real Maxwell is launched. I read "soon" on the net in some articles, but did not find any date.


Rumor has it- GTX980 (or whatever board will be called) will be showcased (or released) at NVidia's Game24 event on September 18th, along with a 343 branch driver. GTX 970/960 could be released by early/mid October. Leaked benchmarks (if there not fake) show GM204 Maxwell to be at reference GTX780ti (5teraFlops) performance levels with a lower TDP. Maxwell's Integer/256AES/TMU/ROP performance is higher then Kelper's core. GTX 980 will have 256bit memory interface. Float (double/single) will be similar to a disabled DP core GK110 (GTX780/780ti) cards. A Titan with 64DP core SMX enabled for double precision tasks won't be replaced until another Maxwell stack is created for Titan's market position. A dual Maxwell board with 11/12 single teraflops' and 3/4 Teraflops for double would be an ultimate board.

Jozef J
Send message
Joined: 7 Jun 12
Posts: 112
Credit: 1,035,582,756
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 37919 - Posted: 16 Sep 2014 | 14:35:32 UTC


[/url][/url]

So here is deciding which card would be best for GPUgrid
Boost clock for gtx 980 looks very good, even the 64 ROPS.. But 780Ti have 2880 cuda cores...?

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,348,955
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37920 - Posted: 16 Sep 2014 | 14:51:45 UTC - in response to Message 37919.
Last modified: 16 Sep 2014 | 14:54:53 UTC

Thanks for this Jozef J.
No hurry for me now to build a new rig as this is still not the "real" Maxwell with the 20nm chip.
Despite more energy used the 780Ti is still the best card to my opinion.

PS: I have a factory overclocked EVGA 780Ti and runs at 1137MHz.
____________
Greetings from TJ

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37921 - Posted: 16 Sep 2014 | 15:02:19 UTC - in response to Message 37919.
Last modified: 16 Sep 2014 | 15:02:50 UTC

So here is deciding which card would be best for GPUgrid
Boost clock for gtx 980 looks very good, even the 64 ROPS.. But 780Ti have 2880 cuda cores...?

The GTX 780Ti is superscalar, so not all of the 2880 CUDA cores can be utilized by the GPUGrid client. The actual number of the utilized CUDA cores of the GTX 780Ti is somewhere between 1920 and 2880 (most likely near the lower end). This could be different for each workunit batch. If they really manufacture the GM204 on 28nm lithography, than this is only a half step towards a new GPU generation. The performance per power ratio will be slightly better of the new GPUs, and (if the data in this chart are correct) I expect the GTX980 could be 15~25% faster than the GTX780Ti (here at GPUGrid). When we'll have the real GPUGrid performance of the GTX980, we'll know how much of the 2880 CUDA cores of the GTX780Ti is actually utilized by the GPUGrid client. But as NVidia choose to move back to scalar architecture, I expect that the superscalar architecture of the Keplers (and the later Fermis) wasn't as successful as expected.

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 37923 - Posted: 16 Sep 2014 | 15:53:33 UTC - in response to Message 37921.

So here is deciding which card would be best for GPUgrid
Boost clock for gtx 980 looks very good, even the 64 ROPS.. But 780Ti have 2880 cuda cores...?

The GTX 780Ti is superscalar, so not all of the 2880 CUDA cores can be utilized by the GPUGrid client. The actual number of the utilized CUDA cores of the GTX 780Ti is somewhere between 1920 and 2880 (most likely near the lower end). This could be different for each workunit batch. If they really manufacture the GM204 on 28nm lithography, than this is only a half step towards a new GPU generation. The performance per power ratio will be slightly better of the new GPUs, and (if the data in this chart are correct) I expect the GTX980 could be 15~25% faster than the GTX780Ti (here at GPUGrid). When we'll have the real GPUGrid performance of the GTX980, we'll know how much of the 2880 CUDA cores of the GTX780Ti is actually utilized by the GPUGrid client. But as NVidia choose to move back to scalar architecture, I expect that the superscalar architecture of the Keplers (and the later Fermis) wasn't as successful as expected.


Is NVidia skipping 20nm for 16nm? After couple years of development, TSMC is struggling badly to find proper Die size(s) for 20nm. Nvidia changes lithography every two years or so. Now after Two and half years, boards are still at 28nm, after three series releases(600,700,800m) of 28nm generations, while GTX980 will be the fourth 28nm released. What could be the problem with finding a pattern to fit cores on 20nm? The change from superscalar to scalar?

How does a 5 SMM, 640core/40TMU/60W-TDP GTX750ti perform (7%~) better than a 4SMX, 768 core/ 110/130W-TDP Kelper with more TMU(64), while smashing GTX650ti/boost compute time/power consumption ratios? Core/memory speed differences'? GTX 750ti is close (~5%) to GTX660 (5SMX/960Core/140w-TDP) compute times. Is Maxwell's cache sub system architecture, TMU rendering that much better than Kelper's, running GPUGRID code? Maxwell's core architecture may be more efficient than Kepler's, but is Maxwell's really more advanced, when Float processing is similar to Kelper? Maxwell Integer performance is higher, due to having more integer cores in SMM vs. SMX, and the added barrel shifter, which is missing in Kepler.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37924 - Posted: 16 Sep 2014 | 19:53:45 UTC - in response to Message 37923.

How does a 5 SMM, 640core/40TMU/60W-TDP GTX750ti perform (7%~) better than a 4SMX, 768 core/ 110/130W-TDP Kelper with more TMU(64), while smashing GTX650ti/boost compute time/power consumption ratios? Core/memory speed differences'? GTX 750ti is close (~5%) to GTX660 (5SMX/960Core/140w-TDP) compute times.

That's very easy to answer:
The SMXes of the GTX650Ti and the GTX660 are superscalar, so only (approximately) 2/3rd of their cores can be utilized (512 and 640, respectively).

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 37925 - Posted: 16 Sep 2014 | 20:42:34 UTC - in response to Message 37924.
Last modified: 16 Sep 2014 | 20:46:10 UTC

How does a 5 SMM, 640core/40TMU/60W-TDP GTX750ti perform (7%~) better than a 4SMX, 768 core/ 110/130W-TDP Kelper with more TMU(64), while smashing GTX650ti/boost compute time/power consumption ratios? Core/memory speed differences'? GTX 750ti is close (~5%) to GTX660 (5SMX/960Core/140w-TDP) compute times.

That's very easy to answer:
The SMXes of the GTX650Ti and the GTX660 are superscalar, so only (approximately) 2/3rd of their cores can be utilized (512 and 640, respectively).


If this is the case, then why do GPU utilization (MSI afterburner, eVGA precision) programs show +90% for most GPUGRID tasks? Are these programs not accounting for type of (scalar or superscalar) architecture? If only 2/3rd of cores are active, won't GPU utilization be at ~66%, instead of the typical 90%? These programs are capable of monitoring Bus usage, memory control (frame buffer), Video processing, amount of power, and much more.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 37926 - Posted: 16 Sep 2014 | 23:26:25 UTC - in response to Message 37919.

I estimate that the new GM204 will be about 45% faster than a 780ti.

Matt

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37928 - Posted: 17 Sep 2014 | 0:05:37 UTC - in response to Message 37925.

How does a 5 SMM, 640core/40TMU/60W-TDP GTX750ti perform (7%~) better than a 4SMX, 768 core/ 110/130W-TDP Kelper with more TMU(64), while smashing GTX650ti/boost compute time/power consumption ratios? Core/memory speed differences'? GTX 750ti is close (~5%) to GTX660 (5SMX/960Core/140w-TDP) compute times.

That's very easy to answer:
The SMXes of the GTX650Ti and the GTX660 are superscalar, so only (approximately) 2/3rd of their cores can be utilized (512 and 640, respectively).

If this is the case, then why do GPU utilization (MSI afterburner, eVGA precision) programs show +90% for most GPUGRID tasks? Are these programs not accounting for type of (scalar or superscalar) architecture? If only 2/3rd of cores are active, won't GPU utilization be at ~66%, instead of the typical 90%? These programs are capable of monitoring Bus usage, memory control (frame buffer), Video processing, amount of power, and much more.

The "GPU utilization" is not equivalent of the "CUDA cores utilization". These monitoring utilities are right in showing that high GPU utilization, as they showing the utilization of the untis which feeding the CUDA cores with work. I think the actual CUDA cores utilization can't be monitored.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37929 - Posted: 17 Sep 2014 | 0:22:59 UTC - in response to Message 37926.

I estimate that the new GM204 will be about 45% faster than a 780ti.

That's a bit optimistic estimation as (1216/928)*(16/15)=1.3977.
but...
1. my GTX780Ti is always boosting to 1098MHz,
2. the 1219MHz boost clock seems to be a bit high, as the GTX750Ti's boost clock is only 1085MHz, and it's a lesser chip.

We'll see it soon.

BTW there's an error in the chart, as the GTX780Ti has 15*192 CUDA cores.

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 37931 - Posted: 17 Sep 2014 | 9:44:20 UTC
Last modified: 17 Sep 2014 | 9:45:22 UTC

[url]http://images.anandtech.com/doci/7764/SMMrecolored_575px.png
[/url]http://images.anandtech.com/doci/7764/SMX_575px.png

Here are Maxwell "crossbar", "dispatch", "issue", differences compared to Kelper feeding CUDA cores and LD/ST units, SFU.

[url] http://www.anandtech.com/show/7764/the-nvidia-geforce-gtx-750-ti-and-gtx-750-review-maxwell/3[/url]

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,348,955
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37933 - Posted: 17 Sep 2014 | 14:04:39 UTC - in response to Message 37931.

http://images.anandtech.com/doci/7764/SMMrecolored_575px.png
http://images.anandtech.com/doci/7764/SMX_575px.png

Here are Maxwell "crossbar", "dispatch", "issue", differences compared to Kelper feeding CUDA cores and LD/ST units, SFU.

http://www.anandtech.com/show/7764/the-nvidia-geforce-gtx-750-ti-and-gtx-750-review-maxwell/


Same text but made it clickable.
____________
Greetings from TJ

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 37934 - Posted: 17 Sep 2014 | 15:01:53 UTC - in response to Message 37933.

http://images.anandtech.com/doci/7764/SMMrecolored_575px.png
http://images.anandtech.com/doci/7764/SMX_575px.png

Here are Maxwell "crossbar", "dispatch", "issue", differences compared to Kelper feeding CUDA cores and LD/ST units, SFU.

http://www.anandtech.com/show/7764/the-nvidia-geforce-gtx-750-ti-and-gtx-750-review-maxwell/


Same text but made it clickable.



Thank you for fixing links.

biodoc
Send message
Joined: 26 Aug 08
Posts: 160
Credit: 1,405,920,847
RAC: 11
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37941 - Posted: 19 Sep 2014 | 8:32:28 UTC

The GTX980 does quite well in the Folding@home benchmarks.

http://www.anandtech.com/show/8526/nvidia-geforce-gtx-980-review/20


Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37942 - Posted: 19 Sep 2014 | 11:09:52 UTC - in response to Message 37941.

The GTX980 does quite well in the Folding@home benchmarks.

http://www.anandtech.com/show/8526/nvidia-geforce-gtx-980-review/20

Wow! Than it's possible that the GTX980's performance improvement over the GTX780Ti will be in the 25-45% range.

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,348,955
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37943 - Posted: 19 Sep 2014 | 15:52:04 UTC - in response to Message 37942.
Last modified: 19 Sep 2014 | 15:52:22 UTC

The GTX980 does quite well in the Folding@home benchmarks.

http://www.anandtech.com/show/8526/nvidia-geforce-gtx-980-review/20

Wow! Than it's possible that the GTX980's performance improvement over the GTX780Ti will be in the 25-45% range.

Well if I read the graph correct then its 6.2 and the 780Ti is 11.

The GTX 980 is available in the Netherlands and about €80 cheaper then a GTX 780Ti. However no EVGA boards are available yet. I am anxious to see the results of GTX 980 here.
____________
Greetings from TJ

popandbob
Send message
Joined: 18 Jul 07
Posts: 65
Credit: 10,972,900
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37946 - Posted: 20 Sep 2014 | 1:46:55 UTC - in response to Message 37943.

The GTX980 does quite well in the Folding@home benchmarks.

http://www.anandtech.com/show/8526/nvidia-geforce-gtx-980-review/20

Wow! Than it's possible that the GTX980's performance improvement over the GTX780Ti will be in the 25-45% range.

Well if I read the graph correct then its 6.2 and the 780Ti is 11.

Double precision performance is lower but SP which is most common for folding is higher.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37948 - Posted: 20 Sep 2014 | 15:13:33 UTC

OK guys, how's got the first one up and running? I'd like to pull the trigger on a GTX970, but would prefer to know beforehand that it works as good, or better than expected, over here.

MrS
____________
Scanning for our furry friends since Jan 2002

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,348,955
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37949 - Posted: 20 Sep 2014 | 16:12:42 UTC - in response to Message 37948.
Last modified: 20 Sep 2014 | 16:16:51 UTC

Well I am waiting on the 20nm Maxwell. However I will buy a GTX980 as soon as there is one from EVGA with 1 radial fan, to replace my GTX770. As soon as I have it installed I will let you all know.

Aha you changed the name of the thread ETA?
____________
Greetings from TJ

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 456
Credit: 815,576,358
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37951 - Posted: 20 Sep 2014 | 20:00:37 UTC

Hopefully they develop an XP Driver too for these.
____________
DSKAG Austria Research Team: http://www.research.dskag.at



Crunching for my deceased Dog who had "good" Braincancer..

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37952 - Posted: 20 Sep 2014 | 21:18:39 UTC - in response to Message 37949.

Aha you changed the name of the thread ETA?

Yes :)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37953 - Posted: 20 Sep 2014 | 22:26:32 UTC - in response to Message 37951.
Last modified: 20 Sep 2014 | 22:27:40 UTC

Hopefully they develop an XP Driver too for these.

The 344.11 driver is released for Windows XP (and for x64 too), and there are GTX 980 and 970 in it (I've checked the nv_dispi.inf file).
However if you search for drivers on the NVidia homepage, it won't display any results for WinXP / GTX 980.

ext2097
Send message
Joined: 3 Jul 14
Posts: 5
Credit: 5,618,275
RAC: 0
Level
Ser
Scientific publications
watwat
Message 37954 - Posted: 21 Sep 2014 | 4:55:25 UTC

GTX970 with driver 344.11
http://www.gpugrid.net/result.php?resultid=13109131

Stderr output

<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -59 (0xffffffc5)
</message>
<stderr_txt>
# GPU [GeForce GTX 970] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 0 :
# Name : GeForce GTX 970
# ECC : Disabled
# Global mem : 4095MB
# Capability : 5.2
# PCI ID : 0000:01:00.0
# Device clock : 1215MHz
# Memory clock : 3505MHz
# Memory width : 256bit
# Driver version : r343_98 : 34411
#SWAN: FATAL: cannot find image for module [.nonbonded.cu.] for device version 520

</stderr_txt>
]]>

I also tried with latest beta driver 344.16, but had same error.

Is that a problem with my computer, or is GTX970 not supported yet?

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37956 - Posted: 21 Sep 2014 | 7:40:50 UTC - in response to Message 37954.

GTX970 with driver 344.11
http://www.gpugrid.net/result.php?resultid=13109131

Stderr output

<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -59 (0xffffffc5)
</message>
<stderr_txt>
# GPU [GeForce GTX 970] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 0 :
# Name : GeForce GTX 970
# ECC : Disabled
# Global mem : 4095MB
# Capability : 5.2
# PCI ID : 0000:01:00.0
# Device clock : 1215MHz
# Memory clock : 3505MHz
# Memory width : 256bit
# Driver version : r343_98 : 34411
#SWAN: FATAL: cannot find image for module [.nonbonded.cu.] for device version 520

</stderr_txt>
]]>

I also tried with latest beta driver 344.16, but had same error.

Is that a problem with my computer, or is GTX970 not supported yet?

I think that the problem is the Compute Capability 5.2 is not supported yet.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37958 - Posted: 21 Sep 2014 | 9:42:33 UTC - in response to Message 37956.

I've sent Matt a PM. Hopefully this is easy to fix!

MrS
____________
Scanning for our furry friends since Jan 2002

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37962 - Posted: 21 Sep 2014 | 14:28:49 UTC

ext2097, did you also try other projects like Einstein@Home?

MrS
____________
Scanning for our furry friends since Jan 2002

HA-SOFT, s.r.o.
Send message
Joined: 3 Oct 11
Posts: 100
Credit: 5,533,954,251
RAC: 115,992
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37963 - Posted: 21 Sep 2014 | 15:01:01 UTC - in response to Message 37962.

It's a new compute capability. CC 5.2

ext2097
Send message
Joined: 3 Jul 14
Posts: 5
Credit: 5,618,275
RAC: 0
Level
Ser
Scientific publications
watwat
Message 37965 - Posted: 21 Sep 2014 | 16:04:02 UTC - in response to Message 37962.

SETI@home - http://setiathome.berkeley.edu/results.php?hostid=7376773
SETI@home Beta - http://setiweb.ssl.berkeley.edu/beta/results.php?hostid=72657
Einstein@Home - http://einstein.phys.uwm.edu/results.php?hostid=11669559

GTX970 working those projects seems ok.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37966 - Posted: 21 Sep 2014 | 18:34:51 UTC

It's a bit off topic, but it's a quite interesting Maxwell advertisement:
GAME24: Debunking Lunar Landing Conspiracies with Maxwell and VXGI

biodoc
Send message
Joined: 26 Aug 08
Posts: 160
Credit: 1,405,920,847
RAC: 11
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37969 - Posted: 21 Sep 2014 | 20:40:35 UTC - in response to Message 37966.

It's a bit off topic, but it's a quite interesting Maxwell advertisement:
GAME24: Debunking Lunar Landing Conspiracies with Maxwell and VXGI


Cool.

However, the folks at the flat earth society are not impressed with Nvidia's effort. :)

http://forum.tfes.org/index.php?topic=1914.0


While we are waiting for GPUgrid GTX980/970 numbers, F@H performance looks encouraging.

http://forums.evga.com/Someone-needs-to-post-980-andor-970-folding-numbers-here-when-they-get-one-m2218148.aspx

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 37972 - Posted: 22 Sep 2014 | 9:36:48 UTC - in response to Message 37963.

Ok gang,

Looks like we'll need an app update - let's see if I can push one out later today. Who's got the cards, do I need to do Linux or Windows first?


Matt

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 37974 - Posted: 22 Sep 2014 | 9:49:35 UTC

..or not. The current CUDA release seems not to support that architecture yet.

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 37975 - Posted: 22 Sep 2014 | 10:37:10 UTC - in response to Message 37974.
Last modified: 22 Sep 2014 | 10:37:31 UTC

Have you tried this yet?? [url] https://developer.nvidia.com/cuda-downloads-geforce-gtx9xx Driver 343.98 is included offering support for C.C 5.2 cards (GTX980/970) [/url]

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 37976 - Posted: 22 Sep 2014 | 10:51:04 UTC - in response to Message 37975.

Yeah, looked straight through that.

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 37978 - Posted: 22 Sep 2014 | 12:07:33 UTC - in response to Message 37969.

It's a bit off topic, but it's a quite interesting Maxwell advertisement:
GAME24: Debunking Lunar Landing Conspiracies with Maxwell and VXGI


Cool.

However, the folks at the flat earth society are not impressed with Nvidia's effort. :)

http://forum.tfes.org/index.php?topic=1914.0


While we are waiting for GPUgrid GTX980/970 numbers, F@H performance looks encouraging.

http://forums.evga.com/Someone-needs-to-post-980-andor-970-folding-numbers-here-when-they-get-one-m2218148.aspx


NVidia showing off VX Global illuminati... err illumination power. The Metro series Redux games utilize this tech with Kepler and above generations. (Even X-box/PS4 ports)

GM204 technical advances compared to Kelper is rather striking. Fermi 480/580 compared to 680: GPU-GRID performance jump won't be nearly the GTX980 jump compared to GTX 680. Filtering and sorting performance for images, or to create programs with atoms and DNA/RNA strands- are higher than Kelper. Maxwell also has more internal memory bandwidth, and Third generation Color Compression enhancements offers more ROP performance, with better Cache latency.

A single GM204 CUDA core is 25-40% faster compared to a single CUDA core Gk104, from the added atomic memory enhancements, and more registers for a thread. For Gaming: a GTX 980 is equal to [2]GTX 680.

In every area GP(GPU) performance jump from GK104 to GM204 is significant. GK110 only now offers higher TMU performance, but without new type of filtering offer by GM204, unless Nvidia offers this for the Kelper Generation. GK110 has higher Double precision. A full fat Maxwell (?20nm/16nm FinFET ?/250W/3072~CUDA cores)that offers 1/3 SP/DP core ratio like GK110 is going to be a card for the ages. (Will first be a Tesla, Quadro, Titan variant?) Maybe AMD will come out with board shortly to challenge GM204 with a similar power consumption. Continuous performance competition should raise standards for each company. These next few years are key.

GM204 replaces the GK104 stack, not GK110 [GTX 780](ti)disabled 64DP core SMX stack.(Driver allows 8 be active in SMX) GM204 per nm transistor density is only few percent more than GK104 (3.54B transistors/294mm2) and (7.1B transistors/551mm2) for GK110.

Kepler's GK110 (4500-6000GFLOPS Desktop card are near [Single]20GFOLPS/W, while GK104 (2500-3500GFLOPS) is 15-18GFLOPS/W depending on clock speeds and voltage. Maxwell GM107(1400~FLOPS) is 20-23GFLOPS/W. Maxwell GM204(5000~FLOPS) is 30-35GFLOPS/W, depending on voltage and GDDR5/CORE speeds. GK104 highest rated mobile (GTX880m/2900~FLOPS) card is 27-29GFLOPS/W.

GM204 compute time/ power usage ratios with a new app will be world class compared to more power hungry cards. Crunchers whose States/Countries higher taxes rates for power bills- a GTX970/980 is top notch choice.

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 37979 - Posted: 22 Sep 2014 | 12:10:56 UTC - in response to Message 37976.
Last modified: 22 Sep 2014 | 12:22:43 UTC

Yeah, looked straight through that.


6.5.19 for 343.98 driver. Are 344 drivers the same? Updated Documents for PTX, programming guide, many others are included with 343.98/6.5 CUDA SDK.
Before updating to 6.5.19 driver , I completed GPU-GRID tasks was CUDA 6.5.12

Linux, also has a new CUDA 6.5 driver also for download.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 37981 - Posted: 22 Sep 2014 | 13:35:19 UTC - in response to Message 37979.
Last modified: 22 Sep 2014 | 14:26:11 UTC

Right. New app version cuda65 for acemdlong. Windows only, needs driver version 344.

Matt

biodoc
Send message
Joined: 26 Aug 08
Posts: 160
Credit: 1,405,920,847
RAC: 11
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37983 - Posted: 22 Sep 2014 | 15:04:49 UTC

I ordered a GTX980 from Newegg. Zotac and gigabyte were my only options so I went with gigabyte. All other manufacturers cards were "out of stock".

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37984 - Posted: 22 Sep 2014 | 15:30:25 UTC - in response to Message 35238.
Last modified: 22 Sep 2014 | 16:28:10 UTC

Time to replace my trusty GTX 460, which has been GPUGrid-ing for years! At ~£100 the GTX 750Ti fits my budget nicely but I need some guidance.

1. Is the power feed only from the mobo enough for 24/7 or should I go for one with a 6-pin connection?
2. One fan or two?
3. WHICH 750Ti do you recommend?

I installed a pair of 750Ti cards in my computer yesterday and tried to run GPUGRID. No go, instant fail within a couple seconds.

This quote is from January. I hope the problem has been fixed!

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 37985 - Posted: 22 Sep 2014 | 15:35:16 UTC - in response to Message 37984.

The 750tis are great and work just fine.

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,348,955
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37986 - Posted: 22 Sep 2014 | 16:08:13 UTC - in response to Message 37983.

I ordered a GTX980 from Newegg. Zotac and gigabyte were my only options so I went with gigabyte. All other manufacturers cards were "out of stock".

Good luck with your card biodoc.

In the Netherlands only Asus and MSI, but I will wait for EVGA. They are not out of stock, but just in production. A few weeks more is no problem. Moreover I first want to see some results.
____________
Greetings from TJ

Jozef J
Send message
Joined: 7 Jun 12
Posts: 112
Credit: 1,035,582,756
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 37989 - Posted: 22 Sep 2014 | 19:15:34 UTC

http://www.techpowerup.com/

On this webpage is massive summary reviews for 970/980 graphics cards.

But this is my insider tip-)
http://www.zotac.com/products/graphics-cards/geforce-900-series/gtx-980/product/gtx-980/detail/geforce-gtx-980-amp-extreme-edition/sort/starttime/order/DESC/amount/10/section/specifications.html

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37990 - Posted: 22 Sep 2014 | 19:52:31 UTC
Last modified: 22 Sep 2014 | 19:53:48 UTC

Does anybody already have a working GTX 980 or 970?

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37991 - Posted: 22 Sep 2014 | 20:13:24 UTC

ext2097, would you mind giving GPU-Grid another try?

And regarding those Einstein@Home results: did you run 1 WUs at a time, which is the default setting?

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37992 - Posted: 22 Sep 2014 | 20:43:29 UTC
Last modified: 22 Sep 2014 | 20:50:45 UTC

I had a CUDA6.5 task on one of my GTX680s, but it's failed after 6 sec with the following error:

# The simulation has become unstable. Terminating to avoid lock-up (1)
40x35-NOELIA_5bisrun2-2-4-RND5486_0
Does anybody had a successful CUDA6.5 task on any older card?

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37993 - Posted: 22 Sep 2014 | 20:49:32 UTC - in response to Message 37992.

And there's another failed CUDA6.5 task on my GTX780Ti:
I4R6-SDOERR_BARNA5-32-100-RND1539_0
It has failed after 2 sec.

# Simulation unstable. Flag 11 value 1 # The simulation has become unstable. Terminating to avoid lock-up # The simulation has become unstable. Terminating to avoid lock-up (2)

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37994 - Posted: 22 Sep 2014 | 21:11:41 UTC - in response to Message 37993.

There were two more CUDA6.5 workunits on my GTX780Ti, both of them failed the same way:

# Simulation unstable. Flag 11 value 1 # The simulation has become unstable. Terminating to avoid lock-up # The simulation has become unstable. Terminating to avoid lock-up (2)

I16R23-SDOERR_BARNA5-32-100-RND7031_0
I12R83-SDOERR_BARNA5-32-100-RND2687_0
Now this host received a CUDA6.0 task, so I don't want to try again, but I think that the CUDA6.5 app has a bug.

biodoc
Send message
Joined: 26 Aug 08
Posts: 160
Credit: 1,405,920,847
RAC: 11
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37995 - Posted: 22 Sep 2014 | 22:46:37 UTC - in response to Message 37986.

I ordered a GTX980 from Newegg. Zotac and gigabyte were my only options so I went with gigabyte. All other manufacturers cards were "out of stock".

Good luck with your card biodoc.

In the Netherlands only Asus and MSI, but I will wait for EVGA. They are not out of stock, but just in production. A few weeks more is no problem. Moreover I first want to see some results.


Yes, my preference would have been EVGA or PNY (lifetime warranty but fixed core voltage). This will be my first Gigabyte card so I hope it works out.

The F@H numbers sold me. I think Nvidia GPU performance on F@H generally translates to GPUGrid. Besides, I usually spend the month of December folding, so the GTX980 will be a nice companion to my 780Ti.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37996 - Posted: 22 Sep 2014 | 23:07:12 UTC - in response to Message 37994.

I have more failed CUDA6.5 tasks on my GTX680:
19x56-NOELIA_5bisrun2-2-4-RND7637_2
I14R24-SDOERR_BARNA5-25-100-RND2569_0
20mgx12-NOELIA_20MG2-9-50-RND2493_0
I12R11-SDOERR_BARNA5-24-100-RND0763_0
I2R35-SDOERR_BARNA5-31-100-RND8916_0

# The simulation has become unstable. Terminating to avoid lock-up (1)

Actually all CUDA6.5 tasks are failing on my GTX680 (OC) and GTX780Ti (Non-OC)

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37997 - Posted: 22 Sep 2014 | 23:20:25 UTC - in response to Message 37996.
Last modified: 22 Sep 2014 | 23:33:22 UTC

I haven't found any successfully finished CUDA6.5 tasks (obviously these are too fresh).
But there are five which failed on other's hosts the same way they failed on mine:
19x56-NOELIA_5bisrun2-2-4-RND7637_1 (GTX780Ti OC)
I7R106-SDOERR_BARNA5-32-100-RND7602_1 (GTX TITAN non-OC)
19x64-NOELIA_5bisrun2-3-4-RND8765_0 (GTX TITAN non-OC)

# Simulation unstable. Flag 11 value 1 # The simulation has become unstable. Terminating to avoid lock-up # The simulation has become unstable. Terminating to avoid lock-up (2)

I2R31-SDOERR_BARNA5-30-100-RND8191_0 (GTX770 OC)
I11R57-SDOERR_BARNA5-32-100-RND3266_0 (GTX770 non-OC)
# The simulation has become unstable. Terminating to avoid lock-up (1)

KSUMatt
Avatar
Send message
Joined: 11 Jan 13
Posts: 216
Credit: 832,764,443
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 37998 - Posted: 23 Sep 2014 | 3:22:02 UTC
Last modified: 23 Sep 2014 | 3:38:55 UTC

I just had one of these errors on a 780Ti as well.

20mgx76-NOELIA_20MG2-6-50-RND7695_0

Edit: Now three Cuda65 in a row have failed.

20mgx76-NOELIA_20MG2-6-50-RND7695_0

I8R16-SDOERR_BARNA5-29-100-RND7841_2

Jozef J
Send message
Joined: 7 Jun 12
Posts: 112
Credit: 1,035,582,756
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 37999 - Posted: 23 Sep 2014 | 4:41:12 UTC

I7R110-SDOERR_BARNA5-32-100-RND5097_1 10082727 181608 22 Sep 2014 | 17:59:10 UTC 23 Sep 2014 | 0:04:53 UTC Error while computing 6.07 2.70 --- Long runs (8-12 hours on fastest card) v8.41 (cuda65)
I11R16-SDOERR_BARNA5-29-100-RND4924_1 10097574 179755 22 Sep 2014 | 16:41:21 UTC 23 Sep 2014 | 0:45:03 UTC Error while computing 10.75 3.53 --- Long runs (8-12 hours on fastest card) v8.41 (cuda65)
I12R49-SDOERR_BARNA5-30-100-RND2140_0 10097949 181608 22 Sep 2014 | 17:59:10 UTC 23 Sep 2014 | 0:04:53 UTC Error while computing 13.20 2.86
19x48-NOELIA_5bisrun2-3-4-RND9893_0 10097397 176407 22 Sep 2014 | 11:03:22 UTC 22 Sep 2014 | 11:08:37 UTC Error while computing 2.05 0.13
I6R67-SDOERR_BARNA5-31-100-RND9535_1 10089657 176407 22 Sep 2014 | 6:36:59 UTC 22 Sep 2014 | 11:08:37 UTC Error while computing 171.94 27.72 --- Long runs (8-12 hours on fastest card) v8.41 (cuda60)
43x63-NOELIA_5bisrun2-3-4-RND0357_0 10096302 176407 22 Sep 2014 | 1:39:43 UTC 22 Sep 2014 | 11:03:22 UTC Error while computing 14,762.32 3,420.89

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38000 - Posted: 23 Sep 2014 | 7:11:20 UTC - in response to Message 37999.

Jozef, there's a CUDA 6.0 task among these. Maybe you need to reboot the host after those CUDA 6.5 failures? Or is it running normally again?

MrS
____________
Scanning for our furry friends since Jan 2002

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38001 - Posted: 23 Sep 2014 | 7:21:03 UTC

CUDA 65s should only have been going to the new Maxwells, sorry about that.
You shouldn't see any new ones going to older cards - please should if you do, it'll be a scheduler bug.

Matt

ext2097
Send message
Joined: 3 Jul 14
Posts: 5
Credit: 5,618,275
RAC: 0
Level
Ser
Scientific publications
watwat
Message 38002 - Posted: 23 Sep 2014 | 8:00:24 UTC - in response to Message 37991.

I had two CUDA65 tasks, but same "FATAL: cannot find image for module [.nonbonded.cu.] for device version 520" error.
I3R87-SDOERR_BARNA5-32-100-RND2755_1
I16R41-SDOERR_BARNA5-29-100-RND8564_0

Two CUDA60 task had "ERROR: file mdioload.cpp line 81: Unable to read bincoordfile".
I957-SANTI_p53final-15-21-RND0451_5
I864-SANTI_p53final-17-21-RND8261_5

And I'm running only 1 WU at a time except setiathome_v7 tasks.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38003 - Posted: 23 Sep 2014 | 8:11:13 UTC - in response to Message 38001.

Right. The new 65 app is failing for non-obvious reasons, so I've moved it to the acemdbeta queue. If you have a GTX9x0, please get some work from that queue.

ext2097
Send message
Joined: 3 Jul 14
Posts: 5
Credit: 5,618,275
RAC: 0
Level
Ser
Scientific publications
watwat
Message 38004 - Posted: 23 Sep 2014 | 9:46:33 UTC - in response to Message 38003.

I have GTX970 and checked at "Run test applications?" and "ACEMD beta", but BOINC says "No tasks are available for ACEMD beta version".
Is there something I need to do in order to get beta tasks?

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,348,955
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38005 - Posted: 23 Sep 2014 | 9:52:15 UTC - in response to Message 38004.

I have GTX970 and checked at "Run test applications?" and "ACEMD beta", but BOINC says "No tasks are available for ACEMD beta version".
Is there something I need to do in order to get beta tasks?

On the GPUGRID preference page, about in the middle is a option called: Run test applications? You have to set it to yes as well.
But perhaps you did then sorry about this post.
____________
Greetings from TJ

ext2097
Send message
Joined: 3 Jul 14
Posts: 5
Credit: 5,618,275
RAC: 0
Level
Ser
Scientific publications
watwat
Message 38006 - Posted: 23 Sep 2014 | 11:10:14 UTC - in response to Message 38005.

Use NVIDIA GPU : yes
Run test applications? yes
ACEMD beta: yes
Use Graphics Processing Unit (GPU) if available : yes

No beta tasks downloaded yet.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38016 - Posted: 23 Sep 2014 | 17:07:13 UTC

Should I open that ESD bag? :)

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 38017 - Posted: 23 Sep 2014 | 17:32:34 UTC - in response to Message 38016.

These are two GK104/GK110 time from you're hosts to compared with new GM204.

20mgx58-NOELIA_20MG2-6-50-RND1410_0
Time per step (avg over 5000000 steps): 5.600 ms
Approximate elapsed time for entire WU: 27997.766 s
Device clock : 1201MHz
Memory clock : 3004MHz
Memory width : 256bit
Driver version : r343_98 : 34411
GeForce GTX 680 Capability 3.0





20mgx81-NOELIA_20MG2-1-50-RND0462_1 Ti
time per step (avg over 5000000 steps): 3.306 ms
Approximate elapsed time for entire WU: 16531.969 s
Device clock : 928MHz
Memory clock : 3500MHz
Driver version : r340_00 : 34043
GeForce GTX 780 Ti Capability 3.5

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38018 - Posted: 23 Sep 2014 | 17:45:12 UTC

The good news is that I've successfully installed the GTX980 under Windows XP x64.
The bad news is that I could not get beta work for it.

23/09/2014 19:38:23 | GPUGRID | Requesting new tasks for NVIDIA 23/09/2014 19:38:25 | GPUGRID | Scheduler request completed: got 0 new tasks 23/09/2014 19:38:25 | GPUGRID | No tasks sent 23/09/2014 19:38:25 | GPUGRID | No tasks are available for ACEMD beta version 23/09/2014 19:38:25 | GPUGRID | No tasks are available for the applications you have selected. 23/09/2014 19:41:53 | GPUGRID | update requested by user 23/09/2014 19:41:55 | GPUGRID | Sending scheduler request: Requested by user. 23/09/2014 19:41:55 | GPUGRID | Requesting new tasks for NVIDIA 23/09/2014 19:41:57 | GPUGRID | Scheduler request completed: got 0 new tasks

Before you ask: I did all the necessary settings.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38019 - Posted: 23 Sep 2014 | 18:03:15 UTC

Seems like Matt has to fill the beta queue, or already got enough failed results from the batch he submitted :p

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38022 - Posted: 23 Sep 2014 | 18:08:57 UTC - in response to Message 38019.
Last modified: 23 Sep 2014 | 18:09:32 UTC

Seems like Matt has to fill the beta queue, or already got enough failed results from the batch he submitted :p

MrS

According to the server status page, there are 100 unsent beta workunits, and the application page shows that there is only the v8.42 CUDA6.5 beta app. Somehow these didn't bound together. I think this could be another scheduler issue.

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,348,955
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38024 - Posted: 23 Sep 2014 | 18:40:22 UTC - in response to Message 38016.

Nice card you have there Zoltan. Would love to see the results as soon as Matt has got it working.
I hope you will have good luck with this card right from the start.
____________
Greetings from TJ

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38026 - Posted: 23 Sep 2014 | 18:46:26 UTC - in response to Message 38024.

Nice card you have there Zoltan. Would love to see the results as soon as Matt has got it working.
I hope you will have good luck with this card right from the start.

I second that! BTW: what are you currently running on the card? Any results from other projects to share? :)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38027 - Posted: 23 Sep 2014 | 18:49:27 UTC - in response to Message 38024.
Last modified: 23 Sep 2014 | 18:56:12 UTC

Thank you TJ & ETA!
It's already crunching for Einstein@home.
As this card is a standard NVidia design, there's a good chance it won't have such problems as my Gigabyte GTX780Ti OC....

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 38028 - Posted: 23 Sep 2014 | 19:01:27 UTC - in response to Message 38027.

Thank you TJ & ETA!
It's already crunching for Einstein@home.
As this card is a standard NVidia design, there's a good chance it won't have such problems as my Gigabyte GTX780Ti OC....


Any comment on you're phenomenal card's wattage usage for tasks, or temps?

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38030 - Posted: 23 Sep 2014 | 19:22:05 UTC - in response to Message 38028.
Last modified: 23 Sep 2014 | 19:35:39 UTC

Any comment on you're phenomenal card's wattage usage for tasks, or temps?

It's awesome! :)
The Einstein@home app is CUDA3.2 - ancient in terms of GPU computing, as this version is released for the GTX 2xx series - so the data you've asked for is almost irrelevant, but here it is:
Ambient temperature: 24.8°C
Task: p2030.20140610.G63.60-00.95.S.b6s0g0.00000_3648_1 Binary Radio Pulsar Search (Arecibo, GPU) v1.39 (BRP4G-cuda32-nv301)
GPU temperature: 53°C
GPU usage: 91-92% (muhahaha)
GPU wattage: 90W (the difference between the idle GPU and the GPU in use, but the CPU is consuming a little to keep the GPU busy)
GPU clock: 1240MHz
GPU voltage: 1.218V
GPU power 55%

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 38031 - Posted: 23 Sep 2014 | 19:55:51 UTC - in response to Message 38030.

Any comment on you're phenomenal card's wattage usage for tasks, or temps?

It's awesome! :)
The Einstein@home app is CUDA3.2 - ancient in terms of GPU computing, as this version is released for the GTX 2xx series - so the data you've asked for is almost irrelevant, but here it is:
Ambient temperature: 26°C
Task: p2030.20140610.G63.60-00.95.S.b6s0g0.00000_3648_1 Binary Radio Pulsar Search (Arecibo, GPU) v1.39 (BRP4G-cuda32-nv301)
GPU temperature: 53°C
GPU usage: 91-92% (muhahaha)
GPU wattage: 90W (the difference between the idle GPU and the GPU in use, but the CPU is consuming a little to keep the GPU busy)


Integer task? 90 watts (91-92%) for 1024 cores at 3.2 CUDA API shows Maxwell(2) GM204 internal core structure enhancements. Other components will be under less "stress" from energy usage drop.
Percentage of taxes risen is off. Any efficiency updates help.

Running 24/7 for weeks/months/years at time-- 250TDP card or 175TDP? 50W-105W TDP GM204 wattage change compared to 225W/250W GK110? 145TDP for 1664 core GTX970. GTX980 TDP 30 watts away from a 6/8 core Haswell-E @140watts. (A few 6/8 core E5 Haswell Xeons are 85W) Having multiple cards- energy savings add up. Higher MB/PSU efficiency, included.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38032 - Posted: 23 Sep 2014 | 20:07:04 UTC

I think I know why we don't receive beta tasks.
The acemd.841-65.exe file is 3.969.024 bytes long, but the acemd.842-65.exe is only 1.112.576 bytes long, so something went wrong with the latter.

Profile tito
Send message
Joined: 21 May 09
Posts: 14
Credit: 908,132,123
RAC: 34,212
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 38036 - Posted: 24 Sep 2014 | 5:26:47 UTC

Zoltan - how many WU are You crunching at once at Einstein?

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38039 - Posted: 24 Sep 2014 | 8:48:23 UTC - in response to Message 38036.

Zoltan - how many WU are You crunching at once at Einstein?

Only one.
Now I've changed my settings to run two simultaneously, but the power consumption haven't changed, only the GPU usage risen to 97%.

Profile tito
Send message
Joined: 21 May 09
Posts: 14
Credit: 908,132,123
RAC: 34,212
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 38040 - Posted: 24 Sep 2014 | 8:56:34 UTC

May I quote all this data at Einstein forum?

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38041 - Posted: 24 Sep 2014 | 9:40:05 UTC - in response to Message 38040.

May I quote all this data at Einstein forum?

Sure.

biodoc
Send message
Joined: 26 Aug 08
Posts: 160
Credit: 1,405,920,847
RAC: 11
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38042 - Posted: 24 Sep 2014 | 9:43:32 UTC - in response to Message 38030.

Any comment on you're phenomenal card's wattage usage for tasks, or temps?

It's awesome! :)
The Einstein@home app is CUDA3.2 - ancient in terms of GPU computing, as this version is released for the GTX 2xx series - so the data you've asked for is almost irrelevant, but here it is:
Ambient temperature: 24.8°C
Task: p2030.20140610.G63.60-00.95.S.b6s0g0.00000_3648_1 Binary Radio Pulsar Search (Arecibo, GPU) v1.39 (BRP4G-cuda32-nv301)
GPU temperature: 53°C
GPU usage: 91-92% (muhahaha)
GPU wattage: 90W (the difference between the idle GPU and the GPU in use, but the CPU is consuming a little to keep the GPU busy)
GPU clock: 1240MHz
GPU voltage: 1.218V
GPU power 55%


The Einstein numbers look great. Congrats on the new card Zoltan!

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 38045 - Posted: 24 Sep 2014 | 9:58:01 UTC
Last modified: 24 Sep 2014 | 10:53:32 UTC

What does Boinc say, about amount of (peak) FLOPS in event log for GTX980? Near 5TeraFLOPS? Over at Mersenne trial-factoring--- a GTX980 is listed @ 1,126GHz and 4,710 GFLOPS.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 912
Credit: 2,216,536,145
RAC: 255,119
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38048 - Posted: 24 Sep 2014 | 13:43:24 UTC - in response to Message 38045.

What does Boinc say, about amount of (peak) FLOPS in event log for GTX980? Near 5TeraFLOPS? Over at Mersenne trial-factoring--- a GTX980 is listed @ 1,126GHz and 4,710 GFLOPS.

Could somebody running a Maxwell-aware version of BOINC check and report this, please, and do a sanity-check of whether BOINC's figure is correct from what you know of the card's SM count, cores per SM, shader clock, flops_per_clock etc. etc? We got the figures for the 'baby Maxwell' 750/Ti into BOINC on 24 February (3edb124ab4b16492d58ce5a6f6e40c2244c97ed6), but I think that was just too late to catch v7.2.42

We're in a similar position this time, with v7.4.22 at release-candidate stage - I'd say that one was safe to test with, if nobody here has upgraded yet. TIA.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38049 - Posted: 24 Sep 2014 | 13:58:43 UTC

No idea why the scheduler wasn't giving out the 842 beta app. Look out for 843 now.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38050 - Posted: 24 Sep 2014 | 13:59:39 UTC - in response to Message 38032.


The acemd.841-65.exe file is 3.969.024 bytes long, but the acemd.842-65.exe is only 1.112.576 bytes long, so something went wrong with the latter.


no, that's deliberate. It's a Maxwell-only build

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38051 - Posted: 24 Sep 2014 | 14:28:56 UTC - in response to Message 38050.

There's now a linux build on acemdbeta. You'll definitely be needing to use a Linux client that reports the right driver version.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38052 - Posted: 24 Sep 2014 | 16:16:18 UTC - in response to Message 38049.

No idea why the scheduler wasn't giving out the 842 beta app. Look out for 843 now.

I still could no get beta work.
24/09/2014 18:16:35 | GPUGRID | update requested by user 24/09/2014 18:16:38 | GPUGRID | Sending scheduler request: Requested by user. 24/09/2014 18:16:38 | GPUGRID | Requesting new tasks for NVIDIA 24/09/2014 18:16:41 | GPUGRID | Scheduler request completed: got 0 new tasks 24/09/2014 18:16:41 | GPUGRID | No tasks sent 24/09/2014 18:16:41 | GPUGRID | No tasks are available for ACEMD beta version 24/09/2014 18:16:41 | GPUGRID | No tasks are available for the applications you have selected.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38054 - Posted: 24 Sep 2014 | 17:24:00 UTC
Last modified: 24 Sep 2014 | 17:27:43 UTC

I gave Folding@home a try, and the power consumption risen by 130W when I started folding on the GPU (GTX980) only. When I started folding on the CPU also, the power consumption went up by 68W (Core i7-870@3.2GHz, 7 threads).
GPU usage: 90-95%
GPU power 64-66%
GPU temperature: 56°C
GPU voltage: 1.218V
GPU core clock: 1240MHz

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38055 - Posted: 24 Sep 2014 | 18:49:25 UTC
Last modified: 24 Sep 2014 | 18:52:57 UTC

Thanks Zoltan! Those numbers are really encouraging and show GM204 power consumption to be approximately where we expected them to be. This is in stark contrast to the ~250 W THG has measured under "some GP-GPU load". Maybe it was FurMark? With these results we can rest assured that the cards won't draw more than their power target to run GPU-Grid.

And while we're at it: what about memory controller load? It should be comparably high at Einstein and will limit unbalanced cards badly. For reference:

GT640: 99% at Einstein, ~60% at GPU-Grid
GTX660Ti: ~60% at Einstein, ~40% at GPU-Grid

Edit: concerning the Einstein tasks. About 1740 s for Arecibo tasks is running 1 WU at a time (RAC -> 50k), whereas 2740 s was achieved running 2 of them concurrently (RAC -> 63k)?

MrS
____________
Scanning for our furry friends since Jan 2002

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 38056 - Posted: 24 Sep 2014 | 18:52:59 UTC - in response to Message 38048.

What does Boinc say, about amount of (peak) FLOPS in event log for GTX980? Near 5TeraFLOPS? Over at Mersenne trial-factoring--- a GTX980 is listed @ 1,126GHz and 4,710 GFLOPS.

Could somebody running a Maxwell-aware version of BOINC check and report this, please, and do a sanity-check of whether BOINC's figure is correct from what you know of the card's SM count, cores per SM, shader clock, flops_per_clock etc. etc? We got the figures for the 'baby Maxwell' 750/Ti into BOINC on 24 February (3edb124ab4b16492d58ce5a6f6e40c2244c97ed6), but I think that was just too late to catch v7.2.42

We're in a similar position this time, with v7.4.22 at release-candidate stage - I'd say that one was safe to test with, if nobody here has upgraded yet. TIA.


GPU info in sched_request/(projects)file/ or slot init_data file. Also, client_state provides working size?

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 436
Credit: 510,585,896
RAC: 176,762
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38057 - Posted: 24 Sep 2014 | 19:49:16 UTC - in response to Message 37984.

Time to replace my trusty GTX 460, which has been GPUGrid-ing for years! At ~£100 the GTX 750Ti fits my budget nicely but I need some guidance.

1. Is the power feed only from the mobo enough for 24/7 or should I go for one with a 6-pin connection?
2. One fan or two?
3. WHICH 750Ti do you recommend?

I installed a pair of 750Ti cards in my computer yesterday and tried to run GPUGRID. No go, instant fail within a couple seconds.

This quote is from January. I hope the problem has been fixed!


Something you didn't mention: If possible, get one that blows the hot air out of the case rather than blowing it around within the case. That should reduce the temperature for both the graphics board and the CPU, and therefore make both of them last longer.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 436
Credit: 510,585,896
RAC: 176,762
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38058 - Posted: 24 Sep 2014 | 19:56:15 UTC

Something I'm having trouble finding: How well do the new cards using PCIE3 work if the motherboard has only PCIE2 sockets?

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 912
Credit: 2,216,536,145
RAC: 255,119
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38059 - Posted: 24 Sep 2014 | 20:45:42 UTC - in response to Message 38058.

Something I'm having trouble finding: How well do the new cards using PCIE3 work if the motherboard has only PCIE2 sockets?

Physically, the sockets are the same. Electrically, they're compatible.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38060 - Posted: 24 Sep 2014 | 22:18:25 UTC - in response to Message 38058.

Something I'm having trouble finding: How well do the new cards using PCIE3 work if the motherboard has only PCIE2 sockets?

We'll find out when there will be a working GPUGrid app, as I will move my GTX 980 to another host wich has PCIe3.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38061 - Posted: 24 Sep 2014 | 22:55:59 UTC - in response to Message 38055.

And while we're at it: what about memory controller load?

Folding@home: 23%
Einstein@home 2 tasks: 62-69% (Perseus arm survey/BRP5 & Arecibo, GPU/BRP4G)
Einstein@home 1 task : 46-48% (Perseus arm survey/BRP5)
Einstein@home 1 task : 58-64% (Arecibo, GPU/BRP4G)

biodoc
Send message
Joined: 26 Aug 08
Posts: 160
Credit: 1,405,920,847
RAC: 11
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38062 - Posted: 25 Sep 2014 | 0:21:49 UTC

I've got my GTX980 running on linux.

I'm also unable to get any beta work on GPUGrid.


Profile @tonymmorley
Send message
Joined: 10 Mar 14
Posts: 24
Credit: 1,215,128,812
RAC: 10
Level
Met
Scientific publications
watwatwatwatwatwatwat
Message 38066 - Posted: 25 Sep 2014 | 7:54:38 UTC

Just got two GTX 980's, will install tomorrow. Should be interesting to see how we go!

biodoc
Send message
Joined: 26 Aug 08
Posts: 160
Credit: 1,405,920,847
RAC: 11
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38067 - Posted: 25 Sep 2014 | 8:47:13 UTC - in response to Message 38048.

What does Boinc say, about amount of (peak) FLOPS in event log for GTX980? Near 5TeraFLOPS? Over at Mersenne trial-factoring--- a GTX980 is listed @ 1,126GHz and 4,710 GFLOPS.

Could somebody running a Maxwell-aware version of BOINC check and report this, please, and do a sanity-check of whether BOINC's figure is correct from what you know of the card's SM count, cores per SM, shader clock, flops_per_clock etc. etc? We got the figures for the 'baby Maxwell' 750/Ti into BOINC on 24 February (3edb124ab4b16492d58ce5a6f6e40c2244c97ed6), but I think that was just too late to catch v7.2.42

We're in a similar position this time, with v7.4.22 at release-candidate stage - I'd say that one was safe to test with, if nobody here has upgraded yet. TIA.


Here's what boinc 7.4.22 (64bit-linux version) is reporting:

Starting BOINC client version 7.4.22 for x86_64-pc-linux-gnu
CUDA: NVIDIA GPU 0: GeForce GTX 980 (driver version 343.22, CUDA version 6.5, compute capability 5.2, 4096MB, 3557MB available, 4979 GFLOPS peak)
OpenCL: NVIDIA GPU 0: GeForce GTX 980 (driver version 343.22, device version OpenCL 1.1 CUDA, 4096MB, 3557MB available, 4979 GFLOPS peak)

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 38068 - Posted: 25 Sep 2014 | 10:17:13 UTC - in response to Message 38067.

What does Boinc say, about amount of (peak) FLOPS in event log for GTX980? Near 5TeraFLOPS? Over at Mersenne trial-factoring--- a GTX980 is listed @ 1,126GHz and 4,710 GFLOPS.

Could somebody running a Maxwell-aware version of BOINC check and report this, please, and do a sanity-check of whether BOINC's figure is correct from what you know of the card's SM count, cores per SM, shader clock, flops_per_clock etc. etc? We got the figures for the 'baby Maxwell' 750/Ti into BOINC on 24 February (3edb124ab4b16492d58ce5a6f6e40c2244c97ed6), but I think that was just too late to catch v7.2.42

We're in a similar position this time, with v7.4.22 at release-candidate stage - I'd say that one was safe to test with, if nobody here has upgraded yet. TIA.


Here's what boinc 7.4.22 (64bit-linux version) is reporting:

Starting BOINC client version 7.4.22 for x86_64-pc-linux-gnu
CUDA: NVIDIA GPU 0: GeForce GTX 980 (driver version 343.22, CUDA version 6.5, compute capability 5.2, 4096MB, 3557MB available, 4979 GFLOPS peak)
OpenCL: NVIDIA GPU 0: GeForce GTX 980 (driver version 343.22, device version OpenCL 1.1 CUDA, 4096MB, 3557MB available, 4979 GFLOPS peak)


OpenCl 1.1 ! Spec from 2010 (Fermi)
2.0 OpenCL spec been released for almost a year. This is Nvidia telling Intel and AMD, they don't give a hoot about OpenCL, because of CUDA.

localizer
Send message
Joined: 17 Apr 08
Posts: 113
Credit: 1,656,514,857
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38069 - Posted: 25 Sep 2014 | 10:43:59 UTC

............. Any more thoughts on when we might see a revised app for the 980 - mine looks very nice, but I'd like to put it to work!!

Thanks,
P.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38070 - Posted: 25 Sep 2014 | 12:21:07 UTC - in response to Message 38066.
Last modified: 25 Sep 2014 | 12:21:27 UTC

Just got two GTX 980's, will install tomorrow. Should be interesting to see how we go!

We're waiting for a working app, so prepare a spare project for awhile.

biodoc
Send message
Joined: 26 Aug 08
Posts: 160
Credit: 1,405,920,847
RAC: 11
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38071 - Posted: 25 Sep 2014 | 13:49:27 UTC - in response to Message 38051.

There's now a linux build on acemdbeta. You'll definitely be needing to use a Linux client that reports the right driver version.


I've got the latest boinc client for linux but am still getting no tasks for my GTX 980.

Thu 25 Sep 2014 07:29:53 AM EDT | | Starting BOINC client version 7.4.22 for x86_64-pc-linux-gnu
Thu 25 Sep 2014 07:29:53 AM EDT | | log flags: file_xfer, sched_ops, task
Thu 25 Sep 2014 07:29:53 AM EDT | | Libraries: libcurl/7.35.0 OpenSSL/1.0.1f zlib/1.2.8 libidn/1.28 librtmp/2.3
Thu 25 Sep 2014 07:29:53 AM EDT | | Data directory: /home/mark/BOINC
Thu 25 Sep 2014 07:29:53 AM EDT | | CUDA: NVIDIA GPU 0: GeForce GTX 980 (driver version 343.22, CUDA version 6.5, compute capability 5.2, 4096MB, 3566MB available, 4979 GFLOPS peak)
Thu 25 Sep 2014 07:29:53 AM EDT | | OpenCL: NVIDIA GPU 0: GeForce GTX 980 (driver version 343.22, device version OpenCL 1.1 CUDA, 4096MB, 3566MB available, 4979 GFLOPS peak)
Thu 25 Sep 2014 09:48:27 AM EDT | GPUGRID | Sending scheduler request: Requested by user.
Thu 25 Sep 2014 09:48:27 AM EDT | GPUGRID | Requesting new tasks for NVIDIA GPU
Thu 25 Sep 2014 09:48:29 AM EDT | GPUGRID | Scheduler request completed: got 0 new tasks
Thu 25 Sep 2014 09:48:29 AM EDT | GPUGRID | No tasks sent
Thu 25 Sep 2014 09:48:29 AM EDT | GPUGRID | No tasks are available for ACEMD beta version
Thu 25 Sep 2014 09:48:29 AM EDT | GPUGRID | No tasks are available for the applications you have selected.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38073 - Posted: 25 Sep 2014 | 18:43:47 UTC
Last modified: 25 Sep 2014 | 18:48:29 UTC

Regarding the power consumption of the new BigMaxwell:
I had a different GPU workunit from folding@home (project 7621) on my GTX980, and it had different readouts:
GPU usage: 99-100% (this WU had a much lower CPU thread utilization ~1%)
GPU power 87-88% (~150W increase measured at the wall outlet)
GPU temperature: 63°C (ambient: 24°C)
GPU memory controller load: 26%
GPU memory used: 441MB
GPU voltage: 1.218V
GPU core clock: 1240MHz

biodoc
Send message
Joined: 26 Aug 08
Posts: 160
Credit: 1,405,920,847
RAC: 11
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38074 - Posted: 25 Sep 2014 | 19:45:29 UTC - in response to Message 38073.

Regarding the power consumption of the new BigMaxwell:
I had a different GPU workunit from folding@home (project 7621) on my GTX980, and it had different readouts:
GPU usage: 99-100% (this WU had a much lower CPU thread utilization ~1%)
GPU power 87-88% (~150W increase measured at the wall outlet)
GPU temperature: 63°C (ambient: 24°C)
GPU memory controller load: 26%
GPU memory used: 441MB
GPU voltage: 1.218V
GPU core clock: 1240MHz


Project 7621 uses the GPU "core 15" (Fahcore:0x15) version which is the oldest GPU client and runs exclusively on Windows machines. Those Wus generally run hot and use very little CPU as you've noticed. They are fixed credit WUs so PPD is low.

Core 17 WUs are more efficient since they use a more recent version of openMM and are distributed to both windows and linux via an OpenCL app. They generally use 100% of a core on machines with an Nvidia card. These WUs offer a quick return bonus (QRB) and are very popular because the faster the card the higher the bonus.

My GTX980 has finished several 9201 project (core17) WUs and is averaging 330,000 ppd. Amazing.

Linux users have an advantage in that only core 17 WUs are delivered to linux machines.

There are core 18 WUs now available to windows users. I don't know anything about them yet.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38075 - Posted: 26 Sep 2014 | 0:18:05 UTC - in response to Message 38050.

The acemd.841-65.exe file is 3.969.024 bytes long, but the acemd.842-65.exe is only 1.112.576 bytes long, so something went wrong with the latter.

no, that's deliberate. It's a Maxwell-only build

I've made my BOINC manager to start this acemd.842-65.exe as an acemd.841-60.exe by overwriting the latter and setting <dont_check_file_sizes> in the cc_config.xml, and I've modified the client_state.xml to copy the cudart32_65.dll and the cufft32_65.dll to the slot with the app, but I've got the same result as before with the 841-65 client.
#SWAN: FATAL: cannot find image for module [.nonbonded.cu.] for device version 520

http://www.gpugrid.net/result.php?resultid=13130843
http://www.gpugrid.net/result.php?resultid=13132835
http://www.gpugrid.net/result.php?resultid=13135543

biodoc
Send message
Joined: 26 Aug 08
Posts: 160
Credit: 1,405,920,847
RAC: 11
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38079 - Posted: 26 Sep 2014 | 8:46:35 UTC

I have data comparing my 780Ti with the 980 at Folding@home.

hardware:
780Ti (only gpu in system) in 2600K, PCIE2 slot, 64-bit linux mint 17 LTS, nvidia driver 343.22
980 (only gpu in system) in 3930K, PCIE2 slot, 64-bit linux mint 17 LTS, nvidia driver 343.22

Project 9201 (core 17)
780Ti@1106MHz (+100MHz OC): TPF=117 seconds, PPD=255,750
980@1352MHz (+100MHz OC): TPF=93 seconds, PPD=360,510

Looks like a 20.5% reduction in TPF. Seems to correlate with difference in clock speed (22%)?

It's not a perfect comparison but it's the best I can do.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38080 - Posted: 26 Sep 2014 | 9:46:19 UTC - in response to Message 38075.

Well, I've not fixed the scheduler, but would you like to try that trick again with the new version 844?

Matt

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38081 - Posted: 26 Sep 2014 | 10:39:07 UTC - in response to Message 38080.

Well, I've not fixed the scheduler, but would you like to try that trick again with the new version 844?

Matt

At once, sire. :)

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38082 - Posted: 26 Sep 2014 | 10:48:33 UTC - in response to Message 38080.
Last modified: 26 Sep 2014 | 10:56:35 UTC

Well, I've not fixed the scheduler, but would you like to try that trick again with the new version 844?

Matt

...aaaaand we have a lift-off!
It's crunching.
# GPU [GeForce GTX 980] Platform [Windows] Rev [3212] VERSION [65] # SWAN Device 0 : # Name : GeForce GTX 980 # ECC : Disabled # Global mem : 4095MB # Capability : 5.2 # PCI ID : 0000:03:00.0 # Device clock : 1215MHz # Memory clock : 3505MHz # Memory width : 256bit # Driver version : r343_98 : 34411 # GPU 0 : 41C # GPU 0 : 43C # GPU 0 : 44C # GPU 0 : 46C # GPU 0 : 47C # GPU 0 : 49C # GPU 0 : 50C # GPU 0 : 52C # GPU 0 : 53C # GPU 0 : 54C # GPU 0 : 55C # GPU 0 : 56C # GPU 0 : 57C # GPU 0 : 58C # GPU 0 : 59C # GPU 0 : 60C # GPU 0 : 61C # GPU 0 : 62C # GPU 0 : 63C


709-NOELIA_20MGWT-1-5-RND4766_0
GPU usage: 93-97% (CPU 100%, PCIe2.0x16)
GPU power 93% (~160W increase measured at the wall outlet)
GPU temperature: 64°C (ambient: 24°C)
GPU memory controller load: 50%
GPU memory used: 825MB
GPU voltage: 1.218V
GPU core clock: 1240MHz

I estimate it will take 19.200 sec to finish this workunit (5h20m), which is more than it takes on a GTX780Ti (16.712), so I really should move this card to another host with PCIe3.0.

biodoc
Send message
Joined: 26 Aug 08
Posts: 160
Credit: 1,405,920,847
RAC: 11
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38083 - Posted: 26 Sep 2014 | 11:34:32 UTC

Good news that the app is working but disappointing performance.

Time to move the windows and (linux?) app to "non-beta"?

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 38084 - Posted: 26 Sep 2014 | 12:09:02 UTC - in response to Message 38083.
Last modified: 26 Sep 2014 | 12:17:33 UTC

Good news that the app is working but disappointing performance.

Time to move the windows and (linux?) app to "non-beta"?


Disappointing compared to GK110? Or GK104 boards? GTX980 (64DP cores/4DPperSMM/1DPper32coreblock) is replacement for GTX680 (64DP/8DPperSMX), NOT 96DPcore GTX780 or 120DPcore GTX780ti. Titan(Black)250TDP have 896/960 DP cores (64DPperSMX)
Compared to GTX680, I'd say GTX980 is an excellent performer, other than Double Float.

biodoc
Send message
Joined: 26 Aug 08
Posts: 160
Credit: 1,405,920,847
RAC: 11
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38085 - Posted: 26 Sep 2014 | 12:24:04 UTC - in response to Message 38084.

Good news that the app is working but disappointing performance.

Time to move the windows and (linux?) app to "non-beta"?


Disappointing compared to GK110? Or GK104 boards? GTX980 (64DP cores/4DPperSMM/1DPper32coreblock) is replacement for GTX680 (64DP/8DPperSMX), NOT 96DPcore GTX780 or 120DPcore GTX780ti. Titan(Black)250TDP have 896/960 DP cores (64DPperSMX)
Compared to GTX680, I'd say GTX980 is an excellent performer, other than Double Float.


I believe the GPUGrid app uses SP floating point calculations.

F@H also uses SP.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38086 - Posted: 26 Sep 2014 | 12:59:37 UTC - in response to Message 38085.
Last modified: 26 Sep 2014 | 13:02:20 UTC

Good news that the app is working but disappointing performance.

Time to move the windows and (linux?) app to "non-beta"?

Disappointing compared to GK110? Or GK104 boards? GTX980 (64DP cores/4DPperSMM/1DPper32coreblock) is replacement for GTX680 (64DP/8DPperSMX), NOT 96DPcore GTX780 or 120DPcore GTX780ti. Titan(Black)250TDP have 896/960 DP cores (64DPperSMX)
Compared to GTX680, I'd say GTX980 is an excellent performer, other than Double Float.

I believe the GPUGrid app uses SP floating point calculations.

F@H also uses SP.

You're right about GPUGrid.
I'll swap my GTX670 and GTX980 and we'll see how's its performance in a PCIe3.0x16 slot.
I expect that a GTX980 should be faster than a GTX780Ti at least by 10%.
Maybe it won't be faster in the beginning, but in time the GPUGrid app could be refined for Maxwells. Besides different workunit batches will gain different performance (it could be even a loss of performance).

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 38087 - Posted: 26 Sep 2014 | 13:09:14 UTC - in response to Message 38086.
Last modified: 26 Sep 2014 | 13:10:13 UTC

Good news that the app is working but disappointing performance.

Time to move the windows and (linux?) app to "non-beta"?

Disappointing compared to GK110? Or GK104 boards? GTX980 (64DP cores/4DPperSMM/1DPper32coreblock) is replacement for GTX680 (64DP/8DPperSMX), NOT 96DPcore GTX780 or 120DPcore GTX780ti. Titan(Black)250TDP have 896/960 DP cores (64DPperSMX)
Compared to GTX680, I'd say GTX980 is an excellent performer, other than Double Float.

I believe the GPUGrid app uses SP floating point calculations.

F@H also uses SP.

You're right about GPUGrid.
I'll swap my GTX670 and GTX980 and we'll see how's its performance in a PCIe3.0x16 slot.
I expect that a GTX980 should be faster than a GTX780Ti at least by 10%.
Maybe it won't be faster in the beginning, but in time the GPUGrid app could be refined for Maxwells. Besides different workunit batches will gain different performance (it could be even a loss of performance).


What do you think difference between PCIe2x16/PCIe3x16 is for GPUGRID, and similar programs? Also, do have idea how many of those "scalar" GM204 cores are cooking? Earlier in this thread-- You estimated 1920-2880 cores are being utilized for "superscalar" GK110.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38088 - Posted: 26 Sep 2014 | 14:12:00 UTC - in response to Message 38082.

Could you crop me the Performance information the *0_0 output file, please?

Matt

klepel
Send message
Joined: 23 Dec 09
Posts: 165
Credit: 2,835,189,088
RAC: 234,082
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38090 - Posted: 26 Sep 2014 | 15:30:31 UTC - in response to Message 38085.

biodoc, I send you a off topic PM in this very moment.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38091 - Posted: 26 Sep 2014 | 16:23:14 UTC - in response to Message 38088.

Could you crop me the Performance information the *0_0 output file, please?

Matt

It's already finished, and uploaded.
I'll swap my cards when I get home.
709-NOELIA_20MGWT-1-5-RND4766_0 18,458.00 sec

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38094 - Posted: 26 Sep 2014 | 21:40:24 UTC - in response to Message 38088.

Could you crop me the Performance information the *0_0 output file, please?

Matt

I've successfully swapped my GTX670 ans GTX980 and hacked this client, so now I have another workunit in progress.

The workunit is 13.103% completed at 40 minutes, the estimated total computing time is 18.316 sec (5h5m)
A similar workunit took 16.616 sec (4h37m) to finish on my GTX780Ti (@1098MHz)

CPU: Core i7-4770K @4.3GHz, 8GB DDR3 1866MHz
GPU usage: 98% (CPU thread 100%, PCIe3.0x16)
GPU Temperature: 62°C
GPU Memory Controller load: 52%
GPU Memory usage: 804MB
GPU Voltage: 1.218V
GPU Power: 95% (Haven't measured at the wall outlet)
GPU Core Clock: 1240MHz

# Simulation rate 83.10 (ave) 83.10 (inst) ns/day. Estimated completion Sat Sep 27 11:06:30 2014 # Simulation rate 88.80 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 10:19:41 2014 # Simulation rate 91.00 (ave) 95.75 (inst) ns/day. Estimated completion Sat Sep 27 10:03:10 2014 # Simulation rate 92.05 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:55:35 2014 # Simulation rate 92.69 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:51:02 2014 # Simulation rate 93.18 (ave) 95.75 (inst) ns/day. Estimated completion Sat Sep 27 09:47:33 2014 # Simulation rate 93.49 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:45:27 2014 # Simulation rate 93.76 (ave) 95.75 (inst) ns/day. Estimated completion Sat Sep 27 09:43:32 2014 # Simulation rate 93.94 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:42:21 2014 # Simulation rate 94.03 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:41:41 2014 # Simulation rate 94.19 (ave) 95.75 (inst) ns/day. Estimated completion Sat Sep 27 09:40:38 2014 # Simulation rate 94.28 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:39:59 2014 # Simulation rate 94.39 (ave) 95.75 (inst) ns/day. Estimated completion Sat Sep 27 09:39:13 2014 # Simulation rate 94.46 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:38:46 2014 # Simulation rate 94.49 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:38:33 2014 # Simulation rate 94.57 (ave) 95.75 (inst) ns/day. Estimated completion Sat Sep 27 09:38:02 2014 # Simulation rate 94.64 (ave) 95.75 (inst) ns/day. Estimated completion Sat Sep 27 09:37:34 2014 # Simulation rate 94.68 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:37:18 2014 # Simulation rate 94.73 (ave) 95.75 (inst) ns/day. Estimated completion Sat Sep 27 09:36:55 2014 # Simulation rate 94.76 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:43 2014 # Simulation rate 94.79 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:32 2014 # Simulation rate 94.83 (ave) 95.75 (inst) ns/day. Estimated completion Sat Sep 27 09:36:15 2014 # Simulation rate 94.86 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:06 2014 # Simulation rate 94.88 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:35:58 2014 # Simulation rate 94.88 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:35:57 2014 # Simulation rate 94.85 (ave) 94.12 (inst) ns/day. Estimated completion Sat Sep 27 09:36:09 2014 # Simulation rate 94.82 (ave) 94.12 (inst) ns/day. Estimated completion Sat Sep 27 09:36:20 2014 # Simulation rate 94.84 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:12 2014 # Simulation rate 94.84 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:11 2014 # Simulation rate 94.83 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:36:15 2014 # Simulation rate 94.81 (ave) 94.12 (inst) ns/day. Estimated completion Sat Sep 27 09:36:25 2014 # Simulation rate 94.67 (ave) 90.65 (inst) ns/day. Estimated completion Sat Sep 27 09:37:20 2014 # Simulation rate 94.57 (ave) 91.40 (inst) ns/day. Estimated completion Sat Sep 27 09:38:01 2014 # Simulation rate 94.51 (ave) 92.55 (inst) ns/day. Estimated completion Sat Sep 27 09:38:26 2014 # Simulation rate 94.52 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:38:21 2014 # Simulation rate 94.54 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:38:12 2014 # Simulation rate 94.56 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:38:03 2014 # Simulation rate 94.59 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:37:55 2014 # Simulation rate 94.60 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:37:47 2014 # Simulation rate 94.61 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:37:44 2014 # Simulation rate 94.63 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:37:37 2014 # Simulation rate 94.65 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:37:30 2014 # Simulation rate 94.66 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:37:24 2014 # Simulation rate 94.68 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:37:18 2014 # Simulation rate 94.68 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:37:15 2014 # Simulation rate 94.71 (ave) 95.75 (inst) ns/day. Estimated completion Sat Sep 27 09:37:06 2014 # Simulation rate 94.72 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:37:01 2014 # Simulation rate 94.72 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:59 2014 # Simulation rate 94.74 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:54 2014 # Simulation rate 94.74 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:52 2014 # Simulation rate 94.75 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:48 2014 # Simulation rate 94.75 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:36:49 2014 # Simulation rate 94.74 (ave) 94.12 (inst) ns/day. Estimated completion Sat Sep 27 09:36:54 2014 # Simulation rate 94.72 (ave) 94.12 (inst) ns/day. Estimated completion Sat Sep 27 09:36:59 2014 # Simulation rate 94.73 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:57 2014 # Simulation rate 94.74 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:53 2014 # Simulation rate 94.75 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:49 2014 # Simulation rate 94.76 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:45 2014 # Simulation rate 94.76 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:43 2014 # Simulation rate 94.74 (ave) 93.33 (inst) ns/day. Estimated completion Sat Sep 27 09:36:53 2014 # Simulation rate 94.73 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:36:55 2014 # Simulation rate 94.74 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:51 2014 # Simulation rate 94.75 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:47 2014 # Simulation rate 94.76 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:46 2014 # Simulation rate 94.76 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:45 2014 # Simulation rate 94.76 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:44 2014 # Simulation rate 94.77 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:40 2014 # Simulation rate 94.77 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:36:42 2014 # Simulation rate 94.76 (ave) 94.12 (inst) ns/day. Estimated completion Sat Sep 27 09:36:46 2014 # Simulation rate 94.74 (ave) 93.72 (inst) ns/day. Estimated completion Sat Sep 27 09:36:52 2014 # Simulation rate 94.72 (ave) 93.33 (inst) ns/day. Estimated completion Sat Sep 27 09:37:00 2014 # Simulation rate 94.73 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:56 2014 # Simulation rate 94.73 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:36:58 2014 # Simulation rate 94.73 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:56 2014 # Simulation rate 94.73 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:36:58 2014 # Simulation rate 94.73 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:56 2014 # Simulation rate 94.73 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:55 2014 # Simulation rate 94.72 (ave) 94.12 (inst) ns/day. Estimated completion Sat Sep 27 09:36:59 2014 # Simulation rate 94.72 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:37:00 2014 # Simulation rate 94.72 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:37:01 2014 # Simulation rate 94.73 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:58 2014 # Simulation rate 94.73 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:57 2014 # Simulation rate 94.73 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:56 2014 # Simulation rate 94.73 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:36:57 2014 # Simulation rate 94.73 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:56 2014 # Simulation rate 94.73 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:55 2014 # Simulation rate 94.73 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:36:56 2014 # Simulation rate 94.73 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:36:57 2014 # Simulation rate 94.73 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:56 2014 # Simulation rate 94.72 (ave) 93.72 (inst) ns/day. Estimated completion Sat Sep 27 09:37:00 2014 # Simulation rate 94.71 (ave) 94.12 (inst) ns/day. Estimated completion Sat Sep 27 09:37:03 2014 # Simulation rate 94.72 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:37:02 2014 # Simulation rate 94.72 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:37:01 2014 # Simulation rate 94.72 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:37:02 2014 # Simulation rate 94.71 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:37:03 2014 # Simulation rate 94.71 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:37:04 2014 # Simulation rate 94.71 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:37:05 2014 # Simulation rate 94.72 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:37:02 2014 # Simulation rate 94.71 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:37:03 2014 # Simulation rate 94.71 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:37:04 2014 # Simulation rate 94.71 (ave) 94.12 (inst) ns/day. Estimated completion Sat Sep 27 09:37:06 2014 # Simulation rate 94.71 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:37:03 2014 # Simulation rate 94.71 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:37:04 2014 # Simulation rate 94.71 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:37:03 2014 # Simulation rate 94.71 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:37:04 2014 # Simulation rate 94.71 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:37:05 2014 # Simulation rate 94.71 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:37:04 2014 # Simulation rate 94.71 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:37:05 2014 # Simulation rate 94.71 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:37:04 2014 # Simulation rate 94.70 (ave) 93.72 (inst) ns/day. Estimated completion Sat Sep 27 09:37:08 2014 # Simulation rate 94.70 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:37:07 2014 # Simulation rate 94.71 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:37:04 2014 # Simulation rate 94.71 (ave) 94.12 (inst) ns/day. Estimated completion Sat Sep 27 09:37:07 2014 # Simulation rate 94.70 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:37:07 2014 # Simulation rate 94.70 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:37:08 2014 # Simulation rate 94.71 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:37:06 2014 # Simulation rate 94.71 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:37:03 2014 # Simulation rate 94.72 (ave) 95.75 (inst) ns/day. Estimated completion Sat Sep 27 09:37:00 2014 # Simulation rate 94.72 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:37:01 2014 # Simulation rate 94.71 (ave) 94.12 (inst) ns/day. Estimated completion Sat Sep 27 09:37:03 2014 # Simulation rate 94.71 (ave) 94.12 (inst) ns/day. Estimated completion Sat Sep 27 09:37:05 2014 # Simulation rate 94.70 (ave) 94.12 (inst) ns/day. Estimated completion Sat Sep 27 09:37:07 2014 # Simulation rate 94.70 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:37:07 2014 # Simulation rate 94.71 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:37:05 2014 # Simulation rate 94.71 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:37:06 2014 # Simulation rate 94.71 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:37:04 2014 # Simulation rate 94.72 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:37:02 2014 # Simulation rate 94.72 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:37:02 2014 # Simulation rate 94.72 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:37:00 2014 # Simulation rate 94.72 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:37:00 2014 # Simulation rate 94.72 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:59 2014 # Simulation rate 94.73 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:57 2014 # Simulation rate 94.73 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:55 2014 # Simulation rate 94.74 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:54 2014 # Simulation rate 94.74 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:53 2014 # Simulation rate 94.73 (ave) 93.72 (inst) ns/day. Estimated completion Sat Sep 27 09:36:56 2014 # Simulation rate 94.73 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:56 2014 # Simulation rate 94.73 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:55 2014 # Simulation rate 94.74 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:53 2014 # Simulation rate 94.74 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:53 2014 # Simulation rate 94.74 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:51 2014 # Simulation rate 94.74 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:36:52 2014 # Simulation rate 94.75 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:50 2014 # Simulation rate 94.75 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:48 2014 # Simulation rate 94.75 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:48 2014 # Simulation rate 94.75 (ave) 94.52 (inst) ns/day. Estimated completion Sat Sep 27 09:36:48 2014 # Simulation rate 94.75 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:48 2014 # Simulation rate 94.76 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:46 2014 # Simulation rate 94.76 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:45 2014 # Simulation rate 94.76 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:44 2014 # Simulation rate 94.76 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:44 2014 # Simulation rate 94.77 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:42 2014 # Simulation rate 94.77 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:41 2014 # Simulation rate 94.77 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:39 2014 # Simulation rate 94.76 (ave) 92.55 (inst) ns/day. Estimated completion Sat Sep 27 09:36:45 2014 # Simulation rate 94.76 (ave) 95.75 (inst) ns/day. Estimated completion Sat Sep 27 09:36:43 2014 # Simulation rate 94.77 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:41 2014 # Simulation rate 94.77 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:40 2014 # Simulation rate 94.77 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:38 2014 # Simulation rate 94.78 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:38 2014 # Simulation rate 94.78 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:37 2014 # Simulation rate 94.78 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:35 2014 # Simulation rate 94.79 (ave) 95.34 (inst) ns/day. Estimated completion Sat Sep 27 09:36:34 2014 # Simulation rate 94.79 (ave) 95.75 (inst) ns/day. Estimated completion Sat Sep 27 09:36:31 2014 # Simulation rate 94.79 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:31 2014 # Simulation rate 94.79 (ave) 94.12 (inst) ns/day. Estimated completion Sat Sep 27 09:36:33 2014 # Simulation rate 94.78 (ave) 93.72 (inst) ns/day. Estimated completion Sat Sep 27 09:36:35 2014 # Simulation rate 94.77 (ave) 93.33 (inst) ns/day. Estimated completion Sat Sep 27 09:36:39 2014 # Simulation rate 94.77 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:39 2014 # Simulation rate 94.78 (ave) 94.93 (inst) ns/day. Estimated completion Sat Sep 27 09:36:38 2014

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38097 - Posted: 27 Sep 2014 | 9:10:11 UTC

My GTX980 is crunching fine, a little slower than a GTX780Ti, while consuming much less power. So probably the GPUGrid client can use more than 1920 CUDA cores of the GTX780Ti (or it can't use all CUDA cores in Maxwell).
Here is the task list of this host.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38098 - Posted: 27 Sep 2014 | 11:24:11 UTC
Last modified: 27 Sep 2014 | 11:27:32 UTC

Great work, Zoltan!

For somparison my GTX660Ti with "Eco-tuning" running a NOELIA_20MGWT, which yields the same credits as the WUs you used:
GPU usage: 93% (CPU 1-2% of an 8-threaded i7 3770K, PCIe3.0x16)
GPU power 100% of its 110 W limit (-> ~121W at the wall outlet, increase over idle ~105 W)
GPU temperature: 64°C (ambient: 22°C)
GPU memory controller load: 39%
GPU memory used: 978MB
GPU voltage: 1.05V
GPU core clock: 1084MHz

Runtime will be ~39000s, as usual. Taking a Win 8.1 tax of ~7% for my system into account you achieve just about double the performance. The cards power consumption is 110 W vs. 165*0.93=153.5 W, i.e. your card consumes only about 40% more! (not taking PSU efficiencies into account here)

I'd be interested in how your numbers change, if you eco-tune your card to ~1.1 V by reducing the power target. If you don't want to run such tests don't worry, I'll probalby measure this myself soon with a GTX970 ;)

biodoc wrote:
Good news that the app is working but disappointing performance.

I would say it's only disappointing if your expectations were set really high. So far GM204 is not performing miracles here, but it's performing solidly at almost the performance level of GK110 for far less power used.

biodoc wrote:
I believe the GPUGrid app uses SP floating point calculations.

Correct.

eXaPower wrote:
Also, do have idea how many of those "scalar" GM204 cores are cooking? Earlier in this thread-- You estimated 1920-2880 cores are being utilized for "superscalar" GK110.

It was always hard for GPU-Grid to use the superscalar shaders, which amounts to 1/3 of all shaders in "all but the high-end Fermis" and all Keplers. That's where this number comes from. Maxwell has no such restrictions, hence all shaders can be used in principle. This says nothing about other potential bottlenecks, however: PCIe bus, memory bandwidth, CPU support etc. Translating these limitations into statments along the lines of "can only use xxxx shaders" would be misleading.

Edit: BTW, what's the memory controller load for GTX780Ti running such tasks?

MrS
____________
Scanning for our furry friends since Jan 2002

biodoc
Send message
Joined: 26 Aug 08
Posts: 160
Credit: 1,405,920,847
RAC: 11
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38099 - Posted: 27 Sep 2014 | 12:15:46 UTC

There are more potential variables in Zoltan's tests so far:

Cuda 6.5 vs 6.0
Zoltan's 780Ti cards: Are they reference cards or overclocked?

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 38100 - Posted: 27 Sep 2014 | 12:46:34 UTC
Last modified: 27 Sep 2014 | 12:47:02 UTC

[url]http://www.anandtech.com/show/8568/the-geforce-gtx-970-review-feat-evga/13 [/url]

Very interesting comments about GTX970 GPC partition(s), requiring further investigation.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 912
Credit: 2,216,536,145
RAC: 255,119
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38101 - Posted: 27 Sep 2014 | 12:55:44 UTC - in response to Message 38100.

http://www.anandtech.com/show/8568/the-geforce-gtx-970-review-feat-evga/13

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38102 - Posted: 27 Sep 2014 | 12:57:04 UTC

@Biodoc: valid points. Regarding the clockspeed Zoltan said his GTX780Ti was running at 1098MHz, so it's got a "typical" overclock. And the new app claims to be CUDA 6.5. However, I don't think Matt changed the actual crunching code for this release, so any differences would come from changes in built-in functions. During the last few CUDA releases we haven't seen any large changes of GPU-Grid performance, so I don't expect it this time either. Anyway, for the best comparison both cards should run the new version.

MrS
____________
Scanning for our furry friends since Jan 2002

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 38104 - Posted: 27 Sep 2014 | 13:23:13 UTC - in response to Message 38102.

@Biodoc: valid points. Regarding the clockspeed Zoltan said his GTX780Ti was running at 1098MHz, so it's got a "typical" overclock. And the new app claims to be CUDA 6.5. However, I don't think Matt changed the actual crunching code for this release, so any differences would come from changes in built-in functions. During the last few CUDA releases we haven't seen any large changes of GPU-Grid performance, so I don't expect it this time either. Anyway, for the best comparison both cards should run the new version.

MrS


Has dynamic parallelism (C.C 3.5/5.0/5.2) been introduced to ACEMD? Or Unified Memory from CUDA 6.0? Unified memory is a C.C 3.0+ feature.
Quoted from newest CUDA programming guide-- "new managed memory space in which all processors see a single coherent memory image with a common address space. A processor refers to any independent execution unit with a dedicated MMU. This includes both CPUs and GPUs of any type and architecture. "

biodoc
Send message
Joined: 26 Aug 08
Posts: 160
Credit: 1,405,920,847
RAC: 11
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38106 - Posted: 27 Sep 2014 | 13:32:39 UTC

I posted some power consumption data for my GTX980 (+/- overclock) at the F@H forum.

Also, there's some early numbers for a GTX970 in the same thread.

https://foldingforum.org/viewtopic.php?f=38&t=26757&p=269043#p269043

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38108 - Posted: 27 Sep 2014 | 17:21:01 UTC - in response to Message 38104.

Has dynamic parallelism (C.C 3.5/5.0/5.2) been introduced to ACEMD? Or Unified Memory from CUDA 6.0? Unified memory is a C.C 3.0+ feature.

Dynamic parallelism: no. It would break compatibility with older cards or require two separate code paths. Besides, GPU-Grid doesn't have much of a problem occupying all shader multiprocessors (SM, SMX etc.).

Unified memory: this is only meant to ease programming for new applications, at the cost of some performance. For any existing code with optimized manual memory management (e.g. GPU-Grid) this would actually be a drawback.

MrS
____________
Scanning for our furry friends since Jan 2002

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 38109 - Posted: 27 Sep 2014 | 18:13:10 UTC - in response to Message 38108.
Last modified: 27 Sep 2014 | 18:19:46 UTC

Has dynamic parallelism (C.C 3.5/5.0/5.2) been introduced to ACEMD? Or Unified Memory from CUDA 6.0? Unified memory is a C.C 3.0+ feature.

Dynamic parallelism: no. It would break compatibility with older cards or require two separate code paths. Besides, GPU-Grid doesn't have much of a problem occupying all shader multiprocessors (SM, SMX etc.).

Unified memory: this is only meant to ease programming for new applications, at the cost of some performance. For any existing code with optimized manual memory management (e.g. GPU-Grid) this would actually be a drawback.

MrS


In you're opinion: how can GPUGRID occupied SM/SMX/SMM be further enhanced, and refined for generational (CUDA C.C) differences? Compatibility is important, as is finding the most efficient code path from CUDA programming. How can we further advance ACEMD? CUDA 5.0/PTX3.1~~~>6.5/4.1 provides new commands/instructions.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2688
Credit: 1,172,901,099
RAC: 144,879
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38110 - Posted: 27 Sep 2014 | 20:20:48 UTC - in response to Message 38109.

That is a good question. One which I can unfortunately not answer. I'm just a forum mod and long-term user, not a GPU-Grid developer :)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38111 - Posted: 27 Sep 2014 | 20:40:32 UTC - in response to Message 38109.


In you're opinion: how can GPUGRID occupied SM/SMX/SMM be further enhanced, and refined for generational (CUDA C.C) differences? Compatibility is important, as is finding the most efficient code path from CUDA programming. How can we further advance ACEMD? CUDA 5.0/PTX3.1~~~>6.5/4.1 provides new commands/instructions.



We have cc-specific optimisations for each of the most performance sensitive kernels. Generally don't use any of the features introduced post CUdA 4.2 though, nothing there we particularly need.

I expect the GM204 performance will be marked improved once I have my hands on one.

Matt

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38112 - Posted: 27 Sep 2014 | 20:58:20 UTC - in response to Message 38111.

I expect the GM204 performance will be marked improved once I have my hands on one.

Matt

I can give you remote access to my GTX980 host, if you want to.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38113 - Posted: 27 Sep 2014 | 21:08:28 UTC - in response to Message 38109.
Last modified: 27 Sep 2014 | 21:10:26 UTC

In you're opinion: how can GPUGRID occupied SM/SMX/SMM be further enhanced, and refined for generational (CUDA C.C) differences? Compatibility is important, as is finding the most efficient code path from CUDA programming. How can we further advance ACEMD? CUDA 5.0/PTX3.1~~~>6.5/4.1 provides new commands/instructions.

There was a huge jump in performance (around 40%) when the GPUGrid app was upgraded from CUDA3.1 to CUDA4.2.
I think this huge change doesn't come very often.
I think the GM204 can run older code more efficiently than the Fermi or the Kepler based GPUs, that's why other projects benefit more than GPUGrid, as this project had this at the transition for CUDA3.1 to CUDA4.2.

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 38115 - Posted: 27 Sep 2014 | 21:43:11 UTC - in response to Message 38111.


In you're opinion: how can GPUGRID occupied SM/SMX/SMM be further enhanced, and refined for generational (CUDA C.C) differences? Compatibility is important, as is finding the most efficient code path from CUDA programming. How can we further advance ACEMD? CUDA 5.0/PTX3.1~~~>6.5/4.1 provides new commands/instructions.



We have cc-specific optimisations for each of the most performance sensitive kernels. Generally don't use any of the features introduced post CUdA 4.2 though, nothing there we particularly need.

I expect the GM204 performance will be marked improved once I have my hands on one.

Matt


I found one of many papers written by you and others-- "ACEMD: Accelerating Biomolecular Dynamics in the Microsecond Time Scale" during golden days of GT200. A Maxwell update: if applicable- would be very informative.

Profile @tonymmorley
Send message
Joined: 10 Mar 14
Posts: 24
Credit: 1,215,128,812
RAC: 10
Level
Met
Scientific publications
watwatwatwatwatwatwat
Message 38118 - Posted: 28 Sep 2014 | 1:40:01 UTC

Hey guys, I can't get any work for my two GTX 980's. Any thoughts, I'm a bit lost in the feed.

eXaPower
Send message
Joined: 25 Sep 13
Posts: 280
Credit: 1,449,568,667
RAC: 73
Level
Met
Scientific publications
watwatwatwatwatwatwatwat
Message 38121 - Posted: 28 Sep 2014 | 11:03:35 UTC - in response to Message 38113.
Last modified: 28 Sep 2014 | 11:11:26 UTC

You don't see these jumps often. A 32core block with an individual warp scheduler, rather Kelper Flat design (sharing all cores with warp scheduler) ) is contributing to better core management, as is Maxwell redesigned crossbar, dispatch, issue.
Even so, GM204 (2048c/1664c) is providing performance levels close to (~1600s) 2880 core GK110, while ~2Hr faster than a GTX780 (2304core). I think GM204, once tuned properly- will excel.

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38122 - Posted: 28 Sep 2014 | 11:50:13 UTC - in response to Message 38118.

http://www.gpugrid.net/forum_thread.php?id=3603&nowrap=true#38075
If this doesn't make sense then I would suggest waiting until the project can update the scheduler, etc. as the details of what Retvari did are a bit twisty.
____________
Thanks - Steve

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38123 - Posted: 28 Sep 2014 | 11:50:33 UTC - in response to Message 38113.


There was a huge jump in performance (around 40%) when the GPUGrid app was upgraded from CUDA3.1 to CUDA4.2.
I think this huge change doesn't come very often.


That change marked the transition to a new code base. The improvement wasn't down to the change in CUDA version, so much as us introducing developing improved algorithms.

Matt

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38124 - Posted: 28 Sep 2014 | 11:51:30 UTC - in response to Message 38118.


Hey guys, I can't get any work for my two GTX 980's. Any thoughts, I'm a bit lost in the feed.


It's not ready just yet...

Matt

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38125 - Posted: 28 Sep 2014 | 11:54:08 UTC - in response to Message 38115.


I found one of many papers written by you and others-- "ACEMD: Accelerating Biomolecular Dynamics in the Microsecond Time Scale" during golden days of GT200. A Maxwell update: if applicable- would be very informative.


I'm doing a bit of work to improve the performance of the code for Maxwell hardware - expect an update before the end of the year.

Matt

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38126 - Posted: 28 Sep 2014 | 11:55:02 UTC - in response to Message 38112.


I can give you remote access to my GTX980 host, if you want to.


Most kind, but I've got some on order already. Just waiting for the slow boat from China to wend its way across the Med.

Matt

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 82,949
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38128 - Posted: 28 Sep 2014 | 11:58:45 UTC - in response to Message 38118.
Last modified: 28 Sep 2014 | 19:03:51 UTC

Not had the time to look into this in great detail but my tuppence worth:

The GTX980 and GTX970 are GM204 (non-super-scalar) but not GM210, so they are really the latest mid range GPU's and very much aimed at the gaming community (1/32 FP32 and 4GB).

These are generational updates to the GK104 models and both the big brother and a revision of the GM107 (GTX750 and GTX750Ti).
As such they should be seen as gaming replacements/upgrades to GPU's such as the GTX670 and even the GTX770.

As usual there is some naming inconsistency; the GTX980 is GM204 while the GTX780 is GK110, so it's not a straight comparison or upgrade there. However, if you go back to a GTX680 the comparison is somewhat more limier (GK104 vs GM204). Note that the GM107 trailblazed Maxwell.

GPU Memory Controller load: 52%

That is very high and I expect its impacting on performance, and I don't think its simply down to Bus Width (256bit) but also down to architectural changes. Was hoping this would not be the case and it's somewhat surprising saying as the GTX900's have 2MB L2 cache.

That said, some of Noelia's WU's are more memory intensive than other WU's, and on my slightly underclocked GTX770 a 147-NOELIA_20MGWT WU's load is presently 30% (which is higher than other WU's).

This suggests the GTX970 is a better choice than the GTX980, certainly when you consider the ~50% price difference (UK) for ~80% performance. That said, I would want to know what the memory controllers utilization is on a GTX970 before concluding that it is definitely a problem and recommending the 970 over the 980 (which will still do more work despite the constraints). For the 790 it might be ~43% which isn't great and suggests another problem (architecture/code) besides the 256bit limitation.

Any readings for other WU's?

In terms of performance these GPU's appear to only be on par with the high-ish end GTX700's, and performance is basically in line with the number of Cuda Cores. Again suggesting that there is some potential for app improvement.

It's possible that if the apps are recompiled with new CUDA Development Tools the new drivers will inherently offer improvements for the GTX900 series, but given that these are GM204 I'm not expecting miracles.

The big question was always going to be, What's the performance per Watt like for here?
Apparently, when gaming a GTX970 uses up to 30W less than a GTX770, and significantly outperforms it (on reviewed games) but the TDP's are 145W and 230W. So a GTX970 might use ~63% of a GTX770's power and at first glance appears to outperform it by ~10%. Thus I'm expecting the performance/Watt to be about 1.75 times that of a GTX770 (ball park). So from that point of view it's a winner, and maybe app tweaks can increase that further.

PS. My GTX770's Memory Controller load is only 22% for a trphisx3-NOELIA_SH2 WU, so I'm guessing the same type of WU would have a 38% load on a GTX980.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38136 - Posted: 28 Sep 2014 | 20:54:54 UTC

Trying to fix the scheduler now - if you have a 980, please sub to the acemdbeta app, accept beta work, and try again. It won't work, but I'm logging the problems now.

Matt

biodoc
Send message
Joined: 26 Aug 08
Posts: 160
Credit: 1,405,920,847
RAC: 11
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38138 - Posted: 28 Sep 2014 | 22:06:44 UTC - in response to Message 38136.
Last modified: 28 Sep 2014 | 22:16:04 UTC

Did as you requested and it's now crunching what looks to be a test WU: MJHARVEY_TEST

EDIT: The scheduler worked!

biodoc
Send message
Joined: 26 Aug 08
Posts: 160
Credit: 1,405,920,847
RAC: 11
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38140 - Posted: 29 Sep 2014 | 0:28:07 UTC

My GTX980 has finished 2 of the beta WUs successfully.

http://www.gpugrid.net/results.php?hostid=142719

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2058
Credit: 15,019,398,669
RAC: 4,416,896
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38144 - Posted: 29 Sep 2014 | 7:37:44 UTC - in response to Message 38140.
Last modified: 29 Sep 2014 | 7:38:36 UTC

My GTX980 has finished 2 of the beta WUs successfully.

http://www.gpugrid.net/results.php?hostid=142719

The 8.44 CUDA65 application is available for the short queue, perhaps you should give it a try too.

biodoc
Send message
Joined: 26 Aug 08
Posts: 160
Credit: 1,405,920,847
RAC: 11
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38147 - Posted: 29 Sep 2014 | 9:37:28 UTC - in response to Message 38144.

My GTX980 has finished 2 of the beta WUs successfully.

http://www.gpugrid.net/results.php?hostid=142719

The 8.44 CUDA65 application is available for the short queue, perhaps you should give it a try too.


I'm getting "no tasks available" for either the beta or the short run WUs.

9/29/2014 5:36:32 AM | GPUGRID | Requesting new tasks for CPU and NVIDIA GPU
9/29/2014 5:36:33 AM | GPUGRID | Scheduler request completed: got 0 new tasks
9/29/2014 5:36:33 AM | GPUGRID | No tasks sent
9/29/2014 5:36:33 AM | GPUGRID | No tasks are available for Short runs (2-3 hours on fastest card)

biodoc
Send message
Joined: 26 Aug 08
Posts: 160
Credit: 1,405,920,847
RAC: 11
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38149 - Posted: 29 Sep 2014 | 9:58:02 UTC

It looks like Matt has just added a new beta app (version 8.45).

I'll keep my preferences for both beta (test applications) and short runs for now unless he requests just beta.

biodoc
Send message
Joined: 26 Aug 08
Posts: 160
Credit: 1,405,920,847
RAC: 11
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38159 - Posted: 29 Sep 2014 | 11:17:41 UTC - in response to Message 38149.

Just got a test WU with the new beta app (8.45).

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38160 - Posted: 29 Sep 2014 | 11:43:02 UTC - in response to Message 38144.


The 8.44 CUDA65 application is available for the short queue


Not any more. the CUDA65 error rate is suspiciously high for non GM204 cards.

Matt

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,348,955
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat