Advanced search

Message boards : Graphics cards (GPUs) : Big disappointment after exchange of graphic card

Author Message
Erich56
Send message
Joined: 1 Jan 15
Posts: 638
Credit: 3,156,114,142
RAC: 806,902
Level
Arg
Scientific publications
watwatwatwatwatwat
Message 42509 - Posted: 29 Dec 2015 | 21:08:25 UTC
Last modified: 29 Dec 2015 | 21:41:30 UTC

some 11 months ago, I bought a new graphic card for one of my PCs, in order to participate in GPUGRID crunching.
Unfortunately, I didn'd choose a really quick card: it was the NVIDIA Quadro K620.
Only 384 cuda cores, core clock: 1000MHz, memory clock: 900 MHz. 768 GFLOPS (single). No overclocking possible. Any "long run" task by Gerard took some 45 hours crunching time.

So, I made myself a Christmas gift and bought a GTX 750 TI - any bigger card would not have fit in my PC.
But okay, this card comes with 640 cuda cores, I have overclocked to 1233 MHz core clock and 1365 MHz memory clock. GFLOP figure as indicated by the manufacturer (Zotac) before overclocking is 1322 (single).

Taken all that into account, one would estimate that crunching time for a "long run" task by Gerard should be around 30 hours (or even less), right?

However, it is exactly same as before, i.e. some 45 hours. What a shame!!! How come?
Can anyone give me a reasonable explanation to this?
BTW, I am using the latest BOING version 7.6.9 (64bit), and the latest GPU driver by NVIDIA.

In fact, I was planning to also exchange my graphic card in one of my other PCs - this is a much bigger one, and I could put in even a GTX 980 - the price is around 600 Euros, but I would do it for GPUGRID, honestly. I am not a gamer, so I could easily do along with my old ASUS EN 8600GTS, which is an excellent card, but unfortunately, not able to do GPUGRID crunching.

However, the disappointing results from the recent graphic card exchange make we wonder whether I should really spend 600 Euros - maybe for nothing ???
Does a "long run" by Gerard take 45 hours in any case, regardless of what GPU is being used?

Can anyone enlighten me on these mysteries?

P.S. It just occurred to me: could this have to do with the fact that the PCIe-Slot is PCIe 1.1 x 16 (and NOT PCIe 2 or 3)?

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2078
Credit: 15,130,925,390
RAC: 4,594,881
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 42510 - Posted: 29 Dec 2015 | 23:18:51 UTC - in response to Message 42509.
Last modified: 29 Dec 2015 | 23:21:57 UTC

NVIDIA Quadro K620: Only 384 cuda cores, core clock: 1000MHz, memory clock: 900 MHz. 768 GFLOPS (single). No overclocking possible. Any "long run" task by Gerard took some 45 hours crunching time.

GTX 750 TI: 640 cuda cores, I have overclocked to 1233 MHz core clock and 1365 MHz memory clock. GFLOP figure as indicated by the manufacturer (Zotac) before overclocking is 1322 (single).

Taken all that into account, one would estimate that crunching time for a "long run" task by Gerard should be around 30 hours (or even less), right?

However, it is exactly same as before, i.e. some 45 hours. What a shame!!! How come?
Can anyone give me a reasonable explanation to this?
It's very simple:
The GPU is just a coprocessor, a couple of bottlenecks could be present in the system, which hinder the overall performance of the GPUGrid app.
Hardware related bottlenecks:
1. The PCIe bandwith is doubled over generations (ie. PCIe3 is 4 times faster than PCIe1) (you figured it out right)
2. The CPU architecture has evolved since your Core2 Quad 9550; as the memory controller and the PCIe bus controller became integrated; e.g it became faster.
Software related bottlenecks:
3. Other CPU tasks could make the GPU app slower (especially on older CPUs)
4. You can make the GPUGrid app use a CPU thread (core) to continuously poll the GPU which makes the processing faster by setting the SWAN_SYNC environmental variable, but you should reduce the number of CPU tasks at the same time by reducing the usable CPU percentage
5. The WDDM overhead of modern Windows OSes (Vista~10) makes the computing speed *not* to scale in direct proportion of the theoretical GFLOPS number (its effect varies between different GPUGrid batches). The only way to eliminate the WDDM is using a different OS (Linux or Windows XP)

Does a "long run" by Gerard take 45 hours in any case, regardless of what GPU is being used?
No. Check the tasklist of this host.
It's a Core 2 Duo E8500 @ 3.16GHz, 2x2GB DDR2 800MHz RAM, PCIe2.0 (Q45 chipset), Windows XPx64, NVidia GeForce GTX 980
SWAN_SYNC set, no other CPU tasks running.
22,500 sec = 6h 15m
30,000 sec = 8h 20m
It should take 20~30 hours for a GTX750Ti.

Can anyone enlighten me on these mysteries?
Your system suffers all of the bottlenecks above, however you can eliminate the 3rd and 4th (and the 5th...) without changing your hardware.
I suggest you to set the crunching hours in BOINC manager instead of setting "suspend GPU computing when computer is in use" as it will make the GPUGrid app suspend several times, and it could lead to computing errors. Alternatively you can manually suspend GPU crunching when you are using this computer by right click on the BOINC manager icon in the taskbar's notification area and selecting "snooze GPU" (for 2 hours).

Erich56
Send message
Joined: 1 Jan 15
Posts: 638
Credit: 3,156,114,142
RAC: 806,902
Level
Arg
Scientific publications
watwatwatwatwatwat
Message 42512 - Posted: 30 Dec 2015 | 6:53:05 UTC - in response to Message 42510.

hello Zoltan, many thanks for your nightly thorough reply. All your explanations are very sound to me, and now I know a lot more about the technical environment and circumstances in connection with GPUGRID computing, and about the situation with my PC in specific.

What I did immediately thereafter was:
1) setting the SWAN_SYNC
2) reducing the percentage of CPU use from 100% to 95%. This immediately reduced the number of CPU cores used for other BOINC projects from 4 to 3. So, besides the GPUGRID task, now 3 other tasks are running at the same time, vis-a-vis 4 tasks before (however, for GPUGRID the status line still says "0.951 CPUs + 1 NVIDIA GPU" - is this okay?)

So, if I now add what is shown under "elapsed" and "remaining" time for the current GPUGRID task, it adds up to some 31,5 hours (a clear improvement over 45 hours as before). I am curious what it will be for the next task, when the settings I just changed have full impact on the complete task.

As mentioned in my previous posting, I am still thinking about exchanging the NVIDIA 8600GTS in one of my other PCs which has only 32 CUDA cores and is therefore not able to do GPU computing.
However, the most severe bottleneck I see here is again the PCIe 1.1 (16x) on my ABIT IP35 PRO mainboard. The processor is a Core 2 Duo E8400 @ 3,74 GHz, 2x2GB DDR2 800MHz RAM (probably the next larger bottleneck).
OS is Vista 32-bit; however, I will, in parallel, install Win10 64-bit (with dual boot) shortly.

So, after having read your explanations from last night, I am somewhat unsure with which new graphic card to replace the old one.
Originally, I had planned to buy the GTX 980 (which should hopefully work with my 450W power supply, having not really any other overly power-consuming parts in the PC).
However, particularly in view of the PCIe 1.1, would it make sense at all to buy a 600 Euro card which could then unfold only a fraction of it's power? Wouldn't I be much better off with a smaller Maxwell card like the GTX 950 or 960?
Or, in other words: under the given circumstances (PCIe 1.1), would a GTX 980 deliver a noticeably better result than a smaller card? Or would the differnce be just marginal?

Could you (or anyone else) please give me comments on this?

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2078
Credit: 15,130,925,390
RAC: 4,594,881
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 42514 - Posted: 30 Dec 2015 | 12:13:29 UTC - in response to Message 42512.
Last modified: 30 Dec 2015 | 12:20:51 UTC

2) reducing the percentage of CPU use from 100% to 95%. This immediately reduced the number of CPU cores used for other BOINC projects from 4 to 3. So, besides the GPUGRID task, now 3 other tasks are running at the same time, vis-a-vis 4 tasks before
I suggest you to experiment a little more by reducing further the number of other (CPU) tasks running to see if the improvement in GPU performance is worth it. The Core 2 Quad CPU is basically two Core 2 Duo CPUs in a single package, therefore it has 2 separate level 2 caches (2x6MB) (see this picture), so it can switch over a process to a core in the other chip much slower than a Core i5 (or i7) CPU, as the latter has all of its cores in a single chip, using the same level 3 cache. I wrote a little batch program to set the process affinities for my BOINC CPU tasks to avoid switching them between cores, but it has a very little effect on my Core i7 CPUs, perhaps your Core 2 Quad could gain some more.

(however, for GPUGRID the status line still says "0.951 CPUs + 1 NVIDIA GPU" - is this okay?)
No, it's just an estimate, but when you set the SWAN_SYNC it's using "1 CPU + 1 NVidia GPU" in reality (you can check it with the task manager). Without setting the SWAN_SYNC it's "0.25~0.50 CPUs + 1 NVidia GPU". You can monitor the real GPU usage by 3rd party tools like GPU-Z, MSI afterburner, NVidia inspector etc. The GPU usage varies by the task type, and the bottlenecks present in the system.

So, if I now add what is shown under "elapsed" and "remaining" time for the current GPUGRID task, it adds up to some 31,5 hours (a clear improvement over 45 hours as before). I am curious what it will be for the next task, when the settings I just changed have full impact on the complete task.
You can have a much more precise estimate if you divide the elapsed time by the fraction done.
Say 32.1% in 4h 35m 23s then your estimate would look like ((4*60+35)*60+23)/0.321=51473.52s
51473.52/3600=14.2982h
0.2982*60=17.892m
0.892*60=53.52s
14h 17m 53.52s

As mentioned in my previous posting, I am still thinking about exchanging the NVIDIA 8600GTS in one of my other PCs which has only 32 CUDA cores and is therefore not able to do GPU computing.
However, the most severe bottleneck I see here is again the PCIe 1.1 (16x) on my ABIT IP35 PRO mainboard. The processor is a Core 2 Duo E8400 @ 3,74 GHz, 2x2GB DDR2 800MHz RAM (probably the next larger bottleneck).
OS is Vista 32-bit; however, I will, in parallel, install Win10 64-bit (with dual boot) shortly.
Believe me, the WDDM is a much severe bottleneck for high-end GPUs, so if you install Windows XP (for crunching purposes only) it will outperform a Core i7 with Windows Vista~10. The only difficulty is there's no driver support for high-end GPUs under Windows XP, but the GTX960's driver can be used by adding a line to the driver's nv4_dispi.inf file. (described in this and this post)

So, after having read your explanations from last night, I am somewhat unsure with which new graphic card to replace the old one.
Originally, I had planned to buy the GTX 980 (which should hopefully work with my 450W power supply, having not really any other overly power-consuming parts in the PC).
However, particularly in view of the PCIe 1.1, would it make sense at all to buy a 600 Euro card which could then unfold only a fraction of it's power? Wouldn't I be much better off with a smaller Maxwell card like the GTX 950 or 960?
If you are not planning to buy a state of the art CPU+MB+RAM in the near future, I would recommend you to buy a GTX 960 or a GTX 970.

Or, in other words: under the given circumstances (PCIe 1.1), would a GTX 980 deliver a noticeably better result than a smaller card?
It depends on how far you want to go to achieve the best GPU performance.
If you install Linux or Windows XP, and don't crunch CPU tasks, then it is worth to buy a GTX 980 for your existing hardware.

fractal
Send message
Joined: 16 Aug 08
Posts: 87
Credit: 1,182,467,326
RAC: 202,295
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 42517 - Posted: 31 Dec 2015 | 1:27:18 UTC

The 750TI was a workhorse at GPUGRID for years. Unfortunately, this has passed. Some batches of work can be completed and uploaded in under 24 hours. Many can not. There was one batch that would complete in just under 24 hours but the half hour to upload pushed it over the bonus limit. Overclocking would help for this one batch. I have a pair of them in a machine that got moved to a different project last month. They are still decent FLOPS/$ and FLOPS/W but a bit too wimpy for GPUGRID.

Looking at your second machine, I would ask you the following. What power supply is in that machine and do you want to purchase a new supply? I'll offer the following observation.

GTX 960 : 8 pin PCIe power plug. I bought one for my son for xmas and paid 20 US$ after rebate for a 420w power supply with the right plug.
GTX 970 : 2 x 6 pin PCIe power plugs. This is the most popular card on GPUGRID at the moment for good reason. Good FLOPS/$ and FLOPS/W. Great card.
GTX 980 : Still 2 x 6 pin PCIe power plugs. Good FLOPS/W but not so great FLOPS/$. The 970 is a better card if cost is a concern.

I do not like the adapter plugs that often come with cards. I would never, ever use them for a card you intend to use for GPUGRID. Make sure your power supply has the power plugs your card needs as shipped by the manufacturer.

I like linux for crunch boxes. They save the microsoft tax and return better performance than new windows versions. The disadvantage is it is more difficult to overclock the video card under Linux than under Windows.

Good luck and crunch on.

Erich56
Send message
Joined: 1 Jan 15
Posts: 638
Credit: 3,156,114,142
RAC: 806,902
Level
Arg
Scientific publications
watwatwatwatwatwat
Message 42518 - Posted: 31 Dec 2015 | 6:20:15 UTC - in response to Message 42517.

The 750TI was a workhorse at GPUGRID for years. Unfortunately, this has passed. Some batches of work can be completed and uploaded in under 24 hours. Many can not. ... They are still decent FLOPS/$ and FLOPS/W but a bit too wimpy for GPUGRID.

Unfortunately, this PC is too small to put in any larger card than the 750ti.
So, my thought was: better than nothing :-)


Looking at your second machine, I would ask you the following. What power supply is in that machine

it is a Corsair 450W; however, only one 6-pin plug available. Any second PCIe-plug (either 6-pin or 8-pin) I would have to get thru a Molex adapter.
Or change the power supply.
Just for my knowledge: for what reason exaclty is your advise NOT to use such adapters?

Erich56
Send message
Joined: 1 Jan 15
Posts: 638
Credit: 3,156,114,142
RAC: 806,902
Level
Arg
Scientific publications
watwatwatwatwatwat
Message 42519 - Posted: 31 Dec 2015 | 6:40:30 UTC - in response to Message 42514.

If you install Linux or Windows XP, and don't crunch CPU tasks, then it is worth to buy a GTX 980 for your existing hardware.

Too bad that I got rid of a PC with XP installed some months ago (however, I think to remember it had only AGP, not PCIe - so no good for GPU computing, anyway).
Whereas I think it should be no problem to still get a XP CD from some source for new installation, I am wondering how safe this really is, since the system would be connected to the internet all the time. How did you solve this problem?

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2078
Credit: 15,130,925,390
RAC: 4,594,881
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 42520 - Posted: 31 Dec 2015 | 15:33:40 UTC - in response to Message 42519.
Last modified: 31 Dec 2015 | 15:42:27 UTC

Whereas I think it should be no problem to still get a XP CD from some source for new installation, I am wondering how safe this really is, since the system would be connected to the internet all the time. How did you solve this problem?
That's why I wrote "if you install Windows XP (for crunching purposes only)" meaning that you have a dual-boot computer: 1st OS is Windows XP for crunching, 2nd OS is any later Windows (preferably 10) for general purposes. But if you install Windows XP x86 (32 bit), you still can get security updates for it by applying this method.
If your hosts are behind a router (e.g. your computers don't have public IP address, and don't reside on a LAN wich have unknown computers), then your computers can't be attacked directly from the internet. Attacking a computer usually is done by attacking the user of it. This can be done by e-mail or instant message attachments or links, "free" downloads where you can't really find the right download button first, "free" games or contents or browser plugins, or fraudulent web pages. All of these can be done on any modern OSes, because it's more easy to trick the user. The best tool to counter these risks is Malwarebyte's Antimalware. However, I don't recommend to use Windows XP beside of crunching.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2078
Credit: 15,130,925,390
RAC: 4,594,881
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 42521 - Posted: 31 Dec 2015 | 15:49:36 UTC - in response to Message 42518.
Last modified: 31 Dec 2015 | 15:49:51 UTC

Just for my knowledge: for what reason exactly is your advise NOT to use such adapters?
The reason is contact resistance: these power connectors don't have gold coating, still there's high currents go through them (240W/12V=20A).
The Molex connectors were designed for hard disk drives, which usually consume no more than 2A from the 12V and the 5V power supply pins.

fractal
Send message
Joined: 16 Aug 08
Posts: 87
Credit: 1,182,467,326
RAC: 202,295
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 42525 - Posted: 31 Dec 2015 | 18:12:18 UTC - in response to Message 42521.

Just for my knowledge: for what reason exactly is your advise NOT to use such adapters?
The reason is contact resistance: these power connectors don't have gold coating, still there's high currents go through them (240W/12V=20A).
The Molex connectors were designed for hard disk drives, which usually consume no more than 2A from the 12V and the 5V power supply pins.

This is part of it. I am too lazy to look up the current carrying capacity of molex pins of different materials. They rate them by how many connectors are in a housing as well as the material. Tin is less than Gold. Even tin is good for a half dozen amps which is likely adequate. You do NOT want to mix materials. A tin plug with a gold socket is worse than tin on tin.

The biggest issue is points of failure. Every connector is a point of failure. Every connector is a source of resistance. And, since resistance turns into heat, small cables and extra connectors are turned into extra heat in the case. And, extra cables and adapters block air flow.

And, for the absolutely most important part: THEY LOOK FUGLY!!

Seriously though, you are spending hundreds of dollars/yen/ruples/your local currency for a graphic card that is going into a computer you spent even more on and you are going to cheap out on the power supply? Buy something that is designed to run the card. And, that means it has the proper plugs in addition to having enough current capacity.

nanoprobe
Send message
Joined: 26 Feb 12
Posts: 181
Credit: 222,317,747
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 42580 - Posted: 8 Jan 2016 | 21:55:19 UTC

Unfortunately, this PC is too small to put in any larger card than the 750ti.


There are smaller versions of the 970 that may fit into your case.
Here's one. http://www.newegg.com/Product/Product.aspx?Item=N82E16814121912&cm_re=gtx_970-_-14-121-912-_-Product

Look around for others. I know I've seen them advertised to fit into an iTX case.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2078
Credit: 15,130,925,390
RAC: 4,594,881
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 42582 - Posted: 9 Jan 2016 | 13:17:14 UTC - in response to Message 42580.
Last modified: 9 Jan 2016 | 13:27:50 UTC

There are smaller versions of the 970 that may fit into your case.
Here's one. http://www.newegg.com/Product/Product.aspx?Item=N82E16814121912&cm_re=gtx_970-_-14-121-912-_-Product

Look around for others. I know I've seen them advertised to fit into an iTX case.
I don't recommend to use this card (or it's Gigabyte equivalent) and/or an ITX case for continuous GPUGrid crunching.
The ASUS GTX970-DCMOC-4GD5 is a bit better, as it has a larger rear grille.
But any of these cards will dissipate most part of the heat inside the case, which will result in very high temperatures inside an ITX case in no time.
The higher the temperature, the shorter the lifetime of the components will be.
Their another drawback is that they have only one PCIe power connector, which results in higher currents go through their pins, which could shorten the lifetime of the connector thus the whole card.

Post to thread

Message boards : Graphics cards (GPUs) : Big disappointment after exchange of graphic card