Message boards : Graphics cards (GPUs) : nVidia driver 340.52
Author | Message |
---|---|
Ever since I installed these newest drivers for my MSI GTX670 I've been spitting out nothing but cuda60 errors and even a few cuda42. I did a complete wipe of the drivers and went back to the previous 337.88 drivers and cuda60 are validating. Has there been any problems reported with the 670 and 340.52 drivers? | |
ID: 37718 | Rating: 0 | rate: / Reply Quote | |
I have not been having those problems with the newer drivers. But, the newer drivers do push the GPUs harder, and it's possible that your current clocks can't handle them. Please read on. | |
ID: 37719 | Rating: 0 | rate: / Reply Quote | |
I'm puking up the cuda60 errors again. One thing that stands out is they crash immediately with CPU time = 0. I've been using MSI Afterburner with auto fan control and the highest the temp has gotten is 79. However with these units crashing right away the GPU isn't given the chance to get hot. Nor is the cuda42 having this insta-crash problem. | |
ID: 37720 | Rating: 0 | rate: / Reply Quote | |
Have you tried the Heaven 4.0 suggestions I gave? | |
ID: 37721 | Rating: 0 | rate: / Reply Quote | |
Have you tried the Heaven 4.0 suggestions I gave? 1) Yes, but so far I haven't been able to get thru a 5 hour segment with the gaming I do at night, and I work during the day. It's only been one single day since you made that suggestion. 2) If you mean the CPU portion of this project I don't run those. If you mean other projects like WCG no. If this project can't work with other CPU projects I'll dump this one, or at least after its moved past cuda42. But let me ask you a question about Heaven. Will that tell me why they are crashing immediately? | |
ID: 37723 | Rating: 0 | rate: / Reply Quote | |
It will tell you immediately if your GPU is unstable. | |
ID: 37725 | Rating: 0 | rate: / Reply Quote | |
It will tell you immediately if your GPU is unstable. Too bad it only allows single diagnostic runs. Anything at -26mhz or less will crash immediately. I was doing diagnostics of 10 runs in a row up to -35 and thought I found a sweet spot at -31 based on FPS and score. Then at that speed I was only able to run it for a few hours before the 'computer hog' family complaints started so I had to abort. Looks like during the night the cuda60 was still failing w/ 0 CPU time so I'll be at it again. I do however have games that will run with Lucid Virtu MVP now that were crashing before. So at least that problem has been solved. | |
ID: 37732 | Rating: 0 | rate: / Reply Quote | |
If you don't click "benchmark", you can run it overnight. | |
ID: 37735 | Rating: 0 | rate: / Reply Quote | |
I thought of that. Unfortunately when the app stops responding it leaves me with a black screen, no crunching and just eating elec. So I'll take the long weekend for this. Meanwhile I've dropped the card -60 to stock reference specs. When I manage to pick some cuda60 if that errors then it shouldn't be the card. | |
ID: 37741 | Rating: 0 | rate: / Reply Quote | |
With some BOINC GPU projects (e.g., Einstein and maybe POEM), I have had problems with anti-viruses. I normally don't use any of them, since I have dedicated machines. But on my main PC, I have tried various AVs, and they all cause problems of different sorts eventually. I don't run GPUGrid on my main PC, so can't offer specific advice, but the AV exclusions usually don't do any good since they are for scanning, but don't necessarily preclude real-time protection. I think the exclusions in avast! also precluded real-time protection, but it caused some other problem that I don't recall at the moment (all on Win7 64-bit). | |
ID: 37742 | Rating: 0 | rate: / Reply Quote | |
I'm currently using the free Avast. I like that and AVG. Avira had too many false positives. | |
ID: 37749 | Rating: 0 | rate: / Reply Quote | |
I'm glad it's starting to work for you! | |
ID: 37750 | Rating: 0 | rate: / Reply Quote | |
The 340.52 drivers are working very well on my GTX 750 Ti's (Asus, not overclocked) under WinXP. I am now getting 520,000 RAC, whereas with the 337.88 drivers it was under 500,000 RAC for the two cards. And I no longer get "unstable machine" restarts. This may all be due to the work units themselves, but at least the new drivers are not hurting anything. The temps are also good, 55 to 58C with a 120 mm side fan in a 20C room running NOELIA_5MG. | |
ID: 37768 | Rating: 0 | rate: / Reply Quote | |
The temps are also good, 55 to 58C with a 120 mm side fan in a 20C room running NOELIA_5MG. I must have read the wrong machine. These cards are now running 63 to 67C; still quite low enough on the NOELIA_5MG. | |
ID: 37773 | Rating: 0 | rate: / Reply Quote | |
Well I had a few good units and then everything went south with more unknown errors. Digging deeper with my research MSI lists a difference reference base clock than what nVidia shows for its reference, that being 915. So I've dropped my card -105 to match that reference. Now looking at GPU-Z it shows: | |
ID: 37775 | Rating: 0 | rate: / Reply Quote | |
Instead of guessing at the clock speeds, you could do the Heaven steps, to ensure stability... I tried really hard to make them simple. | |
ID: 37776 | Rating: 0 | rate: / Reply Quote | |
Instead of guessing at the clock speeds, you could do the Heaven steps, to ensure stability... I tried really hard to make them simple. Hi Jacob. Please don't assume I failed to follow your steps. I did follow them. It was running stable at -60. I was still getting errors which is why I dropped it down to where it is now. I can run that program day which makes me think it's not the GPU speed. So yes you succeeded in making the steps simple. Unfortunately your solution has failed. I watched as this result crapped out in 6 seconds of elapsed time. This was the first unit that caused a pop-up error with Windows error reporting. Acemd.841-60.exe has stopped working | |
ID: 37777 | Rating: 0 | rate: / Reply Quote | |
Sorry. So, were you able to determine a "Max Boost Clock" speed that was completely stable for 5+ hours in Heaven? | |
ID: 37778 | Rating: 0 | rate: / Reply Quote | |
Sorry. So, were you able to determine a "Max Boost Clock" speed that was completely stable for 5+ hours in Heaven? I am at a loss as well. Unfortunately since my card is ignoring my settings and auto-boosting it's impossible to give the factual number. Researching the issue the card wants to boost up to a percentage of the TDP which will change based on the need of the WU. Lowering the power allowed in Afterburner hasn't been effective with that. The only OC I've cared about, researched and bought, and worked out all the nitty gritty BIOS bits was for my CPU. My 3.4 i5 (intel turbo to 3.8) I have OC'd to 4.5 stable. When it comes to videocards I look at build quality, cooling solution and customer service. It came down to MSI or Gigabyte. My previous card was a Gigabyte on my older system but with this new build I chose a MSI mobo so I went with the matching card that just happened to have it's own OC built in. Just for giggles I clocked the i5 down to 3.4 stock with the gpu still set at 915 stock and cud60 still died. I know it's not a speed/temp issue with either the gpu or cpu. Using Heaven the difference between unstable (being able to benchmark at least once) and fully stable was 3 FPS. I don't think my eyes will notice that. In fact the only thing that caught my attn was the errors I've been having with cuda60. Had I not seen that I wouldn't even know there was a problem. Check this out. I'm not the only one having problems with cuda60 errors. Many of my cuda60's also have errored out with other crunchers. With the application error I had yesterday that caused windows error reporting service to pop-up (and not the normal system tray popup if a driver crashed) I wonder if there may be something wrong with these units or the app. I would guess the app but I think it's something the devs should at least take a look at. | |
ID: 37783 | Rating: 0 | rate: / Reply Quote | |
If you adjust the "GPU CLOCK OFFSET" in Precision-X, that limits how high it will "auto-boost" while under load. Each 13Mhz decrement is a step. I had to decrease one of my GTX 660 Ti GPUS 4 steps (-52 Mhz), before Heaven was stable at maximum settings for 5+ hours. | |
ID: 37784 | Rating: 0 | rate: / Reply Quote | |
I don't have (evga) Precision-X. Should I dump (msi) Afterburner for it? | |
ID: 37794 | Rating: 0 | rate: / Reply Quote | |
I believe you can use Afterburner, as it functions nearly identically to Precision-X. You are looking to control the "GPU Clock" to ensure maximum stability, and Kepler GPUs ramp up and down in 13 Mhz intervals. | |
ID: 37795 | Rating: 0 | rate: / Reply Quote | |
Back to a full night test of Heaven, going by the 13 interval at -39 it ran for 9 hours before the uningine stopped responding. Interesting enough at the next interval -52 the uningine stopped responding 7 hours in. Now they both pass the 5 hour mark but I don't know if technically it's supposed to go on like the Energizer bunny. I believe you can use Afterburner, as it functions nearly identically to Precision-X. You are looking to control the "GPU Clock" to ensure maximum stability, and Kepler GPUs ramp up and down in 13 Mhz intervals. The Core Clock slider changes the base speed which is then boosted. However the boost always seems to be 78 MHZ and always seem to be on if the card is running 3D apps. What was your source of the 13 mhz interval? | |
ID: 37860 | Rating: 0 | rate: / Reply Quote | |
Source is by experience. You can slide the slider around in 1Mhz intervals, and then watch whether your GPU's clock goes up/down or not. The registered clock should only "move" in 13 Mhz intervals. I believe it works much like a CPU, where the resulting clock is 13 Mhz times some multiplier. | |
ID: 37861 | Rating: 0 | rate: / Reply Quote | |
Jacob, would the similar Unigine Valley benchmark test work as well as Heaven for this purpose? | |
ID: 37868 | Rating: 0 | rate: / Reply Quote | |
Valley would work, but it has been my experience that Heaven pushes the GPU harder than Valley. | |
ID: 37869 | Rating: 0 | rate: / Reply Quote | |
Per GPU-Z it showed a 1mhz boost change per 1mhz clock change. So the boost is staying constant at 78 mhz above. | |
ID: 37871 | Rating: 0 | rate: / Reply Quote | |
You need to be clearer in your findings. | |
ID: 37872 | Rating: 0 | rate: / Reply Quote | |
I've tried to DL your Precision-X however it's currently unavailable. Some sort of copyright controversy with it. Source is by experience. Are you SeanPoe? Are you saying your GPU doesn't increment like that? Can you confirm with exact steps? Conditionally yes and no. When I was adjusting the GPU speed I wasn't under a full load. Many times when I've adjusted with gpu grid running it crashed the WU. When I was running Heaven fullscreen there was also no way of adjusting speed without, at a minimum, exiting the window which would instantly change what was shown in GPU-Z. So going from your last post and the guide in the link my max kepler boost before things start getting dropped by temperature is 7 offset steps (91mhz). Running the GPU-Z render test does confirm the 13 drop when I manually drop it by 1 as shown in your example. | |
ID: 37903 | Rating: 0 | rate: / Reply Quote | |
Precision X v4 can be downloaded here: | |
ID: 37904 | Rating: 0 | rate: / Reply Quote | |
Heaven ran overnight at -65 with no crashes and no watchdog dmps. However even at 80% fan speed it was holding at 71C. With the noise at that speed I pretty much have a hair dryer at arm's length; which I can't have since this is my gaming rig and not a cruncher in the garage. Max temp under Heaven has never reached 80C even when it was 80F inside the house and the fan speed set to auto - sometimes will hit 60% which is audible but not bothersome. I can live with the card throttling down a step. (And my max setting will be somewhere between -65 and -53.) | |
ID: 37909 | Rating: 0 | rate: / Reply Quote | |
You can setup a fan curve to suit your tastes. I prefer the curve to eventually hit max fan before the thermal limit, so MHz don't get limited. We were doing that here, for testing, just to ensure that the GPU did not downclock due to the thermal limit. | |
ID: 37913 | Rating: 0 | rate: / Reply Quote | |
It makes sense. Thanks Jacob. | |
ID: 37927 | Rating: 0 | rate: / Reply Quote | |
I would envision that those files wouldn't matter. They are symbol files that are downloaded, I think, whenever a BOINC debug version crashes or is debugged. | |
ID: 37930 | Rating: 0 | rate: / Reply Quote | |
Just wanted to confirm the 13MHz steps in Kepler boost. My GTX 680 and GTX 780Ti cards both step up and down in 13MHz increments. I have a Temp Target of 68C set on my 780Ti cards and they will clock up or down in 13MHz steps until they reach a speed where they can maintain the desired temperature. | |
ID: 37936 | Rating: 0 | rate: / Reply Quote | |
I would envision that those files wouldn't matter. They are symbol files that are downloaded, I think, whenever a BOINC debug version crashes or is debugged. True - but I can't help but wonder maybe some file (the cuda60 exe) got borked. I've turned in 4 straight valid WUs since the wipe and clean install. Well at least now I know more about my card than I ever thought I would need. I know where the stable point is. And the problem appears to be fixed. Overnight I'll return the card to its default values and see if the instability churns up any errors. At least that way I can determine if the card contributed to the problem. | |
ID: 37937 | Rating: 0 | rate: / Reply Quote | |
Cuda60 has been churning good, and I've ditched the MSI app for the one from EVGA. Thanks for the help. | |
ID: 38141 | Rating: 0 | rate: / Reply Quote | |
Message boards : Graphics cards (GPUs) : nVidia driver 340.52