Message boards :
Graphics cards (GPUs) :
nVidia driver 340.52
Message board moderation
| Author | Message |
|---|---|
MisfitSend message Joined: 23 May 08 Posts: 33 Credit: 610,551,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Ever since I installed these newest drivers for my MSI GTX670 I've been spitting out nothing but cuda60 errors and even a few cuda42. I did a complete wipe of the drivers and went back to the previous 337.88 drivers and cuda60 are validating. Has there been any problems reported with the 670 and 340.52 drivers? me@rescam.org |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I have not been having those problems with the newer drivers. But, the newer drivers do push the GPUs harder, and it's possible that your current clocks can't handle them. Please read on. First, use a tool like Precision-X to set up a fan curve, where the GPU reaches max fan before 70*C. So, at 69*C, it should be a max fan. Then try this: Install the newer drivers, and see if you can run the Heaven 4.0 benchmark at 1920x1080, Ultra Quality, Extreme Tesselation, 8x Antialiasing... for 5 hours solid with no crashes and no watchdogs dmps reported in C:\Windows\LiveKernelReports\WATCHDOG. .... If you do get a crash, use a tool like Precision-X to back off the Core GPU Mhz by 13 Mhz. Keep backing off in 13 Mhz intervals, until you can run Heaven at those settings for 5+ hours with no crashes. Take notes on how much you needed to back it off, so you can remember in the future. :) For a GPU that is already completely stable at its default clocks, you can also use that process to increase your GPU Core clock too, to find out the max clock before Heaven yields errors. I did this procedure, for both of my GTX 660 Ti GPUs in my system. I discovered that my eVGA GTX 660 Ti 3GB FTW, was factory-overclocked too much -- I had to downclock it 52 Mhz for it to be completely stable in Heaven, and now it is completely stable in GPUGrid and also in iRacing! But my MSI GTX 660 Ti 3GB OC, was factory overclocked too little -- I discovered that I could overclock it 39 Mhz with no problems, and so now it crunches GPUGrid a little faster. It's just a hunch that the drivers are pushing the cards too hard, but seriously, use Heaven to see if you can get the clocks into a "completely stable 24/7" clock setting. Then, once you are sure it is completely stable, test against GPUGrid tasks. I'm hopeful that the procedure solves your problem. Regards, Jacob |
MisfitSend message Joined: 23 May 08 Posts: 33 Credit: 610,551,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I'm puking up the cuda60 errors again. One thing that stands out is they crash immediately with CPU time = 0. I've been using MSI Afterburner with auto fan control and the highest the temp has gotten is 79. However with these units crashing right away the GPU isn't given the chance to get hot. Nor is the cuda42 having this insta-crash problem. To get the temps below 70 with these fans I'm gonna need earplugs. Last three with this error: Stderr output <core_client_version>7.2.42</core_client_version> <![CDATA[ <message> (unknown error) - exit code -1073741819 (0xc0000005) </message> ]]> The one before that: Stderr output <core_client_version>7.2.42</core_client_version> <![CDATA[ <message> The extended attributes are inconsistent. (0xff) - exit code 255 (0xff) </message> ]]> (and the one before that was somehow killed when I rebooted into safe mode.) |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Have you tried the Heaven 4.0 suggestions I gave? Have you tried running with CPU projects suspended? |
MisfitSend message Joined: 23 May 08 Posts: 33 Credit: 610,551,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Have you tried the Heaven 4.0 suggestions I gave? 1) Yes, but so far I haven't been able to get thru a 5 hour segment with the gaming I do at night, and I work during the day. It's only been one single day since you made that suggestion. 2) If you mean the CPU portion of this project I don't run those. If you mean other projects like WCG no. If this project can't work with other CPU projects I'll dump this one, or at least after its moved past cuda42. But let me ask you a question about Heaven. Will that tell me why they are crashing immediately? |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
It will tell you immediately if your GPU is unstable. |
MisfitSend message Joined: 23 May 08 Posts: 33 Credit: 610,551,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
It will tell you immediately if your GPU is unstable. Too bad it only allows single diagnostic runs. Anything at -26mhz or less will crash immediately. I was doing diagnostics of 10 runs in a row up to -35 and thought I found a sweet spot at -31 based on FPS and score. Then at that speed I was only able to run it for a few hours before the 'computer hog' family complaints started so I had to abort. Looks like during the night the cuda60 was still failing w/ 0 CPU time so I'll be at it again. I do however have games that will run with Lucid Virtu MVP now that were crashing before. So at least that problem has been solved. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
If you don't click "benchmark", you can run it overnight. |
MisfitSend message Joined: 23 May 08 Posts: 33 Credit: 610,551,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I thought of that. Unfortunately when the app stops responding it leaves me with a black screen, no crunching and just eating elec. So I'll take the long weekend for this. Meanwhile I've dropped the card -60 to stock reference specs. When I manage to pick some cuda60 if that errors then it shouldn't be the card. |
|
Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
With some BOINC GPU projects (e.g., Einstein and maybe POEM), I have had problems with anti-viruses. I normally don't use any of them, since I have dedicated machines. But on my main PC, I have tried various AVs, and they all cause problems of different sorts eventually. I don't run GPUGrid on my main PC, so can't offer specific advice, but the AV exclusions usually don't do any good since they are for scanning, but don't necessarily preclude real-time protection. I think the exclusions in avast! also precluded real-time protection, but it caused some other problem that I don't recall at the moment (all on Win7 64-bit). |
MisfitSend message Joined: 23 May 08 Posts: 33 Credit: 610,551,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I'm currently using the free Avast. I like that and AVG. Avira had too many false positives. Well the cuda60's are currently finishing and validating. The only change from when this problem first started is I've underclocked the card to the reference specs. Last night I ran FurMark xtreme burnin, with the fans at 100% the temp topped out at 70. I think next time I buy a card I'll ignore MSI's overclock marketing crap. I'll see how things go thru the long weekend. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I'm glad it's starting to work for you! Even at low temps, a GPU will act incorrectly when GPU Core clocks are too high. And it is possible that the new drivers push the shader clusters harder than before, making them susceptible to problems as compared to the previous drivers. At least, that's what I've been told. That's why I continue to recommend getting it right with Heaven 4.0, and then to also use custom fan curves to keep it below clock-limiting thermal thresholds (70*C for Boost 1.0 GPUs like your GTX 670, 80*C for Boost 2.0 GPUs). Sorry to sound like a broken record. Good luck! |
|
Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The 340.52 drivers are working very well on my GTX 750 Ti's (Asus, not overclocked) under WinXP. I am now getting 520,000 RAC, whereas with the 337.88 drivers it was under 500,000 RAC for the two cards. And I no longer get "unstable machine" restarts. This may all be due to the work units themselves, but at least the new drivers are not hurting anything. The temps are also good, 55 to 58C with a 120 mm side fan in a 20C room running NOELIA_5MG. |
|
Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The temps are also good, 55 to 58C with a 120 mm side fan in a 20C room running NOELIA_5MG. I must have read the wrong machine. These cards are now running 63 to 67C; still quite low enough on the NOELIA_5MG. |
MisfitSend message Joined: 23 May 08 Posts: 33 Credit: 610,551,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Well I had a few good units and then everything went south with more unknown errors. Digging deeper with my research MSI lists a difference reference base clock than what nVidia shows for its reference, that being 915. So I've dropped my card -105 to match that reference. Now looking at GPU-Z it shows: Under the Graphics Card tab GPU Clock 915 MHz Boost 993 MHZ Under the Sensors tab GPU Core Clock 1058.2 MHz (while crunching, down to 324.0 when snoozed) AIDA64 is also showing GPU Clock 1058. Somehow this thing is auto-boosting itself and ignoring my manual settings. ----- Just as base point when I let the card run at its default settings for comparison... GPU Clock 1020 Boost 1098 Sensors 1175.8 (AIDA64 1175) |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Instead of guessing at the clock speeds, you could do the Heaven steps, to ensure stability... I tried really hard to make them simple. |
MisfitSend message Joined: 23 May 08 Posts: 33 Credit: 610,551,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Instead of guessing at the clock speeds, you could do the Heaven steps, to ensure stability... I tried really hard to make them simple. Hi Jacob. Please don't assume I failed to follow your steps. I did follow them. It was running stable at -60. I was still getting errors which is why I dropped it down to where it is now. I can run that program day which makes me think it's not the GPU speed. So yes you succeeded in making the steps simple. Unfortunately your solution has failed. I watched as this result crapped out in 6 seconds of elapsed time. This was the first unit that caused a pop-up error with Windows error reporting. Acemd.841-60.exe has stopped working |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Sorry. So, were you able to determine a "Max Boost Clock" speed that was completely stable for 5+ hours in Heaven? If it's stable in Heaven, but crashing in GPUGrid, then, I'm not sure what the issue could be. |
MisfitSend message Joined: 23 May 08 Posts: 33 Credit: 610,551,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Sorry. So, were you able to determine a "Max Boost Clock" speed that was completely stable for 5+ hours in Heaven? I am at a loss as well. Unfortunately since my card is ignoring my settings and auto-boosting it's impossible to give the factual number. Researching the issue the card wants to boost up to a percentage of the TDP which will change based on the need of the WU. Lowering the power allowed in Afterburner hasn't been effective with that. The only OC I've cared about, researched and bought, and worked out all the nitty gritty BIOS bits was for my CPU. My 3.4 i5 (intel turbo to 3.8) I have OC'd to 4.5 stable. When it comes to videocards I look at build quality, cooling solution and customer service. It came down to MSI or Gigabyte. My previous card was a Gigabyte on my older system but with this new build I chose a MSI mobo so I went with the matching card that just happened to have it's own OC built in. Just for giggles I clocked the i5 down to 3.4 stock with the gpu still set at 915 stock and cud60 still died. I know it's not a speed/temp issue with either the gpu or cpu. Using Heaven the difference between unstable (being able to benchmark at least once) and fully stable was 3 FPS. I don't think my eyes will notice that. In fact the only thing that caught my attn was the errors I've been having with cuda60. Had I not seen that I wouldn't even know there was a problem. Check this out. I'm not the only one having problems with cuda60 errors. Many of my cuda60's also have errored out with other crunchers. With the application error I had yesterday that caused windows error reporting service to pop-up (and not the normal system tray popup if a driver crashed) I wonder if there may be something wrong with these units or the app. I would guess the app but I think it's something the devs should at least take a look at. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
If you adjust the "GPU CLOCK OFFSET" in Precision-X, that limits how high it will "auto-boost" while under load. Each 13Mhz decrement is a step. I had to decrease one of my GTX 660 Ti GPUS 4 steps (-52 Mhz), before Heaven was stable at maximum settings for 5+ hours. |
©2025 Universitat Pompeu Fabra