Message boards : Graphics cards (GPUs) : cuda driver error 719
Author | Message |
---|---|
Hi! | |
ID: 34584 | Rating: 0 | rate: / Reply Quote | |
Made URLs clickable. | |
ID: 34585 | Rating: 0 | rate: / Reply Quote | |
As this happened on you Windows machine, you can check if the GPU Clock has down clocked (by half). I guess that this has happened as I have had these fatal cuda driver errors as well. Tasks do result good, but take way longer to complete. Rebooting is the only option to get the GPU Clock work at its operational speed again. | |
ID: 34586 | Rating: 0 | rate: / Reply Quote | |
a1kabear, use a tool such as MSI Afterburner to control the GPU fan speeds and temperatures. Try to keep the temperature below 70C. | |
ID: 34587 | Rating: 0 | rate: / Reply Quote | |
Thanks for the suggestions and sorry about the non-clickable links. I'm gonna try it all on Linux today and see how it does. If I still get the errors/poor performance I will go back to windows and try to keep temps down, maybe underclock, maybe overvolt a bit. | |
ID: 34588 | Rating: 0 | rate: / Reply Quote | |
It sounds like you might have one of the Kamikaze Kards that wants to die an early death by allowing the temp to hit 80C before it ramps up the fan speed. I have one of those. I don't know if it's the fault of the BIOS on the video card, the driver or some combination of the 2 but other than flashing the BIOS, the only way to manage temps on those nasty cards is to configure COOLBITS which allows manual fan control. I see you have a few Linux machines already so you're probably familiar with COOLBITS already but I thought I would just check and make sure. | |
ID: 34589 | Rating: 0 | rate: / Reply Quote | |
Yes I tried coolbits but it seems to only activate on the first card. I read I need to either connect another display to the second card or a virtual display (complicated maybe) to get the fan controls for the second card. However the fan was already running quite high (I am in Thailand, ambient temperature here is high) and pushing it up more only got me an extra 3c off and its noticeably louder. | |
ID: 34591 | Rating: 0 | rate: / Reply Quote | |
Yes I tried coolbits but it seems to only activate on the first card. I read I need to either connect another display to the second card or a virtual display (complicated maybe) to get the fan controls for the second card. Hi, As says to control the fan on more than one GPU in Linux using Coolbits requires that each GPU has an associated real or virtual screen. It's not complicated, it's annoying, I think there is a site that has already been explained as connecting the second or more virtual monitors (but not exactly the link). If you are interested I can hang the system I've used to activate a second monitor and to control the fan of my two GPUs. in Ubuntu. | |
ID: 34592 | Rating: 0 | rate: / Reply Quote | |
As for controlling temps after you get COOLBITS configured, I am interested in Carlesa's solution. I know it involves gkrellm and lsensors (?) but I've never been able to figure it out so far so I wrote my own GPU temp control app in Python, still in development, get it at https://github.com/Dagorath/gpu_d. Yes I tried coolbits but it seems to only activate on the first card. I read I need to either connect another display to the second card or a virtual display (complicated maybe) to get the fan controls for the second card. However the fan was already running quite high (I am in Thailand, ambient temperature here is high) and pushing it up more only got me an extra 3c off and its noticeably louder. Downclocking is one way to handle high ambient but IMHO a better way is to duct the heat from the computer to the outdoors, in other words get it out of the house/office ASAP and that means not allowing it to mix in with the air in the room, collect it at the back of the computer and push/suck it outside immediately. It's not hard to do. There are 2 solutions for getting the card to initialize without a monitor plugged in. I prefer [url=blog.zorinaq.com?e=11]the VGA dummy plug[url] hardware solution over the software solution because I always manage to somehow undo the software solution and then I have to search for the link that explains how to fix what I broke. The hardware solution is permanent and safe if you do it properly. By properly I mean making sure the resistors can't slip out and can't accidentally touch ground. To do that I cut the resistor leads short enough so that they don't stick out so far and then I bent them over and taped them down. When you bend them over make sure the lead(s) inserted into the c1/c2/c3 holes don't touch the leads inserted into the ground hole or the metal around the surrounding the plug. In the blog article one of the posters claims shorting c1/c2/c3 to ground works and doesn't cause any damage but I would avoid shorts as that is not the specification the card is designed to operate at. 75 ohms is the ideal resistance but you won't find a 75 ohm resistor. 68 ohm and 100 ohm resistors are common and are close enough to the 75 ohm ideal. The software solution is to use nvidia-xconfig tool shipped with the driver. sudo nvidia-xconfig --help sudo nvidia-xconfig --advanced-help | less sudo nvidia-xconfig --enable-all-gpus sudo nvidia-xconfig --cool-bits=5 The last 2 commands do what you want, run them in the order shown. Some say use cool-bits=5 while other say cool-bits=4. Both work but 5 supposedly unlocks the clocks as well. I say supposedly because it used to unlock the clocks on my GTX 570 with older drivers but it doesn't work on 6xx cards with newer drivers Or maybe there is now a second lock that must be bypassed, I dunno, I use 5 in case I come across a way to get around that second lock though I think it's going to take a driver hack to do it, still working on that. I am also considering changing the BCLK on the motherboard to lower the speed of pci-express/cpu/memory which should also reduce the work on the cards helping them run cooler. This may be a lot easier for me than installing windows and flashing. However with a non-standard BCLK maybe it wont be stable, no idea. "Whatever works" they say but IMHO that's the unnecessarily complicated approach and it defeats the purpose to some extent. Deal with the culprit directly. The culprit is ambient temperature. Deal with that first and if that isn't enough then augment it with other strategies. I mean if your car's brakes don't work and you don't like getting injured in smash ups then you could install an elaborate system of rubber bumpers mounted on shock absorbers and more airbags but the best solution is to just fix the damn brakes :-) ____________ BOINC <<--- credit whores, pedants, alien hunters | |
ID: 34593 | Rating: 0 | rate: / Reply Quote | |
Hello: Using gkrellm + lmsensors is allowing only have information, but not contol, the temperatures of the CPU and GPUs (each independent) also the fan speed and set reminder alarms, also other interesting information. | |
ID: 34594 | Rating: 0 | rate: / Reply Quote | |
(im using 780 lightning) The cooler of this card blows the hot air into the case, in that way if you have two such cards, they gonna heat each other (and the MB, CPU, etc). You have to blow cool air into the case with case fans, or take the side panel off. I am also considering changing the BCLK on the motherboard to lower the speed of pci-express/cpu/memory which should also reduce the work on the cards helping them run cooler. This may be a lot easier for me than installing windows and flashing. However with a non-standard BCLK maybe it wont be stable, no idea. Do not lower (or raise) the PCIe frequency, it has strict nominal working frequency (100MHz). Changing it will make your GPUs unreliable. It's much better if you lower the frequency of your GPUs (by flashing it's BIOS, or by using a 3rd party utility) | |
ID: 34596 | Rating: 0 | rate: / Reply Quote | |
Message boards : Graphics cards (GPUs) : cuda driver error 719