Message boards :
News :
monitor suspend/resume bug in 295/296 drivers
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Send message Joined: 5 Jan 09 Posts: 670 Credit: 2,498,095,550 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The 266.58 are the last drivers that seem to be problem-free, no downclocking bug and obviously no sleep mode bug. 285.62 doesn't have problems. Radio Caroline, the world's most famous offshore pirate radio station. Great music since April 1964. Support Radio Caroline Team - Radio Caroline |
![]() Send message Joined: 3 Oct 10 Posts: 2 Credit: 34,005,977 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
OK, so I went back step by step to to 285.72 drivers. All CUDA tasks have performed without errors. I have not been able to test with CPUGRID, as I am waiting for a new WU. Matman |
Send message Joined: 22 May 10 Posts: 20 Credit: 85,355,427 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Win 7 64-bit (SP1) Dual GTX 580s Driver: 295.73 Power Control Panel -> Turn off the display: Never Result: no errors |
![]() Send message Joined: 10 Jun 11 Posts: 6 Credit: 70,330,451 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
There are many many people out there with the 295.73 driver, and must be causing thousands of errors on the GPUGRID projects. People in general, will be reluctant to role back drivers to earlier versions, because GPUGRID is not the primary reason to own or use a computer. GPUGRID is wasting valuable data at this moment, because of many thousands of errors. Therefore, shouldn't the GPUGRID team do something themselves, instead of asking every member to change drivers? |
![]() Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Do what? GPUGrid has to rely on the system to deal with these issues. Users that continuously fail tasks will stop getting tasks. If you ban users with specific drivers, unless the drivers universally fail, you will be banning users that complete tasks successfully too. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
![]() Send message Joined: 10 Jun 11 Posts: 6 Credit: 70,330,451 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Do what? Has anyone investigated why the tasks fail with some drivers? On a global scale, Nvidia will not change their drivers to pander to a relatively small group. Therefore, shouldn't GPUGRID be looking at re-writing the code required to untertake the tasks under the newer drivers? |
![]() Send message Joined: 10 Jun 11 Posts: 6 Credit: 70,330,451 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Do what? UPDATE: I have just done a casual check on some of the top performers on GPUGRID, and note that the majority of them are experiencing multiple failures of tasks. Even some of those with 285.xx drivers. Maybe there is something else wrong here? I am also active with Seti@home, and, with one unrelated exception, have no failures on those tasks... |
![]() Send message Joined: 2 Mar 09 Posts: 124 Credit: 124,873,744 RAC: 39 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Do what? That is a question EVERY project is asking themselves. This is what I did with PrimeGrid's GeneferCUDA application. Ken had previously done something similar with PrimeGrid's other CUDA applications. If a CUDA API call returns CUDA_ERROR_NO_DEVICE, GeneferCUDA prints a warning to stderr saying that under Windows, using RDP or using the 295/296 Nvidia driver causes the GPU to not work. Stderr is visible in the BOINC task webpage, so there's a chance the user might read it. After printing the message, Genefer goes to sleep for 10 minutes. It's still active, and doesn't return to BOINC, but it's not doing anything. This is intentially tying up the GPU, since no other BOINC task is going to be able to run on the GPU. After 10 minutes, it tries again. This continues until either the program can run successfully, or one hour elapses. After an hour, Genefer gives up, declares a computation error, and exits. This approach has two benefits. First, this error is transient and may go away while Genefer is still waiting, if either the RDP session is closed, or the monitor comes out of sleep mode. Second, in the more likely event that the problem doesn't go away, we're only failing one WU per hour instead of several per minute. This certainly doesn't solve the problem, but it does mitigate its affect on the project somewhat. |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,053,468,649 RAC: 1,308,024 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
On a global scale, Nvidia will not change their drivers to pander to a relatively small group. On the contrary, Nvidia told Einstein: This bug is considered as release critical (show-stopper) for the next NVIDIA driver release that's due in 2-4 weeks. Thus a fix will be available by that time. We are only a 'relatively small group' if we hide away in our separate corners and try to sort out problems like this 'one project at a time'. There are times when collective action is necessary, and if 'BOINC Central' isn't proactively co-ordinating it, then projects which have adopted the BOINC platform should go and bang on their doors until they do. |
Send message Joined: 13 Mar 12 Posts: 21 Credit: 8,773,573 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The latest version of the BOINC client software + manager, that being version 7.0.25 , has taken it upon themselves not to recognize the GPU anymore, if the user has one of those 'incompatible drivers'. I think that this represents a mistake, because now that I've programmed my own Windows 7 Pro, x64 computer never to switch off the monitor, I am handing in work units successfully again, and I do think it's unrealistic thinking from BOINC, that users will downgrade their drivers, for the sake of BOINC. I'm using driver version 296.10 successfully now. Actually, one reason I had for upgrading from my outdated drivers, was my concern that 'the old method' of implementing "PhysX", would have used a discrete "PPU" (Physics Processing Unit) on my graphics card, and I wanted to /make sure/ that since this approach has been abandoned by nVidia, in favor of using the "GPGPU" itself, my own graphics card should also be using the GPGPU. Especially since the instructions for downgrading, now tell us to remove ALL nVidia software from our computers, this has become a totally infeasible thing for me to do, with PhysX and "CUDA" SDKs all installed and working. I have to add something to the advice, for how to prevent the monitor from sleeping though. Well enough, one would set the general Power Settings, accessible through Screensaver preferences. But then it can happen that some other process tries to give the command anyway, to put the monitor to sleep, especially since /some of us/ have sundry programs installed. The stronger setting I would recommend would be (in addition to the standard setting): Start Menu Type in "Edit Group Policy" into the search field and hit Enter Computer Configuration Administrative Templates System Power Management Video And Display Settings Turn Off Display (Plugged In AND On Battery) --> Disabled What this does on Windows 7 Pro at least, is take away the privileges processes would have, which we might not have kept rack of, to put the monitor to sleep. I think that by simply banning all up-to-date device drivers, the new version of BOINC client software will kill off one major source of contributed work for you. The main reason my own GC did crash at one point, was simply the fact that I had not researched the subject (in the forums), and it's not likely to happen to me again. Dirk |
Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Sorry to double post, but it seems important: (not my words) You'll be happy to know that 301.24 fixes the sleeping monitor Bug, althrough i haven't tried it on PrimeGrid yet, only Einstein & Seti so far, Claggy (Some user on PG) First post I did 7 Setiathome offline Benches last night and couldn't get it to fail (But i'm using a different monitor to when i could get to fail with 295.xx drivers), Before i upgraded i grabbed some BRP4Cuda work and have done some of it this morning no problem, In a little while i'll downgrade to 295.73 and check i can get offline benches to fail on this monitor. Second: I downgraded my i7-2600K/GTX460/HD5770 host to 295.73, ran a setiathome offline bench, proved that the cuda apps do fail with this monitor, then upgraded back up to 301.24, EDIT: Decided to check for myself using 301.10, and after letting monitor sleep for awhile, I resumed and saw that GPU usage remained steady. Still have to wait for validation from wingman |
Send message Joined: 13 Mar 12 Posts: 21 Credit: 8,773,573 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Doesn't it even seem to matter to you, that BOINC's new client and manager package, is overriding the individual projects' policies on the subject? Dirk |
Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
It does, but since many people rely on auto update, or always get latest driver. This may become a moot point after awhile. From what I can tell, only people who know what they're doing didn't use those drivers anyway, and since BOINC blacklisted them it won't matter when NVIDIA releases WHQL. I mean it prevents failed WU, and even Einstein quit allowing people to use those drivers (they blacklisted WU from going to hosts w/ those drivers anyways. Even if you fixed it yourself it wouldn't work. Should actually help projects in the long run, even if its rude to the users. EDIT: by allowing BOINC to blacklist, it would have allowed me to use my 680 on Einstein, since they would know that even though mines higher than there 290 limit (301), they would have been able to send me WU. Arrogant, kinda yea, but MANY WU are failing everywhere b/c of it. Especially here I do believe |
![]() Send message Joined: 2 Mar 09 Posts: 124 Credit: 124,873,744 RAC: 39 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The latest version of the BOINC client software + manager, that being version 7.0.25 , has taken it upon themselves not to recognize the GPU anymore, if the user has one of those 'incompatible drivers'. I probably didn't dig deep enough, but where in the release notes does it say this? I couldn't find mention of this. Assuming it's true, I've got very mixed feelings about it. On the one hand, from the project side of things, this driver bug is a huge pain in the posterior. Thousands and thousands of errors, and I've got WU's over at PrimeGrid that are hitting the "too many tasks" limits because of this. From a user's perspective, it's not so nice -- but the user has the option of upgrading the driver to 301, downgrading the driver to 285, or reverting BOINC to 6.12.34 after clearing their work queue. All in all, the benefits probably outweigh the disadvantages. Want to find one of the largest known primes? Try PrimeGrid. Or help cure disease at WCG. ![]() |
Send message Joined: 13 Mar 12 Posts: 21 Credit: 8,773,573 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
@Michael Goetz: It's possible that I misread the issues with the newer BOINC Manager and Client. What they wrote, is that when we install BOINC as a Service, OR in Protected Execution, GPU detection won't work anymore. http://boinc.berkeley.edu/wiki/Release_Notes#Protected_Application_Execution_.28Service.29_Installation.2C_GPU_detection_and_Windows_XP I was under the impression that 'as a service' is the opposite of 'in protected execution'. If in fact they are one and the same thing, then I got it wrong. In that case, BOINC installed 'in User Mode' will still recognize the GPUs (without problem)... If that's so, you might want to make the text just a tad more clear about it. How does it address the malfunctions? It does, but since many people rely on auto update, or always get latest driver. This may become a moot point after awhile. From what I can tell, only people who know what they're doing didn't use those drivers anyway, and since BOINC blacklisted them it won't matter when NVIDIA releases WHQL. In my opinion there are two errors here. 1) Updating your graphics driver and system software, is not a trivial task. When I asked Windows Update to do it first, Windows Update updated and left me with an improper install. I could no longer open my nVidia Control Panel from that. So I had to do a manual upgrade afterward, my icons were all displaced and so on... I don't think that users who simply have their computers on auto-pilot experience that. 2) The other people who chose the 296.10 driver, have other things to do with their computers, than BOINC Work Units. We only run BOINC on the side. I'm into game development, PhysX etc.. I'd say that ~BOINC is my screensaver~, but in fact mine is the 3D Text Screensaver, with BOINC running in the background. You can't convince me to reinstall, and then re-reinstall my graphics drivers. Dirk |
![]() Send message Joined: 2 Mar 09 Posts: 124 Credit: 124,873,744 RAC: 39 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
@Michael Goetz: I'm pretty sure that's a feature that's been in BOINC for many years now, and certainly isn't driver specific. It definitely was in the 6.12.34 client, and possibly in all of the 6.x.x clients. Want to find one of the largest known primes? Try PrimeGrid. Or help cure disease at WCG. ![]() |
Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'm not trying to convince anyone of anything. My point is that if you are using the WHQL driver for game development and the like (i play games) than you would upgrade to newest WHQL in order to have their ( NVIDIA) latest software. In this case you would be upgrading to 300 series whenever the WHQL is released if I'm not mistaken. This was my point, if your using the latest currently, than why not upgrade when newest is released. I personally use NVIDIA website so i can do a clean install. When I made my comment I was merely saying if boinc is currently being run on the side, than I would ASSUME you would want the latest. This being 300, which is a good thing for everyone all around. |
Send message Joined: 13 Mar 12 Posts: 21 Credit: 8,773,573 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
My apology. I thought that I was being urged to upgrade to the beta driver, etc.. I can upgrade to the 300.xx driver as soon as it becomes WHQL, just because my current setup will continue to work for now. And I did just upgrade my client software to 7.0.25, as requested. Dirk |
Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Quite allright. I've been trying to spread the word to various sites, b/c as Michael had stated, it has been a HUGE problem from the projects standpoint. MANY MANY errors have been caused by this monitor sleep bug, and when I said, "only the people that know what they're doing don't use it anyways" I should have been more clear. I meant to mean the BOINC ONLY crowd, but since this can be a rather small percentage on some sites, many users who aren't BOINC ONLY (they attach project and leave it) w/o ever checking results, they just keep producing errors w/o knowing it. No need to go to beta if you already prevent monitor from sleeping, but not everyone does this, and this was the problem. People who play games etc. in spare time want/need the latest drivers in order for their system to function properly (whether it's WHQL or not). All in all, it's great news for the BOINC community as a whole, b/c now everyone's happy (will be soon anyways when WHQL is released). From both the projects side (valid WU), and latest and greatest PhyX etc. As always Happy Crunching |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,053,468,649 RAC: 1,308,024 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
@Michael Goetz: Trying to eliminate some confusion here: 'Service mode' and 'Protected Application Execution' are the same thing. In Windows Vista and Windows 7 GPUs can NOT be used in Service/PAE mode - in any version of BOINC (it's an OS restriction). In Windows XP, GPUs CAN be used in Service/PAE mode up to and including BOINC v6.12.34 - but not in the new BOINC v7.0.25 |
©2025 Universitat Pompeu Fabra