Message boards :
Graphics cards (GPUs) :
Do I have to be made to click "OK" on failed task?
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 5 Jan 09 Posts: 32 Credit: 1,412,042,305 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi I don't switch to the different machines I have grid computing particularly often and I just noticed an error message for a task that had failed - it had put a little window up needing OK to be clicked. In Boinc manager I could see the GPUGrid task had been running for 3 days waiting for someone to click OK its failed, lets move on. As soon as I clicked OK, the task status shifted to computing error, and the next task started. Is there any way this can be avoided? I assume I just lost 3 days of GPU processing and hopefully not electricity too. Thanks, Far |
|
Send message Joined: 15 Apr 10 Posts: 123 Credit: 1,004,473,861 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I had the same issue when I was using the 295 nvidia drivers. CUDA would crash when the monitor went to sleep and a new task was started. Either downgrade or upgrade the drivers. XtremeSystems.org - #1 Team in GPUGrid |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Far, I had the same issue recently. I expect this happened on one of your XP systems? Anyway, a restart is in order. Also, change the monitor to never turn itself off, and just turn it off manually. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Recently I was creating quite a few errors here (my fault) and this never happened on 2 hosts. So it's definitely some special case on your side. Don't know what causes it, though. Trying a never driver is a good idea. And maybe you recently installed some GPU programming tools which activated some debug mode, which causes this message? MrS Scanning for our furry friends since Jan 2002 |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
|
|
Send message Joined: 5 Jan 09 Posts: 32 Credit: 1,412,042,305 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thanks for all the suggestions guys. I'm running the latest (non-beta) drivers 306.81, on XP. There are no GPU programming tools installed. The monitor is already set to never sleep (as is anything else under the energy profile). I can't reboot the machine as it has other processes running which require a pw/uid logon and I can't store the info in a file anywhere. It sounds like the simplest option is just for me to try to remember to connect to the machines to checkup on them more often. I'm going to have a look at the eFMer boinc app at some point when I can find time, in case I can spot from my phone that something is taking an unduly long amount of time and go check it out. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Use the auto-logon and if you want to run some app, put it in the startup folder using a run as batch file. I guess you could also use Zoltan's script and set up an administrator alert by email, or disable and re-enable the card/driver, but exactly how to get the alert going could be tricky and it's a fair bit of work. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Instead of checking each machine manually you could look at your hosts in GPU-Grid, under your account, and see when they last contacted the server. MrS Scanning for our furry friends since Jan 2002 |
©2025 Universitat Pompeu Fabra