Long run Santi worse then Noelia

Message boards : Number crunching : Long run Santi worse then Noelia
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31824 - Posted: 5 Aug 2013, 7:22:51 UTC - in response to Message 31822.  

TJ, have you ever tried installing the drivers without the stuff you don't need; sound, 3D vision, updater?
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help
ID: 31824 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TJ

Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31825 - Posted: 5 Aug 2013, 8:06:48 UTC - in response to Message 31824.  

TJ, have you ever tried installing the drivers without the stuff you don't need; sound, 3D vision, updater?

Yes, I did with no results. I also tried to run GPUGRID without anything crunching on the CPU and had no difference. Overnight two SR from Santi finished okay, and yesterday one LR. So it goes better. Seeing errors at wingman too, gives me confidence that the card is not faulty, but I will swap it when it become less warm here.
But on this particular rig I had more error with Santi then Noelia, that's why I started this thread.
Greetings from TJ
ID: 31825 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31830 - Posted: 5 Aug 2013, 18:56:17 UTC - in response to Message 31821.  

My AMD PC with a 770 and driver 320.49 had nil errors yet and did all types of tasks.
It looks to me that the 660 and Santi WU's need a special set up.

Since both are CC 3.0 Keplers I'm pretty sure the difference lies elsewhere than the cards in general. If you're talking about a specific card (which is OC'ed almost too far, too warm or whatever) on the other hand, this could well be.

MrS
Scanning for our furry friends since Jan 2002
ID: 31830 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31832 - Posted: 5 Aug 2013, 21:33:01 UTC
Last modified: 5 Aug 2013, 21:33:20 UTC

There is definitely something going on. :)
My main cruncher restarted two times while crunching this 23x15-SANTI_RAP74wt-7-34-RND0953_0.
Later on, this 13x10-SANTI_RAP74wt-8-34-RND8572_0 run into an error, the subsequent 84-NOELIA_005p_express-2-3-RND1913_0 also run into an error (on 3 other host too), and the following 9x0-SANTI_RAP74wt-6-34-RND5647_1 had a stuck progress indicator, so I've aborted it, but the following WU showed the same symptom, so I've restarted the host (the WU finished fine then).

My other host (with a new CPU and MB) had similar computation errors and two WTF-type SANTI LR:

47x9-SANTI_RAP74wt-8-34-RND9533_0
----------------------------- Run time: 55,832s
----------------------------- CPU time: 39,996s
Approximate elapsed time for entire WU: 28,272s

30x6-SANTI_RAP74wt-11-34-RND0448_0
----------------------------- Run time: 55,403s
----------------------------- CPU time: 50,072s
Approximate elapsed time for entire WU: 28,634s

I've double cheched that the GPUs didn't downclocked.

The three different runtimes should be nearly the same, like this:
47x19-SANTI_RAP74wt-6-34-RND8040_1
----------------------------- Run time: 27,474s
----------------------------- CPU time: 27,362s
Approximate elapsed time for entire WU: 27,483s

These two strange SANTI LRs were acting like they were crunching: 95% GPU usage, progress indicator shows some progress, but the running time was abnormally high. I thought that my GPUs are downclocked, but they didn't. So I've restarted the host expecting the workunits will fail, but to my surprise they didn't. Instead they've started from 0% progress, while their elapsed time was ~9 hours, and they've finished successfully, with these strange run times.
ID: 31832 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TJ

Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31838 - Posted: 6 Aug 2013, 8:57:34 UTC - in response to Message 31832.  
Last modified: 6 Aug 2013, 9:26:19 UTC

One LR Santi errored after 21000 seconds, but one finished okay. But I had a system crash overnight as well. The course seems to be a Noelia. After a restart I got a Santi SR and that failed after 1419 seconds. Thankfully I have a Noelia LR now! However I had a Noelia failed and caused the nVidia driver to crash, they do that within seconds.
If I am at my rigs and I see a Santi I will abort it for the time being. They error way to many. I don't care about the credits, but for the energy though. The GPU could do another project than, to help science.
Greetings from TJ
ID: 31838 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31852 - Posted: 7 Aug 2013, 0:18:02 UTC - in response to Message 31750.  
Last modified: 7 Aug 2013, 0:18:28 UTC

MY GTX 670, GTX 560, GTX 650 Ti and GTX 460/768mb GPUs are now at 84 SANTI_RAP74 WUs completed with with only 1 WU that errored after a bit over an hour. Again, I'm using 310.90 drivers in Win7-64. The only WUs that are a problem here are the NOELIA_RUN and the new 290... NOELIAs, but that's because they need > 1GB memory or they slow to a crawl. Personally I wish all the WUs were SANTIs.
ID: 31852 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31853 - Posted: 7 Aug 2013, 1:58:15 UTC
Last modified: 7 Aug 2013, 2:18:14 UTC

My own recent experience is that Santis are fine if you don't overclock. But they are very particular. I have had to recently reduce the clocks on my GTX 660s from 993 MHz (a factory overclock of less than 1 percent) to 980 MHz, the Nvidia specified rate. I have also tweaked up the core voltage a small amount for some additional overhead. That had not been necessary on any of the other longs, but with the Santis they work fine, until you get a hard one, and then they fail.

That is OK; I want the science to use my cards to the maximum extent, and don't want them to slow down the science just to accommodate some card's overclock, but we need to be aware of it. And I have not found that the driver version makes a difference; it may just be a slight difference in the difficulties of the work units themselves that matter. You can go for weeks at a time with no problems, and then they lower the boom on you.
ID: 31853 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John C MacAlister

Send message
Joined: 17 Feb 13
Posts: 181
Credit: 144,871,276
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 31855 - Posted: 7 Aug 2013, 3:14:36 UTC - in response to Message 31852.  

Hi, Beyond:

I too wish all WUs behaved as well as the SANTIs!! :)

John

MY GTX 670, GTX 560, GTX 650 Ti and GTX 460/768mb GPUs are now at 84 SANTI_RAP74 WUs completed with with only 1 WU that errored after a bit over an hour. Again, I'm using 310.90 drivers in Win7-64. The only WUs that are a problem here are the NOELIA_RUN and the new 290... NOELIAs, but that's because they need > 1GB memory or they slow to a crawl. Personally I wish all the WUs were SANTIs.

ID: 31855 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TJ

Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31859 - Posted: 7 Aug 2013, 8:49:30 UTC

I don't want Santi's any more again 4 fail in a row LR and SR on the 660. I have also 310.19 drivers, in fact I have tried all drivers from 3xx.xx until latest beta, and no OC. With Noelia and Nathan with every driver good results.
Thankfully my 660 has a Noelia now for the next 14 hours. So I can safely leave the PC alone.
My AMD with GTX770 and the latest official driver, does all WU, including Santi okay. So I will only buy 690 and higher cards. They are more expensive, but 30 or more errors after 60% running costs a lot of energy that was complete spoiled.
Greetings from TJ
ID: 31859 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31865 - Posted: 7 Aug 2013, 13:50:11 UTC - in response to Message 31859.  

with GTX770 and the latest official driver, does all WU, including Santi okay. So I will only buy 690 and higher cards. They are more expensive, but 30 or more errors after 60% running costs a lot of energy that was complete spoiled.

TJ, the SANTIs run better on low end cards than any other WUs. They even run on my GTX 460/768MB. There's something wrong either with your system setup or your GTX 660. Looking at your computer it reports:

NVIDIA GeForce GTX 660 (2048MB) driver: 311.6

The 311.6 driver is bad. When I tested it I had many errors. Returned to 310.90 and the errors disappeared. I think you need to use driver sweeper to totally remove the traces of all the drivers you've tried. Then install 310.90 and see what happens. If you swap the 660 into a known good system and it still fails, return it or send it to the manufacturer for repair.
ID: 31865 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile chip

Send message
Joined: 10 Feb 09
Posts: 5
Credit: 50,333,581
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwat
Message 31867 - Posted: 7 Aug 2013, 14:36:30 UTC

Strange times...

ct 7840.080000 nm 39x16-SANTI_RAP74wt-8-34-RND8382_1 et 29700.570862
ct 7998.639000 nm 27x0-SANTI_RAP74wt-8-34-RND7549_2 et 30814.709614
ct 10096.990000 nm 9x10-SANTI_RAP74wt-11-34-RND7306_0 et 39109.705767

ct 6607.001000 nm 041px45-NOELIA_KLEBE-0-3-RND7216_1 et 29152.602219
ct 9825.645000 nm 148nx13-NOELIA_KLEBE-0-3-RND0834_0 et 41916.915213
ID: 31867 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TJ

Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31869 - Posted: 7 Aug 2013, 15:08:22 UTC - in response to Message 31865.  

with GTX770 and the latest official driver, does all WU, including Santi okay. So I will only buy 690 and higher cards. They are more expensive, but 30 or more errors after 60% running costs a lot of energy that was complete spoiled.

TJ, the SANTIs run better on low end cards than any other WUs. They even run on my GTX 460/768MB. There's something wrong either with your system setup or your GTX 660. Looking at your computer it reports:

NVIDIA GeForce GTX 660 (2048MB) driver: 311.6

The 311.6 driver is bad. When I tested it I had many errors. Returned to 310.90 and the errors disappeared. I think you need to use driver sweeper to totally remove the traces of all the drivers you've tried. Then install 310.90 and see what happens. If you swap the 660 into a known good system and it still fails, return it or send it to the manufacturer for repair.

Your right Beyond, if I check BOINC first page I see indeed 311.6 however I am absolutely sure I downloaded and installed 310.90. I have even saved the file for just in case, here it is: 310.90-desktop-win8-win7-winvista-64bit-english-whql.
After a clean install, first remove nVidia software, reboot, ccleaner, reboot, install new driver (as clean install), reboot, ccleaner, reboot and than BOINC, it seems to be the 311 one. I have it from nVidia US site.
Greetings from TJ
ID: 31869 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31875 - Posted: 7 Aug 2013, 15:59:59 UTC - in response to Message 31869.  

Your right Beyond, if I check BOINC first page I see indeed 311.6 however I am absolutely sure I downloaded and installed 310.90. I have even saved the file for just in case, here it is: 310.90-desktop-win8-win7-winvista-64bit-english-whql.

It is because Windows automatic update gives you the new one as soon as you install 310.90. In fact, it might have even found it on your system from a previous download, and installed it without even having to download it again. (It is another feature from the geniuses at MS).

Try the following:
Control Panel/System/Advanced System Settings => Hardware Tab/Device Installation Settings
"No, let me choose what to do" => Never install driver software from Windows Update

However, to get rid of the ones already downloaded, I think you will need to uninstall them manually yourself. First, get rid of whatever you can from Control Panel/Programs and Features, rebooting as necessary. Then go into Device Manager and remove and delete the drivers from the computer. (I also run Driver Sweeper, or the newer version Driver Fusion at that point too.) Then, when you get back to the default Microsoft VGA drivers, you can start over again.

ID: 31875 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TJ

Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31878 - Posted: 7 Aug 2013, 17:44:11 UTC - in response to Message 31875.  

Thanks Jim,

I have it set to not update drivers by MS, but forgot to do that on this particular system. I will let Noelia's finish and then start working on this one.
But I see you have great results on your 660's with 314 drivers, so the 310 are not necessary.
Greetings from TJ
ID: 31878 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31882 - Posted: 7 Aug 2013, 18:50:36 UTC - in response to Message 31878.  

I think the 314 drivers are fine, but it doesn't really matter in my case, since I use a dedicated PC for GPUGrid and don't need the card for display purposes. My only suggestion is if you do upgrade, do a clean uninstall with Driver Sweeper or Driver Fusion; it solves a lot of problems.
ID: 31882 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31885 - Posted: 7 Aug 2013, 19:38:46 UTC - in response to Message 31882.  

My only suggestion is if you do upgrade, do a clean uninstall with Driver Sweeper or Driver Fusion; it solves a lot of problems.

TJ, we keep saying this. Please do it.
ID: 31885 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TJ

Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31895 - Posted: 8 Aug 2013, 13:23:23 UTC

Okay, 310.90 drivers are in affect now and system was cleaned with driver sweeper as well. To test it I let it now do SR and no tasks at first on the CPU.
Greetings from TJ
ID: 31895 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : Number crunching : Long run Santi worse then Noelia

©2026 Universitat Pompeu Fabra