Message boards :
Graphics cards (GPUs) :
GPU problem
Message board moderation
Previous · 1 · 2 · 3 · Next
| Author | Message |
|---|---|
|
Send message Joined: 12 Jul 07 Posts: 100 Credit: 21,848,502 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
If you go over quota for the day, let me have your hostid. got 3, 2 normal & 1 shortie, running the shortie now thanks |
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
who, me? Maybe it was overclocked by the vendor. According to this http://en.wikipedia.org/wiki/GeForce_8_Series the shader should be clocked at 1375 You could try to reduce it. GDF |
MJHSend message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level ![]() Scientific publications ![]()
|
Hi, According to the Asus website, at [url=http://www.asus.co.nz/news_show.aspx?id=9578], your card is a factory overclocked 8800GS. Whilst overclocking is often acceptable for games, because the GPUGRID application works the card very hard it is quite possible that it becoming unstable. I suggest that you try reducing the clock back to the standard settings shown in the Wikipedia article griven by GDF. Before you can change the clock frequencies, you may need to add the following to the screen or device section of /etc/X11/xorg.conf: Option "Coolbits" "1" Then restart X. The nvidia-settings program will then have a panel called "clock settings". MJH |
Stefan LedwinaSend message Joined: 16 Jul 07 Posts: 464 Credit: 298,573,998 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
If you meant me with the oc'd card - no they aren't overclocked... My 9800GTX is an EVGA 9800 GTX SC (super clocked) and is a little bit overclocked by default, but it was stable 24/7 over one week when I was running the Folding@home GPU Client under Windows - Haven't had one error on this card at FAH. And the other two cards do show the same errors with ps3grid and they are for sure not oc'd - not by me and not by the vendor... ----- Just had another error after nine hours, with a new error code- The WU was http://www.ps3grid.net/PS3GRID/result.php?resultid=38719 ,and the error was <core_client_version>6.3.5</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63) </message> <stderr_txt> </stderr_txt> ]]> pixelicious.at - my little photoblog |
|
Send message Joined: 12 Jul 07 Posts: 100 Credit: 21,848,502 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi,Yep, that would make sense. Must admit, I didn't realise it was an overclock card, i just bought it because it was a cheap 8800. As far as temps go, its 16C at idle and a max of 28C while running WUs. Before you can change the clock frequencies, you may need to add the following to the screen or device section of /etc/X11/xorg.conf:Yep, done that and up pops the extra panel but I can only adjust GPU (at default of 600Mhz) and Memory (at default of 900 Mhz) settings. There's no access to the Shader settings. I'm willing to knock both GPU & Memory down if that will help, any suggestions what to set it to? I've also done a "nvclock -r" (reset) but it didn't seem to change the output from "nvclock -i -f" |
|
Send message Joined: 12 Jul 07 Posts: 100 Credit: 21,848,502 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
ok, I've gone for GPU @ 550 and Memory @ 850 shader is still at 1700 though, anyone know how to adjust that? edit oops, nope, its dropped down to 1566Mhz I'll have a play around with GPU & Mem settings |
UBT - NaRyanSend message Joined: 16 Jul 08 Posts: 68 Credit: 1,242,980 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
My 8800GT is factory overclocked should be 600MHz Core, 1500MHz Shader & 1800MHz Memory. However it runs at 700MHz Core, 1700MHz Shader & 2000MHz Memory. And it's the computer that's so far been 100% stable. I do however have the fan set to 100% on it. *EDIT* oops forgot it was Quad systems having probs not dual core ones *ahem* |
Stefan LedwinaSend message Joined: 16 Jul 07 Posts: 464 Credit: 298,573,998 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Ok, that's your Dualcore without problems, but the Quadcore has also some WUs with errors... Any word from GDF or MJH about that it looks like those errors only appear on Quadcore computers? pixelicious.at - my little photoblog |
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
Ok, that's your Dualcore without problems, but the Quadcore has also some WUs with errors... I don't see any reason why quad cores could create problems. GDF |
UBT - NaRyanSend message Joined: 16 Jul 08 Posts: 68 Credit: 1,242,980 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Ok, that's your Dualcore without problems, but the Quadcore has also some WUs with errors... Thing is, looking at the Top Computers every single Quad Core has them :( Need someone with an AMD Quad to join to see if that also has the same probs. |
Stefan LedwinaSend message Joined: 16 Jul 07 Posts: 464 Credit: 298,573,998 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Ok, let's look a little bit further at the top hosts list... Computers with computations errors: stefan@home Intel Quad Core stefan@home Intel Quad Core JG4KEZ(Koichi Soraku) Intel Quad Core Xeon sneakysaurus Intel Quad Core Anonymous user Intel Quad Core UBT-NaRyan Intel Quad Core stefan@home Intel Quad Core Anonymous user Intel Dual Core ! [AF>Linux>Gentoo] elgrande71 Intel Dual Core ! Computers without computation errors: UBT-NaRyan AMD Dual Core Ok that's - this are all GPU computers I could find (Except the computer of GDF, I have excluded it, because he also had computation errors, but they were dated before the official start of gpugrid)... Seems the errors are not only related to Quad Cores, but the only computer without errors is an AMD Dual Core... Don't know what it means, but at least all Intel computers show these errors... pixelicious.at - my little photoblog |
|
Send message Joined: 12 Jul 07 Posts: 100 Credit: 21,848,502 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
got 3, 2 normal & 1 shortie, running the shortie now The shortie turned out to be not a shortie :( 1st one crashed again but that was before I underclocked the card. Current WU is now 12 hours in with an estimated 1.5 hours to go. |
UBT - NaRyanSend message Joined: 16 Jul 08 Posts: 68 Credit: 1,242,980 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
got 3, 2 normal & 1 shortie, running the shortie now The shorties have the name "xxxxxxx-FASTTEST-x-x-xxxxx" and for me it took about 47 Seconds to complete for 1.99 credits :) Down with the Kredit Kops!!! |
|
Send message Joined: 12 Jul 07 Posts: 100 Credit: 21,848,502 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The shorties have the name "xxxxxxx-FASTTEST-x-x-xxxxx" and for me it took about 47 Seconds to complete for 1.99 credits :)I didn't get any of them then :( |
|
Send message Joined: 12 Jul 07 Posts: 100 Credit: 21,848,502 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I think I may have sorted my GPU problems. I run my own little boinc stats database for my team and mysqld gets a tad busy during updates for about 10 minutes. I'm also in the process of upgrading my machine from Fedora 7 to Fedora 9. To do that, I've borrowed a machine from work and have moved the stats over to that machine while I do the upgrade. I'm waiting a couple of days before upgrading the main machine untill I'm sure everything is running ok on the temp machine. Since moving the stats all GPU WUs have completed successfully. Ok, its only done 2 and a bit WUs but its never managed that before, I was lucky if I had 1 in 5 succeed and never 2 sequentially. Its a bit early to claim 100% victory but its looking good, maybe, touch wood :D |
UBT - NaRyanSend message Joined: 16 Jul 08 Posts: 68 Credit: 1,242,980 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
I just got another error on my quad. Task ID 39247 Last one I had on the Quad was 4 days ago, so can't moan about it. And the AMD dual core is still plodding along as happy as can be :) Down with the Kredit Kops!!! |
NightlordSend message Joined: 22 Jul 08 Posts: 61 Credit: 5,461,041 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Hi guys! A couple of error types to report. I loaded up two boxes with 8800GT's (the first box ran a 8600GT as a test bed for a couple of days). One box (this one)seems a bit unstable. Lots of compute errors. I've reduced the clock using nvclock as discussed earlier in the thread and we'll see what happens On the other box, I have some missing libcudart.so errors. Was there a fix found for the missing libcudart.so discussed earlier? This host seems to do that on every second WU - immediately it tries to start up the second WU after completing the first, it fails for missing libcudart. I've checked and the file is present in the /projects/ps3grid.net folder, so stumped really. Both boxes are running ubuntu 8.04, with 8800GT's fed from dual core E4300's, which have been fine on boinc up till now. |
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
Hi guys! A couple of error types to report. We have a workaround for the libcudart missing problem. It seems to happen in a strange way, which we cannot replicate on our fedora box. The workaround is to install the Nvidia toolkit (same page of the the driver) and set in your .bashrc file the following command: export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda/lib This should not be needed in the future. I keep you updated for when we find a solution. gdf |
NightlordSend message Joined: 22 Jul 08 Posts: 61 Credit: 5,461,041 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Thanks for your tip, I'll keep a watch on the boxes and patch if needed. I reduced the clocks slightly on the machine that appeared to have a stability problem, it seems fine now. The other machine has not encountered a libcudart.so fail since Saturday evening. |
|
Send message Joined: 12 Jul 07 Posts: 100 Credit: 21,848,502 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi I had a WU running for ~9 hours when the benchmarks kicked in. The WU resumed after the benchmarks and promptly failed with error 1 I know my card isn't the most stable but this failure looked to be caused by the benchmarks. If a benchmark is due within the estimated WU run period is there any way that the benchmark could be run before starting the WU? |
©2025 Universitat Pompeu Fabra