Error while computing

Message boards : Number crunching : Error while computing
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
barts

Send message
Joined: 28 Aug 09
Posts: 12
Credit: 4,537,060
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 17639 - Posted: 16 Jun 2010, 15:59:50 UTC

Almost all my GPUGRID wu's fail after 5 seconds "Computation Error"

Boinc 6.10.56
wxWidgets 2.8.10
Nvidia GTX275 driver 8.17.11.9745

Some wu's still computing correctly.

What can this be ? I did not recently update

ID: 17639 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17643 - Posted: 16 Jun 2010, 22:09:10 UTC - in response to Message 17639.  

At least you are completing the odd task, but your problem is at least 16days old, going by your tasks. You seem to be completing the tasks if they actually run for any length of time, but most fail after a few seconds, so you could be sitting idle for long periods (after too many failures)!

Maybe a different driver will work. You could try the most recent one or perhaps a much older driver 195.xx

A few weeks back I had the same problem with my GTX260 on Win 7 x64 (same as you). In the end I gave up and put it into an XP system! It now works fine.

The problem may be related to the reported RAM size on Win7 systems, and expected size by the app or Boinc. Yours is reported as,
NVIDIA GeForce GTX 275 (877MB) driver: 19745
- I'm guessing it actually has 896MB

So I would suggest you try the 257.21 driver released in the last day or so. If that fails try an older driver.

NVidia

Good luck,
ID: 17643 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
barts

Send message
Joined: 28 Aug 09
Posts: 12
Credit: 4,537,060
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 17658 - Posted: 17 Jun 2010, 18:36:54 UTC - in response to Message 17643.  

Gosh... I must have been on NVidia site just a split second before this new driver was out... Updated the driver but I am idle (reached limit of 5 tasks per day) so I must wait for a while to see if it make a difference.

I will keep an eye on it for the coming days

The best proof that I did not change a thing, is that this problem started during my vacation. No any automatic updates will be carried out on my system, so I am pretty sure that it is not because of changes in my system.

Will see what happens with the new driver
ID: 17658 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
barts

Send message
Joined: 28 Aug 09
Posts: 12
Credit: 4,537,060
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 17667 - Posted: 18 Jun 2010, 15:04:54 UTC - in response to Message 17658.  

Updated the driver.... BTW I not do any overclocking or so...

These messages from the last WU... problem is still the same

18/06/2010 07:13:56 GPUGRID Starting h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0
18/06/2010 07:13:56 GPUGRID Starting task h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0 using acemd2 version 605
18/06/2010 07:14:14 GPUGRID Computation for task h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0 finished
18/06/2010 07:14:14 GPUGRID Output file h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0_1 for task h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0 absent
18/06/2010 07:14:14 GPUGRID Output file h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0_2 for task h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0 absent
18/06/2010 07:14:14 GPUGRID Output file h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0_3 for task h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0 absent
18/06/2010 07:14:14 GPUGRID Starting h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0
18/06/2010 07:14:14 GPUGRID Starting task h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0 using acemd2 version 605
18/06/2010 07:14:15 GPUGRID Started upload of h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0_0
18/06/2010 07:14:15 GPUGRID Started upload of h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0_4
18/06/2010 07:14:16 GPUGRID Finished upload of h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0_0
18/06/2010 07:14:16 GPUGRID Finished upload of h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0_4
18/06/2010 07:14:16 GPUGRID Started upload of h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0_7
18/06/2010 07:14:17 GPUGRID Finished upload of h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0_7
18/06/2010 07:14:32 GPUGRID Computation for task h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0 finished
18/06/2010 07:14:32 GPUGRID Output file h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0_1 for task h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0 absent
18/06/2010 07:14:32 GPUGRID Output file h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0_2 for task h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0 absent
18/06/2010 07:14:32 GPUGRID Output file h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0_3 for task h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0 absent
18/06/2010 07:14:33 GPUGRID Started upload of h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0_0
18/06/2010 07:14:33 GPUGRID Started upload of h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0_4
18/06/2010 07:14:34 GPUGRID Finished upload of h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0_0
18/06/2010 07:14:34 GPUGRID Finished upload of h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0_4
18/06/2010 07:14:34 GPUGRID Started upload of h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0_7
18/06/2010 07:14:35 GPUGRID Finished upload of h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0_7


Any suggestions ? (while I will try an older driver)
ID: 17667 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
barts

Send message
Joined: 28 Aug 09
Posts: 12
Credit: 4,537,060
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 17668 - Posted: 18 Jun 2010, 15:17:44 UTC - in response to Message 17667.  

as expected.... an older driver has the same result as before.

so it is likely not the driver, but something else...

suggestions still welcome
ID: 17668 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17672 - Posted: 19 Jun 2010, 0:58:11 UTC - in response to Message 17668.  

Use XP or Linux, if you can.
ID: 17672 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
barts

Send message
Joined: 28 Aug 09
Posts: 12
Credit: 4,537,060
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 17680 - Posted: 21 Jun 2010, 6:07:17 UTC - in response to Message 17672.  

Not possible...

Why GPUGRID not make a more stable application ?
ID: 17680 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Betting Slip

Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17681 - Posted: 21 Jun 2010, 6:26:00 UTC - in response to Message 17680.  

Hi Barts

Can you look in Device Manager, right click on your card, select properties and then details and in the pull down list is there an entry that is called "Install Error" anywhere in the list?



Radio Caroline, the world's most famous offshore pirate radio station.
Great music since April 1964. Support Radio Caroline Team -
Radio Caroline
ID: 17681 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 17684 - Posted: 21 Jun 2010, 13:20:16 UTC - in response to Message 17681.  
Last modified: 21 Jun 2010, 13:24:28 UTC

I would encourage you to try the Linux on a stick. http://www.gpugrid.net/forum_thread.php?id=2203
ID: 17684 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
barts

Send message
Joined: 28 Aug 09
Posts: 12
Credit: 4,537,060
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 17687 - Posted: 22 Jun 2010, 20:53:54 UTC - in response to Message 17681.  
Last modified: 22 Jun 2010, 21:18:39 UTC

Hi, No install errors in that driver section, however I do see multiple entries {3ab22e31-8264-4b4e-9af5-a8d2d8e33e62}
[1]..[17] and [25] behind it.

About linux... uhm... I have nothing against linux, although the support for Nvidia is 'difficult'..

So again... why GPUGRID is not making the application more stable ? Mine was running ok and without driver updates it start getting bad
ID: 17687 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17697 - Posted: 25 Jun 2010, 14:42:42 UTC - in response to Message 17687.  
Last modified: 25 Jun 2010, 15:04:56 UTC

I'm getting runaway errors for TONI_CAPBIND on my quad GT240 system. The other tasks work fine. Each TONI_CAPBIND fails after about 20sec.
Vista Ult x64, driver 19621.

I have now stopped picking up new tasks, communication deffered for 7h.
I cannot change the operating system, it gets used too much.
Did a system restart. One task (TONI_HERG) is due to complete in about 90min, so I will see if the restart made any difference.
ID: 17697 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17698 - Posted: 25 Jun 2010, 16:27:17 UTC - in response to Message 17697.  

When I manually reported the finished TONI_HERG work unit, Boinc picked up 2 new tasks :) Fortunately they are TONI_KID work units and both have made it to 1% (about 7min).
ID: 17698 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
barts

Send message
Joined: 28 Aug 09
Posts: 12
Credit: 4,537,060
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 17707 - Posted: 26 Jun 2010, 20:37:49 UTC - in response to Message 17684.  

I would encourage GPU grid to get a decent round of debugging in these tasks that seems to be highly unstable, or come out with a good and clear report why these tasks fail. In that case we can do something about it.... and GPUgrid does not have all those failed tasks
ID: 17707 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile robertmiles

Send message
Joined: 16 Apr 09
Posts: 503
Credit: 769,991,668
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17715 - Posted: 27 Jun 2010, 20:04:48 UTC
Last modified: 27 Jun 2010, 20:25:06 UTC

GPUGRID has an option to participate in testing of new software versions that have passed server site testing but still need additional testing on a wider variety of computers. You may want to check if you have unknowingly set your account to participate.

Also, a BOINC CPU workunits project where you may want to avoid participating for now, since it does not seem fully compatible with GPUGRID: PrimeGrid.

A few comments on why one of my workunits failed:

6/27/2010 7:15:15 AM GPUGRID Computation for task D273r4-TONI_HERGunb1-59-100-RND6573_0 finished
6/27/2010 7:15:15 AM GPUGRID Output file D273r4-TONI_HERGunb1-59-100-RND6573_0_1 for task D273r4-TONI_HERGunb1-59-100-RND6573_0 absent
6/27/2010 7:15:15 AM GPUGRID Output file D273r4-TONI_HERGunb1-59-100-RND6573_0_2 for task D273r4-TONI_HERGunb1-59-100-RND6573_0 absent
6/27/2010 7:15:15 AM GPUGRID Output file D273r4-TONI_HERGunb1-59-100-RND6573_0_3 for task D273r4-TONI_HERGunb1-59-100-RND6573_0 absent

The failure happened just after I had enabled getting workunits from the PrimeGrid BOINC project, and got several workunits with the completion time overestimated enough that two of them went into high-priority mode immediately. A third CPU core was already running a The Lattice Project workunit in high-priority mode. I had set BOINC not to use the fourth CPU core. It looks like the workunit was simply not able to recover from having all the CPU cores BOINC could use in high-priority mode at once.
ID: 17715 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
barts

Send message
Joined: 28 Aug 09
Posts: 12
Credit: 4,537,060
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 17739 - Posted: 28 Jun 2010, 22:04:33 UTC - in response to Message 17715.  

This has nothing to do with the number of CPU cores. I have AQUA running as well taking all my cores and still a GPUGRID can run. ACEMD2: GPU molucar dynamics runs fine here..

It is really a very instable TONI_* or one of the other new WU's.

Better that GPU grid has a look at this instability before they send out more of those WU's
ID: 17739 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Snow Crash

Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17741 - Posted: 28 Jun 2010, 23:28:17 UTC - in response to Message 17739.  

Hey barts ... I took a look at about 10 of your errored WUs and what I noticed is that they are all different WU types and most of them have already been sucessfully completed by other computers (no multiple errors on different machines). Maybe before claiming there is an unstable WU type please double check around a little before just throwing the blame blanket on GPUGrid.

Might I suggest a clean install of the driver? Uninstall, boot to safe mode (F8), run driver sweeper to clean up any old remants, boot again to safe mode and install the driver you want to use. Now reboot one more time and see how it goes.

Do you have your BOINC directories excluded from AV scanning? Both the data and Program directories.
Thanks - Steve
ID: 17741 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
=Lupus=

Send message
Joined: 10 Nov 07
Posts: 10
Credit: 12,777,491
RAC: 0
Level
Pro
Scientific publications
watwatwatwat
Message 17742 - Posted: 28 Jun 2010, 23:30:40 UTC

I am observing that in the last few days there were some TONI_CAPBIND's failing on my machine... 2 cancelled by server (ok thats not an error) 3 with exit code 98, one just finished ok... There seems to be a problem with them ^.^

BOINC_64_6.10.56 on Vista64,
"27.06.2010 19:00:33 NVIDIA GPU 0: GeForce GTX 260 (driver version 19107, CUDA version 2030, compute capability 1.3, 896MB, 582 GFLOPS peak)"
ID: 17742 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17750 - Posted: 29 Jun 2010, 9:26:23 UTC - in response to Message 17742.  

Some of the work units must be different in some way that causes them to fail, usually after a few seconds. Some tasks just won’t run for me while others work fine. This is mostly the case on Vista and Win7, so it is operating system related, depends on your exact GPU, and in the recent past (last few months) definitely driver related too (I found some drivers work for some tasks, while other drivers fail all tasks). So it is just down to getting the correct driver for the tasks (if you can). Otherwise the only choice is to change operating system. XP and Linux seem to work the best.
ID: 17750 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17751 - Posted: 29 Jun 2010, 9:32:58 UTC - in response to Message 17739.  

This has nothing to do with the number of CPU cores. I have AQUA running as well taking all my cores and still a GPUGRID can run. ACEMD2: GPU molucar dynamics runs fine here..

It is really a very instable TONI_* or one of the other new WU's.

Better that GPU grid has a look at this instability before they send out more of those WU's



Barts,
these workunits seem to work just fine for us. Try the USB key (see join link, this will allow you to run faster and leave untounched your home system.

gdf
ID: 17751 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
barts

Send message
Joined: 28 Aug 09
Posts: 12
Credit: 4,537,060
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 17849 - Posted: 3 Jul 2010, 11:11:21 UTC - in response to Message 17751.  

If it starts running instable while the PC is untouched, I was on holiday when this started to happen.... Then it can only be something in GPUGRID causing this. "Error while computing" as error message does not give me any information, so maybe a GPUGRID member can investigate the real reason why the WU's have an error. If it is in my system, I know what I can fix, if it is in GPUGRID, they can fix.

I don't see the point of running anothter OS especially for GPUGRID. Many other projects (e.g. MilkyWay like my GPU also)....

So please come up with some real reasons why these errors happen, not just a try another OS
ID: 17849 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : Error while computing

©2025 Universitat Pompeu Fabra