Message boards :
Number crunching :
Error while computing
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 28 Aug 09 Posts: 12 Credit: 4,537,060 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Almost all my GPUGRID wu's fail after 5 seconds "Computation Error" Boinc 6.10.56 wxWidgets 2.8.10 Nvidia GTX275 driver 8.17.11.9745 Some wu's still computing correctly. What can this be ? I did not recently update |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
At least you are completing the odd task, but your problem is at least 16days old, going by your tasks. You seem to be completing the tasks if they actually run for any length of time, but most fail after a few seconds, so you could be sitting idle for long periods (after too many failures)! Maybe a different driver will work. You could try the most recent one or perhaps a much older driver 195.xx A few weeks back I had the same problem with my GTX260 on Win 7 x64 (same as you). In the end I gave up and put it into an XP system! It now works fine. The problem may be related to the reported RAM size on Win7 systems, and expected size by the app or Boinc. Yours is reported as, NVIDIA GeForce GTX 275 (877MB) driver: 19745 - I'm guessing it actually has 896MB So I would suggest you try the 257.21 driver released in the last day or so. If that fails try an older driver. NVidia Good luck, |
|
Send message Joined: 28 Aug 09 Posts: 12 Credit: 4,537,060 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Gosh... I must have been on NVidia site just a split second before this new driver was out... Updated the driver but I am idle (reached limit of 5 tasks per day) so I must wait for a while to see if it make a difference. I will keep an eye on it for the coming days The best proof that I did not change a thing, is that this problem started during my vacation. No any automatic updates will be carried out on my system, so I am pretty sure that it is not because of changes in my system. Will see what happens with the new driver |
|
Send message Joined: 28 Aug 09 Posts: 12 Credit: 4,537,060 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Updated the driver.... BTW I not do any overclocking or so... These messages from the last WU... problem is still the same 18/06/2010 07:13:56 GPUGRID Starting h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0 18/06/2010 07:13:56 GPUGRID Starting task h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0 using acemd2 version 605 18/06/2010 07:14:14 GPUGRID Computation for task h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0 finished 18/06/2010 07:14:14 GPUGRID Output file h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0_1 for task h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0 absent 18/06/2010 07:14:14 GPUGRID Output file h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0_2 for task h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0 absent 18/06/2010 07:14:14 GPUGRID Output file h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0_3 for task h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0 absent 18/06/2010 07:14:14 GPUGRID Starting h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0 18/06/2010 07:14:14 GPUGRID Starting task h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0 using acemd2 version 605 18/06/2010 07:14:15 GPUGRID Started upload of h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0_0 18/06/2010 07:14:15 GPUGRID Started upload of h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0_4 18/06/2010 07:14:16 GPUGRID Finished upload of h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0_0 18/06/2010 07:14:16 GPUGRID Finished upload of h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0_4 18/06/2010 07:14:16 GPUGRID Started upload of h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0_7 18/06/2010 07:14:17 GPUGRID Finished upload of h232f99r83-TONI_CAPBINDsp2-7-100-RND6332_0_7 18/06/2010 07:14:32 GPUGRID Computation for task h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0 finished 18/06/2010 07:14:32 GPUGRID Output file h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0_1 for task h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0 absent 18/06/2010 07:14:32 GPUGRID Output file h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0_2 for task h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0 absent 18/06/2010 07:14:32 GPUGRID Output file h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0_3 for task h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0 absent 18/06/2010 07:14:33 GPUGRID Started upload of h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0_0 18/06/2010 07:14:33 GPUGRID Started upload of h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0_4 18/06/2010 07:14:34 GPUGRID Finished upload of h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0_0 18/06/2010 07:14:34 GPUGRID Finished upload of h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0_4 18/06/2010 07:14:34 GPUGRID Started upload of h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0_7 18/06/2010 07:14:35 GPUGRID Finished upload of h232f99r116-TONI_CAPBINDsp2-4-100-RND6115_0_7 Any suggestions ? (while I will try an older driver) |
|
Send message Joined: 28 Aug 09 Posts: 12 Credit: 4,537,060 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
as expected.... an older driver has the same result as before. so it is likely not the driver, but something else... suggestions still welcome |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Use XP or Linux, if you can. |
|
Send message Joined: 28 Aug 09 Posts: 12 Credit: 4,537,060 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Not possible... Why GPUGRID not make a more stable application ? |
|
Send message Joined: 5 Jan 09 Posts: 670 Credit: 2,498,095,550 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi Barts Can you look in Device Manager, right click on your card, select properties and then details and in the pull down list is there an entry that is called "Install Error" anywhere in the list? Radio Caroline, the world's most famous offshore pirate radio station. Great music since April 1964. Support Radio Caroline Team - Radio Caroline |
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
I would encourage you to try the Linux on a stick. http://www.gpugrid.net/forum_thread.php?id=2203 |
|
Send message Joined: 28 Aug 09 Posts: 12 Credit: 4,537,060 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Hi, No install errors in that driver section, however I do see multiple entries {3ab22e31-8264-4b4e-9af5-a8d2d8e33e62} [1]..[17] and [25] behind it. About linux... uhm... I have nothing against linux, although the support for Nvidia is 'difficult'.. So again... why GPUGRID is not making the application more stable ? Mine was running ok and without driver updates it start getting bad |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'm getting runaway errors for TONI_CAPBIND on my quad GT240 system. The other tasks work fine. Each TONI_CAPBIND fails after about 20sec. Vista Ult x64, driver 19621. I have now stopped picking up new tasks, communication deffered for 7h. I cannot change the operating system, it gets used too much. Did a system restart. One task (TONI_HERG) is due to complete in about 90min, so I will see if the restart made any difference. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
When I manually reported the finished TONI_HERG work unit, Boinc picked up 2 new tasks :) Fortunately they are TONI_KID work units and both have made it to 1% (about 7min). |
|
Send message Joined: 28 Aug 09 Posts: 12 Credit: 4,537,060 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
I would encourage GPU grid to get a decent round of debugging in these tasks that seems to be highly unstable, or come out with a good and clear report why these tasks fail. In that case we can do something about it.... and GPUgrid does not have all those failed tasks |
robertmilesSend message Joined: 16 Apr 09 Posts: 503 Credit: 769,991,668 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
GPUGRID has an option to participate in testing of new software versions that have passed server site testing but still need additional testing on a wider variety of computers. You may want to check if you have unknowingly set your account to participate. Also, a BOINC CPU workunits project where you may want to avoid participating for now, since it does not seem fully compatible with GPUGRID: PrimeGrid. A few comments on why one of my workunits failed: 6/27/2010 7:15:15 AM GPUGRID Computation for task D273r4-TONI_HERGunb1-59-100-RND6573_0 finished 6/27/2010 7:15:15 AM GPUGRID Output file D273r4-TONI_HERGunb1-59-100-RND6573_0_1 for task D273r4-TONI_HERGunb1-59-100-RND6573_0 absent 6/27/2010 7:15:15 AM GPUGRID Output file D273r4-TONI_HERGunb1-59-100-RND6573_0_2 for task D273r4-TONI_HERGunb1-59-100-RND6573_0 absent 6/27/2010 7:15:15 AM GPUGRID Output file D273r4-TONI_HERGunb1-59-100-RND6573_0_3 for task D273r4-TONI_HERGunb1-59-100-RND6573_0 absent The failure happened just after I had enabled getting workunits from the PrimeGrid BOINC project, and got several workunits with the completion time overestimated enough that two of them went into high-priority mode immediately. A third CPU core was already running a The Lattice Project workunit in high-priority mode. I had set BOINC not to use the fourth CPU core. It looks like the workunit was simply not able to recover from having all the CPU cores BOINC could use in high-priority mode at once. |
|
Send message Joined: 28 Aug 09 Posts: 12 Credit: 4,537,060 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
This has nothing to do with the number of CPU cores. I have AQUA running as well taking all my cores and still a GPUGRID can run. ACEMD2: GPU molucar dynamics runs fine here.. It is really a very instable TONI_* or one of the other new WU's. Better that GPU grid has a look at this instability before they send out more of those WU's |
|
Send message Joined: 4 Apr 09 Posts: 450 Credit: 539,316,349 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hey barts ... I took a look at about 10 of your errored WUs and what I noticed is that they are all different WU types and most of them have already been sucessfully completed by other computers (no multiple errors on different machines). Maybe before claiming there is an unstable WU type please double check around a little before just throwing the blame blanket on GPUGrid. Might I suggest a clean install of the driver? Uninstall, boot to safe mode (F8), run driver sweeper to clean up any old remants, boot again to safe mode and install the driver you want to use. Now reboot one more time and see how it goes. Do you have your BOINC directories excluded from AV scanning? Both the data and Program directories. Thanks - Steve |
|
Send message Joined: 10 Nov 07 Posts: 10 Credit: 12,777,491 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]()
|
I am observing that in the last few days there were some TONI_CAPBIND's failing on my machine... 2 cancelled by server (ok thats not an error) 3 with exit code 98, one just finished ok... There seems to be a problem with them ^.^ BOINC_64_6.10.56 on Vista64, "27.06.2010 19:00:33 NVIDIA GPU 0: GeForce GTX 260 (driver version 19107, CUDA version 2030, compute capability 1.3, 896MB, 582 GFLOPS peak)" |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Some of the work units must be different in some way that causes them to fail, usually after a few seconds. Some tasks just won’t run for me while others work fine. This is mostly the case on Vista and Win7, so it is operating system related, depends on your exact GPU, and in the recent past (last few months) definitely driver related too (I found some drivers work for some tasks, while other drivers fail all tasks). So it is just down to getting the correct driver for the tasks (if you can). Otherwise the only choice is to change operating system. XP and Linux seem to work the best. |
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
This has nothing to do with the number of CPU cores. I have AQUA running as well taking all my cores and still a GPUGRID can run. ACEMD2: GPU molucar dynamics runs fine here.. Barts, these workunits seem to work just fine for us. Try the USB key (see join link, this will allow you to run faster and leave untounched your home system. gdf |
|
Send message Joined: 28 Aug 09 Posts: 12 Credit: 4,537,060 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
If it starts running instable while the PC is untouched, I was on holiday when this started to happen.... Then it can only be something in GPUGRID causing this. "Error while computing" as error message does not give me any information, so maybe a GPUGRID member can investigate the real reason why the WU's have an error. If it is in my system, I know what I can fix, if it is in GPUGRID, they can fix. I don't see the point of running anothter OS especially for GPUGRID. Many other projects (e.g. MilkyWay like my GPU also).... So please come up with some real reasons why these errors happen, not just a try another OS |
©2025 Universitat Pompeu Fabra