GTS 250 65nm G92 Rev A2 - Successes and Failures

Message boards : Graphics cards (GPUs) : GTS 250 65nm G92 Rev A2 - Successes and Failures
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13836 - Posted: 8 Dec 2009, 18:08:31 UTC

Below is a list of tasks, from 1st Dec 09 to 7th Dec 09, that my GTS250 managed to complete successfully:

151-KASHIF_HIVPR_sub_so_ba2-73-100-RND7713_0
39-GIANNI_BIND_2-32-100-RND1522
38-IBUCH_2_reverse_TRYP_0911-11-40-RND1649
D160-TONI_HERGdof5-5-40-RND0496
467-GIANNI_BIND_166_119-32-100-RND4596
92-KASHIF_HIVPR_twomons_ba2-72-100-RND5413
p1515000-IBUCH_3_pYEEI_2011-13-20-RND1885
98-KASHIF_HIVPR_n1_for_1hhp_open_ba5-81-100-RND2003
70-GIANNI_BIND_166_119-43-100-RND0394
315-GIANNI_BIND_166_119-33-100-RND5680

Last week the same GTS 250 failed Four tasks, All ...TONI-HERG...
TONI-HERG is BAD for this GTS 250 Card, so any I get for the card will be Aborted By User;

They failed after the following amounts of time,
38565, 15028, 19544 and 3083 seconds.
That is a total of 21h of lost crunching, or 12.5% lost time, last week.
The previous week it was twice that, 25% lost time, when I had more of these Bad Work Units.

So things are looking up again for the old 65nm G92 Rev A2 card.

Using Boinc 6.10.18, Driver 19539, CUDA 3000.
ID: 13836 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13979 - Posted: 18 Dec 2009, 8:45:49 UTC - in response to Message 13836.  
Last modified: 18 Dec 2009, 8:46:21 UTC

Between the 10th Dec and 17th Dec by my GTS250 had 5 Errors and 9 Successes.
19h 15min were lost (69206s), or 11.5% of the time. Slightly better than the previous week (12.5) and still much better that the week before (25%).

Since the 13th there has only been one failure, although it did fail after 9h30min!
I suspect that failure was as a result of the task being run when I was using the system. So I made sure it does not run GPUGrid when I am using it (which is not too often)!

All Error messages have the following line,
MDIO ERROR: cannot open file "restart.coor"

List of tasks undertaken:
1633773 1024720 15 Dec 2009 23:57:30 UTC 16 Dec 2009 13:41:29 UTC Completed and validated 47,063.52 4,500.40 3,977.21 5,369.23 Full-atom molecular dynamics v6.71 (cuda23)
1632027 1023349 15 Dec 2009 6:17:34 UTC 15 Dec 2009 23:57:30 UTC Error while computing 34,324.59 1,311.34 4,428.01 --- Full-atom molecular dynamics v6.71 (cuda23)
1629991 1022109 14 Dec 2009 15:42:01 UTC 15 Dec 2009 11:17:44 UTC Completed and validated 52,633.66 3,102.44 4,503.74 6,080.05 Full-atom molecular dynamics v6.71 (cuda23)
1627586 1020487 14 Dec 2009 0:18:19 UTC 14 Dec 2009 20:42:11 UTC Completed and validated 55,474.79 3,033.25 4,531.91 6,118.08 Full-atom molecular dynamics v6.71 (cuda23)
1625604 1007426 13 Dec 2009 11:55:40 UTC 14 Dec 2009 6:21:57 UTC Completed and validated 52,461.03 2,915.99 4,503.74 6,080.05 Full-atom molecular dynamics v6.71 (cuda23)
1624544 1018750 12 Dec 2009 11:21:41 UTC 13 Dec 2009 11:53:58 UTC Completed and validated 55,578.08 3,299.06 4,531.91 6,118.08 Full-atom molecular dynamics v6.71 (cuda23)
1624517 1018739 12 Dec 2009 10:56:14 UTC 12 Dec 2009 11:14:49 UTC Error while computing 1,015.11 54.30 4,022.81 --- Full-atom molecular dynamics v6.71 (cuda23)
1624470 1018708 12 Dec 2009 10:28:30 UTC 12 Dec 2009 10:34:45 UTC Error while computing 265.39 18.05 4,428.01 --- Full-atom molecular dynamics v6.71 (cuda23)
1622530 1013606 11 Dec 2009 20:43:48 UTC 12 Dec 2009 10:28:30 UTC Error while computing 32,402.90 1,903.20 4,531.91 --- Full-atom molecular dynamics v6.71 (cuda23)
1620740 1016195 11 Dec 2009 7:20:21 UTC 12 Dec 2009 5:01:01 UTC Completed and validated 50,106.54 2,207.99 4,022.81 5,430.80 Full-atom molecular dynamics v6.71 (cuda23)
1620010 1015882 12 Dec 2009 10:34:45 UTC 12 Dec 2009 10:56:14 UTC Error while computing 1,199.04 117.81 3,977.21 --- Full-atom molecular dynamics v6.71 (cuda23)
1617985 1014436 10 Dec 2009 10:43:53 UTC 11 Dec 2009 16:01:44 UTC Completed and validated 45,358.59 5,275.05 3,539.96 4,778.94 Full-atom molecular dynamics v6.71 (cuda23)
1616001 1013057 9 Dec 2009 23:10:00 UTC 10 Dec 2009 18:54:39 UTC Completed and validated 56,684.09 3,287.19 4,531.91 6,118.08 Full-atom molecular dynamics v6.71 (cuda23)

Failure 1:

________________________________________

Name p270000-IBUCH_2_pYEEI_2011-5-20-RND2486_0
Workunit 1015882

Created 11 Dec 2009 1:24:50 UTC
Sent 12 Dec 2009 10:34:45 UTC
Received 12 Dec 2009 10:56:14 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 1 (0x1)
Computer ID 51279

Report deadline 17 Dec 2009 10:34:45 UTC
Run time 1199.038244
CPU time 117.812
stderr out <core_client_version>6.10.18</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTS 250"
# Clock rate: 1.85 GHz
# Total amount of global memory: 1073741824 bytes
# Number of multiprocessors: 16
# Number of cores: 128
MDIO ERROR: cannot open file "restart.coor"
Cuda error: Kernel [pme_fill_charges_overflow] failed in file 'fillcharges.cu' in line 97 : unknown error.

</stderr_txt>
]]>
Validate state Invalid
Claimed credit 3977.21064814815
Granted credit 0
application version Full-atom molecular dynamics v6.71 (cuda23)

Failure 2:

________________________________________

Name 471-GIANNI_BIND_166_119-30-100-RND4009_1
Workunit 1013606

Created 11 Dec 2009 20:08:53 UTC
Sent 11 Dec 2009 20:43:48 UTC
Received 12 Dec 2009 10:28:30 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 1 (0x1)
Computer ID 51279

Report deadline 16 Dec 2009 20:43:48 UTC
Run time 32402.901728
CPU time 1903.197
stderr out <core_client_version>6.10.18</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTS 250"
# Clock rate: 1.85 GHz
# Total amount of global memory: 1073741824 bytes
# Number of multiprocessors: 16
# Number of cores: 128
MDIO ERROR: cannot open file "restart.coor"
Cuda error: Kernel [PmeRealSpace_compute_forces] failed in file 'PmeRealSpace.cu' in line 172 : unknown error.

</stderr_txt>
]]>
Validate state Invalid
Claimed credit 4531.90972222222
Granted credit 0
application version Full-atom molecular dynamics v6.71 (cuda23)

Failure 3:

________________________________________

Name 88-KASHIF_HIVPR_n1_for_1hhp_open_ba4-78-100-RND1283_0
Workunit 1018708

Created 12 Dec 2009 9:52:07 UTC
Sent 12 Dec 2009 10:28:30 UTC
Received 12 Dec 2009 10:34:45 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 1 (0x1)
Computer ID 51279

Report deadline 17 Dec 2009 10:28:30 UTC
Run time 265.390701
CPU time 18.04932
stderr out <core_client_version>6.10.18</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTS 250"
# Clock rate: 1.85 GHz
# Total amount of global memory: 1073741824 bytes
# Number of multiprocessors: 16
# Number of cores: 128
MDIO ERROR: cannot open file "restart.coor"
Cuda error: Kernel [PmeRealSpace_compute_forces] failed in file 'PmeRealSpace.cu' in line 172 : unknown error.

</stderr_txt>
]]>
Validate state Invalid
Claimed credit 4428.01157407407
Granted credit 0
application version Full-atom molecular dynamics v6.71 (cuda23)

Failure 4:
Name34-KASHIF_HIVPR_sub_so_ba1-72-100-RND1262_0 Workunit1018739 Created12 Dec 2009 10:17:10 UTC Sent12 Dec 2009 10:56:14 UTC Received12 Dec 2009 11:14:49 UTC Server stateOver OutcomeClient error Client stateCompute error Exit status1 (0x1) Computer ID51279 Report deadline17 Dec 2009 10:56:14 UTC Run time1015.111446 CPU time54.30395 stderr out
<core_client_version>6.10.18</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTS 250"
# Clock rate: 1.85 GHz
# Total amount of global memory: 1073741824 bytes
# Number of multiprocessors: 16
# Number of cores: 128
MDIO ERROR: cannot open file "restart.coor"
Cuda error: Kernel [PmeRealSpace_compute_forces] failed in file 'PmeRealSpace.cu' in line 172 : unknown error.

</stderr_txt>
]]>
Validate stateInvalid Claimed credit4022.81481481481 Granted credit0 application versionFull-atom molecular dynamics v6.71 (cuda23)

Failure 5:
Name89-KASHIF_HIVPR_n1_for_1hhp_open_ba4-78-100-RND7252_1 Workunit1023349 Created15 Dec 2009 5:43:05 UTC Sent15 Dec 2009 6:17:34 UTC Received15 Dec 2009 23:57:30 UTC Server stateOver OutcomeClient error Client stateCompute error Exit status1 (0x1) Computer ID51279 Report deadline20 Dec 2009 6:17:34 UTC Run time34324.593376 CPU time1311.344 stderr out
<core_client_version>6.10.18</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTS 250"
# Clock rate: 1.85 GHz
# Total amount of global memory: 1073741824 bytes
# Number of multiprocessors: 16
# Number of cores: 128
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTS 250"
# Clock rate: 1.85 GHz
# Total amount of global memory: 1073741824 bytes
# Number of multiprocessors: 16
# Number of cores: 128
Cuda error: Kernel [PmeRealSpace_compute_forces] failed in file 'PmeRealSpace.cu' in line 172 : unknown error.

</stderr_txt>
]]>
Validate stateInvalid Claimed credit4428.01157407407 Granted credit0 application versionFull-atom molecular dynamics v6.71 (cuda23)
ID: 13979 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 14065 - Posted: 27 Dec 2009, 19:42:19 UTC - in response to Message 13979.  
Last modified: 27 Dec 2009, 19:43:25 UTC

From the 18th Dec 2009 to the 24th my GTS250 successfully completed 7 tasks in a row, and averaged over 7000 points per day, with tasks completing in between 46000 and 60000 seconds.
On the 24th there was a failure after 2seconds, 143-IBUCH_reverse1fix_pYEEI_2312-0-40-RND2977, from a known bad batch of tasks, and then a TONI_HERG task failed after 14,135 seconds. Surprisingly that task succeeded on a GeForce 9600 GT despite failing on 2 additional systems.

No failures from the 24th to the 27th.
So, since the 18th Dec (9 days ago) my GTS250 has only lost 14138seconds.
(777600 – 14138) / 777600sec = 98% successful GPU processing time!

A Huge improvement.


Techs and Scientists - Thank You,
ID: 14065 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Graphics cards (GPUs) : GTS 250 65nm G92 Rev A2 - Successes and Failures

©2025 Universitat Pompeu Fabra