Author |
Message |
|
got today 2 tasks with errors
http://www.gpugrid.net/result.php?resultid=11080227
95-NOELIA_TRP188-0-1-RND5224_3
Stderr Ausgabe
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -98 (0xffffff9e)
</message>
<stderr_txt>
# GPU [GeForce GTX 760] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 0 :
# Name : GeForce GTX 760
# ECC : Disabled
# Global mem : 2048MB
# Capability : 3.0
# PCI ID : 0000:01:00.0
# Device clock : 1071MHz
# Memory clock : 3004MHz
# Memory width : 256bit
# Driver version : r337_00 : 33788
ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified
10:42:52 (4752): called boinc_finish
</stderr_txt>
]]>
______________________________________________________________________________
http://www.gpugrid.net/result.php?resultid=11080200
26-NOELIA_TRP215-1-4-RND9753_0
Stderr Ausgabe
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -98 (0xffffff9e)
</message>
<stderr_txt>
# GPU [GeForce GTX 760] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 0 :
# Name : GeForce GTX 760
# ECC : Disabled
# Global mem : 2048MB
# Capability : 3.0
# PCI ID : 0000:01:00.0
# Device clock : 1071MHz
# Memory clock : 3004MHz
# Memory width : 256bit
# Driver version : r337_00 : 33788
ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified
10:44:56 (6264): called boinc_finish
</stderr_txt>
]]>[url][/url] |
|
|
|
Likewise the new Noelia_Trp fail instantly on my gtx 750 ti (Win XP x86) with the same error having received three so far. Good to know its not just my machine / maxwell issue.
ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified
01:51:38 (3128): called boinc_finish |
|
|
|
Currently all of the NOELIA_TRP WU's I have received (four so far) have failed within a few seconds also. All give the same error:
Exit status -98 (0xffffffffffffff9e) Unknown error number
ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified
02:07:23 (11844): called boinc_finish
Machine (149863) - GTX680
WinXP Driver 335.28 v8.41 (cuda60)
http://www.gpugrid.net/result.php?resultid=11078372
http://www.gpugrid.net/result.php?resultid=11077061
Machine (158717) - GTX680
WinXP Driver 335.28 v8.41 (cuda60)
http://www.gpugrid.net/result.php?resultid=11078799
Machine (165832) - GTX780Ti
Win7 Driver 331.82 v8.41 (cuda42)
http://www.gpugrid.net/result.php?resultid=11070971 |
|
|
TJSend message
Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level
Scientific publications
|
Guess its an error in the program. I had one too, but all wing(wo)men too.
Just after a few seconds, so no worries. W'll have to wait until Noelia made a change.
____________
Greetings from TJ |
|
|
Matt Send message
Joined: 11 Jan 13 Posts: 216 Credit: 846,538,252 RAC: 0 Level
Scientific publications
|
Same here - instant fail. It looks like this WU finally went through after the sixth send-out, however.
http://www.gpugrid.net/workunit.php?wuid=8215418 |
|
|
Matt Send message
Joined: 11 Jan 13 Posts: 216 Credit: 846,538,252 RAC: 0 Level
Scientific publications
|
...and another
http://www.gpugrid.net/workunit.php?wuid=8215374
|
|
|
|
This problem seems to have been repeated with today's batch of NOELIA_SH2 tasks on the short queue.
Many errors on host 45218 |
|
|
eXaPowerSend message
Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level
Scientific publications
|
Instant work unit NOELIA_SH2 failure throughout today- (Exit status -98 (0xffffffffffffff9e) Unknown error number ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified)
Majority of wingman host(s) failed NOELIA_SH2- prompting (too many errors may be bug) message. All presently requested units on my host show many wingman failures beforehand.
Instant failure no matter beginning part of NOELIA_SH2 work unit name-for example... argphex1,asvalx7,et al. |
|
|
|
Hello all,
I'm new to GPUGRID.net (obviously) but I've been using my NVidia card on other projects. From what I've been reading here the NOELIA wu's have been failing at a fairly substantial rate for some crunchers. I have also been having the same difficulties with the NOELIA units; they all instantly fail with the exit code -98, ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified. I'm currently using a NVidia GTX650 Ti Boost with the latest driver (337) from NVidia trying to crunch the short workunits for now. I'm currently 0 for 5 on GPU tasks. I just thought I would post it here since this appears to be the correct thread for this issue. Hopefully the project administrators can correct the issue so I can get back to crunching!
Thank you all for your time. Happy crunching! |
|
|
|
I've noticed that with these WUs where I have a failure, all the wingman that fail are running Windows as I am. When I see WU with 5 failures and a success, the success is always on a Linux system. The failures are on Windows systems. Has anybody else seen this pattern? |
|
|
|
After a long run without problems, I encountered these failures yesterday. :(
11116386 8236608 158482 6 Jun 2014 | 0:36:38 UTC 6 Jun 2014 | 0:43:02 UTC Error while computing 3.03 0.14 --- Short runs (2-3 hours on fastest card) v8.41 (cuda60)
11114622 8236718 158482 6 Jun 2014 | 0:36:38 UTC 6 Jun 2014 | 0:43:02 UTC Error while computing 4.40 0.16 --- Short runs (2-3 hours on fastest card) v8.41 (cuda60)
11110452 8236955 158482 6 Jun 2014 | 0:39:13 UTC 6 Jun 2014 | 0:43:02 UTC Error while computing 0.12 0.12 --- Short runs (2-3 hours on fastest card) v8.41 (cuda60)
11110140 8236969 158482 6 Jun 2014 | 0:39:13 UTC 6 Jun 2014 | 0:43:02 UTC Error while computing 3.34 0.14 --- Short runs (2-3 hours on fastest card) v8.41 (cuda60) |
|
|
nateSend message
Joined: 6 Jun 11 Posts: 124 Credit: 2,928,865 RAC: 0 Level
Scientific publications
|
Hey guys,
We're looking into it. Unfortunately Noelia is away right now so we may have to cancel them for now. Thanks for catching it and pointing it out.
Nate |
|
|
|
I've noticed that with these WUs where I have a failure, all the wingman that fail are running Windows as I am. When I see WU with 5 failures and a success, the success is always on a Linux system. The failures are on Windows systems. Has anybody else seen this pattern?
I just completed 3 Noelia short tasks on an Ubuntu 14.04 rig and they have validated. On two of the tasks, there were several previous failures on Windows boxes.
|
|
|
|
Hello !
Have you stopped the short WU's upload ?
Same problem for me last night, 19 short NOELIA WU's in error after a few seconds.
Thank You
Kind Regards
Phil1966
|
|
|
ritterm Send message
Joined: 31 Jul 09 Posts: 88 Credit: 244,413,897 RAC: 0 Level
Scientific publications
|
Work unit argargx8-NOELIA_SH2eq-0-1-RND4849 failed on a mix of Win7 and Linux hosts.
____________
|
|
|
nateSend message
Joined: 6 Jun 11 Posts: 124 Credit: 2,928,865 RAC: 0 Level
Scientific publications
|
Hello all.
The NOELIA_SH2eq workunits were all failing, so those have been cancelled.
There some other groups that follow the naming pattern NOELIA_TRPXXX. Those continue to run, because from what I can tell the ones that are being sent out now do not have a problem. The first post in this thread (http://www.gpugrid.net/forum_thread.php?id=3770&nowrap=true#36977) refers to a group that was cancelled, fixed and resent earlier in the week. However, if you continue to have problems with those please let me know here.
Nate |
|
|
|
Hello all.
The NOELIA_SH2eq workunits were all failing, so those have been cancelled.
There some other groups that follow the naming pattern NOELIA_TRPXXX. Those continue to run, because from what I can tell the ones that are being sent out now do not have a problem. The first post in this thread (http://www.gpugrid.net/forum_thread.php?id=3770&nowrap=true#36977) refers to a group that was cancelled, fixed and resent earlier in the week. However, if you continue to have problems with those please let me know here.
Nate
Hello: Six (6) Units - NOELIA_SH2eq-0-1-RND2630_3 - downloaded a few minutes (13 H local) makes all failed in Windows 8.1.
In Linux Ubuntu 14.04 the same type of short work units operate without problem. |
|
|
|
Unfortunately, I'll have to wait for my daily quota to allow another fetch before I can test. |
|
|
|
Hello: Six (6) Units - NOELIA_SH2eq-0-1-RND2630_3 - downloaded a few minutes (13 H local) makes all failed in Windows 8.1.
In Linux Ubuntu 14.04 the same type of short work units operate without problem.
Hello: Short tasks - NOELIA_SH2eq-0-1-RND2630_3 - still fail in Windows
|
|
|
Dingo Send message
Joined: 1 Nov 07 Posts: 20 Credit: 128,376,317 RAC: 0 Level
Scientific publications
|
I am getting the same error today when I started the first task on my GTX 750 Ti:
lystrpx8-NOELIA_SH2eq-0-1-RND2046_1
Workunit 8238384
Created 7 Jun 2014 | 14:50:16 UTC
Sent 7 Jun 2014 | 16:19:10 UTC
Received 7 Jun 2014 | 16:26:05 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status -98 (0xffffffffffffff9e) Unknown error number
Computer ID 170120
Report deadline 12 Jun 2014 | 16:19:10 UTC
Run time 2.00
CPU time 0.14
Validate state Invalid
Credit 0.00
Application version Short runs (2-3 hours on fastest card) v8.41 (cuda60)
Stderr output
<core_client_version>7.3.19</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -98 (0xffffff9e)
</message>
<stderr_txt>
# GPU [GeForce GTX 750 Ti] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 0 :
# Name : GeForce GTX 750 Ti
# ECC : Disabled
# Global mem : 2048MB
# Capability : 5.0
# PCI ID : 0000:01:00.0
# Device clock : 1137MHz
# Memory clock : 2700MHz
# Memory width : 128bit
# Driver version : r337_00 : 33788
ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified
12:27:22 (4164): called boinc_finish
</stderr_txt>
]]>
Looks like my buddy on this had a different error:
http://www.gpugrid.net/workunit.php?wuid=8238384
____________
Proud Founder and member of
Have a look at my WebCam |
|
|
|
" ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified
12:27:22 (4164): called boinc_finish "
Hello: This is the same mistake that I have from a few days ago in Windows 8.1 but as I say these same tasks in Linux work perfectly me. |
|
|