Errors piling up, bad batch of NOELIA?

Message boards : Number crunching : Errors piling up, bad batch of NOELIA?
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Khali

Send message
Joined: 13 Jan 14
Posts: 9
Credit: 390,232,404
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 37808 - Posted: 4 Sep 2014, 16:35:27 UTC

I have had several, nine and counting, Noelia tasks fail after only running for around 2 seconds.


<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -98 (0xffffff9e)
</message>
<stderr_txt>
# GPU [GeForce GTX 680] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 1 :
# Name : GeForce GTX 680
# ECC : Disabled
# Global mem : 2048MB
# Capability : 3.0
# PCI ID : 0000:02:00.0
# Device clock : 1058MHz
# Memory clock : 3004MHz
# Memory width : 256bit
# Driver version : r340_00 : 34052
ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified
12:16:00 (3092): called boinc_finish

</stderr_txt>
]]>
ID: 37808 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Betting Slip

Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37809 - Posted: 4 Sep 2014, 17:07:01 UTC - in response to Message 37808.  
Last modified: 4 Sep 2014, 17:08:52 UTC

Have you just done an update from Windows Update WDDM?

I did and have got multiple errors since then and can't uninstall despite trying to use restore. Update not mentioned in Uninstall Programs


To Add

Finally got one running.
ID: 37809 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
MrJo

Send message
Joined: 18 Apr 14
Posts: 43
Credit: 1,192,135,172
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwat
Message 37811 - Posted: 4 Sep 2014, 17:18:40 UTC

The same with me. Suddenly more than 30 errors. No changes in my system than updating the driver to the last version. But I have the same issue on machines where nothing has changed since weeks..



<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -98 (0xffffff9e)
</message>
<stderr_txt>
# GPU [GeForce GTX 750 Ti] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 0	:
#	Name		: GeForce GTX 750 Ti
#	ECC		: Disabled
#	Global mem	: 2048MB
#	Capability	: 5.0
#	PCI ID		: 0000:01:00.0
#	Device clock	: 1241MHz
#	Memory clock	: 2700MHz
#	Memory width	: 128bit
#	Driver version	: r340_00 : 34052
ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified 
18:44:55 (2516): called boinc_finish

</stderr_txt>
]]>

Regards, Josef

ID: 37811 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Khali

Send message
Joined: 13 Jan 14
Posts: 9
Credit: 390,232,404
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 37813 - Posted: 4 Sep 2014, 17:25:46 UTC - in response to Message 37809.  

Have you just done an update from Windows Update WDDM?

I did and have got multiple errors since then and can't uninstall despite trying to use restore. Update not mentioned in Uninstall Programs


To Add

Finally got one running.


No updates of any kind on my system. Looking at my GPU Grid account it shows I have one task in progress when I have nothing in my queue for GPU Grid. I have done a project rest and rebooted the system. Still getting the same problem. Instead of wasting time and bandwidth downloading tasks that are going to fail I have set the project to no new tasks until some one comes up with an answer.
ID: 37813 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
MrJo

Send message
Joined: 18 Apr 14
Posts: 43
Credit: 1,192,135,172
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwat
Message 37815 - Posted: 4 Sep 2014, 17:35:00 UTC

Current problem solving: Switched my preferences to ACEMD short runs (2-3 hours on fastest card). And all works fine again. Seems that the problem is caused by the long ones.
Regards, Josef

ID: 37815 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
neilp62

Send message
Joined: 23 Nov 10
Posts: 14
Credit: 8,017,535,732
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37817 - Posted: 4 Sep 2014, 17:47:40 UTC - in response to Message 37813.  

Same issue here. 16 errors and increasing. No changes to platform; platform is >50% complete processing three NOELIA_tpam2 tasks without issue. NOELIA_TRPs fail within 3 seconds with following error:
ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified
10:34:25 (8728): called boinc_finish

Switching to another project just to conserve internet bandwidth.
ID: 37817 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 351
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37819 - Posted: 4 Sep 2014, 17:58:36 UTC
Last modified: 4 Sep 2014, 18:02:46 UTC

Just had two bad ones in quick succession:

260-NOELIA_TRP188-0-1-RND4118
364-NOELIA_TRP188-0-1-RND5307

Erroring on every computer they're sent to, so I'm pretty sure it's the task make-up, nothing local to our machines.

Make that three:

337-NOELIA_TRP188-0-1-RND9634
ID: 37819 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
klepel

Send message
Joined: 23 Dec 09
Posts: 189
Credit: 4,798,881,008
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37820 - Posted: 4 Sep 2014, 18:03:52 UTC
Last modified: 4 Sep 2014, 18:04:07 UTC

Me too!
ID: 37820 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TJ

Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37825 - Posted: 4 Sep 2014, 20:01:14 UTC

Indeed the third type of Noelia's WU's are errorsome. I have now 47 of them and wing(wo)man too. This seems be the error: file mdioload.cpp line 162: No CHARMM parameter file specified.

It are xxx-NOELIA_TRPxxx WU's the other two types run like a train.

We have to wait for Noelia to repair this.


Greetings from TJ
ID: 37825 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37826 - Posted: 4 Sep 2014, 20:19:45 UTC
Last modified: 4 Sep 2014, 20:19:57 UTC

I'm having the same issue with NOELIA_TRP188 workunits.
All of these run to the following error after 2 seconds:
ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified
ID: 37826 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bedrich Hajek

Send message
Joined: 28 Mar 09
Posts: 490
Credit: 11,731,645,728
RAC: 57
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37828 - Posted: 4 Sep 2014, 21:33:51 UTC - in response to Message 37820.  

Please cancel these units already, they are all erroring out.

ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified
14:24:19 (2920): called boinc_finish


ID: 37828 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
labrat42
Avatar

Send message
Joined: 13 May 10
Posts: 7
Credit: 452,806,864
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 37830 - Posted: 5 Sep 2014, 3:35:25 UTC

Have these errors been resolved?
Bill42
ID: 37830 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
neilp62

Send message
Joined: 23 Nov 10
Posts: 14
Credit: 8,017,535,732
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37831 - Posted: 5 Sep 2014, 3:46:04 UTC - in response to Message 37830.  

Unclear if resolved. Within the last 30 minutes, I am no longer receiving NOELIA_TRP long run tasks. All three of my platforms have received NOELIA_tpam tasks, so I have switched back to the long runs for now.
ID: 37831 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
MrJo

Send message
Joined: 18 Apr 14
Posts: 43
Credit: 1,192,135,172
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwat
Message 37832 - Posted: 5 Sep 2014, 6:55:07 UTC - in response to Message 37831.  

...I am no longer receiving NOELIA_TRP long run tasks. All three of my platforms have received NOELIA_tpam tasks, so I have switched back to the long runs for now.

Me the same. Right now all seems to be fine.

Regards, Josef

ID: 37832 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
biodoc

Send message
Joined: 26 Aug 08
Posts: 183
Credit: 10,085,929,375
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37834 - Posted: 5 Sep 2014, 10:39:19 UTC
Last modified: 5 Sep 2014, 10:49:13 UTC

Feel free to send the NOELIA_TRP WUs my way. They run fine on my 780Ti under linux.

http://www.gpugrid.net/workunit.php?wuid=10045657

I've finished 2 of these WUs successfully.

PNY 780Ti reference card; no overclock

Nvidia driver version: 337.25
ID: 37834 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37837 - Posted: 5 Sep 2014, 12:34:46 UTC - in response to Message 37826.  
Last modified: 5 Sep 2014, 12:37:03 UTC

GPUGrid Admins:

Retvari said it best:
I'm having the same issue with NOELIA_TRP188 workunits.
All of these run to the following error after 2 seconds:
ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified


That is exactly what's happening for me. The NOELIA_TRP188 work units error out just like that, for Windows machines.

http://www.gpugrid.net/workunit.php?wuid=10045781
http://www.gpugrid.net/workunit.php?wuid=10045192
http://www.gpugrid.net/workunit.php?wuid=10045206

Then, BOINC immediately reports the error (per your setting), gets a network backoff for GPUGrid, then starts asking my backup projects for work. So, I get to work on backup projects until you can fix this.

Can you kindly please determine the problem, and cancel the work units that have problems, and then relay to us what happened?

Thanks,
Jacob
ID: 37837 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
RaymondFO*

Send message
Joined: 22 Nov 12
Posts: 72
Credit: 14,040,706,346
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37838 - Posted: 5 Sep 2014, 12:49:41 UTC - in response to Message 37832.  
Last modified: 5 Sep 2014, 13:03:57 UTC

...I am no longer receiving NOELIA_TRP long run tasks. All three of my platforms have received NOELIA_tpam tasks, so I have switched back to the long runs for now.

Me the same. Right now all seems to be fine.


I am still getting them, just not as frequently.


Feel free to send the NOELIA_TRP WUs my way. They run fine on my 780Ti under linux.

http://www.gpugrid.net/workunit.php?wuid=10045657

I've finished 2 of these WUs successfully.

PNY 780Ti reference card; no overclock

Nvidia driver version: 337.25


While you may think this to be true, however my linux computers have had their share of these errors, just not as frequently.


FWIW, going back to Einstein@home until this is resolved. Enough is enough.
ID: 37838 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
biodoc

Send message
Joined: 26 Aug 08
Posts: 183
Credit: 10,085,929,375
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37839 - Posted: 5 Sep 2014, 15:06:15 UTC - in response to Message 37838.  

While you may think this to be true, however my linux computers have had their share of these errors, just not as frequently.


Good point. I've only had 2 WUs so that's certainly not enough for me to conclude that these WUs are ok on my machine.
ID: 37839 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
MrJo

Send message
Joined: 18 Apr 14
Posts: 43
Credit: 1,192,135,172
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwat
Message 37847 - Posted: 5 Sep 2014, 18:42:45 UTC - in response to Message 37838.  

I am still getting them, just not as frequently.

You're right. I just checked one of my PC's. Two attempts failed before the third has worked.

18:03 fail
18:16 fail
18:20 Ok


Regards, Josef

ID: 37847 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
klepel

Send message
Joined: 23 Dec 09
Posts: 189
Credit: 4,798,881,008
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37853 - Posted: 6 Sep 2014, 17:35:15 UTC

Any up-date? Can I change back to long runs? My computers run without supervision over the week-end, so I do not like to pile up errors.
ID: 37853 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : Errors piling up, bad batch of NOELIA?

©2025 Universitat Pompeu Fabra