Message boards :
Graphics cards (GPUs) :
NOELIA_SH2eq Short Work unit(s) Instantly Failing
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Noelia_SH2eq work unit(s)-argprox1/argasnx1/asnmetx3/argcysx3/alaphex6/argalax3/argvalx1/asaaspx6/asnserx6/argasnx2/argargx7/alailex7 all failing with Code (98) along with statement: ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified. Wingman generating same error line. |
Carlesa25Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi: Well, I've already made a few of these short tasks without problem, in Linux -Ubuntu 14.04. |
|
Send message Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thanks for sharing you're experience. I've found couple Linux wingman who completed Noelia_SH2eq work units, yet failing work units showing process exited code 199 (0xc7, -57)and FATAL : Cuda driver error 35 in file 'swanlibnv2.cpp' in line 446. Are these similar meaning errors for Windows and Linux? Or completely apart for another? All Failing Windows wingman hosts generating Code(98)and ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified. |
|
Send message Joined: 13 Apr 13 Posts: 61 Credit: 726,605,417 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
eXaPower, We had the same error on the NOELIA_TRP WU's a week ago. See below message. http://www.gpugrid.net/forum_thread.php?id=3770&nowrap=true#36979 I have since received only two NOELIA_TRP WU's since then, and both finished. No updates if a change was made or not. http://www.gpugrid.net/result.php?resultid=11091492 http://www.gpugrid.net/result.php?resultid=11094060 Only running long WU's now, so I have not received any short ones. Thought the above information may be relevant since same error with the same researcher. Regards, Jeremy |
[VENETO] sabayoninoSend message Joined: 4 Apr 10 Posts: 50 Credit: 650,142,596 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi guys many WU's failed (NATHAN) http://www.gpugrid.net/results.php?hostid=174773 All Wu's was crunched by GTX 780 Ti <core_client_version>7.2.41</core_client_version> <![CDATA[ <message> process exited with code 199 (0xc7, -57) </message> <stderr_txt> # SWAN Device 0 : # Name : GeForce GTX 780 Ti # ECC : Disabled # Global mem : 3071MB # Capability : 3.5 # PCI ID : 0000:01:00.0 # Device clock : 1071MHz # Memory clock : 3600MHz # Memory width : 384bit SWAN : FATAL : Cuda driver error 700 in file 'swanlibnv2.cpp' in line 1963. # SWAN swan_assert -57 </stderr_txt> ]]> Other GPU's Cards run fine. (GTX 780 ** GTX 760 ** GTX 660Ti ** GTX 750Ti) All Cards running 340.24 Nvidia Drivers (for Linux) All OS have SWAN_SYNC=0 environment variable (Gentoo Linux) No Overclock (CPU and GPU) All systems hardware (except GPU's) are the same (i7-4770 ASUS-Z87-A) ONLY 780Ti has too much WU's failed Other GPU Projects GTX780Ti run fine PS : GTX 780Ti + GTX 760 same PC running only 780Ti WU's fail |
[VENETO] sabayoninoSend message Joined: 4 Apr 10 Posts: 50 Credit: 650,142,596 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
uhm.. I think the probl is hight temperature PCB of GTX780Ti was very very Hot I replaced GTX780Ti and GTX760 to other MainBoard with 2x660 and add 2 Fans At the moment all WUs are crunching fine |
[VENETO] sabayoninoSend message Joined: 4 Apr 10 Posts: 50 Credit: 650,142,596 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
nothing changes WUs still got errors http://www.gpugrid.net/results.php?hostid=174768 (GPU was installed to other MainBoard) GPU Sun Jul 27 20:07:58 2014
+------------------------------------------------------+
| NVIDIA-SMI 340.24 Driver Version: 340.24 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 780 Ti Off | 0000:01:00.0 N/A | N/A |
| 55% 79C P0 N/A / N/A | 570MiB / 3071MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 760 Off | 0000:04:00.0 N/A | N/A |
| 62% 81C P0 N/A / N/A | 530MiB / 2047MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Compute processes: GPU Memory |
| GPU PID Process name Usage |
|=============================================================================|
| 0 Not Supported |
| 1 Not Supported |
+-----------------------------------------------------------------------------+
CPU coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +65.0°C (high = +80.0°C, crit = +100.0°C) Core 0: +60.0°C (high = +80.0°C, crit = +100.0°C) Core 1: +65.0°C (high = +80.0°C, crit = +100.0°C) Core 2: +61.0°C (high = +80.0°C, crit = +100.0°C) Core 3: +61.0°C (high = +80.0°C, crit = +100.0°C) |
[VENETO] sabayoninoSend message Joined: 4 Apr 10 Posts: 50 Credit: 650,142,596 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
:O playing with app_config something changes <name>acemdshort</name>
<max_concurrent>2</max_concurrent>
<gpu_versions>
<gpu_usage>1</gpu_usage>
<cpu_usage>0.49</cpu_usage>
</gpu_versions>
with this configuration GPU temperature is ~54°C If I increase CPU_USAGE >=0.5 or 1 (keeping GPU_USAGE=1), temperature increase over 70°C If I decrease CPU_USAGE to <0.5 temperature is very low but WU is very very slowly CPU_USAGE >=0.5 temp increase No Changes playing with GPU_USAGE CPU_USAGE=0.49 ** GPU_USAGE=1 Sun Jul 27 21:50:51 2014
+------------------------------------------------------+
| NVIDIA-SMI 340.24 Driver Version: 340.24 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 780 Ti Off | 0000:01:00.0 N/A | N/A |
| 40% 55C P0 N/A / N/A | 576MiB / 3071MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 760 Off | 0000:04:00.0 N/A | N/A |
| 50% 59C P0 N/A / N/A | 526MiB / 2047MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Compute processes: GPU Memory |
| GPU PID Process name Usage |
|=============================================================================|
| 0 Not Supported |
| 1 Not Supported |
+-----------------------------------------------------------------------------+
:O any ideas ? No problems with GTX780 .... |
©2025 Universitat Pompeu Fabra