Message boards :
Number crunching :
Short runs: trypsin_lig_1161_3-NOELIA_RL3_run-0-1- crashed
Message board moderation
| Author | Message |
|---|---|
(retired account)Send message Joined: 22 Dec 11 Posts: 38 Credit: 28,606,255 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hello, just had a trypsin_lig_1161_3-NOELIA_RL3_run-0-1-RND4573_0 crashing immediately, here's the stderr output:
After that I got a 6x9-SANTI_MARwtdim-16-25- which is running fine so far. |
|
Send message Joined: 15 Oct 11 Posts: 17 Credit: 81,085,378 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Same issue here.... x2 workunits trypsin_lig_1316_4-NOELIA_RL3_run-0-1-RND5763_0 trypsin_lig_1097_4-NOELIA_RL3_run-0-1-RND4667_0 Stderr output <core_client_version>7.0.64</core_client_version> <![CDATA[ <message> (unknown error) - exit code -98 (0xffffff9e) </message> <stderr_txt> # GPU [GeForce GTS 450] Platform [Windows] Rev [3203] VERSION [55] # SWAN Device 0 : # Name : GeForce GTS 450 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:01:00.0 # Device clock : 1760MHz # Memory clock : 1840MHz # Memory width : 128bit # Driver version : r331_54 : 33158 ERROR: file pme.cpp line 85: PME NX too small 19:11:34 (4304): called boinc_finish </stderr_txt> |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The aforementioned batch of WU's contains a bug. All 13 WU's I recieved failed (after 2 or 3 seconds) on all my systems (Linux and Windows). Example Stderr output <core_client_version>7.2.23</core_client_version> <![CDATA[ <message> (unknown error) - exit code -98 (0xffffff9e) </message> <stderr_txt> # GPU [GeForce GTX 660] Platform [Windows] Rev [3203] VERSION [55] # SWAN Device 2 : # Name : GeForce GTX 660 # ECC : Disabled # Global mem : 2048MB # Capability : 3.0 # PCI ID : 0000:02:00.0 # Device clock : 1032MHz # Memory clock : 3004MHz # Memory width : 192bit # Driver version : r331_00 : 33140 ERROR: file pme.cpp line 85: PME NX too small 09:38:11 (6728): called boinc_finish </stderr_txt> ]]> The same WU also failed on other systems,
7415514 143807 31 Oct 2013 | 0:50:38 UTC 31 Oct 2013 | 0:56:44 UTC Error while computing 2.43 0.14 --- Short runs (2-3 hours on fastest card) v8.14 (cuda42) 7415906 152255 31 Oct 2013 | 2:16:17 UTC 31 Oct 2013 | 2:22:13 UTC Error while computing 2.02 0.08 --- Short runs (2-3 hours on fastest card) v8.14 (cuda42) 7416291 131405 31 Oct 2013 | 4:21:57 UTC 31 Oct 2013 | 4:26:02 UTC Error while computing 1.62 0.11 --- Short runs (2-3 hours on fastest card) v8.14 (cuda42) 7416888 127801 31 Oct 2013 | 6:33:24 UTC 31 Oct 2013 | 6:42:57 UTC Error while computing 2.09 0.42 --- Short runs (2-3 hours on fastest card) v8.14 (cuda55) 7417449 139265 31 Oct 2013 | 8:00:12 UTC 31 Oct 2013 | 9:43:26 UTC Error while computing 2.22 0.20 --- Short runs (2-3 hours on fastest card) v8.14 (cuda55) 7418231 139502 31 Oct 2013 | 11:44:17 UTC 31 Oct 2013 | 11:50:05 UTC Error while computing 2.03 0.11 --- Short runs (2-3 hours on fastest card) v8.14 (cuda55) 7418699 --- --- --- Unsent --- --- ---
http://www.gpugrid.net/result.php?resultid=7417225 http://www.gpugrid.net/result.php?resultid=7416175... All, ERROR: file pme.cpp line 85: PME NX too small Too many errors (may have bug) FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
dskagcommunitySend message Joined: 28 Apr 11 Posts: 463 Credit: 958,266,958 RAC: 31 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Oh and i though i could be my old card witch dont want to run with these batch O.o 30 Errors :/ didnt look into sderr, but now i see all fail on multiple machines. DSKAG Austria Research Team: http://www.research.dskag.at
|
|
Send message Joined: 1 Sep 09 Posts: 2 Credit: 214,365,451 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Same error Message for me as well!! |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The latest effort seems to work well. Linux system (GTX670 FOC and GTX770): trypsin_lig_1298_4x1-NOELIA_RL3run-0-1-RND2976_0 4888061 1 Nov 2013 | 4:27:41 UTC 1 Nov 2013 | 6:07:34 UTC Completed and validated 1,716.63 1,658.44 1,500.00 Short runs (2-3 hours on fastest card) v8.00 (cuda42) trypsin_lig_1089_2x1-NOELIA_RL3run-0-1-RND9336_0 4887950 31 Oct 2013 | 21:55:42 UTC 1 Nov 2013 | 0:47:14 UTC Completed and validated 1,485.76 1,429.36 1,500.00 Short runs (2-3 hours on fastest card) v8.00 (cuda55) Windows 7 (GTX770): trypsin_lig_1706_1x1-NOELIA_RL3run-0-1-RND8824_0 4888310 1 Nov 2013 | 5:15:45 UTC 1 Nov 2013 | 7:32:23 UTC Completed and validated 2,028.98 2,007.25 1,500.00 Short runs (2-3 hours on fastest card) v8.14 (cuda55) They seem to be much faster on Linux (35%) rather than the typical 11%. I expect they are small simulations. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
X1900AIWSend message Joined: 12 Sep 08 Posts: 74 Credit: 23,566,124 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
|
|
Send message Joined: 23 May 09 Posts: 121 Credit: 400,300,664 RAC: 14 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I've added my new GT630 a few days ago. Many many bad results, always right after a few seconds. Host: http://www.gpugrid.net/results.php?userid=25200 Error -52, SWAN : FATAL Unable to load module .mshake_kernel.cu. (702) Error -98, ERROR: file pme.cpp line 85: PME NX too small One fault caused the system to crash. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Werkstatt, other crunchers, mods and even scientists have 'No access' access to 'your' link - you need to link to the individual system (it's a Boinc server/site template security thing), http://www.gpugrid.net/results.php?hostid=161299&offset=0&show_names=1&state=0&appid= Fortunately you haven't hidden your systems. You had the unfortunate experience of encountering a bad batch of WU's. These failed on everyone's cards. trypsin_lig_1298_1-NOELIA_RL3_run-0-1-RND8370_6 - Already failed 6 times before being sent to you. As they failed in about 2 or 3seconds, numerous tasks would have failed for everyone before anyone noticed. You seem to be having success now, trypsin_lig_9_4x1-NOELIA_RC3run-0-1-RND4317_0 4889572 2 Nov 2013 | 0:13:18 UTC 2 Nov 2013 | 6:28:49 UTC Completed and validated 22,149.52 22,125.89 1,500.00 Short runs (2-3 hours on fastest card) v8.14 (cuda55) trypsin_lig_1355_3x1-NOELIA_RC3run-0-1-RND4109_0 4889060 1 Nov 2013 | 17:10:25 UTC 2 Nov 2013 | 0:19:32 UTC Completed and validated 22,070.07 22,049.27 1,500.00 Short runs (2-3 hours on fastest card) v8.14 (cuda55) trypsin_lig_458_2x1-NOELIA_RL3run-0-1-RND5559_0 4888617 1 Nov 2013 | 15:50:44 UTC 1 Nov 2013 | 17:11:01 UTC Completed and validated 4,769.82 4,769.82 1,500.00 Short runs (2-3 hours on fastest card) v8.14 (cuda55) but that credit! Either it's a new batch with more complicated molecules (8.827 ms vs 1.9ms per step) and poor credit, or your GPU downclocked, or your system was busy doing other things. Noelia, if it's a new batch perhaps you could up the credits for the next batch? FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
|
Send message Joined: 23 May 09 Posts: 121 Credit: 400,300,664 RAC: 14 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi Skygiven, THX for the reply. Failing wu's right after they start is not really a problem for me, have internet flatrate. Just wanted to keep Admins informed that there is a proplem somwhere, maybe triggered by discussion @ Einstein about a fault in cuda 5 they ran in and still use cuda 32. I want to test my new card with different projects. Its a 'Kepler' card, driven by the GK208 chip, it adds < 19W to the power budget and it's a passive cooled single slot slim size card. Cheers Alexander |
dskagcommunitySend message Joined: 28 Apr 11 Posts: 463 Credit: 958,266,958 RAC: 31 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
wow didnt see this low credits before, thats hard O.o DSKAG Austria Research Team: http://www.research.dskag.at
|
|
Send message Joined: 23 May 09 Posts: 121 Credit: 400,300,664 RAC: 14 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
last one was much better ... :)) |
©2025 Universitat Pompeu Fabra