Message boards :
Graphics cards (GPUs) :
Error compiling program: nvrtc: error: invalid value for --gpu-architecture (-arch)
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 5 Dec 12 Posts: 84 Credit: 1,663,883,415 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
3090 FE. Driver Date Aug. 27th, 2021 ---------------------------------- Name e1s247_I282-ADRIA_AdB_KIXCMYB_HIP-1-2-RND4280_1 Workunit 27079548 Created 26 Sep 2021 | 7:33:50 UTC Sent 26 Sep 2021 | 7:33:56 UTC Received 26 Sep 2021 | 7:36:06 UTC Server state Over Outcome Computation error Client state Compute error Exit status 195 (0xc3) EXIT_CHILD_FAILED Computer ID 140554 Report deadline 1 Oct 2021 | 7:33:56 UTC Run time 10.12 CPU time 0.00 Validate state Invalid Credit 0.00 Application version New version of ACEMD v2.18 (cuda101) Stderr output <core_client_version>7.16.11</core_client_version> <![CDATA[ <message> (unknown error) - exit code 195 (0xc3)</message> <stderr_txt> 00:34:25 (21456): wrapper (7.9.26016): starting 00:34:25 (21456): wrapper: running bin/acemd3.exe (--boinc --device 0) ACEMD failed: Error compiling program: nvrtc: error: invalid value for --gpu-architecture (-arch) 00:34:28 (21456): bin/acemd3.exe exited; CPU time 0.000000 00:34:28 (21456): app exit status: 0x1 00:34:28 (21456): called boinc_finish(195) 0 bytes in 0 Free Blocks. 268 bytes in 4 Normal Blocks. 1144 bytes in 1 CRT Blocks. 0 bytes in 0 Ignore Blocks. 0 bytes in 0 Client Blocks. Largest number used: 0 bytes. Total allocations: 190200 bytes. Dumping objects -> {323252} normal block at 0x0000018D079E9B30, 126 bytes long. Data: <<project_prefere> 3C 70 72 6F 6A 65 63 74 5F 70 72 65 66 65 72 65 ..\api\boinc_api.cpp(309) : {323249} normal block at 0x0000018D079A6B10, 8 bytes long. Data: < • > 00 00 95 07 8D 01 00 00 {322607} normal block at 0x0000018D079E9A70, 126 bytes long. Data: <<project_prefere> 3C 70 72 6F 6A 65 63 74 5F 70 72 65 66 65 72 65 {321996} normal block at 0x0000018D079A6E80, 8 bytes long. Data: <ÀÄž > C0 C4 9E 07 8D 01 00 00 ..\zip\boinc_zip.cpp(122) : {147} normal block at 0x0000018D079ADD40, 260 bytes long. Data: < > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 {134} normal block at 0x0000018D079A7290, 16 bytes long. Data: <p«š > 70 AB 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {133} normal block at 0x0000018D079AAB70, 40 bytes long. Data: < rš conda-pa> 90 72 9A 07 8D 01 00 00 63 6F 6E 64 61 2D 70 61 {126} normal block at 0x0000018D079AA9B0, 48 bytes long. Data: <--boinc --device> 2D 2D 62 6F 69 6E 63 20 2D 2D 64 65 76 69 63 65 {125} normal block at 0x0000018D079A6930, 16 bytes long. Data: <8ìš > 38 EC 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {124} normal block at 0x0000018D079A7330, 16 bytes long. Data: < ìš > 10 EC 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {123} normal block at 0x0000018D079A76A0, 16 bytes long. Data: <èëš > E8 EB 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {122} normal block at 0x0000018D079A68E0, 16 bytes long. Data: <Àëš > C0 EB 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {121} normal block at 0x0000018D079A6F20, 16 bytes long. Data: < ëš > 98 EB 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {120} normal block at 0x0000018D079A7600, 16 bytes long. Data: <pëš > 70 EB 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {119} normal block at 0x0000018D079A6F70, 16 bytes long. Data: <Pëš > 50 EB 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {118} normal block at 0x0000018D079A6FC0, 16 bytes long. Data: <(ëš > 28 EB 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {117} normal block at 0x0000018D079A6840, 16 bytes long. Data: < ëš > 00 EB 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {116} normal block at 0x0000018D079AEB00, 496 bytes long. Data: <@hš bin/acem> 40 68 9A 07 8D 01 00 00 62 69 6E 2F 61 63 65 6D {66} normal block at 0x0000018D079A6890, 16 bytes long. Data: < êfè÷ > 80 EA 66 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 {65} normal block at 0x0000018D079A75B0, 16 bytes long. Data: <@éfè÷ > 40 E9 66 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 {64} normal block at 0x0000018D079A6D40, 16 bytes long. Data: <øWcè÷ > F8 57 63 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 {63} normal block at 0x0000018D079A7560, 16 bytes long. Data: <ØWcè÷ > D8 57 63 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 {62} normal block at 0x0000018D079A6B60, 16 bytes long. Data: <P cè÷ > 50 04 63 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 {61} normal block at 0x0000018D079A6A20, 16 bytes long. Data: <0 cè÷ > 30 04 63 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 {60} normal block at 0x0000018D079A71F0, 16 bytes long. Data: <à cè÷ > E0 02 63 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 {59} normal block at 0x0000018D079A7740, 16 bytes long. Data: < cè÷ > 10 04 63 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 {58} normal block at 0x0000018D079A6AC0, 16 bytes long. Data: <p cè÷ > 70 04 63 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 {57} normal block at 0x0000018D079A6C00, 16 bytes long. Data: < Àaè÷ > 18 C0 61 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 Object dump complete. </stderr_txt> ]]> |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Known issue. The CUDA101 app will fail on Ampere cards. See this thread. https://www.gpugrid.net/forum_thread.php?id=5246 |
|
Send message Joined: 5 Dec 12 Posts: 84 Credit: 1,663,883,415 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Maybe I'm missing some context, but the link shows that issue had been fixed and does not mention my error code specifically. This host has run several successful tasks since then, but perhaps they were another GPUGrid application. I'm surprised there are still known issues with Ampere cards. While getting mine was a struggle, the architecture has released for over a year at this point. |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
The thread does in fact mention exactly the error message title of this thread in the latest posts. https://www.gpugrid.net/forum_thread.php?id=5246&nowrap=true#57363 ACEMD failed: The CUDA1121 application runs fine on Ampere cards. Only when the scheduler sends a task assigned with the CUDA 101 application do the tasks fail. The issue is that the driver level does not match the CUDA101 application. Simplest solution is to remove the CUDA101 app from the scheduler and force all hosts to use the CUDA1121 application which requires minimum CUDA 11.2 level of drivers. |
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
We have now changed the scheduler, let's see if now it's better. gdf |
PDWSend message Joined: 7 Mar 14 Posts: 18 Credit: 6,575,125,525 RAC: 2 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
We have now changed the scheduler, let's see if now it's better. Is this a result of the scheduler changes or something else ? The result http://gpugrid.net/result.php?resultid=32646962 failed (see below) to launch CUDA which isn't surprising as the host doesn't show a GPU. Host: http://gpugrid.net/show_host_detail.php?hostid=514156 New version of ACEMD v2.18 (cuda1121) Stderr output <core_client_version>7.16.11</core_client_version> <![CDATA[ <message> process got signal 67</message> <stderr_txt> 14:40:06 (57305): wrapper (7.7.26016): starting 14:40:06 (57305): wrapper (7.7.26016): starting 14:40:06 (57305): wrapper: running /bin/tar (xf conda-pack.tar.bz2) 14:42:47 (57305): /bin/tar exited; CPU time 127.344146 14:42:47 (57305): wrapper: running bin/acemd3 (--boinc --device 0) ACEMD failed: Error invoking kernel: CUDA_ERROR_LAUNCH_FAILED (719) 19:16:23 (57305): bin/acemd3 exited; CPU time 6047.267986 19:16:23 (57305): app exit status: 0x1 19:16:23 (57305): called boinc_finish(195) </stderr_txt> ]]> |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 428 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I'm seeing my Linux machines receive the cuda1121 plan class more consistently, but my Windows machines receive cuda101 - I don't think I've ever seen cuda1121 under Windows. Cards are from the same range (GTX 1660), and drivers are up-to-date - Linux 470.63, Windows 472.12 |
©2025 Universitat Pompeu Fabra