Python apps for GPU hosts errors

Author	Message
Erich56 Send message Joined: 1 Jan 15 Posts: 1168 Credit: 12,311,898,501 RAC: 331,341 Level Scientific publications	Message 59888 - Posted: 7 Feb 2023, 11:14:53 UTC - in response to Message 59884. Last modified: 7 Feb 2023, 11:16:55 UTC There must be some way to run the command in a Windows terminal with elevated rights. The application is a user level application that Nvidia provides in all distributions. https://www.minitool.com/news/elevated-command-prompt.html Thanks, Keith, for providing the link above. As explained in this posting: https://www.gpugrid.net/forum_thread.php?id=5233&nowrap=true#59832 I tried to apply the tool with admin rights - but still it did not work. Maybe something is wrong with my installation, or nvidia-smi is defective, or whatever ... BTW: right now, the 4 Pythons running conurrently on the Quadro P5000 are using exactly 12.000 MB VRAM. So VRAM usage really seems to vary quite much. ID: 59888 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 423,674 Level Scientific publications	Message 59889 - Posted: 7 Feb 2023, 13:52:02 UTC - in response to Message 59888. but only you seem to be reporting exceptionally low VRAM use at times. which points to something else going on with your system specifically. either incorrect readings or something not working in the way you think. no one else reports this level of variance. mine have been pretty consistent, some use ~3GB and some use ~4GB, and nothing else. and that seems to align with what others are reporting as well. ID: 59889 · Rating: 0 · rate: / Reply Quote

KAMasud Send message Joined: 27 Jul 11 Posts: 138 Credit: 539,953,398 RAC: 0 Level Scientific publications	Message 59890 - Posted: 7 Feb 2023, 16:50:58 UTC It seems 'abou' has tweaked these WUs. It is even using lesser RAM and VRAM but you will have to ask 'abou'. Otherwise, it is all conjecture and physical monitoring. ID: 59890 · Rating: 0 · rate: / Reply Quote

Pop Piasa Send message Joined: 8 Aug 19 Posts: 252 Credit: 458,054,251 RAC: 0 Level Scientific publications	Message 59892 - Posted: 8 Feb 2023, 1:56:45 UTC - in response to Message 59890. It seems 'abou' has tweaked these WUs. It is even using lesser RAM and VRAM but you will have to ask 'abou'. Otherwise, it is all conjecture and physical monitoring. Very true. As this is an ongoing project in a new and developing science I think we volunteer research assistants can help some by "tuning" our hosts to best take advantage of what changes we see happening in the tasks. Abouh has been good about communicating and informing us of developments over on the news thread. That information has been enhanced by us communicating among ourselves what we've seen happening on our hosts, Linux or Windows. "Stay tuned" seems appropriate here. (IMHO) This is not a project which you can just set your host to compute and then ignore (F@H). This is a fun challenge and learning experience, at least for me. ID: 59892 · Rating: 0 · rate: / Reply Quote

Erich56 Send message Joined: 1 Jan 15 Posts: 1168 Credit: 12,311,898,501 RAC: 331,341 Level Scientific publications	Message 59893 - Posted: 9 Feb 2023, 16:40:57 UTC - in response to Message 59892. ...(IMHO) This is not a project which you can just set your host to compute and then ignore (F@H). This is a fun challenge and learning experience, at least for me. yes, how right you are :-) not at all "set and forget" ID: 59893 · Rating: 0 · rate: / Reply Quote

Pop Piasa Send message Joined: 8 Aug 19 Posts: 252 Credit: 458,054,251 RAC: 0 Level Scientific publications	Message 59894 - Posted: 9 Feb 2023, 22:44:06 UTC Speaking of GPU tuning, does anybody know what the difference is between NVIDIA's 'gaming' and 'studio' drivers and if one is better suited for this sort of duty than the other? ID: 59894 · Rating: 0 · rate: / Reply Quote

KAMasud Send message Joined: 27 Jul 11 Posts: 138 Credit: 539,953,398 RAC: 0 Level Scientific publications	Message 59895 - Posted: 10 Feb 2023, 1:39:13 UTC - in response to Message 59892. It seems 'abou' has tweaked these WUs. It is even using lesser RAM and VRAM but you will have to ask 'abou'. Otherwise, it is all conjecture and physical monitoring. Very true. As this is an ongoing project in a new and developing science I think we volunteer research assistants can help some by "tuning" our hosts to best take advantage of what changes we see happening in the tasks. Abouh has been good about communicating and informing us of developments over on the news thread. That information has been enhanced by us communicating among ourselves what we've seen happening on our hosts, Linux or Windows. "Stay tuned" seems appropriate here. (IMHO) This is not a project which you can just set your host to compute and then ignore (F@H). This is a fun challenge and learning experience, at least for me. ________________ Correct and Abou, has been very good at interacting with us and solving problems. There are however some WUs that are chewing up my 16 GB of RAM, not VRAM. VRAM usage seems quite feasible. ID: 59895 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 423,674 Level Scientific publications	Message 59896 - Posted: 10 Feb 2023, 12:25:34 UTC - in response to Message 59895. the tasks use about 10GB of system RAM per task. you should account for this. ID: 59896 · Rating: 0 · rate: / Reply Quote

Pop Piasa Send message Joined: 8 Aug 19 Posts: 252 Credit: 458,054,251 RAC: 0 Level Scientific publications	Message 59897 - Posted: 10 Feb 2023, 15:49:23 UTC - in response to Message 59895. There are however some WUs that are chewing up my 16 GB of RAM I have noticed that early in the run my windows hosts will show a brief peak of RAM usage around 15.6 GBs or so. I think it might be during the unpacking and expansion phase but don't take my guess as fact. I had to give the BOINC manager access to 95% of the available RAM to get though this part and go on to the ~60GBs of commit charge memory part where the factories are running. My swap file is user set at 55GB for now. During the factory phase it drops to the same ~10GB Ian reported for Linux. Incidentally, my observations are from watching the Afterburner hardware monitor. ID: 59897 · Rating: 0 · rate: / Reply Quote

KAMasud Send message Joined: 27 Jul 11 Posts: 138 Credit: 539,953,398 RAC: 0 Level Scientific publications	Message 59898 - Posted: 10 Feb 2023, 16:28:15 UTC 10GBs for the WU, some GBs for the System. The 15+GBs quote of Pops is correct. I shut down everything else to get through this phase, but not all WUs are doing this. ID: 59898 · Rating: 0 · rate: / Reply Quote

Pop Piasa Send message Joined: 8 Aug 19 Posts: 252 Credit: 458,054,251 RAC: 0 Level Scientific publications	Message 59911 - Posted: 12 Feb 2023, 23:08:26 UTC - in response to Message 59897. I have noticed that early in the run my windows hosts will show a brief peak of RAM usage around 15.6 GBs or so. The last 48 hrs of crunching Pythons has only used ~12GB RAM max. The spike in windows memory usage hasn't appeared on any of my hosts. Good work abouh. ID: 59911 · Rating: 0 · rate: / Reply Quote

Carlesa25 Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level Scientific publications	Message 60080 - Posted: 14 Mar 2023, 18:05:12 UTC Last modified: 14 Mar 2023, 18:05:36 UTC Hello: My tasks are all failing after 3 or 4 minutes of execution, both in Windows 10 Pro and Linux Ubuntu 22.04 my AMD 3500 CPU and GTX 780ti GPU and 16GB RAM...??? some information. Thank yo. http://stats.free-dc.org/cpidtagb.php?cpid=b4bdc04dfe39b1028b9c5d6fef3082b8&theme=9&cols=1 ID: 60080 · Rating: 0 · rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1423 Credit: 9,186,946,190 RAC: 1,288,374 Level Scientific publications	Message 60081 - Posted: 14 Mar 2023, 18:11:31 UTC - in response to Message 60080. Last modified: 14 Mar 2023, 18:13:56 UTC If you look at your failed tasks result outputs, the explanation is self-evident. [W NNPACK.cpp:79] Could not initialize NNPACK! Reason: Unsupported hardware. /var/lib/boinc-client/slots/0/lib/python3.7/site-packages/torch/cuda/__init__.py:120: UserWarning: Found GPU%d %s which is of cuda capability %d.%d. PyTorch no longer supports this GPU because it is too old. The minimum cuda capability supported by this library is %d.%d. Not high enough CUDA capability and not even a high enough driver. The application name tells you the minimum CUDA level. Python apps for GPU hosts v4.03 (cuda1131) Best to utilize these gpus on other projects with lesser requirements. ID: 60081 · Rating: 0 · rate: / Reply Quote

Carlesa25 Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level Scientific publications	Message 60082 - Posted: 14 Mar 2023, 18:19:20 UTC - in response to Message 60081. If you look at your failed tasks result outputs, the explanation is self-evident. [W NNPACK.cpp:79] Could not initialize NNPACK! Reason: Unsupported hardware. /var/lib/boinc-client/slots/0/lib/python3.7/site-packages/torch/cuda/__init__.py:120: UserWarning: Found GPU%d %s which is of cuda capability %d.%d. PyTorch no longer supports this GPU because it is too old. The minimum cuda capability supported by this library is %d.%d. Not high enough CUDA capability and not even a high enough driver. The application name tells you the minimum CUDA level. Python apps for GPU hosts v4.03 (cuda1131) Best to utilize these gpus on other projects with lesser requirements. Hello: Thank you for your prompt response, too bad because I had worked a lot on this project before, we'll see later if I can change the GPU. http://stats.free-dc.org/cpidtagb.php?cpid=b4bdc04dfe39b1028b9c5d6fef3082b8&theme=9&cols=1 ID: 60082 · Rating: 0 · rate: / Reply Quote

Carlesa25 Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level Scientific publications	Message 60088 - Posted: 15 Mar 2023, 10:55:14 UTC - in response to Message 60081. If you look at your failed tasks result outputs, the explanation is self-evident. [W NNPACK.cpp:79] Could not initialize NNPACK! Reason: Unsupported hardware. /var/lib/boinc-client/slots/0/lib/python3.7/site-packages/torch/cuda/__init__.py:120: UserWarning: Found GPU%d %s which is of cuda capability %d.%d. PyTorch no longer supports this GPU because it is too old. The minimum cuda capability supported by this library is %d.%d. Not high enough CUDA capability and not even a high enough driver. The application name tells you the minimum CUDA level. Python apps for GPU hosts v4.03 (cuda1131) Best to utilize these gpus on other projects with lesser requirements. Hello: What would be the minimum type of NVIDIA card to be able to execute this project, thanks. http://stats.free-dc.org/cpidtagb.php?cpid=b4bdc04dfe39b1028b9c5d6fef3082b8&theme=9&cols=1 ID: 60088 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 423,674 Level Scientific publications	Message 60089 - Posted: 15 Mar 2023, 11:25:36 UTC - in response to Message 60088. I think Maxwell based cards. GTX 900 series and newer. ID: 60089 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 423,674 Level Scientific publications	Message 60090 - Posted: 15 Mar 2023, 11:26:10 UTC - in response to Message 60088. Last modified: 15 Mar 2023, 11:26:30 UTC .. ID: 60090 · Rating: 0 · rate: / Reply Quote

Carlesa25 Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level Scientific publications	Message 60092 - Posted: 15 Mar 2023, 11:40:47 UTC - in response to Message 60089. I think Maxwell based cards. GTX 900 series and newer. Hello: Thanks, I have a GTX 1080 ti in sight, so this would work. http://stats.free-dc.org/cpidtagb.php?cpid=b4bdc04dfe39b1028b9c5d6fef3082b8&theme=9&cols=1 ID: 60092 · Rating: 0 · rate: / Reply Quote