Message boards :
Graphics cards (GPUs) :
Problem with Boinc device vs Nvidia X Server gpu allocation
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 1 Mar 10 Posts: 147 Credit: 1,077,535,540 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi everybody ! My MB is an ASUS X99-E WS with Core I7-5820K at 3.3Ghz stock clock and 16GB DDR5. Running Lubuntu 15.04 and 2 strictly identical GPUs , Gigabyte GTX Titan Black in auto mode with latest hand installed drivers Whatever GPU I want to ignore in cc_config.xml, GPU-0 is allways used... The WU goes to end. CC_CONFIG :
<options> <report_results_immediately>1</report_results_immediately> <use_all_gpus>1</use_all_gpus> <ignore_nvidia_dev>0</ignore_nvidia_dev> </options> <log_flags> <coproc_debug>1</coproc_debug> <task>1</task> <file_xfer>1</file_xfer> <sched_ops>1</sched_ops> </log_flags> </cc_config>
lun. 27 juil. 2015 20:45:31 CEST | | log flags: file_xfer, sched_ops, task, coproc_debug lun. 27 juil. 2015 20:45:31 CEST | | Libraries: libcurl/7.38.0 OpenSSL/1.0.1f zlib/1.2.8 libidn/1.28 librtmp/2.3 lun. 27 juil. 2015 20:45:31 CEST | | Data directory: /var/lib/boinc-client lun. 27 juil. 2015 20:45:31 CEST | | [coproc] launching child process at /usr/bin/boinc lun. 27 juil. 2015 20:45:31 CEST | | [coproc] relative to directory / lun. 27 juil. 2015 20:45:31 CEST | | [coproc] with data directory /var/lib/boinc-client lun. 27 juil. 2015 20:45:31 CEST | | CUDA: NVIDIA GPU 0 (ignored by config): GeForce GTX TITAN Black (driver version 352.21, CUDA version 7.5, compute capability 3.5, 4096MB, 4009MB available, 6396 GFLOPS peak) lun. 27 juil. 2015 20:45:31 CEST | | CUDA: NVIDIA GPU 1: GeForce GTX TITAN Black (driver version 352.21, CUDA version 7.5, compute capability 3.5, 4096MB, 4009MB available, 6396 GFLOPS peak) lun. 27 juil. 2015 20:45:31 CEST | | OpenCL: NVIDIA GPU 0 (ignored by config): GeForce GTX TITAN Black (driver version 352.21, device version OpenCL 1.2 CUDA, 6144MB, 4009MB available, 6396 GFLOPS peak) lun. 27 juil. 2015 20:45:31 CEST | | OpenCL: NVIDIA GPU 1: GeForce GTX TITAN Black (driver version 352.21, device version OpenCL 1.2 CUDA, 6143MB, 4009MB available, 6396 GFLOPS peak) lun. 27 juil. 2015 20:45:31 CEST | | NVIDIA library reports 2 GPUs lun. 27 juil. 2015 20:45:31 CEST | | No ATI library found lun. 27 juil. 2015 20:45:31 CEST | | Host name: odysseusV lun. 27 juil. 2015 20:45:31 CEST | | Processor: 12 GenuineIntel Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz [Family 6 Model 63 Stepping 2] lun. 27 juil. 2015 20:45:31 CEST | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt lun. 27 juil. 2015 20:45:31 CEST | | OS: Linux: 3.19.0-23-generic lun. 27 juil. 2015 20:45:31 CEST | | Memory: 15.58 GB physical, 31.98 GB virtual lun. 27 juil. 2015 20:45:31 CEST | | Disk: 203.13 GB total, 166.69 GB free lun. 27 juil. 2015 20:45:31 CEST | | Local time is UTC +2 hours lun. 27 juil. 2015 20:45:31 CEST | Milkyway@Home | Found app_config.xml lun. 27 juil. 2015 20:45:31 CEST | | Config: report completed tasks immediately lun. 27 juil. 2015 20:45:31 CEST | | Config: ignoring NVIDIA GPU 0 lun. 27 juil. 2015 20:45:31 CEST | | Config: GUI RPCs allowed from: lun. 27 juil. 2015 20:45:31 CEST | Milkyway@Home | URL http://milkyway.cs.rpi.edu/milkyway/; Computer ID 624246; resource share 100 lun. 27 juil. 2015 20:45:31 CEST | World Community Grid | URL http://www.worldcommunitygrid.org/; Computer ID 3343766; resource share 100 lun. 27 juil. 2015 20:45:31 CEST | GPUGRID | URL http://www.gpugrid.net/; Computer ID 226017; resource share 100 lun. 27 juil. 2015 20:45:31 CEST | World Community Grid | General prefs: from World Community Grid (last modified 24-Feb-2015 22:06:56) lun. 27 juil. 2015 20:45:31 CEST | World Community Grid | Computer location: home lun. 27 juil. 2015 20:45:31 CEST | | General prefs: using separate prefs for home lun. 27 juil. 2015 20:45:31 CEST | | Reading preferences override file lun. 27 juil. 2015 20:45:31 CEST | | Preferences: lun. 27 juil. 2015 20:45:31 CEST | | max memory usage when active: 11962.05MB lun. 27 juil. 2015 20:45:31 CEST | | max memory usage when idle: 11962.05MB lun. 27 juil. 2015 20:45:31 CEST | | max disk usage: 162.50GB lun. 27 juil. 2015 20:45:31 CEST | | (to change preferences, visit a project web site or select Preferences in the Manager) lun. 27 juil. 2015 20:45:31 CEST | | gui_rpc_auth.cfg is empty - no GUI RPC password protection lun. 27 juil. 2015 20:45:31 CEST | | Not using a proxy lun. 27 juil. 2015 20:45:32 CEST | GPUGRID | [coproc] Assigning NVIDIA instance 0 to e1s2_1-GERARD_FXCXCL12_LIG_501831-0-1-RND4749_0 lun. 27 juil. 2015 20:46:32 CEST | GPUGRID | [coproc] NVIDIA instance 0; 1.000000 pending for e1s2_1-GERARD_FXCXCL12_LIG_501831-0-1-RND4749_0 lun. 27 juil. 2015 20:46:32 CEST | GPUGRID | [coproc] NVIDIA instance 0: confirming 1.000000 instance for e1s2_1-GERARD_FXCXCL12_LIG_501831-0-1-RND4749_0 lun. 27 juil. 2015 20:47:33 CEST | GPUGRID | [coproc] NVIDIA instance 0; 1.000000 pending for e1s2_1-GERARD_FXCXCL12_LIG_501831-0-1-RND4749_0 lun. 27 juil. 2015 20:47:33 CEST | GPUGRID | [coproc] NVIDIA instance 0: confirming 1.000000 instance for e1s2_1-GERARD_FXCXCL12_LIG_501831-0-1-RND4749_0 lun. 27 juil. 2015 20:48:33 CEST | GPUGRID | [coproc] NVIDIA instance 0; 1.000000 pending for e1s2_1-GERARD_FXCXCL12_LIG_501831-0-1-RND4749_0 lun. 27 juil. 2015 20:48:33 CEST | GPUGRID | [coproc] NVIDIA instance 0: confirming 1.000000 instance for e1s2_1-GERARD_FXCXCL12_LIG_501831-0-1-RND4749_0 lun. 27 juil. 2015 20:49:33 CEST | GPUGRID | [coproc] NVIDIA instance 0; 1.000000 pending for e1s2_1-GERARD_FXCXCL12_LIG_501831-0-1-RND4749_0 lun. 27 juil. 2015 20:49:33 CEST | GPUGRID | [coproc] NVIDIA instance 0: confirming 1.000000 instance for e1s2_1-GERARD_FXCXCL12_LIG_501831-0-1-RND4749_0 lun. 27 juil. 2015 20:50:34 CEST | GPUGRID | [coproc] NVIDIA instance 0; 1.000000 pending for e1s2_1-GERARD_FXCXCL12_LIG_501831-0-1-RND4749_0 lun. 27 juil. 2015 20:50:34 CEST | GPUGRID | [coproc] NVIDIA instance 0: confirming 1.000000 instance for e1s2_1-GERARD_FXCXCL12_LIG_501831-0-1-RND4749_0 lun. 27 juil. 2015 20:51:34 CEST | GPUGRID | [coproc] NVIDIA instance 0; 1.000000 pending for e1s2_1-GERARD_FXCXCL12_LIG_501831-0-1-RND4749_0 lun. 27 juil. 2015 20:51:34 CEST | GPUGRID | [coproc] NVIDIA instance 0: confirming 1.000000 instance for e1s2_1-GERARD_FXCXCL12_LIG_501831-0-1-RND4749_0 lun. 27 juil. 2015 20:51:53 CEST | GPUGRID | [coproc] NVIDIA instance 0; 1.000000 pending for e1s2_1-GERARD_FXCXCL12_LIG_501831-0-1-RND4749_0 lun. 27 juil. 2015 20:51:53 CEST | GPUGRID | [coproc] NVIDIA instance 0: confirming 1.000000 instance for e1s2_1-GERARD_FXCXCL12_LIG_501831-0-1-RND4749_0 lun. 27 juil. 2015 20:52:54 CEST | GPUGRID | [coproc] NVIDIA instance 0; 1.000000 pending for e1s2_1-GERARD_FXCXCL12_LIG_501831-0-1-RND4749_0 lun. 27 juil. 2015 20:52:54 CEST | GPUGRID | [coproc] NVIDIA instance 0: confirming 1.000000 instance for e1s2_1-GERARD_FXCXCL12_LIG_501831-0-1-RND4749_0 lun. 27 juil. 2015 20:53:54 CEST | GPUGRID | [coproc] NVIDIA instance 0; 1.000000 pending for e1s2_1-GERARD_FXCXCL12_LIG_501831-0-1-RND4749_0 lun. 27 juil. 2015 20:53:54 CEST | GPUGRID | [coproc] NVIDIA instance 0: confirming 1.000000 instance for e1s2_1-GERARD_FXCXCL12_LIG_501831-0-1-RND4749_0
boinc 4626 4590 13 20:45 ? 00:01:32 ../../projects/www.gpugrid.net/acemd.846-65.bin --device 1
Lubuntu 16.04.1 LTS x64 |
|
Send message Joined: 28 Mar 09 Posts: 490 Credit: 11,731,645,728 RAC: 51 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Try using this line, because you have 2 gpus. <use_all_gpus>2</use_all_gpus> |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Try using this line, because you have 2 gpus. That's wrong. The "use_all_gpus" variable is a boolean, so its value could be 0 or 1. See BOINC manager's client configuration wiki. |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Your BOINC log shows that it's ignoring GPU 0 according to the cc_config: lun. 27 juil. 2015 20:45:31 CEST | | CUDA: NVIDIA GPU 0 (ignored by config): GeForce GTX TITAN Black (driver version 352.21, CUDA version 7.5, compute capability 3.5, 4096MB, 4009MB available, 6396 GFLOPS peak) This line also confirms that this task is started on GPU 1:
So perhaps the NVidia X server have different ideas about the GPU numbering than the BOINC manager. I'm not a Linux expert, so I'm just guessing, but you should try to disable the other GPU in cc_config (e.g. "<ignore_nvidia_dev>1</ignore_nvidia_dev>", and then check NVidia X server again. |
|
Send message Joined: 28 Mar 09 Posts: 490 Credit: 11,731,645,728 RAC: 51 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
You're right on that. It can be any number, and it works the same. My mistake! |
|
Send message Joined: 1 Mar 10 Posts: 147 Credit: 1,077,535,540 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
So perhaps the NVidia X server have different ideas about the GPU numbering than the BOINC manager. I'm not a Linux expert, so I'm just guessing, but you should try to disable the other GPU in cc_config (e.g. "<ignore_nvidia_dev>1</ignore_nvidia_dev>", and then check NVidia X server again. Thanks for your answers ;-) Nvidia driver enumerates GPUs in the order found on PCI bus that is : PCIE16_1 -> GPU-0 : Boinc device 0 ,the one which CRUNCHING and should be ignored according to config PCI16_2 -> not used PCIE16_3 -> GPU-1 : Boinc device 1, the one that should be crunching and does NOTHING ! If I ignore device 1 , Boinc says using device 0, that's ok, as in the first case Boinc is in phase with itself BUT, NVidia X server shows that it is GPU-1 which is CRUNCHING ! So to REALLY ignore first(0) GPU/device I must ignore number 1 and vice-versa !! Lubuntu 16.04.1 LTS x64 |
|
Send message Joined: 9 May 13 Posts: 171 Credit: 4,594,296,466 RAC: 127 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
jihal I am also running an ASUS board with Ubuntu. It has an NVIDIA 770 and an NVIDIA 970. When I look at the NVIDIA X Server Setting software, it says the 770 is GPU number 0 and the 970 is GPU number 1. However, when I look in the BOINC event log, it says that the 770 is GPU number 1 and the 970 is GPU number 0. The GPU numbers are reversed in BOINC. The same reversal could be happening to you. Luckily for me, my GPU's are different models so it is easy to spot the reversal. Hope that helps. |
|
Send message Joined: 1 Mar 10 Posts: 147 Credit: 1,077,535,540 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
jihal Hi captainjack ! Very interesting if some other crunchers could confirm this . I can't see which of your computers is concerned Please can you give us the model of your MB and Linux OS and Boinc version Regards Lubuntu 16.04.1 LTS x64 |
|
Send message Joined: 9 May 13 Posts: 171 Credit: 4,594,296,466 RAC: 127 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
jihal, My motherboard is an ASUS P9X79 LE, Ubuntu 15.04 64-bit, BOINC 7.2.42 (manually installed), and NVIDIA drivers 346.47 (manually installed). The computer shows up twice on the list (I tried to combine them but it wouldn't let me). It is the one that shows as having 2 GTX970's and running Linux. Let me know if you need more information. |
©2025 Universitat Pompeu Fabra