I’m working on some automated benchmarks in which we do headless (i.e. -b
) Cycles rendering on a number of different compute nodes. One of the nodes has two GPUs (Tesla K40m’s) with working NVIDIA drivers, CUDA 10.2 installed, etc. The blender 2.82.7 binary I’m using was manually compiled. To check detection of the CUDA devices I have run Blender in a VNC session on the compute node under VirtualGL. There it shows both CUDA devices in the preferences and they are selected for use.
The behaviour I’m seeing is that when rendering headless with blender -y -b file.blend -f 1
the GPUs are not being used, as I see no GPU activity with nvidia-smi
on the node while all CPU cores are maxed out. The test scene used is correctly set to use Cycles on the GPU device.
When looking at the output of --debug-cycles
I see the CUDA devices are being picked up and enabled:
Blender 2.82 (sub 7)
Read prefs: /home/paulm/.config/blender/2.82/config/userpref.blend
found bundled python: /sw/arch/RedHatEnterpriseServer7/EB_production/2019/software/Blender/2.82-foss-2018b-Python-3.7.5-nvidia/share/blender/2.82/python
I0312 11:43:21.696301 17782 util_debug.cpp:51] Disabling avx2 instruction set.
I0312 11:43:21.696409 17782 blender_python.cpp:184] Debug flags initialized to:
CPU flags:
AVX2 : False
AVX : True
SSE4.1 : True
SSE3 : True
SSE2 : True
BVH layout : BVH8
Split : False
CUDA flags:
Adaptive Compile : False
OptiX flags:
CUDA streams : 1
OpenCL flags:
Device type : ALL
Debug : False
Memory limit : 0
Read blend: /home/paulm/reframe/reframe-surfsara/production_tests/applications/blender/src/classroom/classroom.2.82-gpu.blend
Info: Read library: '/home/paulm/reframe/reframe-surfsara/production_tests/applications/blender/src/classroom/assets/lamps/lamps.blend', '//assets/lamps/lamps.blend', parent '<direct>'
Info: Read library: '/home/paulm/reframe/reframe-surfsara/production_tests/applications/blender/src/classroom/assets/chairs/chairs.blend', '//assets/chairs/chairs.blend', parent '<direct>'
Info: Read library: '/home/paulm/reframe/reframe-surfsara/production_tests/applications/blender/src/classroom/assets/coatStand/coatStand.blend', '//assets/coatStand/coatStand.blend', parent '<direct>'
Info: Read library: '/home/paulm/reframe/reframe-surfsara/production_tests/applications/blender/src/classroom/assets/desks/desks.blend', '//assets/desks/desks.blend', parent '<direct>'
Info: Read library: '/home/paulm/reframe/reframe-surfsara/production_tests/applications/blender/src/classroom/assets/dustBin/dustBin.blend', '//assets/dustBin/dustBin.blend', parent '<direct>'
Info: Read library: '/home/paulm/reframe/reframe-surfsara/production_tests/applications/blender/src/classroom/assets/radiator/radiator.blend', '//assets/radiator/radiator.blend', parent '<direct>'
Info: Read library: '/home/paulm/reframe/reframe-surfsara/production_tests/applications/blender/src/classroom/assets/suitcase/suitcase.blend', '//assets/suitcase/suitcase.blend', parent '<direct>'
Info: Read library: '/home/paulm/reframe/reframe-surfsara/production_tests/applications/blender/src/classroom/assets/wallClock/wallClock.blend', '//assets/wallClock/wallClock.blend', parent '<direct>'
Info: Read library: '/home/paulm/reframe/reframe-surfsara/production_tests/applications/blender/src/classroom/assets/wastes/wastes.blend', '//assets/wastes/wastes.blend', parent '<direct>'
Info: Read library: '/home/paulm/reframe/reframe-surfsara/production_tests/applications/blender/src/classroom/assets/books/books.blend', '//assets/books/books.blend', parent '<direct>'
Info: Read library: '/home/paulm/reframe/reframe-surfsara/production_tests/applications/blender/src/classroom/assets/officeSupplies/officeSupplies.blend', '//assets/officeSupplies/officeSupplies.blend', parent '<direct>'
I0312 11:43:22.269744 17782 util_debug.cpp:51] Disabling avx2 instruction set.
I0312 11:43:22.272995 17782 device_cuda.cpp:2582] CUEW initialization succeeded
I0312 11:43:22.273051 17782 device_cuda.cpp:2584] Found precompiled kernels
I0312 11:43:22.322508 17782 device_cuda.cpp:2708] Device has compute preemption or is not used for display.
I0312 11:43:22.322548 17782 device_cuda.cpp:2711] Added device "Tesla K40m" with id "CUDA_Tesla K40m_0000:02:00".
I0312 11:43:22.322613 17782 device_cuda.cpp:2708] Device has compute preemption or is not used for display.
I0312 11:43:22.322624 17782 device_cuda.cpp:2711] Added device "Tesla K40m" with id "CUDA_Tesla K40m_0000:82:00".
I0312 11:43:22.323077 17782 util_task.cpp:329] Creating pool of 16 threads.
I0312 11:43:22.323216 17782 util_task.cpp:241] Detected 16 processors in active group.
I0312 11:43:22.323228 17782 util_task.cpp:251] Not setting thread group affinity.
I0312 11:43:22.323711 17782 device_cpu.cpp:126] Will be using default kernels.
The only way I’ve found to get GPU rendering is to manually update the GPU list with a Python expression:
blender -y --python-expr "import bpy; bpy.context.preferences.addons['cycles'].preferences.get_devices()" -b file.blend -f 1
When using the extra Python snippet the GPUs are used for rendering and the CPU cores stay mostly idle, as expected.
So is the get_devices()
call really needed these days to get correct detection of GPUs when using headless rendering?