I have the same problem if I try to render with cycles… can you confirm it’s both the engines not rendering headless?
Hello all, I had the same problem of rendering with EEVEE on amazon aws (or any other cluster). Using regular X forwarding doesn’t work for utilizing remote GPUs. So I searched and found about VirtualGL. Now I can render my scenes with EEVEE on AWS. Since most of the people have similar problems, I wrote down a guide on how I did it with some explanations on my website.
https://yigityakupoglu.home.blog/
I guess you can solve the problem with similar software such as TurboVNC, TigerVNC, etc… I am planning to test the same with TurboVNC and write a similar guide in the future. My email address is in the About section on my website. I am happy to be able to give back to the community, thanks
If you’re using NVIDIA GPUs then you should simply be able to fake an attached monitor in the xorg.conf file by using something like
Section "Screen"
Identifier "Screen0"
Device "Device0"
Monitor "Monitor0"
DefaultDepth 24
Option "UseDisplayDevice" "none"
Option "ConnectedMonitor" "DFP-0"
Option "CustomEDID" "DFP-0: /etc/X11/dell-3008wfp.bin"
EndSection
Where the file /etc/X11/dell-3008wfp.bin
needs to contain a dump of a real monitor’s EDID information. We use this on our GPU nodes to make them think there is a monitor there, no need for hardware plugs.
I don’t understand why you need VirtualGL to render on a remote node. The important things to keep in mind are:
-
Do NOT use X forwarding (e.g. don’t use
ssh -X ..
) as that will actually forward the GPU rendering commands to your local machine and do the rendering there, as that is what X forwarding means. Also, X forwarding doesn’t support many of the modern OpenGL extensions that are needed for Blender. -
Make sure the remote machine has a working X server and the right OpenGL drivers. The easiest way to verify both of these is to run
DISPLAY=:0.0 glxinfo
on the remote node. It should return a lot of info, including some lines likedirect rendering: Yes server glx vendor string: NVIDIA Corporation server glx version string: 1.4 ... OpenGL vendor string: NVIDIA Corporation OpenGL renderer string: Tesla K40m/PCIe/SSE2 OpenGL core profile version string: 4.6.0 NVIDIA 418.39
This will tell you that the OpenGL implementation is provided by the NVIDIA drivers, while using a GPU - a Tesla K40m in the example above - for rendering (the
direct rendering: Yes
part is crucial here). The output also tells you that OpenGL 4.6 is available, which is new enough for Blender 2.8 (which needs OpenGL 3.3 or higher).If you get lines containing text like
Mesa
orllvmpipe
then the node (in the current configuration) does not provide GPU-based rendering, but software-based OpenGL rendering. In this case the OpenGL drivers might not be correctly installed, the X server might not be configured correctly or the node might not have a GPU at all. It could also mean that the Blender executable is linking to Mesa, instead of the NVIDIA-based OpenGL library (but GLVND should help these days).If you get a message like
Unable to open a display
then either the X server isn’t running, the appropriateDISPLAY
value isn’t set or there’s a permissions error accessing the X server. -
On Linux with an NVIDIA GPU the only way to get hardware-accelerated OpenGL rendering is currently by going through the X server with GLX. (The exception to this is to use EGL, which can be used to get hardware-accelerated OpenGL rendering without GLX, but Blender doesn’t support EGL, nor the OpenGL ES version it provides).
So if there’s no X server running on the Linux server node Blender will not be able to use hardware-accelerated OpenGL.
And VirtualGL and TurboVNC won’t make a difference in this respect. In fact, TurboVNC merely provides an alternative X server that holds the remote desktop, while VirtualGL intercepts certain GLX and OpenGL calls to divert the rendering from the TurboVNC X server to the real X server (which has access to the GPU, as noted above).
-
EEVEE and Cycles differ in the way they use and need a GPU. EEVEE always uses OpenGL and can’t work without it. But Cycles’ GPU rendering mode is based on CUDA and doesn’t need OpenGL. Therefore, GPU-based Cycles rendering (through CUDA) can be used without have an X server running.
-
Blender’s
-b
(or--background
) option also makes a difference here. If you don’t use the-b
option then Blender will start normally and will try to initialize the GUI, which needs OpenGL and X.But if you do use the
-b
option then the GUI part isn’t started. However, as mentioned in the previous point an EEVEE render always needs OpenGL, regardless of wether-b
was used. But a Cycles render can work without OpenGL and X when-b
is used, even when doing GPU-based rendering.
This might be quite a bit of (technical) detail, but I hope this is useful for future reference as it seems to be misunderstood quite often.
And to come back to the original issue, for me rendering an image with EEVEE using the -b
background option works for me when running it on a Linux node with X and OpenGL correctly installed. I.e. this produces the correct output picture for me:
DISPLAY=:1.0 ~/software/blender-2.80-linux-glibc217-x86_64/blender -b eevee.blend -o //doh -f 1
By the way, a situation where VirtualGL + TurboVNC are really nice is when you want to have remote desktop on a GPU node for running Blender in, including being able to use the GUI and GPU-based rendering. We use that all the time and it works great. But VirtualGL + TurboVNC should not be needed to simply do batch rendering from the command-line.
Edit: edits, added points 3, 4 and 5
By the way, what indications do you have that EEVEE doesn’t support headless rendering? And what is “headless” precisely in this case? If it means “without a monitor attached” then that should be solvable (see my reply on faking monitors above).
Thanks a lot for the detailed answer.
Until now my problem was that I didn’t know I should run the application with DISPLAY=:0.0.
I went over your steps and can confirm that this works, so I was wrong, you don’t have to use Virtual GL.
I will correct the page and convert it to using VGL for GUI Blender.
Thanks
Great, happy to be of help!
thank you so much for the detail explanation.
I’m trying to make a docker basing on ubuntu 16.04 with blender and NVIDIA GPU ,and run it through docker run command to render a blender file , cycles works fine , but I have no clue how to set up the ubuntu for eevee.
could you please to make an instruction about how to set up x server and display in detail? thanks again.
I managed to start x server and render with eevee via blender python console , but with error output:Received X11 Error:
error code: 178
request code: 154
minor code: 34
error text: GLXBadFBConfig
and the output of glxinfo has nothing about NVIDIA , but I can get all gpu recognized in blender like:
bpy.context.preferences.addons[‘cycles’].preferences.get_devices()
([<bpy_struct, CyclesDeviceSettings(“GeForce GTX 1080 Ti”)>, <bpy_struct, CyclesDeviceSettings(“GeForce GTX 1080 Ti”)>, <bpy_struct, CyclesDeviceSettings(“GeForce GTX 1080 Ti”)>, <bpy_struct, CyclesDeviceSettings(“GeForce GTX 1080 Ti”)>, <bpy_struct, CyclesDeviceSettings(“Intel Xeon Silver 4114 CPU @ 2.20GHz”)>], [<bpy_struct, CyclesDeviceSettings(“Intel Xeon Silver 4114 CPU @ 2.20GHz”)>])
I googled some, the problem could be nvidia glx settings of opengl, all the output form glxinfo are about mesa.
I’m working on it and still want your instruction. : )
I wrote the detailed steps of PaulMelis’ answer, step 2. May solve your problem and help others:
1)DISPLAY=:0.0 glxinfo may return “unable to open a display”, if you haven’t setup anything.
2)Make sure glxinfo command is there, if not download by: sudo apt-get install mesa-utils
3)Make sure xorg is there: sudo apt-get install xorg
4)I am assuming nvidia drivers are setup properly. Use
nvidia-xconfig --query-gpu-info
to see available gpu’s and note their bus id.
5)Now we want to generate a virtual x screen. Do:
sudo nvidia-xconfig --busid=PCI:0:6:0 --use-display-device=none --virtual=1280x1024 .
Replace PCI:0:6:0 with one of the bus id’s you have. Choose any of the GPU’s bus ID if more than one gpu exists. This creates the xorg.conf file. Xserver will read this file and setup the xorg server accordingly.
6)Run
sudo Xorg :1
to start the Xserver. “:1” here refers to the that virtual screen’s number.
7)Open a new shell. Now we need a new DISPLAY environment variable, but it may not be setup. Do:
export DISPLAY=:1
8)Now, when you do glxinfo | grep -e render -e NVIDIA, you can get the good output that PaulMelis has in step 2. You can follow rest of his answer.
9)After you are done, Xorg doesn’t close with Ctrl+C. The way to kill it is run nvidia-smi to get the PID# of the Xorg, then run kill -9 PID#
I tried this in a new Ubuntu 18.04 instance on AWS and it worked, hope this helps.
thanks so much bro!!!
Thank you very much yyakupog (and PaulMelis)! With your help, I am able to ssh into a machine without X forwarding, and run this command:
$ blender -b -E BLENDER_EEVEE -f 1
Blender 2.82 (sub 7) (hash 375c7dc4caf4 built 2020-03-12 05:30:40)
found bundled python: /home/daniel/blender-2.82a-linux64/2.82/python
Fra:1 Mem:88.40M (0.00M, Peak 88.68M) | Time:00:01.62 | Syncing Cube
Fra:1 Mem:88.72M (0.00M, Peak 89.01M) | Time:00:01.92 | Syncing Light
Fra:1 Mem:88.72M (0.00M, Peak 89.01M) | Time:00:01.92 | Syncing Camera
Fra:1 Mem:88.73M (0.00M, Peak 89.01M) | Time:00:01.96 | Rendering 1 / 64 samples
Fra:1 Mem:88.73M (0.00M, Peak 89.01M) | Time:00:02.09 | Rendering 26 / 64 samples
Fra:1 Mem:88.73M (0.00M, Peak 89.01M) | Time:00:02.11 | Rendering 51 / 64 samples
Fra:1 Mem:88.73M (0.00M, Peak 89.01M) | Time:00:02.12 | Rendering 64 / 64 samples
Fra:1 Mem:48.47M (0.00M, Peak 89.01M) | Time:00:02.16 | Sce: Scene Ve:0 Fa:0 La:0
Saved: '/tmp/0001.png'
Time: 00:02.47 (Saving: 00:00.31)
Blender quit
and it will correctly generate the image! I used Blender 2.82a for this.
For the most part I was able to follow your instructions, except I needed to do one thing. I was getting this error message after running sudo Xorg :1
$ sudo Xorg :1
_XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed
_XSERVTransMakeAllCOTSServerListeners: server already running
(EE)
Fatal server error:
(EE) Cannot establish any listening sockets - Make sure an X server isn't already running(EE)
(EE)
Please consult the The X.Org Foundation support
at http://wiki.x.org
for help.
(EE) Please also check the log file at "/var/log/Xorg.1.log" for additional information.
(EE)
(EE) Server terminated with error (1). Closing log file.
The reason is that I already had Xorg running, as shown here:
$ ps -C Xorg
PID TTY TIME CMD
1313 tty1 00:02:22 Xorg
2397 tty2 05:59:02 Xorg
The solution is to simply call sudo kill -9 [pid]
and then to run again.
Hi yyakupog, I posted an answer on the Blender Stack Exchange which combines your answer with Paul’s answer above. However, I am wondering if you have advice on how to start up an X server if the machine does not already have one? Do you remember the steps you performed?
Hello sdafasdf,
Thanks for posting the combined answer, I believe lots of people will benefit from that.
Let me clarify some of your concern:
1)Ubuntu 18.04 comes with Gnome Desktop Manager installed(GDM3). Instead of GDM3, your system might have LightDM and KDM, (or any other desktop managers that I’m not aware of) During startup of the system, desktop managers start Xserver in the back. This is why you got the error you when you run ‘sudo Xorg :1’ . Xserver was already running. On Ubuntu 18.04, you can check all the running services by (older versions may differ)
$ systemctl list-units --all --type=service
You will probably see gdm.service active. Or if you know the service name use:
$systemctl status gdm.service
2) I actually wrote my previous answer for the case when you do not have a desktop manager. Step 3 installs xorg server and step 6 starts it.
3)If you have a desktop manager that is already running, for example gdm, instead of killing the X process, the cleaner way to do is to stop the gdm service. You can stop gdm service by:
$ service gdm stop
and starting by
$service gdm start
or instead of starting gdm you can just do sudo Xorg:1, which just starts Xserver, which I believe is sufficient for your purposes.
Note: I personally do not start gdm. Please let me know if that works, there might be issues with configuration of gdm.
Hope this helps
I am so thankful I found this thread. I hadn’t realised that BLENDER_EEVEE
isn’t fully supported for headless rendering. My render script was working fine when executed locally or over vnc, but the moment I tried running the script remotely via ssh it would fail with the error below.
Thanks for Linux solutions, can anyone point me in the direction of a potential windows solution as I want to be able to utilise my families PC’s when they are in light / no use?
ssh was looking great as I didn’t have to disturb them.
mikey@GX502 C:\Users\Mikey>\\RASPBERRYPI\Personal\Blender_Test\local_multi_render_2.82.py
mikey@GX502 C:\Users\Mikey>Blender 2.82 (sub 7) (hash 375c7dc4caf4 built 2020-03-12 15:41:08)
Read prefs: C:\Users\Mikey\AppData\Roaming\Blender Foundation\Blender\2.82\config\userpref.blend
found bundled python: C:\Program Files\Blender Foundation\Blender 2.82\2.82\python
mikey@GX502 C:\Users\Mikey>Read blend: \\RASPBERRYPI\\Personal\\Blender_Test\\cube.blend
Warning! Using result of ChoosePixelFormat.
Warning! Using result of ChoosePixelFormat.
Warning! Using result of ChoosePixelFormat.
Warning! Using result of ChoosePixelFormat.
Warning! Using result of ChoosePixelFormat.
Warning! Using result of ChoosePixelFormat.
Warning! Using result of ChoosePixelFormat.
Warning! Using result of ChoosePixelFormat.
Warning! Using result of ChoosePixelFormat.
Warning! Using result of ChoosePixelFormat.
Warning! Using result of ChoosePixelFormat.
Warning! Using result of ChoosePixelFormat.
Warning! Using result of ChoosePixelFormat.
Warning! Using result of ChoosePixelFormat.
Win32 Error# (127): The specified procedure could not be found.
Error : EXCEPTION_ACCESS_VIOLATION
Address : 0x0000000000000000
@brecht Is it still impossible to do rendering with EEVEE in the background, without an open window? Any updates on this topic in general (i.e. should we expect this to be supported in the future)?
If full background rendering is not supported for EEVEE at the moment, is there a workaround for background rendering with EEVEE with a virtual window? Would the rendering speed does get affected via background rendering with a virtual window?
There’s no concrete plan for when headless rendering will be added.
This topic has posts explaining workarounds.