is this working for you?
I can run eevee via xvfb on an aws gpu instance, but yeah, slow: 50sec while it should be 2sec
is this working for you?
I can run eevee via xvfb on an aws gpu instance, but yeah, slow: 50sec while it should be 2sec
I managed to get eevee running and rendering fast on a debian Google Cloud Instance with a GPU (P100), so for AWS I guess should be possible as well
First installed Nvidia drivers, then started Xorg and it worked.
just make sure
glxinfo doesn’t give any errors (it checks that OpenGL can work properly)
Just wondering, how related is this question / issue to the one of rendering through ssh connections? I am having some difficulty rendering through ssh connections, even though I have -X forwarding which should mean that I have a display on my remote machine.
I have a minimal working example provided in this StackExchange question. I don’t know if it makes sense to have essentially a duplicate post on devtalk but I’m happy to investigate more at this and to discuss here (or maybe in a new question?) if that makes more sense.
Why use X forwarding for background rendering? If all the rendering happens on the remote GPU nothing that needs to be forwarded to the local machine. If rendering happens on the local GPU there is no point to involve a remote machine.
Thanks for the reply. I suppose I might not be understanding the purpose of X forwarding. The goal here is to get rendering done on that remote GPU. I guess my question can be simplified as follows:
Given this four line python script:
#!/usr/bin/env blender --python import bpy bpy.ops.render.render() bpy.data.images['Render Result'].save_render(filepath='example.png')
How may I successfully run this over an ssh connection to a headless research server that has state of the art NVIDIA GPUs and recent OpenGL versions? By “running this” I mean running the command:
blender --background --python example-script.py
example-script.py contains just those four lines above.
Hopefully this makes the question clear! Please let me know if I can clarify and/or provide more information (or if it would be better to start a new question).
EDIT: fixed a typo in the script call, it should say
--render as I had earlier.
I don’t know the steps to do it, just that X forwarding should not be needed since all computation should happen on the remote machine.
I would try what @vadimlobanov says above, install and run graphics drivers and Xorg on the remote machine, then see what
glxinfo gives and go from there. There is no support yet for headless rendering in Eevee, so it may be necessary to do some configuration that makes it seem like there is a display even if there is no physical display.
I have the same problem if I try to render with cycles… can you confirm it’s both the engines not rendering headless?
Hello all, I had the same problem of rendering with EEVEE on amazon aws (or any other cluster). Using regular X forwarding doesn’t work for utilizing remote GPUs. So I searched and found about VirtualGL. Now I can render my scenes with EEVEE on AWS. Since most of the people have similar problems, I wrote down a guide on how I did it with some explanations on my website.
I guess you can solve the problem with similar software such as TurboVNC, TigerVNC, etc… I am planning to test the same with TurboVNC and write a similar guide in the future. My email address is in the About section on my website. I am happy to be able to give back to the community, thanks
If you’re using NVIDIA GPUs then you should simply be able to fake an attached monitor in the xorg.conf file by using something like
Section "Screen" Identifier "Screen0" Device "Device0" Monitor "Monitor0" DefaultDepth 24 Option "UseDisplayDevice" "none" Option "ConnectedMonitor" "DFP-0" Option "CustomEDID" "DFP-0: /etc/X11/dell-3008wfp.bin" EndSection
Where the file
/etc/X11/dell-3008wfp.bin needs to contain a dump of a real monitor’s EDID information. We use this on our GPU nodes to make them think there is a monitor there, no need for hardware plugs.
I don’t understand why you need VirtualGL to render on a remote node. The important things to keep in mind are:
Do NOT use X forwarding (e.g. don’t use
ssh -X ..) as that will actually forward the GPU rendering commands to your local machine and do the rendering there, as that is what X forwarding means. Also, X forwarding doesn’t support many of the modern OpenGL extensions that are needed for Blender.
Make sure the remote machine has a working X server and the right OpenGL drivers. The easiest way to verify both of these is to run
DISPLAY=:0.0 glxinfo on the remote node. It should return a lot of info, including some lines like
direct rendering: Yes server glx vendor string: NVIDIA Corporation server glx version string: 1.4 ... OpenGL vendor string: NVIDIA Corporation OpenGL renderer string: Tesla K40m/PCIe/SSE2 OpenGL core profile version string: 4.6.0 NVIDIA 418.39
This will tell you that the OpenGL implementation is provided by the NVIDIA drivers, while using a GPU - a Tesla K40m in the example above - for rendering (the
direct rendering: Yes part is crucial here). The output also tells you that OpenGL 4.6 is available, which is new enough for Blender 2.8 (which needs OpenGL 3.3 or higher).
If you get lines containing text like
llvmpipe then the node (in the current configuration) does not provide GPU-based rendering, but software-based OpenGL rendering. In this case the OpenGL drivers might not be correctly installed, the X server might not be configured correctly or the node might not have a GPU at all. It could also mean that the Blender executable is linking to Mesa, instead of the NVIDIA-based OpenGL library (but GLVND should help these days).
If you get a message like
Unable to open a display then either the X server isn’t running, the appropriate
DISPLAY value isn’t set or there’s a permissions error accessing the X server.
On Linux with an NVIDIA GPU the only way to get hardware-accelerated OpenGL rendering is currently by going through the X server with GLX. (The exception to this is to use EGL, which can be used to get hardware-accelerated OpenGL rendering without GLX, but Blender doesn’t support EGL, nor the OpenGL ES version it provides).
So if there’s no X server running on the Linux server node Blender will not be able to use hardware-accelerated OpenGL.
And VirtualGL and TurboVNC won’t make a difference in this respect. In fact, TurboVNC merely provides an alternative X server that holds the remote desktop, while VirtualGL intercepts certain GLX and OpenGL calls to divert the rendering from the TurboVNC X server to the real X server (which has access to the GPU, as noted above).
EEVEE and Cycles differ in the way they use and need a GPU. EEVEE always uses OpenGL and can’t work without it. But Cycles’ GPU rendering mode is based on CUDA and doesn’t need OpenGL. Therefore, GPU-based Cycles rendering (through CUDA) can be used without have an X server running.
--background) option also makes a difference here. If you don’t use the
-b option then Blender will start normally and will try to initialize the GUI, which needs OpenGL and X.
But if you do use the
-b option then the GUI part isn’t started. However, as mentioned in the previous point an EEVEE render always needs OpenGL, regardless of wether
-b was used. But a Cycles render can work without OpenGL and X when
-b is used, even when doing GPU-based rendering.
This might be quite a bit of (technical) detail, but I hope this is useful for future reference as it seems to be misunderstood quite often.
And to come back to the original issue, for me rendering an image with EEVEE using the
-b background option works for me when running it on a Linux node with X and OpenGL correctly installed. I.e. this produces the correct output picture for me:
DISPLAY=:1.0 ~/software/blender-2.80-linux-glibc217-x86_64/blender -b eevee.blend -o //doh -f 1
By the way, a situation where VirtualGL + TurboVNC are really nice is when you want to have remote desktop on a GPU node for running Blender in, including being able to use the GUI and GPU-based rendering. We use that all the time and it works great. But VirtualGL + TurboVNC should not be needed to simply do batch rendering from the command-line.
Edit: edits, added points 3, 4 and 5
By the way, what indications do you have that EEVEE doesn’t support headless rendering? And what is “headless” precisely in this case? If it means “without a monitor attached” then that should be solvable (see my reply on faking monitors above).
Thanks a lot for the detailed answer.
Until now my problem was that I didn’t know I should run the application with DISPLAY=:0.0.
I went over your steps and can confirm that this works, so I was wrong, you don’t have to use Virtual GL.
I will correct the page and convert it to using VGL for GUI Blender.
Great, happy to be of help!
thank you so much for the detail explanation.
I’m trying to make a docker basing on ubuntu 16.04 with blender and NVIDIA GPU ,and run it through docker run command to render a blender file , cycles works fine , but I have no clue how to set up the ubuntu for eevee.
could you please to make an instruction about how to set up x server and display in detail? thanks again.
I managed to start x server and render with eevee via blender python console , but with error output:Received X11 Error:
error code: 178
request code: 154
minor code: 34
error text: GLXBadFBConfig
and the output of glxinfo has nothing about NVIDIA , but I can get all gpu recognized in blender like:
([<bpy_struct, CyclesDeviceSettings(“GeForce GTX 1080 Ti”)>, <bpy_struct, CyclesDeviceSettings(“GeForce GTX 1080 Ti”)>, <bpy_struct, CyclesDeviceSettings(“GeForce GTX 1080 Ti”)>, <bpy_struct, CyclesDeviceSettings(“GeForce GTX 1080 Ti”)>, <bpy_struct, CyclesDeviceSettings(“Intel Xeon Silver 4114 CPU @ 2.20GHz”)>], [<bpy_struct, CyclesDeviceSettings(“Intel Xeon Silver 4114 CPU @ 2.20GHz”)>])
I googled some, the problem could be nvidia glx settings of opengl, all the output form glxinfo are about mesa.
I’m working on it and still want your instruction. : )
I wrote the detailed steps of PaulMelis’ answer, step 2. May solve your problem and help others:
1)DISPLAY=:0.0 glxinfo may return “unable to open a display”, if you haven’t setup anything.
2)Make sure glxinfo command is there, if not download by: sudo apt-get install mesa-utils
3)Make sure xorg is there: sudo apt-get install xorg
4)I am assuming nvidia drivers are setup properly. Use
to see available gpu’s and note their bus id.
5)Now we want to generate a virtual x screen. Do:
sudo nvidia-xconfig --busid=PCI:0:6:0 --use-display-device=none --virtual=1280x1024 .
Replace PCI:0:6:0 with one of the bus id’s you have. Choose any of the GPU’s bus ID if more than one gpu exists. This creates the xorg.conf file. Xserver will read this file and setup the xorg server accordingly.
sudo Xorg :1
to start the Xserver. “:1” here refers to the that virtual screen’s number.
7)Open a new shell. Now we need a new DISPLAY environment variable, but it may not be setup. Do:
8)Now, when you do glxinfo | grep -e render -e NVIDIA, you can get the good output that PaulMelis has in step 2. You can follow rest of his answer.
9)After you are done, Xorg doesn’t close with Ctrl+C. The way to kill it is run nvidia-smi to get the PID# of the Xorg, then run kill -9 PID#
I tried this in a new Ubuntu 18.04 instance on AWS and it worked, hope this helps.
thanks so much bro!!!
Thank you very much yyakupog (and PaulMelis)! With your help, I am able to ssh into a machine without X forwarding, and run this command:
$ blender -b -E BLENDER_EEVEE -f 1 Blender 2.82 (sub 7) (hash 375c7dc4caf4 built 2020-03-12 05:30:40) found bundled python: /home/daniel/blender-2.82a-linux64/2.82/python Fra:1 Mem:88.40M (0.00M, Peak 88.68M) | Time:00:01.62 | Syncing Cube Fra:1 Mem:88.72M (0.00M, Peak 89.01M) | Time:00:01.92 | Syncing Light Fra:1 Mem:88.72M (0.00M, Peak 89.01M) | Time:00:01.92 | Syncing Camera Fra:1 Mem:88.73M (0.00M, Peak 89.01M) | Time:00:01.96 | Rendering 1 / 64 samples Fra:1 Mem:88.73M (0.00M, Peak 89.01M) | Time:00:02.09 | Rendering 26 / 64 samples Fra:1 Mem:88.73M (0.00M, Peak 89.01M) | Time:00:02.11 | Rendering 51 / 64 samples Fra:1 Mem:88.73M (0.00M, Peak 89.01M) | Time:00:02.12 | Rendering 64 / 64 samples Fra:1 Mem:48.47M (0.00M, Peak 89.01M) | Time:00:02.16 | Sce: Scene Ve:0 Fa:0 La:0 Saved: '/tmp/0001.png' Time: 00:02.47 (Saving: 00:00.31) Blender quit
and it will correctly generate the image! I used Blender 2.82a for this.
For the most part I was able to follow your instructions, except I needed to do one thing. I was getting this error message after running
sudo Xorg :1
$ sudo Xorg :1 _XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed _XSERVTransMakeAllCOTSServerListeners: server already running (EE) Fatal server error: (EE) Cannot establish any listening sockets - Make sure an X server isn't already running(EE) (EE) Please consult the The X.Org Foundation support at http://wiki.x.org for help. (EE) Please also check the log file at "/var/log/Xorg.1.log" for additional information. (EE) (EE) Server terminated with error (1). Closing log file.
The reason is that I already had Xorg running, as shown here:
$ ps -C Xorg PID TTY TIME CMD 1313 tty1 00:02:22 Xorg 2397 tty2 05:59:02 Xorg
The solution is to simply call
sudo kill -9 [pid] and then to run again.
Hi yyakupog, I posted an answer on the Blender Stack Exchange which combines your answer with Paul’s answer above. However, I am wondering if you have advice on how to start up an X server if the machine does not already have one? Do you remember the steps you performed?
Thanks for posting the combined answer, I believe lots of people will benefit from that.
Let me clarify some of your concern:
1)Ubuntu 18.04 comes with Gnome Desktop Manager installed(GDM3). Instead of GDM3, your system might have LightDM and KDM, (or any other desktop managers that I’m not aware of) During startup of the system, desktop managers start Xserver in the back. This is why you got the error you when you run ‘sudo Xorg :1’ . Xserver was already running. On Ubuntu 18.04, you can check all the running services by (older versions may differ)
$ systemctl list-units --all --type=service
You will probably see gdm.service active. Or if you know the service name use:
$systemctl status gdm.service
2) I actually wrote my previous answer for the case when you do not have a desktop manager. Step 3 installs xorg server and step 6 starts it.
3)If you have a desktop manager that is already running, for example gdm, instead of killing the X process, the cleaner way to do is to stop the gdm service. You can stop gdm service by:
$ service gdm stop
and starting by
$service gdm start
or instead of starting gdm you can just do sudo Xorg:1, which just starts Xserver, which I believe is sufficient for your purposes.
Note: I personally do not start gdm. Please let me know if that works, there might be issues with configuration of gdm.
Hope this helps
I am so thankful I found this thread. I hadn’t realised that
BLENDER_EEVEE isn’t fully supported for headless rendering. My render script was working fine when executed locally or over vnc, but the moment I tried running the script remotely via ssh it would fail with the error below.
Thanks for Linux solutions, can anyone point me in the direction of a potential windows solution as I want to be able to utilise my families PC’s when they are in light / no use?
ssh was looking great as I didn’t have to disturb them.
mikey@GX502 C:\Users\Mikey>\\RASPBERRYPI\Personal\Blender_Test\local_multi_render_2.82.py mikey@GX502 C:\Users\Mikey>Blender 2.82 (sub 7) (hash 375c7dc4caf4 built 2020-03-12 15:41:08) Read prefs: C:\Users\Mikey\AppData\Roaming\Blender Foundation\Blender\2.82\config\userpref.blend found bundled python: C:\Program Files\Blender Foundation\Blender 2.82\2.82\python mikey@GX502 C:\Users\Mikey>Read blend: \\RASPBERRYPI\\Personal\\Blender_Test\\cube.blend Warning! Using result of ChoosePixelFormat. Warning! Using result of ChoosePixelFormat. Warning! Using result of ChoosePixelFormat. Warning! Using result of ChoosePixelFormat. Warning! Using result of ChoosePixelFormat. Warning! Using result of ChoosePixelFormat. Warning! Using result of ChoosePixelFormat. Warning! Using result of ChoosePixelFormat. Warning! Using result of ChoosePixelFormat. Warning! Using result of ChoosePixelFormat. Warning! Using result of ChoosePixelFormat. Warning! Using result of ChoosePixelFormat. Warning! Using result of ChoosePixelFormat. Warning! Using result of ChoosePixelFormat. Win32 Error# (127): The specified procedure could not be found. Error : EXCEPTION_ACCESS_VIOLATION Address : 0x0000000000000000