Alternative rendering backends

brecht · June 8, 2018, 10:16am

Part of the work is indeed refactoring, replacing GLuint with uint32_t or an abstract pointer, always allocating textures through GPUTexture, and similar changes to hide OpenGL data types and functions. Further I think more Blender specific code in bf_gpu should be moved into bf_draw, and intern_gawain merged into bf_gpu to form the more generic abstraction library.

We could agree on these refactoring steps without worrying too much about how it all maps to Vulkan, there’s already plenty of work there that can be done incrementally, to isolate the OpenGL stuff to one module that is as small as possible.

But there is also a big design change, which is that Vulkan / Metal / DX12 replace global state by objects and support multithreading. If we emulate the gl* calls or adopt ANGLE we will not get the performance benefits.

For example to remove glEnable(GL_BLEND), GPUShader could store the blend mode and then GPU_shader_bind() could set it. This way GPUShader could map to a Vulkan pipeline. I don’t know if that’s exactly the right solution, we need a design for how the existing GPU/GWN/DRW abstractions map to Vulkan.

Clément probably has ideas about this.

LazyDodo · June 8, 2018, 2:56pm

Allright seems like we’re kinda on the same path, how do you feel about this plan of action.

Stage 1: opengl cleanup

Add a cmake option WITH_OPENGL (Default:on) that’ll toggle glew.h only to included if it’s defined. we make sure we force the define for bf_gpu/intern_gawain/maybe bf_draw? why? so when this flag is off, everything else that is using an opengl function or type will give a neat compiler error (this is actually how I did my quick survey)
change the public facing API of intern_gawain not to use any opengl types. (it’s screaming for a refactor, but this is not the time)
refactor out all opengl from the codebase by wrapping them into bf_gpu calls, ie glEnable(GL_SMOOTH); becomes GPU_enable(GPU_SMOOTH);

(at this point blender should build with WITH_OPENGL=Off)

refactor intern_gawain , bf_gpu and bf_draw.

gawain’s types are scearming for a refactor , generally I like libraries to hide their internals using opaque types, ie having typedef struct Gwn_Batch Gwn_Batch; in the public headers, and have a private header with the actual definition. However this comes with one disadvantage you can no longer put these structures on the stack (size is unknown) and you have to provide a function to create them and be sure to free them later. however the main advantages are, just because I want to store a dx/vulkan/metal type for internal house keeping, doesn’t mean the rest of the code has to pollute their namespaces by also having to depend on dx/vulkan/metal headers. so faster build time, less colissions and less tempting for devs to ‘ohh I’ll just #ifdef a weird vulkan exception hack here in bf_sculpt, it’ll be allright’ and it’ll prevent client code to have access to internal fields.

stage 2: backend refactor

TBD.

stage 3: implement alternative backends.

TBD.

Doing the cleanup like this means that most of the cleanup (steps 1-3) can be done incrementally in master while 2.8 plows forward without disrupting anything. This is going to be a lot of work, and doing it in a branch while keeping up with the light-speed 2.8 is currently going is not going to be fun for anyone.

fclem · June 11, 2018, 10:33am

I was thinking the same thing.

Stage 1 seems fine to me and somewhat doable in the near future. But I have concern about doing it in master instead of 2.8 since the drawing code have diverged drastically. Also since we are still in heavy development, I would wait untill 2.8 stabilize.

refactor intern_gawain , bf_gpu and bf_draw.

This means quite some work. The draw manager/cache is relying heavilly on gawain’s structures not being opaque. Also it was not really designed with Vulkan in mind but it shouldn’t not be hard to refactor it.

Merging Gawain to GPU module is a matter of licence if I recall correctly.
Also note that Opengl ES is a big NO for Eevee (or even the workbench engine).

Of topic: Another issue for MoltenVK adoption is the lack of support of geometry shader which we use in some places. We try to avoid them but sometimes it’s giving better performance on intel GPUs or there is no other alternative. We could do the same things with compute shaders (maybe even not as efficiently) but that really means duplicating the codebase to support everyone.

Only supporting Vulkan hardware is more or less the same as only supporting post 2012 hardware (2015 for intel) https://en.wikipedia.org/wiki/Vulkan_(API).

brecht · June 11, 2018, 1:01pm

Refactoring in master is indeed not useful, it should happen in blender2.8 since the code is so different now.

I talked a bit further with @fclem about this, and we think it’s ok to start refactoring now: adding WITH_OPENGL, not using OpenGL types in the APIs, and wrapping OpenGL calls. Making the gawain types opaque should wait until things stabilize more though.

Also I suggest to run the Eevee and OpenGL draw regression tests for this refactoring, this should make it relatively safe to commit changes.

LazyDodo · June 11, 2018, 1:02pm

yeah bad phrasing on my part, i meant we’re able to do it in the 2.8 branch without disrupting it too much, cause keeping up with such sweeping changes in a branch while 2.8 plows ahead isn’t going to be fun.

LazyDodo · June 11, 2018, 1:05pm

allright, i’m gonna send diffs for 1.1 and 1.2, and a few of 1.3 but once we get the hang of 1.3 probably best to just commit directly. (given all tests pass)

LazyDodo · June 17, 2018, 3:44pm

Making an inventory of methods that are going to be needed in bf_gpu, is there a preference here?

void GPU_clip_distance(bool enable);

vs

void GPU_clip_distance_enable(void);
void GPU_clip_distance_disable(void);

kilon · June 18, 2018, 8:45am

I would like to give a helping hand. It would be nice if we had a task at developer website where it outlines the todo , goals, guidelines so it give us an idea how we can help out. It does not have to be long or specific , just something that gives at least a general idea.

Or maybe there is already such a task and someone can link it here.

brecht · June 18, 2018, 6:33pm

I’d go for the first one.

LazyDodo · June 30, 2018, 5:31pm

Doing some move invasive changes and uhhh the opengl_draw tests they are nothing short of awful.

They don’t run in the background, Windows popup all over the place so while this is running i cannot do anything else on the machine, not even answer an email cause anytime a window could popup stealing the focus.
It’s slow, they take about 10-15 minutes to run. that is if you’re paying attention because of issue 3.
About half the tests, give a ‘Some changes have not been saved. Do you really want to quit?’ dialog you have to confirm/deny.
running BLENDER_TEST_UPDATE=1 ctest -R opengl_draw to generate the references immediately followed by ctest -R opengl_draw gives failures, somehow if you click away from the annoying window that pops up you get more failures.

as you can see in the screenshots, a bunch of the menu/status bars are missing so that’s not getting tested…
Even if i take out the dialog from issue 3 in ghost, i still can’t walk away because a bunch of test have crashes and will popup the ‘do you wish to debug’ dialog (opengl_draw_tracking_test is really bad in this regard) , so i can’t walk away while the tests run, i HAVE to sit there and babysit them, and i can’t do anything else while they run…
not sure what determines the window size, but sometimes the size changes, so the resolution of the screenshot changes, instant fail on all of them…

I love the regular unit tests, but the opengl_draw tests are plain awful, bordering on unusable.

brecht · July 1, 2018, 11:42am

Ok, it seems some things got broken or never fully worked on Windows, I’ll look into fixing it.

Even if you don’t run all the tests, if you have just one folder with a variety of .blend files it can help to spot issues early.

LazyDodo · July 1, 2018, 3:42pm

Given most of the changes rely on changing calls from GL_* defines to GPU_* enums i threw an asserts to validate the parameters and running the tests in debug mode, still found a few issues i overlooked,without having to rely on visual comparison

I also noticed that some tests are run incorrectly the render_passes.blend just takes a screenshot of the nodesetup, what is supposed to happen is do a render and validate the node previews all have an image.

Also the statup time for each test is rather significant, i’m kinda leaning towards starting blender once, scripting a bunch of stuff (maybe record mouse+keyboard information from ghost and play it back in a replay session?) and taking screenshots along the way, but i haven’t really taken the time to see if the supporting infrastructure is ready for such a thing yet. it’s one of those ‘maybe one day when i have time’ projects

Psy-Fidelity · December 8, 2018, 8:26pm

Hi

I had some time a few months back to convert some old hobby project to support cross API rendering between OpenGL and Vulkan. So here are my 10 cents, in case they are useful:

Vulkan is centered around the concept of pipelines. A pipeline encapsulates of a lot of state that blender currently sets dynamically, relying on OpenGL to handle the internal setup of the GPU. This state includes: Shaders, blend factors/enables, framebuffer reads/writes, data format of vertex attribute inputs, format of descriptors (this refers to any data that is passed to shaders in uniform/storage buffer or texture form), samplers. This means that you need a different pipeline object for every permutation of those states. Even shaders used with different vertex formats or blend states need a separate pipeline. Some states of the pipeline however, such as the viewport, can be set dynamically.
For blender you need to investigate what render state the built-in shaders are used with and create the approppriate pipelines up-front (similarly to how built-in shaders are currently stored). Alternatively, there can be a hashmap cache, with hash calculated from the renderstate. It might make sense to implement both these solutions - for example dynamic material shaders might benefit from a cache.
Much trickier is the fact that pipelines need renderpass information during creation. Renderpasses in Vulkan describe the layout of your frame. This simply means that any kind of dynamic reconfiguration of renderpasses would require pipelines in that renderpass to be recreated.
Renderpasses themselves are tricky. The closest concept in OpenGL are framebuffer objects, but renderpasses require you to create a “template” of how your frame will be structured up front, as opposed to dynamic configuration that OpenGL allows. In some cases, that structure is known up-front, but in others it may not be - for instance in cases where blender overlay passes are changed dynamically.
Vulkan requires you to specify up front how many resources (textures/buffers) to allocate per pipeline per frame. That is because resource pools have a maximum capacity. Again, there has to be a way to track this knowledge, which blender code does not have yet. Alternatively creating many pools dynamically might work.
Vulkan’s descriptors need different GLSL syntax so shader source is different between OpenGL and Vulkan. You might try to hide those behind defines but it’s not trivial. Keeping shader source usable for both Vulkan and OpenGL can be a major PITA. You might switch to another shading language but the point is: The shading source will need changes.
Vulkan draws with command buffers, where commands are buffered and later executed. Basically all data preparation (and streaming to GPU!) has to be done up front. Again, blender has no notion of the command buffer and OpenGL commands are sprinkled all over the place.
Vulkan requires manual synchronization between CPU and GPU. This means the implementation has to take care of window buffer swapping, through what is called a swapchain in Vulkan jargon. Also resource lifetime. You should not delete or overwrite resources that might be processed by the GPU, which means that either every resource needs to track last frame it was used in or the frame itself should track a list of resources bound to it. The components required for this are not unlike thread synchronization primitives.
Vulkan requires manual memory allocation, but that is the least of your worries right now. The component can be seamlessly hooked up to any GPU memory requests (such as creation of GPU_Buffer or GPU_Texture). There is even an open source AMD library that can handle that for you.
Vulkan resources can change layout when their use case changes (for example a buffer is used as a copy operation destination, then as source for shader operation) - this is called layout change/memory barrier. There is something similar in OpenGL but managing it is not usually required of the programmer. In Vulkan it is.

As you can see there is a lot of non trivial refactoring to do if you want proper vulkan support.
You can always regenerate state on demand and create a wrapper system that will act like OpenGL for you, but in my opinion this defeats the purpose of using something Vulkan in the first place. The extra complexity is just not worth it in my opinion. Unless there is a good open source library that could handle this for blender? I’m not aware of one though.

So my proposed checklist for you, should you even want to go to such troubles, which I honestly can’t tell if it would be a good idea, is:

Write a small working Vulkan application to get a feel of the API. This is very important, as the API is very hard. I get the feeling from the comments that none has actually done that yet.
Find shader/render state/renderpass permutations and create any immutable pipelines up front. Dynamic pipelines may need to be handled differently. Group and optimize your draw calls around pipeline changes.
Lay down your frame structure and create renderpasses as appropriate. The draw engine code maps quite well to this I think, but the trouble with Vulkan is that all drawing (even UI) has to go through renderpasses.
Create a command buffer abstraction
Create swapchains for GHOST windows - it could be part of a VulkanContext class.
You should somehow track and either garbage collect or recycle frame resources such as descriptors used during each frame.
You’ll need to integrate glslang - the GLSL compiler for Vulkan - in blender.
You need to add descriptor information to uniform/sampler declarations in shader source.
You’ll need a system to handle data streaming (textures/buffers) to the GPU.
Your render abstraction will need to manage layout changes for your resources. That means it needs enough information on how various resources will be used during the frame.

kvark · March 13, 2019, 2:00am

Looks like the issues of transition are already discussed in great detail here. I agree that the best path from GL to Vulkan would be first refactoring the GL implementation to operate with Vulkan-like concepts, step by step. As in - bringing internal abstractions for command buffers, render passes, and pipelines, and making them run on GL before moving on.

One thing that wasn’t mentioned is pipeline layouts (VkPipelineLayout). You’d want to figure what those up-front, so that switching from one pipeline to another involves minimal descriptor set rebinding. Of course, you’d also want to have the shaders written to match the layout (via explicit binding locations), as opposed to introspection of the shaders to figure out the layout at run time.

Finally, you may consider dropping GL backend entirely upon adding Vulkan support, as an option to ease maintenance. Vulkan Portability initiative can provide you access to platforms where Vulkan isn’t natively available. For example, gfx-portability runs on Metal, DX12, DX11, and OpenGL/WebGL (to a lesser extent). Of course there are issues, but we are committed to address them in case Blender discovers them.

newin · July 28, 2019, 8:59pm

Hi folks!

I have some experience with Vulkan and I’m hype to work on some experiment to make Vulkan on blender a reality.
For me it has some advantages I’d love to see coming like compatibility with tools like Radeon GPU Profiler (so I will be able to make patches about GPU performances in the future) or easily implement new API with driver like compatibility layer like MoltenVK (if the licence is compatible ?)

My question is: is some work done already ? I didn’t found any branch or anything but I’m very new here (my first post actually) so if the branch has a special name or if I just missed it I’m sorry.

LazyDodo · July 28, 2019, 9:09pm

I did some cleanup trying to drive most of our direct opengl usage into a minimal number of libraries, but never finished it (it kept interfering with actual work being done) beyond that i’m not aware of any work in this area.

newin · July 28, 2019, 9:26pm

Okay then I think I’m gonna start this locally and see how it goes. I don’t think that will be necessary to officially start a branch if I’m working alone and nothing is gonna work before a very long time.
Except if anyone want to follow me, in this case we need something online.

Zingam · September 19, 2019, 5:48am

I’d rather have HLSL

newin · September 19, 2019, 11:27am

I’m not sure why: the whole (eevee) pipeline is in glsl and to support spir-V there is just some minimal changes to do (such as the #version at the begining and some explicit binding layout). If we choose hlsl over glsl we would need to have either some heavy macro processing to make the shaderfile support both, write shader into a whole other language and cross compiling it into glsl and hlsl (which is insane) or just having to maintain a completely different file. All those option are insane over the minimal changes required to make a shader spir-V glslang compatible. Especially when we consider the actual changes: you gain a few minor like saturate (which is rewritten in blender shader code anyway), loose a few like minor like syntactic sugar on matrix multiplication. The rest is a matter of formality like writing texture() instead of texture.Sample() or float3 instead of vec3 (and a far less easy to read layout configuration IMO).
And last but not the least: it’s gonna add dependencies to another project despite having glslang/glslValidator shipped with vulkanSDK.
I don’t see any actual advantages on using hlsl here maybe on some other project that has to support dx11/dx12 or was in dx in the first place or even has to support other shading language (such as CG or PSSL) but here it’s a layer of complexity for no reason that I know of.

Is there a reason why you want HLSL so much ?

newin · September 19, 2019, 9:10pm

That may sounds silly but I think we should try to implement vulkan as a replacement for OpenCL first, as it just run a dispatch for a specific work instead of a whole render pipeline architecture like eevee. It could be much easier to implement and much of the bulk integration work would be done and the compute shader pipeline it a fair bit easier to put in place (no raster config, no IA, simpler command buffer etc…).
It’s not unheard of, there is some project to make OpenCL converge with Vulkan with some plan to make interoperability between both. (I also found some people saying there is perf gain to that but I’m not convinced: https://community.khronos.org/t/opencl-vs-vulkan-compute/7132/2 )

Additionally if we end-up having a single API that run both Compute stuff and rendering it could be quite awesome in term of both programming and end-user (as there is no additional stuff to enable or install as with openCL: no CUDA, OpenCL choosing between, we can just have to test the availability of the ComputeQueue Family and display “GPU” in the cycles dropdown accordingly.

And last but not the least: if we choose to enable moltenVK en macos it could bring back a proper GPU compute on Mac (openCL being broken and deprecated)

Doing this feel like a win-win while being a smoother starting point. If anyone agree I may try to mirror OpenCL implementation this weekend and see what happen.

Edit: additional point: we could even go as far as merging some compositor effect to be able to run as compute shader (and async compute) in the eevee pipeline with some performance vs quality settings (so you could choose to have a bokeh DOF of a few ms in cycle or about a second in eevee if you want to) and bridge gab between eevee and cycle even more.

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

Alternative rendering backends