EEVEE shader compilation process information needed

3di · April 14, 2021, 12:08pm

I haven’t yet tried to figure out the EEVEE shader compilation process, but before I waste time (particularly as I’m not proficient in C), could anyone tell me if it would be possible to only calculate groups once, so that when connecting something to one of the group’s outer socket’s, the only recalculation that takes place is to append the new node’s code into the previous state?

I’m guessing it should be possible to avoid recalculating the group, because it’s just changing the source of data flowing into the group from a group slider, to a node output (the type of data would remain the same)

I’m thinking calculate the group once per Blender session, unless changes are made to the group internally, and then any material which uses that group, would use that initially calculated data as well. At the moment, if I have 10 materials all using the same group, but each material has different nodes connected to the group, then the group is recalculating 10 times (taking around 5 mins)

brecht · April 14, 2021, 1:23pm

No, it’s not practical.

3di · April 14, 2021, 1:32pm

Darn it. Ok, thanks.

brecht · April 14, 2021, 1:32pm

The way game engines solve this with a concept of shaders and shader instances, where for every shader you define a shader graph and are able to vary some specified parameters and image textures for each instance. But each instance of a particular shader should use the same number of image textures, and be quite similar. One shader that has instances of type glass, skin, metal, … for example would not perform well, those would need to be separate shaders.

Something like that is possible to implement (though it has significant design implications). What’s not practical is to do it at the level of node groups within a larger shader network.

3di · April 14, 2021, 1:48pm

Thanks. So when you say shader, do you mean material, or the individual bsdf nodes contained within the material?

brecht · April 14, 2021, 1:50pm

What I mean here is shader = the entire shader node graph of a material. Reusing things at the level of individual nodes is what is impractical.

3di · April 14, 2021, 2:00pm

Thanks, and it’s not possible to associate each part of a shader graph with the node it represents, and then during calculation, when the function responsible for updating the graph gets to that point, if the node/group is unchanged, it just inserts the unchanged portion of the graph from the last time it was calculated?

brecht · April 14, 2021, 2:04pm

OpenGL/Vulkan graphics drivers have no good mechanism for such partial updates to shader graphs.

3di · April 14, 2021, 2:05pm

ah ok, so really it would need to be implemented in opengl/vulcan libraries.

Kenzie · April 14, 2021, 2:07pm

Best you can do in Vulkan is pipeline inheritance and cache to give the graphics drivers a head-start on pipelines. Since Vulkan doesn’t actually care about or know about GLSL/HLSL/Etc. you can do whatever you want with a shader compiler as long as it generates valid SPIR-V bytecode for Vulkan to digest. Blender uses OpenGL however and I am not soo sure about anything you can do with GL, I think there is limited SPIR-V support in some extensions but it is probably not worth it.

3di · April 16, 2021, 7:09pm

I have noticed that if I create a new material, and add a group to it which has been used in another material, then it doesn’t take 26 seconds to calculate the shader graph (as it does when connecting a new node to one of the groups external sockets), it’s immediate. How can this be possible if the shader graph represents a full material and partial updates aren’t possible? It seems to me it must be re-using the part of the shader graph from the other material in which the group was initially compiled? If that is the case, then I don’t get why the same can’t happen when adding a new node outside of a group.

I’m not doubting anyone by the way, I’m just trying to get a better understanding.

Kenzie · April 16, 2021, 7:57pm

I have a hunch…

Assuming the issue isn’t a bottleneck in blender: OpenGL drivers tend to have a pretty bloated opaque series of optimizations it does behind the scenes. Due to the global state machined nature of the API that needs to be adapted to how modern GPUs actually work: nearly draw call with every single state change requires it’s own PSO (pipeline shader object) that contains the machine code for your GPU to actually draw those triangles. Simply put the OpenGL driver is nearly always compiling low level shaders. Vulkan/DX12 doesn’t do this and (A:) let’s you compile the human readable shader using your tools of choice offline/real-time to bytecode which is transpiled by the driver (rather than the driver compiling the code). (B:) puts the application in charge of creating and managing PSOs assuming that they can do better or at a more opportune time and save it for multiple runs. (which has been the cause of some controversy).

My theory (if the bottleneck isn’t blender) is that when you create the node group and blender adds the respective GLSL snippet to the final GLSL code for the fragment shader. The driver (through some optimization likely made years ago) recognizes a similarity in the cache and is able to skip doing some of the work.

This is just my best guess.

3di · April 16, 2021, 8:09pm

So you have a material, which as Brecht explains is a shader graph, and you add a complicated group node which usually takes 26 seconds to compile, and you think opengl is recognising the group’s GLSL as similar to what it has in cache, so it uses that. So why doesn’t it recognise the same snippet when connecting a node to the groups external socket, and just add the new node’s glsl to the groups glsl it has in cache?

Would be great to solve it, because it’ll allow for much more compatibility between Cycles and EEVEE materials, without having huge waiting times for shader compilation.

3di · April 16, 2021, 8:11pm

could opengl’s failure to see the similarity be caused by the way Blender is presenting the data to it differently when connecting nodes to external sockets, vs adding group to materials?

Kenzie · April 16, 2021, 8:17pm

It could be. The thing about OpenGL is that since the drivers are soo complex over 25 years and these kinds of optimizations are up to AMD/NVidia/Intel to implement there isn’t often a whole lot you can do about it.

But I would benchmark it if you really wanted to see if the problem lies on their end. My theory is just that. It could very well just be a blender side bottleneck that is going unnoticed. Those 26 seconds are far too long and could be a bug in the driver you have installed. Who knows!!!? It’s just part of the “fun” of graphics programming.

3di · April 16, 2021, 8:18pm

Perhaps it can only see the similarity if the first material only has the group node and nothing else, so when the new material has a group added to it, it’s basically the same material with a different name. I’ll test now by adding nodes to the first material, creating a second material that uses the group alone, and see if it has to recompile for 26 seconds.

Kenzie · April 16, 2021, 8:21pm

The similarity is likely in something lower level than what you can see in the node graph. If you want to have a better look, hook up a printf() or something to the point at which blender generates the actual GLSL code. If the driver has even more layers of fun to what it’s doing maybe even that might not be low level enough. You might not be able to do anything about it though.

3di · April 16, 2021, 8:29pm

just tested my theory, adding the group to a new material when the only other material in the scene has the group + nodes plugged into it…resulted in the new material compiling, but still around 2 seconds vs 26 seconds. So opengl must be able to recognise partial similarities and use cached glsl rather than recalculating.

It’s a bit above my skill level to fix, but hopefully the thread will give someone food for thought

Kenzie · April 16, 2021, 8:32pm

Doesn’t seem like much to fix, just the OpenGL driver doing it’s job and speeding up recompiling of stuff. If it becomes too much of a problem or you are working with the same files over and over again. ~~I believe Steam has a layer and options to save and load the driver’s GL shader cache for an app for faster re-use. It might be worth a shot.~~

*Edit: I was incorrect. The option Steam provides doesn’t force saving of shader cache to disk, it appears to try to download a pre-compiled cache of shaders for your system from some database. This is mainly useful for getting past initial load times in video games, but isn’t particularly useful here in blender.

brecht · April 17, 2021, 11:27am

Both Blender and (most) OpenGL drivers do some amount of caching. If the shader graph compiles to the exact same GLSL shader code as another material that can speed things up.

On the OpenGL driver level, that could even be a a material that you created a few days ago in another .blend file, since the cache is saved to disk.

Caching is not based on partial similarities, it must be an exact match. However slightly different shader graphs may compile to the exact same code.

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

EEVEE shader compilation process information needed