2025-01-14 Shape Keys Performance

The issue is not with the idea itself (this is an agreed mid- to long-term goal), but with the scope of such a project vs. the available resources this year to work on it.

Refactoring a (very!) old data-block type with its (painful!) baggage of legacy code accumulated over decades is not a small task. Converting any data-block to something completely different (not an ID anymore, but in this case an attribute) is also a fairly heavy task. Combining both would be a major project, requiring months of dev-time to be completed - in best case scenario. It involves changes to the animation system, massive parts of the UI/UX codebase (editors, edit modes and operators), ID management, complex versioningā€¦ while ensuring total feature parity!

And the new attribute system has currently limitations (like lack of namespace) that would also need to be addressed to be usable for something like shapekeys (or vertex groups, which are another candidate for attributes).

17 Likes

Together with Nathan, Iā€™ve been testing a prototype (ā€˜Springā€™) shapekey based rig and trying to benchmark it by:

  • duplicating the rig (locally) with having separate animation actions, after 20 duplicates (4000 shapekeys in total) the viewport on my machine would still run it above 24 fps.
  • regarding data size, 1 facial rig with bones and 200 (driven) shapekeys can be saved to a 31mb blendfile.
  • manipulating shapekeys for animation in the viewport stays very responsive.

Nathan concluded that performance and filesize are not an issue for low poly count facial rigging and animation based on the testing we did together.
Talking to @mont29 , the benchmark will be the character from Charge (Einar) with a relatively high polycount and setting up a shapekey system that would resemble a production ready facial rig. Found here However, it wonā€™t include other factors like body rig, hair, extra deformers etc in order to isolate performance measurements for shapekeys.
This benchmark would give us clarity whether we might be trying to solve a problem that doesnā€™t exist and with the precious developers time, there are other matters that need their brains.

Iā€™m also going to make a workflow design plan for the character facial rigging workflow (with consult of Nathan and Met) that hopefully gives a clear view of what improvements are required to accomodate this regarding shapekey management, data optimization and additional tooling.

8 Likes

There is one big limitation of the current system that has been a thorn in the side of riggers for a long time: the blend weights are stored on the shapekey datablock itself. This makes it impossible to share a mesh with corrective shapekeys between characters, as it is impossible to set different blend weights for different users of the mesh.

If the data model for shapekeys is going to change, it would be great if it were to change in a way that solves this problem. The solution doesnā€™t necessarily have to be implemented as part of this change, but please keep it in mind when designing a new data model.

20 Likes

not related to performance of shape keys but rather shape keys ā€œmanagementā€ā€¦ there is a free and optionally paid addon which has all the structure and few other features (such as the batch tools and other) that shape keys panel needs by default to order more tidy by folders all hundreds of shapes that there can be in character model or else, you can download and use the code for testing a built-in prototype for blender (maybe kindly request the addon dev for permission, but imo is not necessary since this is already opensource code anyways)

Blender Addon : Lazy Shapekeys

[Lazy Shapekeys] Add-on for Shape Keys to Divide Folders, Force Transfer, and Individually Separate Objects [Blender Add-on] ā€“ Oblivion Summary

3 Likes

For benchmarking shapekey performance, iā€™ve slapped together a basic shapekey based rig with some basic deformations driven by bones for the mouth, lips, mouthcorners, nostrils, cheeks and brows. The face mesh (extracted from the character from the open movie ā€˜Chargeā€™) contains over 63.000 vertices, which is quite overkill for any facial rigging mesh (the movie base mesh has around 4000 verts). (EDIT: Apparently not for AAA VFX movies, according to dan2) For performance sake, Iā€™ve simly applied 2 subdiv modifiers and build all the shapes with that same resolution. The face mesh has over 100 shapekeys, each of them containing the same amount (63.618) of verts. All of the shapekeys are driven by the position of the control bones. Normally a feature film facial rig would contain twice as many shapekeys, but never this amount of verts. Iā€™m running this test on my laptop with medium specs (2019):

Operating system: Windows-10-10.0.19045-SP0 64 Bits
Graphics card: NVIDIA GeForce GTX 1660 Ti with Max-Q Design/PCIe/SSE2 NVIDIA Corporation 4.6.0 NVIDIA 512.78 Opengl Backend
16GB RAM
Processor: AMD Ryzen 7 4600HS

For the test Iā€™ve been keying all the bones, while setting playback to 60fps (as seen in the HUD of the 3d viewport). Iā€™m linking in the rigfile into an empty scene.

Click here to watch the recording:
Einar_shapekey_benchmark_test

As seen in the video, when any shapekey has been manipulated, the framerate drops down from 60fps to around 40fps.

In my opinion, shapekey performance for facial rigging is not a bottleneck at the moment. As seen in the video, with 1 rig the framerate doesnā€™t drop below 40fps on a medium laptop. With 4 duplicates it drops to 18fps which is still not bad. Of course we need to take in account that this is not a full character rig, and simply an isolated test solely focussed on testing shapekey performance. If we want to spend time and resources on improving the workflow using shapekeys thereā€™s much more to be gained from improving shapekey management, and the majority of suggestions above. Hope this helps, and let me know if i need to include the blendfile somewhere.

EDIT: you can download the blendfile here
The file itself contains all the external shapekey meshes, so itā€™s recommended to only link the CH-Einar_face collection into a fresh blendfile in order to do animation tests.

9 Likes

Which lets face it, given the fairly ā€˜averageā€™ PC specs (with a somewhat overkill vertex count) and I assume running OBS at the same time to record it, combined with the fact we largely animate/preview at 24fps, then yeah, I really donā€™t see that as an performance issue (which I didnā€™t expect it to be).

I still think it would be nice to be able to download the animated/keyed file so others can hit play and add more data points.

Outside of some really old hardware, if someone else doesnā€™t get the same or better, then that could maybe start to point to a specific bottleneck/bug in a rarely used area of the code.

6 Likes

@Rikstopher I took your file animated all bones and ran through a profiler.

single core

multi core

so judging by this most time is actually spent drawing the frame?
You canā€™t see in the screenshot because the callstack is so high, but a significant portion of the drawing is spent in blender::bke::mesh::normals_calc_verts

But that is of course a debug build, letā€™s get some numbers from a release build by using SCOPED_TIMER_AVERAGED.

CPU: AMD Ryzen 9 9900X 12-Core Processor

- Average
DRW_draw_view 1.95ms
normals_calc_verts 0.54ms
BKE_key_evaluate_object_ex 2.47ms

So in a release build the story shifts a bit. 2.5ms for BKE_key_evaluate_object_ex is still not bad though but there is room for improvement. As far as I can tell the loops in static void key_evaluate_relative are not threaded which might be an easy win. But looking at the code it could use a face lift in general.

Out of curiosity I can that with blender restricted to 4 threads to simulate a less powerful CPU.

- Average
DRW_draw_view 2.82ms
normals_calc_verts 1.43ms
BKE_key_evaluate_object_ex 2.15ms

oddly enough BKE_key_evaluate_object_ex is faster with fewer threads, I ran that a few times just to sanity check. Always the same result within margin of error.

5 Likes

You should compile with CMAKE_BUILD_TYPE=RelWithDebInfo to get a release build with symbols. As you pointed out, the flamegraphs arenā€™t really meaningful otherwise :smiley:

3 Likes

So I did just that and quickly added my own silly/pointless animation loop that moved around all the control bones.

Long story short, on viewport playback I hit 100FPS (it is possible that is being synced to my monitor, which is also set to 100HMz).

As you can see, this is on a Ryzen 9700X, with largely a single core doing all the work (but itā€™s been long known that some parts of Blender are very much single core limited, possible that is just the way it has to be).
My GPU (a RTX 3080 Ti) is barely breaking a sweat, but then this is largely vertex/CPU calculation, so thatā€™s nothing surprising.

Could some code updates, etc be done to improve performance, very likely. Do I need or would I even notice if it was faster, nope.

4 Likes

I had a follow-up meeting with Rik today which went over the management features a bit more: 2025-02-13 Shape Keys Management

2 Likes

Unfortunately that is not correct, at least in feature VFX it isnā€™t. Most head meshes I worked with at major studios were 60-120k faces (just the head, like Hulk here for example at the beginning), and hero ones had shape key counts up to 5-6000. That is with correctives and inbetweens.

5 Likes

wow thatā€™s insane! I stand corrected :D. Iā€™ve been looking at rigs from Spiderman: Far from home at Sony, but for animation these where quite optimized, but probably where having the same res when rendered. VFX is indeed a different beast (no pun intended).