Exploration of SSE optimizing some bf_blenlib math routines

jacqueslucke · December 27, 2019, 11:10am

Personally, I think padding 3d vectors with an additional fourth float is not a good idea in most cases. My main reason is that it encourages people to do a “wrong” kind of vectorization. By wrong I mean suboptimal and complex.

Here is a simplified view of why I think so (I originally wrote quite a bit more, but could not finish that yet).

To optimize performance, developers should focus on interleaving the processing of many elements, instead of trying to optimize the latency of processing a single element.

I think that padding 3d vectors with a fourth float encourages developers to do the opposite. It makes it look like this vector can be processed much faster now, when in reality it usually can’t. That is because most non-trivial algorithms do different operations on the x, y and z coordinates. A better approach is to always process multiple vectors at ones. Also see this guide on optimizing normalization of many vectors.

Furthermore, in my opinion, functions that process 3d vectors but require 4d vectors as input (maybe even without telling the caller) have a bad contract and should be avoided.

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

Exploration of SSE optimizing some bf_blenlib math routines