I have committed C++ containers such as BLI::Vector
and BLI::Set
to the master branch in the last couple of months (after discussion with @brecht and @sergey). This has led to confusion among other developers for various reasons. To be fair, it was never discussed or communicated publicly what’s the plan with those data structures.
We can use this thread to discuss whether we want custom/native C++ containers in Blender. If yes, we also have to decide which ones and where they should be used. I’ll start the discussion by providing a couple of general aspects that need to be considered and my current opinion.
Advantages of a native container
- We can add all the methods we want to make them comfortable to use for the problems we have to solve.
- We can implement (optional) features such as small object optimization.
- We have greater control over performance when necessary.
- We use the same container implementation on all platforms.
Disadvantages of a native container
- We have to design and implement them ourselves.
- We have to ensure the correctness ourselves.
- Custom containers easily need a couple thousand lines of code that we have to maintain ourselves.
- We have to document the containers ourselves.
- Developers used to the standard library cannot apply this knowledge directly to our custom containers. So there is more learning overhead.
Alternatives for usage policies of native containers
- Use standard library containers by default. Only use a native container, when there is a good reason. That reason should be explained in a comment.
- Use native containers by default. Only use a container from the standard library when there is a good reason. For example, you might have to interact with a third party library that expects standard library containers.
- Decide on whether standard library or native containers should be used on a per module basis.
Alternatives for naming conventions used by native containers
- Follow the type and method naming conventions of the standard library whenever possible.
- Follow the method naming conventions of the standard library, but use different type names (e.g. starting with an upper case).
- Follow the naming conventions of the standard library only when the behavior of a method matches that of the standard library exactly.
- Follow naming conventions of some other library, possibly from another language like Python.
- Follow naming conventions of C libraries that already exist in Blender.
- Don’t follow any particular naming conventions.
Advantages of wrapping the standard library
- It’s easier to make it a drop-in replacement for standard library containers.
- We can still add methods to make them more comfortable to use.
- Some features such as small object such as small object optimization can be added to some degree.
- Requires less code than implementing the containers entirely ourselves.
- It’s easier to write containers that are correct.
Disadvantages of wrapping the standard library
- No control over how data is laid out in memory.
- Uses different container implementations for different platforms.
What follows is my opinion on these topics. It might change over the course of the discussion.
- We can add all the methods we want to make them comfortable to use for the problems we have to solve.
- We can implement (optional) features such as small object optimization.
- We have greater control over performance when necessary.
Those aspects are quite important to me when I work with C++. Maybe those are not as important to others.
We use the same container implementation on all platforms.
This is not as important as the others, but it might be nice when we want to compare performance between different platforms.
We have to design and implement them ourselves.
I’ve done a good part of that last year already. My design decisions are open for discussion of course. We are in a state where things can still change relatively easily.
We have to ensure the correctness ourselves.
The containers already have fairly good unit test coverage. Furthermore, I’ve been using them for a couple of months now and I’m not aware of any bugs. While there probably are some hidden bugs still, I think we will find them relatively quickly, when we decide to use the containers more in Blender.
Custom containers easily need a couple thousand lines of code that we have to maintain ourselves.
I think the maintenance cost of native containers is relatively low compared to most other code in Blender. That is because they do not depend on code that changes often.
We have to document the containers ourselves.
Individual methods often have comments already. High level documentation about the inner workings of the containers and when/how they should be used is lacking in many cases. It will be my responsibility to provide that documentation once we’ve come to conclusions in this thread.
Developers used to the standard library cannot apply this knowledge directly to our custom containers. So there is more learning overhead.
This is probably the biggest hurdle. It is yet another set of APIs that has to be learned by potentially every Blender C++ developer. Depending on our naming convention choice, this might be harder or easier. My opinion is probably quite biased here, because I know the API already. Personally, I think that learning something once and then benefiting from that in the long run is almost always worth the effort.
Alternatives for usage policies of native containers
When we have core native containers such as BLI::Vector
, I think they should be used everywhere by default. At least in new code. Corresponding standard library containers might be necessary in some cases, but those use cases should be explained in comments.
Alternatives for naming conventions used by native containers
I don’t think we should follow the standard library naming conventions closely. While it is practical to have a drop-in replacement during the code transition period, I don’t think that alone is important enough. Furthermore, using the same method names as the standard library can be dangerous when method is not exactly doing what the standard describes.
I like the idea of loosely following the API names of Python lists, sets and dicts with names such as append
, add
and `remove. Maybe this is because I know the Python API better. However, that might be the case for many existing Blender developers.
Wrapping standard library containers
Wrapping a simple data structure like std::vector
might make sense. Small object storage could be added, but with limitations. I’m not sure if proper copy and move semantics of a std::vector
can be implemented using e.g. this Stack Allocator. I think having a small buffer inside the vector by default that works well with moving/copying is useful. We can also make sure that this buffer is not larger than some constant by default using some compile time programming.
Wrapping more complex data structures like std::unordered_set
does not really provide a big benefit. We have to change its memory layout fundamentally in ways that are not compatible with the requirements for std::unordered_set
.
While it might be useful to wrap some really simple data types to add some additional methods, I don’t think that is a good approach for native containers.
Below are some examples on how the containers that are currently in blenlib can be used. The first two patches also show a nice measurable speedup.
- D7506: Use BLI::Set instead of std::unordered_set in depsgraph module.
- D7509: Use BLI::Map instead of GHash for operations_map.
- D7512: Use BLI::Map for RNANodeQuery.id_data_map_.
- D7521: Use BLI::Map in RootPChanMap.
- D7556: Use BLI::Vector for Relations.
@LazyDodo @ideasman42 @julianeisel @mont29 @JeroenBakker @sybren