How to deal with enum incompatibilities between different compilers?

jacqueslucke · December 7, 2020, 7:13pm

Enums are used in many places in Blender and ideally we would use them in even more places (replacing integers).
Unfortunately, enums have a problem that is not very obvious at first: Different compilers are allowed to use different underlying integer types for the same enum.
For more details, read this post by @LazyDodo: https://developer.blender.org/D9736#242235.

This has two main consequences:

One cannot safely serialize and deserialize structs that contain enums (with simple memcpy), because the struct layout might not the same. Therefore, we cannot use enums in dna.
A function with an enum in its signature that is defined in C, but called from C++ (and vice versa) might fail at run-time. This happens when the C and C++ compiler use different underlying integer types for the same enum.

Given that more C code is converted to C++ code in Blender, the second problem could mean that we can’t use enums in a significant number of function signatures.
That is because we have a lot of functions that are defined in C and called in C++.

The goal of this thread is to decide whether we can safely ignore that second problem in the context of Blender and to document our decision.

Here are the two options we should pick one of:

Work under the assumption that all pairs of compilers might use different integer types for enums (as is allowed by the C and C++ standards). This means that we can rarely use enums at all in interfaces that are used by C and C++ code.
Assume that for the C and C++ compiler combinations that are actually used for Blender, the selected integer types for an enum are always the same (so e.g. gcc and g++ are compatible, but not necessarily gcc and msvc). This allows us to use enums in all pretty much all function signatures in Blender.

In either case, enums must not be used in dna structs. Allowing that would force us to make greater assumptions about compatibility between compilers.

My opinion is that we can choose the second option, given that we haven’t run into related problems in practice yet (afaik).

I’d still like to hear the opinion of a couple more core developers, just to make this decision a bit more official.

LazyDodo · December 7, 2020, 7:38pm

I see C and C++ using different sizes in a single compiler being a mostly theoretical problem, the more dangerous solution was the one with the macro where we specified the size for C++ manually, as long as we don’t do that i doubt it’ll be a problem.

As for DNA enums cannot be supported as is, but i don’t see why we couldn’t extend DNA to carry a bit of extra meta data on the sizes of enum fields (sure it would break backwards compat, then again so would the mere act of introducing enums into DNA)

if we willing to live with a tiny bit of syntactical discomfort, combining an enum with an uint64_t (to accommodate the maximum size) into a single union may be a way out of this conundrum?

brecht · December 8, 2020, 2:51pm

Option 2 seems reasonable, using different compilers for Blender’s C and C++ code probably has a bunch of other issues, and there’s just no point.

If we ever add support for enums in DNA, I guess that would be done by the kind of refactor Sergey has been arguing for. Where we would use offsetof and sizeof in generated code to determine member offsets and size and remove the need for padding variables. Enums we would just treated as integers of the corresponding size.

JeroenBakker · December 8, 2020, 3:22pm

+1 for Option 2.

Enums reduces the amount of time code needs to be read and debuggers display them very nicely. I would not change the size to uint64_t unless really needed as the side effect is to many unused bits in a cache line. In performance sensitive areas enums can be packed _attribute__((__packed__)) so they fit better on a cache line. But that should probably only be done with enums that are inside a single compile unit and currently not done to my knowledge in the code base.

LazyDodo · December 8, 2020, 4:23pm

Just to clarify the union was just an idea for DNA structs to sidestep the unknown size, everywhere else we’d just the raw enum type.

LazyDodo · December 10, 2020, 7:42pm

a point of worry I have more and more recent lately is constructs like this

call back definition

typedef void (*update_render_passes_cb_t)(void *userdata,
                                          struct Scene *scene,
                                          struct ViewLayer *view_layer,
                                          const char *name,
                                          int channels,
                                          const char *chanid,
                                          eNodeSocketDatatype type);

actual callback

static void bke_view_layer_verify_aov_cb(void *userdata,
                                         Scene *UNUSED(scene),
                                         ViewLayer *UNUSED(view_layer),
                                         const char *name,
                                         int UNUSED(channels),
                                         const char *UNUSED(chanid),
                                         int UNUSED(type)) <------------ unused or not * DANGER*

similar constructs can be found where the header and the actual implementation disagree on the type (enum vs int usually)

Are we just really enjoying pointing a loaded gun at our feet or what is going on here?

MSVC warns about this, I’m unsure what warning needs to be enabled for clang/gcc but I HIGHLY recommend we find that out sooner rather than later.

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

How to deal with enum incompatibilities between different compilers?