reviewed and submitted patches from Jason Wenger - sped up subdivide. grid fill (took quite a few iterations)
bug fixes: extrude sometimes didn’t extrude on normal
up next: plan to work on UV sync project in March (i.e., now)
Howard
Manifold Boolean solver is now usually quite a lot faster than the float solver, and is about ready to be reviewed. Still need to work with platform teams to have Manifold library as a proper dep.
up next: plan to work on improved Bevel / Bevel Geometry Node
Discussion Topics
Nika raised the question of PRs related to x-ray selection. Since it cuts across concerns of mesh selection, object selection, and selection in other modules (especially animation), would the Modeling Module be an appropriate place to look at these?
Campbell: no principled objection but need to look at design. Issue: when selecting in x-ray mode, Fundamental question: do you need to hit a button to select back faces or not? What is the default behavior? Daniel and Nika and Jonathan all like it the way it is currently in Blender. But having a toggle seems doable.
Need some coordination with animation module: need alignment between Object mode, Animation mode, and Mesh elements.
Daniel: root problem: people from other software who are used to other ways. Can be good to have an option, but also don’t want to overcomplicate things too much.
Campbell: WYSIWYG is good. Do we need a Design Task for this?
Daniel demonstrates: some users want select through even when what is visible is not see-through. Jonathan: is there some improvement in x-ray display that would help?
Nika: has seen something where if you start selecting and x-ray is on, then see through while selecting but then it turns off when you stop selecting.
Campbell: Object selection is already see-through. The advantage to having an option would be to make it optional object mode.
Conclusion: we should look at these PRs and try to figure out a good overall design that is consistent across modes.
Howard raised the issue of determinism/repeatability of mesh operations (thinking specifically of Boolean though) across separate runs and separate machines. Some amount of this is necessary for the way we currently do regression tests. But the Manifold library is not deterministic/repeatable when parallelism is configured (as we do) and when the problem is big enough that parallelism is used. Regression tests would still likely be OK but do users need determinism even on large meshes? The worries are: (1) could this cause flickering in animation? (2) might Geometry Nodes users (improperly) depend on the indices of various mesh elements of the boolean output being stable?
Daniel asks: are the other boolean solvers deterministic? Howard: yes. Daniel’s opinion is that in maybe 80% of cases, boolean is one and done (baked into something) and so determinism probably not a big issue. For other cases: would be nice to have, but is it worth the extra developer time to make it so?
[Howard, post meeting, asked the Manifold developers if it is true their code is not deterministic if parallelism is used, and they confirmed that it is true. They had had an option to be deterministic a while ago, but it was not much used because it was quite a performance hit, so they removed it a couple of months ago. For non-parallelism results, they expect that the results should be the same from run to run and machine to machine, but “… floating point is tricky. Things like FMA can have higher intermediate precision and produce different results on x86-64 compared with arm, for example.”, so they weren’t going to absolutely guarantee that.]
Random look at a bunch of already existing Mesh related code (not even going into Booleans), makes me believe that already today a whole lot is not 100% the same across machines. Like, for normals calculations it uses some trigonometric functions (cos, sin, atan, …) – these are “close”, but not 100% the same between compilers, OSes and CPUs.
The only things that are guaranteed to be 100% the same across all things are basic arithmetic operations (addition, subtraction, multiplication, division), and the square root. As soon as you have any trigonometry, or things like “inverse square root” or “reciprocal”, you get close-but-not-same results between machines. Compilers that willingly turn series of multiplications/additions into FMA operations (where underlying target CPU has that operation, and the compiler is allowed to do that) also cause that issue as you noticed above.
Non-determinism on the same machine due to threading is a shame though. Is it really that large of overhead, or they simply did not investigate the problem seriously enough?
While this certainly puts a damper on things, I’d also be curious to see just how the results could vary? Being able to achieve the same results every time inside geometry nodes is pretty darn important, not to mention people being able to follow tutorials and getting the same results.
If things go well on review I’d be very keen to test it.
That sounds very ill-advised, I know I wouldn’t think of relying on this. At worst, I can see myself relying on an attribute captured before the boolean (such as the original point index of a geometry union-ed to another), but this assumes the new boolean maintains/propagates attributes sensibly. Does it ?
We try as much as possible to propagate attributes “sensibly”, in all of the boolean solvers, including the new one. So, e.g., attributes of a face on an input mesh will propagate to any remaining fragment of that face in the output (but there’s an ambiguity when there are coplanar faces in the input and output has only one fragment). Similarly for vertex, corner, and edge attributes (though I admit that in some cases, an internal remaining fragment of an edge might not get proper propagation right now). Corner attributes (like UVs) for new corners in the output are interpolated in the UV space of the input face that that new corner is in.
likely ordering, pretty much the same way if i give you a water melon, an orange and an egg to peel, you’d output melon, orange, egg in that order, now if i round up 2 more people and have you three work on these same items together, i’ll likely get the items in the egg, orange, melon order, still all will be peeled properly but the order is different, depending a bit on the egg and the orange, the orange person could perhaps beat the egg and you get the orange first.
Which is a bit problematic for things like unit tests that like deterministic results, after peeling, it goes, first there is the melon, and it should weigh about xx grams, if that gets an egg first it’ll report, this melon was not peeled properly! the results aren’t what i am expecting → FAIL!
same thing here, the boolean code likely still gets to more or less the same visual results, but there’s no guarantee the first poly it outputs will always be the same one at the same coordinates. the one that came out first could be poly 3 or 4 or 193 on future runs.
Basically there is no guarantee on the order of the indices making sense from the original mesh correct?
Like this:
If so then yea, there would be no guarantee of attributes getting propagated properly I guess.
I’m jumping ahead assuming what I just said above is correct, but I did have a look in houdini just now. The indices after a boolean get jumbled as well, so it might be safe to assume their code is also not deterministic.
EDIT: Houdini’s boolean node is in fact deterministic, producing the same index order provided the settings are always the same.
That the indices are jumbled after boolean isn’t the main problem. The problem is if the jumbling is different each time you perform the same boolean. I.e. You open up a blend file with an unapplied boolean on Monday and get a different result than when you open the same file on Tuesday. Or if you have a static, non-animated, boolean and move from frame 1 to frame 2 etc.
Even if the vert/edge/face counts are exactly the same, is it possible for an edge to be “different” depending on which thread is working in the surrounding or adjacent area? E.g. example of a cube being subtracted from another cube but yielding different topology even though the stats, and even the Index of that edge(!), are the same:
It sounds like the unit tests will be fine only because they’re going to be small enough that the threading in manifold would not activate.
It’s unclear to me if it’s just the indices that would be different, or if it might make different decisions that lead to a different geometry beyond that. But even if just the indices are different, that would cause issues with:
Thanks for listing those, Brecht. Some things I hadn’t thought of. Of course, one could use one of the other solvers if determinism is needed but this is not ideal.
I am not sure of most of the answer to “if it’s just the indices that would be different, or if it might make different decisions that lead to a different geometry beyond that.” I imagine that most of the time, if you looked at renders of a still of the result, you would not be able to tell the difference between the different output possibilities. However one thing I can say is that: post Manifold running, I do a pass to remove triangulation edges from input ngons that had to be triangulated to make the input to Manifold. Not all triangulation edges can be removed, sometimes, and if that is the case, the ones that remain, while deterministic in my process, depend on the index order of the Manifold output. So if the original face is non-planar, the silhouette of the mesh object may look different.
We removed the deterministic flag from Manifold because we weren’t confident it truly was anyway, as it had become too difficult to fully maintain and test the divergent code paths. Attribute propagation and geometric shape are solid - it is merely the indexing and occasionally topology that can change. Mostly this changes between compilers and Manifold releases. Even so, the results tend to be quite stable given our use of stable sorting, it’s just that we can’t guarantee it for all the edge cases.
For our unit tests, we check things like volume, surface area, and whether the propagated attributes interpolate the originals as expected. We don’t test index order, or make functions that are sensitive to index order, since simply importing and exporting a mesh in different programs can often resort the verts/tris.
I’d be very interested to see an example of a workflow that fails using Manifold. In general, Booleans are pretty expensive ops anyway, so if you really want to have determinism, the right answer is to write out the result to a file or cache instead of redoing it. If anything is changing about the input to the Boolean, then there’s no reliable way to relate those result topologies to each other anyway, even with determinism.
It is pretty much impossible to get deterministic results across machines due to the use of things like trigonometric functions that are not covered by the IEEE standard. For manifold, we don’t do trigonometry if the user passes us a matrix, so technically, it is possible to get deterministic results. However, realistically, mesh operations will involve trigonometry when you do things like rotation, so the end result for the user is still not deterministic (across machines) even when Manifold does not introduce additional non-determinism. Technically, users can’t follow tutorials and get the same result due to this, unless they have the same machine or they are lucky.
Non-determinism on the same machine: There are two reasons here: It prohibits some optimization, and we did not try hard to optimize this. The second reason is simple: People don’t care about this enough, so we don’t prioritize this issue. Now let me explain the first reason with a bit more detail:
Apart from optimizing individual boolean operations, Manifold also tries to optimize a group of boolean operations in the form of an expression DAG where we apply simple transformations on the DAG and schedule operations to fully utilize all the CPU cores when possible. Consider the classical matrix multiplication with N matrices, one can perform dynamic programming to group matrices together and reduce the number of arithmetic operations. The idea is similar in Manifold, we try to group boolean operations that take the smallest meshes first. The problem is that we cannot predict the mesh size after the operation (the worst-case size is at least quadratic, but it is rare), so we have to wait for the operation to complete before starting some new operation. Because inexact mesh boolean isn’t commutative and associative, different reordering decisions can produce different results.
Is the current scheme optimal? No, some optimal reordering cases can be missed due to timing, and our code is just a greedy algorithm. It can also be possible to try to estimate this mesh size information and plan the reordering instead of relying on this greedy, racy approach. But this requires a bit of experimentation and probably more data to test on.
And if Blender doesn’t care about optimizing multiple boolean operations, making things deterministic on the same machine will not have too much overhead. The issue is testing. We mainly use GitHub action for running our CI, but the CI doesn’t provide too much concurrency (2 threads?) to trigger non-deterministic behavior (say we managed to remove most of the non-determinism, but there may be something left).
The issue is that caching is extra work and error prone. It easy to forget to add or not know it’s needed, or things may go out of sync. Running the boolean operation repeatedly might be slower than it could be in CPU cycles, but managing caches and dealing with related issues is often much more time consuming for users.
For example say you have some static geometry created with booleans and you distributes points on it for scattering grass, rocks, etc. If the distribution changes every time you open that blend file, or when you edit the UV map, or paint some attribute, that’s a problem. If you send that to a render farm and 10% of frames randomly have different distributions, a user might spend a lot of time figuring that out.
At the moment, no, I don’t use Manifold other than in the mode of just doing a single “Batch boolean” – which may involve > 2 meshes, but is just one boolean operation, so I don’t think is what you are talking about, is it?
After the initial release, I am interested in figuring out an elegant way of how to get the advantage of Manifold’s optimizations for multi-boolean operations inside of Geometry Nodes in Blender. This is not going to be particularly easy and I have another project that is higher priority than that which I will be working on first.
The across-machine repeatability requirement is somewhat less important, since my guess is that users tend to use the same architecture machines throughout their processing (even when using render farms), but this is by no means universal. The point about rotations is true, but in many cases, no rotations (other than 90-degree-multiple ones, which can be handled specially) are used.
As far as Manifold and non-determinism within Blender goes, I understand and can sympathize with the developers’ decision not to add a “deterministic” flag back. That leaves me with these options:
Tell users to use another solver (the Exact one or the Float one) if they need deterministic results.
Add an option to Boolean when the solver is Manifold to use no parallelism. While this won’t guarantee repeatability, especially across machines, I think in many practical cases it will work. Just slower.
Give up on the idea of using the Manifold solver in Blender. Not the choice I would make, but it could be that the overall Blender core developers might choose this option.
As we said in module meeting I think just marking this as known issue is fine. Other two solvers also have issues and users just know they can’t use them in certain workflows. It’s nature of solvers, and as long as there is choice I don’t see a problem. If user decides to use manifold in procedural animations, answer should be just to use float instead.
Manifold is mainly used for sculpting workflows, and that’s gonna be primary usecase in Blender as well. Slight topology changes and determinism isn’t a concern for sculpting workflows that use trim tools. Slowing down manifold just to support edge case workflows will make it worse for it’s primary workflow and that’s not useful for anyone. I would just mention limitation in tooltip.
Batch boolean is still multiple boolean operations, and reordering will affect this.
Feel free to discuss later when you work on this, I am happy to help.
Yeah, I mentioned that mainly to set expectations. For me, I am open to making things deterministic (run-to-run deterministic), but that will not be something trivial to handle. Optimizing multiple boolean operations will probably be hard, but optimizing a single operation is probably much simpler with acceptable overhead.
Sometimes trade-offs have to be made, and maybe this is one of those cases, but to say it’s not a problem seems wrong to me. We should strive to eliminate these kinds of oddities and pitfalls that users have to learn about, to make Blender easier to learn and use. Especially ones that can cause issues later on, where tracking down the cause takes work. Even experienced users can easily forget this kind of thing, or not notice when using an existing asset.
Thanks for letting me know. Somehow I figured they’d all be done together (some Boolean algorithms, like Mesh Arrangements, do that).
I had determinism in mind as a requirement when I implemented the Mesh Arrangements exact solver. It mainly meant using algorithms that processed things in an order related to some input indices, and when dividing work into parallel batches, taking care as to the algorithms used. I think I’ve seen some code in Manifold where there are races from parallel arms to see which thread gets to modify an atomic variable first. Those would seem to be more challenging to make deterministic. Maybe algorithms using some kind of deterministic parallel reduce might be used in such cases? Or more clever division of work into batches that are truly unconnected to each other? But I wouldn’t want you to have to contort your algorithms into a decidedly less efficient form just do deal with this issue. As nickberckley said above, the sculpting use case is super important and determinism for that use case is unimportant compared to utmost speed.