Missing BVH8

Could we have BVH8 back? My workloads are running roughly 6 times slower with BVH_LAYOUT_EMBREE compared to BVH_LAYOUT_BVH8.

I tried to revert b2f6addc, but so far I haven’t managed to make BVH8 work :frowning:

We’re unlikely to add BVH8 back, we don’t want to main two CPU optimized BVH implementations. It’s quite surprising that it would be 6x slower, even a BVH2 would not be that much slower than BVH8.

You could investigate why exactly it happens, since there’s really no good reason for this that I can think of and that can’t be fixed.

Maybe an example .blend where the problem happens could be useful, because it sounds like some kind of bug or limitation.

1 Like

Something is fishy with my setup. After moving back to the old version, the performance still sucks. I’ll try to figure it out and then possibly share the cause of the messup. Sorry for the possible noise.

1 Like

The reason for the slowdown I experienced was not the change in BVH.
It seems, that previous version of cycles standalone was building with most options disabled. Whyle the current one has them enabled by default.

After disabling all the options I don’t need, I got back the performance. (Plus an extra 30%)

The probable reason for this is that the extra code injected by the options messes with cache locality of the code. As ray-tracing is strongly memory-bound, this can cause the slowdown.

Can you be more specific on what options are responsible for the regression?

The only build option that I can imagine giving that much of a slowdown would be a debug / unoptimized rather than release build. And any improved performance is likely due to Embree.

This brought back the performance:
set(WITH_CYCLES_OSL OFF CACHE BOOL “disabling OSL”)
set(WITH_CYCLES_OPENSUBDIV OFF CACHE BOOL “disabling opensubdiv”)
set(WITH_CYCLES_OPENVDB OFF CACHE BOOL “disabling openvdb”)
set(WITH_CYCLES_EMBREE OFF CACHE BOOL “disabling embree”)
set(WITH_CYCLES_OPENIMAGEDENOISE OFF CACHE BOOL “disabling openimagedenoise”)

Actually, (at least on Linux) RelWithDebInfo builds are faster than release builds. According to my experience, this is a common case with memory bound code.

Do you have example file that shows the issue? I could see any of these libs slowing down a render, given they were configured to be used in the scene, however them just being enabled in the build should not make much of (or any really) a difference.

I am using cycles stand-alone on dynamically generated scenes. So mostly no. I’ll try to come up with a repo case if my hands are not that full.
I guess, the reason for the slowdown is the same, why Release builds (with -O3) are slower than RelWithDebInfo builds (with -O2). Namely, the increased text size causes more cache/TLB misses. I can profile this more easily than providing a repo for the build options.