Investigating memory leaks

I noticed quite substantial memory leaks with certain .blend files, workflows (multires + shader displace) and rendering methods (hybrid CPU+GPU). The scenes in which this is happening are quite big in size.

Before submitting these cases to the bugtracker, is there anythig more a user can do to narrow the problem better than shaving the .blend file from unnecessary data?

1 Like

At least ensure it’s reproable with --factory-startup and that it’s really a leak. If you get to the point where you think it’s leaking, and you start a new scene in the same blender instance, does the memory usage go back down? If you reload the file again and continue working does the memory usage continue to go up?

Beyond that, the smallest possible file is desired. Multiple devs typically look at the issue so large repros are really a pain to download all the time for each person.

Lastly, do you mean the “adaptive subdivision” form of shader displacement? That has some… quirks that can lead to very, very high memory usage if the viewport and render levels on the modifier aren’t set to 0.

Thanks @deadpin.

In that instance - no. Even after closing the Blender instance the memory is not freed.

Yes.

I understand. The thing is, if I remove the biggest meshes and objects from file, the leaks are gone, or they are so small I can’t notice them. I’ll need to write down and monitor RAM usage more closely with next test on smaller file.

I don’t use adaptive subdivision, no.
The example case I was talking about is a large mesh (~1,5M poly) with 3 levels of Multires on top. On top of that there is a shader with quite complicated procedural displacement (made in shader editor).

When I try to render this mesh with CPU only I got memory usage around 100GB - which is expected judging from my previous scenes. But when the rendering is done with CPU+GPU the memory usage goes much higher and starts swapping (I have 128GB of RAM + 128GB of swap). After hybrid rendering is done large parts of the RAM+swap are not freed - usually around 50-60GB. Swap eventually is freed after some time, but not RAM. I need to restart the mashine to free the RAM after such a render.

Other examples are also with big meshes + displacement + hybrid or GPU only rendering.
So far I stopped using hybrid rendering for the problematic scenes and render with CPU only.
I’m using AMD GPUs, so maybe this is driver or OpenCL only thing? Also, I had deadlock problems with Blender previously that were related to AMD GPU drivers.

I was hoping for some form of automated tool that can generate relevant information for developers in case of memory issues like crashreport does for crashes.

That seems impossible unless the memory leak is in the video driver maybe or some other non-blender process? If that’s the case, then there’s probably no blender tool or logging that’s going to be able to spot it either.

I don’t think anyone on the triage team is going to be able to even attempt a repro at this size… If, as you say, that if you just reload the file in the same blender instance and continue using it that the memory still grows, than using a smaller object would be possible. It would just require a few reloads to start seeing it? Maybe something along those lines will help.

I though that too.

That is a good idea. Will try to strip the scene and reload after various scenarios.
Thanks for help.

There’s --debug-memory flag. https://docs.blender.org/manual/en/latest/advanced/command_line/arguments.html#debug-options
A debug build on buildbot is on roadmap. Linux, and with some effort macOS may have leak sanitizer enabled in it.

1 Like

When running Blender with ./blender --factory-startup --debug-memory --debug-gpu
I got:

LLVM triggered Diagnostic Handler: Illegal instruction detected: VOP* instruction violates constant bus restriction
renamable $vgpr4 = V_CNDMASK_B32_e32 32768, killed $vgpr14, implicit killed $vcc, implicit $exec
LLVM failed to compile shader
radeonsi: can't compile a main shader part
LLVM triggered Diagnostic Handler: Illegal instruction detected: VOP* instruction violates constant bus restriction
renamable $vgpr4 = V_CNDMASK_B32_e32 32768, killed $vgpr4, implicit killed $vcc, implicit $exec
LLVM failed to compile shader
radeonsi: can't compile a main shader part

That is happening without openning any scene. Looks like Radeon related issue.

OK, after some digging, it looks like LLVM bug, that was fixed in LLVM 12.
I’m on 11, so I will wait until 12 hit my repo.
https://developer.blender.org/T83488