Do we really need to use reference counting for data blocks?

A couple of weeks ago I had this thought, but I’m not sure if it is stupid/naive or if it can help reduce complexity a lot. Maybe @brecht or @mont29 can tell me what it is.

Afaik, our main usage of is to decide if a data block should be stored in a file. There are a couple of other uses, but I could not find many.

We don’t use the reference counts to free memory when it is no longer referenced. We don’t do incremental freeing. The still have long pauses while “freeing”. Furthermore, reference counting is hard in Blender, because some references have to be counted, some don’t and there are many special cases such as fake users.

It seems we are doing reference counting without getting any of the benefits it should provide. We get the disadvantage of not being able to handle reference cycles well, though.

I do not have a full proposal, I just wanted to share this general idea and get some feedback.

A better solution might be to stop using reference counting for data blocks and use something like garbage collection instead. Thanks to @mont29, we already have an iterator that can go over all data block references. It seems it should be easy to write a function that tags all data blocks that should be kept when writing .blend files.

My hope is that this can reduce a lot of complexity with ID management in Blender.

We are not taking advantage of reference counting much right now. There’s a few places that check if it’s 0 or 1 to avoid work. However we have to significantly improve performance when working with many datablocks, and looping over all datablocks to delete one datablock is too slow.

I think the solution for that is to effectively make all datablock references “weak pointers”, so that references are stored in two directions. We don’t need necessarily need to store a user count, but updating such weak pointers would happen in all the same places we already increase and decrease reference counts.

I think the time of datablock memory freeing has to be predictable, and garbage collection is not ideal for that.

I do think the current logic for when to reference count is too complex. I think this can all be abstracted away behind a function that sets and clears pointers to other datablocks, which should be able to tell if reference counts / weak pointers are needed based on flags on the datablocks. Right now add / copy functions have to pass along flags and decide if reference counting is needed, which I think is wrong. It should be clear if a datablock is part of the main database and reference counting, or if it’s owned by something else without reference counting.

1 Like

I was not proposing to do the freeing and random points in time. The freeing would still happen at the same moments it is happening now, just without the user counts. I was calling it “something like garbage collection”, because for that to work we’d also have to start at some data block and check what is reachable from there.

Anyway, thanks for the links. Caching the relationships would make things more efficient indeed. That makes the reference counts implicit as well.