# Node Tree data structure

Currently node trees are stored as a linked list of nodes and a linked list of links. That leads to some really inefficient code if you want to walk over that tree. Take a look at this `nodeChainIter` function for example. It takes `O(n^2)` time and `O(k)` stack memory where `n` is the amount of nodes and `k` is the length of the chain (potentially equal to `n`). While the memory needs can be reduced by using an iterative instead of recursive function, the running time complexity remains quadratic if you don’t use any auxiliary data structure.

Being able to quickly find linked nodes is essential for more sophisticated tree walking algorithms. I see two ways to achieve this:

1. Every time an operator needs this data structure it builds it based on the current `bNodeTree` struct. This is essentially what is done in Animation Nodes. Every time the node tree changes, this structure is rebuild from scratch because working on the original node tree data is so slow (note: the Python API does not really make it apparent that the `socket.links` or `socket.node` are very inefficient. That hit me hart some years ago.)
2. Refactor the `bNodeTree` struct to use a different data structure, that supports the operations we need out of the box. Also it would be good to get rid of these linked lists in many places but that is a bigger topic I guess… This should be the preferred solution in my opinion but I don’t know if this is an acceptable change in the Blender source code.

I think we have to go one of these ways sooner or later. The question is which one?

When that is decided, I should be able to do the refactoring myself.

3 Likes

As we move towards “everything nodes” this would be a good thing to look at. Refactoring has larger implications though. What operations and aux data structs would be needed for more performant node chain iteration logic?

Most importantly fast lookup of connected nodes/sockets is necessary. So for example if you have one node, it should take `O(k)` time to find all the nodes that come directly after it, where `k` is the amount if nodes that come directly after it. Currently the time complexity of this operation is `O(n)` where `n` is the total number of links. (It gets even worse when you want to resolve reroute nodes in the same run)