Remove support for saving MemFile undo steps as .blend files [Proposal]

This document proposes to remove the ability to dump MemFile undo steps as .blend files. This is done for auto-save. This slows done auto-save but allows for more optimization opportunities.

Background

The undo system in Blender uses the same serialization code that is used for writing .blend files. Essentially, it’s currently creating an in-memory .blend file that is restored whenever the user wants to undo. On top of that simple approach, there are a few optimizations that to reduce duplications between undo steps and to avoid full depsgraph evaluations after undo.

This approach generally works well and is not subject to debate here. However, there is one additional aspect which was originally a good idea but is now causing some problems and limitations. Blender has auto-save which by default saves the current state every 2 minutes. This should be fast, because if it’s not, it causes annoying (short) freezes of the UI.

To make it fast, Blender currently just dumps the last MemFile undo step to disk as a .blend file. This way, the auto-save code does not have to serialize all Blender data again, which results in better auto-save performance.

Downsides of Current Approach

Unfortunately, this approach to auto-save also has some downsides which is why I propose to remove the ability to support saving undo steps as .blend files. The most prominent downsides are:

  • Not every undo step is a MemFile undo step. For example, when in in mesh edit mode, a different kind of undo step is used. This also means that auto-save is currently essentially ignoring all changes done while in these modes, making it an unreliable tool.
  • It makes it much harder to optimize undo because every undo step also has to be able to become a valid .blend file with little overhead.

Possible Undo Optimizations

There are a few concrete things we already do or want to do to improve undo performance which would benefit from the proposed change:

  • Skip forward compatibility conversions. Some features in Blender have been implemented with explicit support for forward compatibility. That usually means that Blender stores data in an older format in .blend files so that older versions can read them. At run-time the new storage format is used. This conversion step makes saving and reading files slower. When creating a MemFile undo step, the conversion should not be necessary because only the current Blender session has to be able to read the data again. Therefore, skipping the conversion for undo is reasonable. Unfortunately, when saving such undo steps as .blend files, the resulting file is quite different from what you’d get when saving normally, which is unexpected and has lead to bugs in the past already.
  • Skip writing some data to the MemFile, e.g. by using implicit sharing. This can make creating undo steps and undoing itself much faster. There is other data that doesn’t have to be written to undo steps, for example the SDNA. It’s likely that we’ll see more such optimization opportunities.

What should auto-save do instead?

Instead of using the MemFile for auto-save, we would use the normal saving code instead. This is generally slower than the current approach. From testing it seems that the auto-save time triples in the worst case.

This sounds quite bad, so there is definitely a trade-off between faster auto-save and faster undo. Fortunately, there are also a few things we can do to improve the auto-save situation again.

How to make auto-save less annoying?

The longer auto-save takes, the more annoying the unexpected freezes of the UI. There are quite a few things that were already done or can still be done to improve the situation:

  • The most obvious approach is to just make normal saving faster. The design of .blend files allows writing them fairly efficiently. If we get normal file-saving to be bottlenecked only by disk-write-speed, it should have close to the same performance of the old auto-save. In this I started experimenting with serializing ID data blocks in parallel to achieve higher throughput.
  • We could skip some things we do for normal file saving like calling BKE_lib_override_library_main_operations_create which is essentially just a fail-safe. Skipping it should still generate valid files.
  • Restart auto-save timer when saving manually. This makes it so that people who have a habit of saving all the time anyway never get a frozen UI due to auto-save.
  • Improve perceived auto-save performance by trying to find a better moment in time for auto-save where it is less intrusive. There are a multitude of different heuristics that could be used to improve the user experience.
  • Notify the user before auto-save happens. For example, the status bar could show a count-down starting 5 seconds before the auto-save. The user could cancel it by saving manually. This way, the auto-save freeze is less unexpected, and it even teaches the user to get into the habit of saving regularly manually.

More suggestions are welcome.

Can we make the undo optimizations work without the proposed change?

In theory it is possible to write skip implicitly shared data and forward compatibility conversions when writing the MemFile. When saving it as .blend file, we would then still have to do the forward compatibility conversions and insert the shared data in the middle of the MemFile. This still wouldn’t solve the missing data when e.g. in mesh edit mode, but it could solve some problems.

While this works in theory, in practice it’s quite tricky. So tricky in fact that we likely wouldn’t implement some undo optimizations because it’s not worth the complexity. Another issue is that this code path would be rarely tested in practice. Only when it really matter, i.e. when the user has lost work due to a crash, the auto-save files are read back into Blender. I’d feel better knowing that auto-save is just a normal save. So as long as normal save works, auto-save also results in a valid file.

Conclusion

In the end it is a trade-off. Personally, I’ve come to the conclusion that undo performance matters more than auto-save performance, simply because it happens more often and affects more people. The extra work we currently do for every undo step allows auto-save to be faster but it comes at a cost that I think is not worth it.

16 Likes

I highly agree with this. In every software, I proceed as if autosave doesn’t even exist, and Cntl-S on a regular responsible basis. I don’t trust timers with not losing hours of work.

7 Likes

@jacqueslucke So I read a bit into this after @ZedDB brought this up to me.
As long as the auto-save performance will not regress much I think this is ok. Improving undo performance is a bigger gain. But on heavy sculpting and production files the slower auto-save could become very disruptive. So I’d appreciate it a lot if the saving performance can actually be significantly improved soon after.

Making it more obvious when and if Blender is auto-saving is a high priority then. Having Blender regularly freeze for no discernible reason is a big frustration.
But a visible timer is too distracting, especially when auto-saving every 2-3 min.

Finding the right moment to auto-save is also a good point. Maybe after navigating is a good moment since that’s usually a short pause for reorienting the user. Or generally if there’s a significant pause of the user not giving input.
It shouldn’t happen in between strokes, using tools and operators.

@ZedDB Also mentioned that (if possible) it would be a great addition behavior to still allow navigation while auto-saving.

7 Likes

Something to at least consider if moving to using the normal saving code is always enabling Compression during auto-save to see if reducing the amount of data written to disk makes a decent difference for time spent in total.

A quick measure of the regular save code path shows 435ms to save out a complex 1.3gb file. But only 340ms to save out the same file with compression at 290mb. There’s probably a few cross-over points where compression makes things better and/or worse. More files, and platforms, would have to be tested. This doesn’t make up the 3x difference but it could be a free improvement after larger changes like parallel serialization. [EDIT]: Yeah, some other files show a regression with Compression from 180ms to 220ms (original file size was 600mb).

We haven’t updated zstd in a while but there have been some, small, improvements since then as well. Notably for Apple Silicon so measurements there would be most interesting.

Can you briefly explain why auto save and implicit sharing are incompatible? It’s not obvious to me from reading the pull request.

I would think forward compatibility is not a concern for auto-save, but maybe the special case of reading such a file without forward compatibility would be too easy to break.

The reason I’m asking because in theory, saving undo steps to disk is a nice way of supporting a longer undo history without running out of memory. At least there were ideas like this at some point. And this seems to be moving away from having undo steps be compatible with writing to disk.

2 Likes

I am also not convinced that going away from keeping global undo and blendfile writing at least somewhat compatible is the way to go. And I still do not understand why it seems to be that complicated to have some sort of two-steps write-file process, with the data behind the shared pointers simply being ‘blendfile-converted’ when the auto-save file is written, instead of when the undo step is created.

As for forward compatibility, IIRC when the whole Mesh data was being converted to attributes, undo code already was using a specific file-write process, which was skipping the conversion back to the old Mesh format (and actually created issues with autosave at some point if my memory serves well :wink: ).

At the very least, if it is decided to split ways, imho the improvement to blendfile writing speed should be done and validated before changing the undo code. Breaking something that works and saying ‘we’ll fix it later’ is never a good thing, and almost always a fairly painful experience for both users and devs.

But on heavy sculpting and production files the slower auto-save could become very disruptive. So I’d appreciate it a lot if the saving performance can actually be significantly improved soon after.

I do wonder, does auto-save while sculpting actually do anything useful for you? I was under the impression that just like in mesh edit mode, auto-save wouldn’t save any changes you made since entering sculpt mode currently.

Finding the right moment to auto-save is also a good point. Maybe after navigating is a good moment since that’s usually a short pause for reorienting the user. Or generally if there’s a significant pause of the user not giving input.

Might be interesting to look at a video of you working, then checking where the auto-save freezes happened and then trying to find a better moment in the video where the auto-save could have happened. Based on that, we can try to find a good heuristic.

Also mentioned that (if possible) it would be a great addition behavior to still allow navigation while auto-saving.

Allowing navigation while saving is probably quite difficult, because navigation itself changes what is saved (at least the viewport transform).

Something to at least consider if moving to using the normal saving code is always enabling Compression during auto-save to see if reducing the amount of data written to disk makes a decent difference for time spent in total.

Could be considered. I’d assume that whether this actually helps highly depends on your scene and hardware configuration. It helps more when the disk has slower write-speed so that the overhead of doing the compression is worth it.

Can you briefly explain why auto save and implicit sharing are incompatible? It’s not obvious to me from reading the pull request.

As mentioned, I don’t think it’s fundamentally incompatible. Supporting it would require some more changes like better decoupling of the serialization of data-blocks and writing full .blend files. This is a bit similar to what I tried when experimenting with multi-threaded serialization. Some additional notes:

  • Serializing implicitly shared data for auto-save is not just copying a single memory array. The shared data may require many BHeads. This is for example the case for vertex groups. Implicit sharing can also makes sense at even higher levels like entire grease pencil layers or a list of FCurve.
  • Serializing this data just for auto-save also makes auto-save slower. And it will become slower the more we embrace implicit sharing. Up to the point where it has approximately the same performance as normal file saving.
  • I’m not sure if all the shared data should be interleaved with the non-shared data when writing the MemFile to disk, or whether the shared data can also come at the end of each ID. Either way, there is some overhead involved in writing these smaller data memory chunks to disk that are part of bigger memory chunk currently. This might become important in production files with many data-blocks. The disk-write-throughput is often significantly worse when writing smaller buffers. That matters because we also use implicit sharing for relatively small arrays.

Overall, every time I started implementing this, it felt like it’s not worth the complexity and that there are better ways to achieve the same goals.

I would think forward compatibility is not a concern for auto-save, but maybe the special case of reading such a file without forward compatibility would be too easy to break.

I can only say that every time we did something like this we ended up with bug reports after a while. Obviously, it should just be tested more explictly, but it’s still easy to break. Versioning code also becomes more complex. That’s because converting the old data to the new data usually happens at the end of all versioning code. So all versioning code since the new data representation was introduced has to be able to handle the old and new format.

The reason I’m asking because in theory, saving undo steps to disk is a nice way of supporting a longer undo history without running out of memory. At least there were ideas like this at some point. And this seems to be moving away from having undo steps be compatible with writing to disk.

Interesting idea, I haven’t considered that before. When reading this at first I thought that you’re right, but now I started to change my mind.

  • Removing support for saving MemFile undo steps to .blend files does not feel much more incompatible with this idea then not supporting saving other kinds of undo steps as .blend files. It would be quite confusing if the fact that there are different types of undo steps leaks even more into user-land. We could support saving all undo step types to disk, but that wouldn’t result in .blend files.
  • Not sure who the target audience for such a feature is. Seems tricky to use this in production files. Every time an undo step is created, we’d have to write a large .blend file (after the initial period where everything fits in RAM). For smaller files, one could also just do a full .blend serialization for every undo step including everything as that would be fast enough.
  • Requiring the undo steps to be written as normal .blend files also means that we’d loose implicit sharing at that point. I’d think that we would want to keep the implicit sharing between undo steps working even when the undo steps are on disk. Personally, I just wouldn’t call these files .blend files anymore, because to me that implies that they are just like all other .blend files.

I am not sure if this is a short-term or long-term proposal to diverge the MemFile and .blend file. Locally I can see why it might be on the table. For the longer term not so much.

To me it seems a bit strange to consider that it is important to preserve all kinds of implicit shared data as such for undo, but not for regular save. Save-and-reload should not make the file to use much more memory because some implicit sharing was lost. So to me it is more like we do need to support proper (de)serialization of implicit sharing to regular .blend file anyway.

But it also depends which exactly data is shared. If the undo steps references the data from the current state, and not data within the undo step, then I do not really see why it can’t be written as regular .blend file.

Yes, it would be nice to support implicit sharing within a .blend file. It seems doable even if it not entirely straight forward. For the case mentioned by Brecht where we’d want to store multiple undo steps on disk, sharing data between different undo steps seems more important. And this seems much harder to do with .blend files. At least it would stretch the idea of what a .blend file is a bit.

I’d like to emphasize this:

As it works right now, auto-save will ignore any changes (undo steps) made in the following modes:

  • Edit Armature, Curve/Curves, Font, Lattice, MetaBall, Mesh
  • Paint Curve, Image
  • Sculpting
  • Particle Edit
  • Text Editing

This is because the code scans the undo stack backwards for any MemFile step and, if it finds one, saves it. Otherwise, it will flush changes from editors (e.g. mesh editing, sculpting) and does a normal save.

The problem is: This last case of flushing the editors, never happens (correct me if I’m wrong).
When Blender starts up, it automatically creates an undo step called “Original” (which is a MemFile!). So if the user e.g. loads their sculpting file and starts sculpting, none of their changes are ever saved during auto-save.

To summarize: “Auto-save” doesn’t auto save changes made in any of the modes listed above. The user has to either manually save, or switch to e.g. object mode to manually flush editor changes.

IMHO, it’s a stretch to call this “auto-save” right now.

2 Likes

Thanks for the explanation, I can see how implicit sharing makes undo saving difficult. It’s unfortunate because auto save could even be made to work asynchronously if it only needs the undo stack.

What I would suggest to do is:

  • Change auto save to not do anything in these modes. But do run after exiting the modes and other places that flush changes, if auto save time has passed. This way there is no big regression even if the behavior remains sub optimal.
  • Merge the implicit sharing undo patch, since this is an important optimization that should not get blocked for much longer.
  • Plan some time to improve auto save performance in a more fundamental way.

In some way I still think it would be good if auto save could work based on the undo stack, and even work asynchronously in a background thread. To me auto save does not necessarily have to write a general .blend file exactly, if that gets in the way.

I also think that ideally everything should be stored in the memfile undo stack, rather than having a single stack but still separate storage that continues to cause problems. This is what I hoped to the unified undo stack in 2.8 would be, but it didn’t go that far.

6 Likes

That would be ideal, yes. It’d be interesting to get a sense for how much slower e.g. edit mesh undo would be with memfile + implicit sharing. Edit Mesh undo already does a conversion from BMeshMesh anyway.

That sounds awesome to me as then we would not lock up the interface while saving. It could also solve the “changes won’t be saved in sculpt mode without toggling edit modes”.

2 Likes

Sounds like a good plan to me.

Good point about asynchronous auto-save, I also haven’t considered that yet. Supporting to write the auto-save file asynchronously also alleviates some of the arguments I made before which came down to the more we embrace implicit sharing, the slower auto-save becomes. That’s because the time it takes to auto-save doesn’t really matter much anymore.

We could potentially even implement this in a way that does all the forward-compatibility conversions when auto-saving to avoid having different kinds of .blend files. Or we tag auto-save files in some specific way so that they don’t run versioning code and can only be opened by the Blender executable that created them.

“How much slower is memfile undo (with implicit sharing) compared to mesh edit mode undo?”

Because I was curious, I did the test (This is getting a tad off-topic, but I thought it would still be interesting to share). You can check out the code here. I basically combined #106903: Undo: support implicit-sharing in memfile undo step and #110022: Undo: Always fallback to memfile undo as well as some little changes to allow memfile to be used in mesh edit mode.

So here are the numbers:

Mesh: ~1M vertices, ~2M triangles

Edit Mesh Undo:  undosys_step_encode: Average:  ~90ms (but sometimes spiking to 2sec, don't know why...)
Edit Mesh Undo:  undosys_step_decode: Average: ~600ms

Global Undo:     undosys_step_encode: Average:  ~92ms
Global Undo:     undosys_step_decode: Average: ~621ms

Mesh conversion:
BMesh to Mesh:                        Average:  ~90ms (99%)
Mesh to BMesh:                        Average: ~476ms (76%)

So the short awnser is “not much slower”. But I didn’t test this with a lot of files, so who knows. Interesting nonetheless.

1 Like

About auto-save, other software just have a reminder in the ui, so it’s not done in the backgrounds.