Because I was on vacation, I missed the workshop sadly.
I do think that most of the things that were discussed sounds great.
However, there are a few head scratchers that I would like to discuss:
Cache
If the whole pixel processing stack is faster (via SIMD/GPU, and evaluated at preview resolution), and video decoding has no stalls (via ffmpeg caching, GPU decoding etc.), then maybe “final frame” caches are not needed at all?
We should absolutely make everything as fast as possible! 
But I don’t think we can get rid of final frame caches for a few reasons:
-
Even if we make decoding and compositing really fast, people will be able to bring it to a crawn with enough stacked videos or with enough high definition video (for example 3 4k 60fps HDR strips on top of each other with filters applied on top)
-
Caching helps with slow network drives or can also help when running on under powered computers
-
Not needlessly wasting compute resources. With a cache we don’t need to re-evaluate the same frames all the time.
By the same thought, maybe “prefetch” should be only concerned with prefetching/preloading source media files, not rendering the whole frame?
I think we still need prefetch to render the whole frame as sometimes the actual composition of the final image is what is slow and not the reading/decoding of the source media files themselves.
Similar to trying to pre-render scene strips: if they are fast enough, there feels little point in pre-rendering them. If they are slow, then trying to pre-render them in the background will slow everything down anyway.
I don’t really understand what the argument is here?
“It is slow to render, so don’t bother pre-render them?”
To me it is a bit like saying “Cycles is slow to render, so don’t bother pre-render it to a video file. Even if viewing the resulting video file will be fast, the creation of it is slow, so just don’t do it.”.
This is the absolute best case scenario for pre-fretching and caching!
You can’t playback in real time, so instead you pre-render it (slowly) so that you can then watch it in real time later.
The whole point of pre-fretching and caching is that you spend time up front to get good performance instead of doing it in real time but with horrible performance.
There might still be a need for a very small/simple “only during one frame” cache of intermediate steps. Imagine being on one frame, and dragging some slider that controls parameters of an effect; might want to cache everything below the effect for more interactive tweaking.
This is exactly what our “preprocessed” and “composite” caches tries to solve. So It feels a bit weird that we want to remove them if we want features like this?
I do agree that we could probably limit these caches to just one frame though.
Strongly typed channels
Flexibility of “anything can go into a channel” is not often used in actual production
We talked about this already in a google doc that Aras created. I’m a bit sad that it doesn’t seem like any of the feedback John Kiril Swenson and I posted there made it into the meeting about this topic. Why was this the case or was it simply ignored?
I’ll post mine and John’s comments here for clarity:
Mine:
I’ve talked to multiple users and most are either indifferent or think that it is a great feature that you can mix and not have strongly typed channels.
The reason why users think it is a great feature is that you can easily group strips together so it is easy to work on a “focused blob”. If the channels only could contain certain types, then the end user is forced into a certain way to organize their timeline.
Here in Blender they can instead organize in ways that makes sense to them.
(And of course this ongoing organization effort does sometimes conclude into channels only for audio or video etc)
John:
After Effects works like this (with un-typed channels) and it’s very successful software. I think the A/V split paradigm, where some channels are dedicated to audio and some to video et. al., only really accommodates the very specific (albeit common) use-case of having to edit simple 2-channel footage.
We could think about introducing features to help users out in these cases, like, maybe while you’re dragging strips, scrolling up increases the distance between your audio and video strips, so that if you want to stack video on video and audio on audio then you still can. Or maybe a setting to switch between “channel display types”.
But it’s very flexible to allow any channel to have any kind of strip. Confusion can be completely avoided if the UI makes it obvious which strip is which type (e.g. audio strips have a line down the middle displaying their waveform).