Asset Linking [Proposal]

I honestly don’t think that at all. For all data-blocks where you have control over the name you can do exact the same as you are doing now. So maybe this proposal does not affect you at all. The case where it would affect you is if you want to use external asset libraries including the one that is shipped with Blender. Currently, by using append and reuse, copies of those data-blocks are introduced into various files, which wouldn’t be the case anymore with the proposed system, solving your problem without you having to do anything extra.

Uniquely identifying data-blocks just based on their name might work within a tightly controlled setting like yours but I have my doubts that this can work more generally given the different kinds of users of Blender. Would be good if one can name their data-blocks however one wants without worrying that some asset library has a data-block with the same name.

I think it’s good to find all the places where Blender creates the .001 files where it shouldn’t and to find solutions for those cases. From what I’ve seen in Blender Studio production files, the main cause for unintentional data copies is the use of append (which semantically creates a copy) instead of linking. This proposal tries to provide a way to use linking where one previously had to append to avoid the problem at the source, without the downsides of linking for many cases.

As you’ve seen, even among core developers there are still quite different ideas for how this should work. This proposal is just one that I’ve been thinking about this week, we’ve discussed others in the past and will probably yet others in the future. It’s good to get this feedback though.

4 Likes

It seems to me that deferring the check to the user is a lot more busywork. It’s also error-prone, unless the user in question always stays on top of the ins and outs of the database in a given production. The moment they don’t, that’s when a datablock gets deduplicated by error and data is lost. Not lost forever, sure, but it’s lost track of, and might not be noticed for some time. I think without namespaces that kind of confusion is bound to happen at some point. From what I understand of the implications, I think I’d prefer some kind of automation. As per what system would handle that best, I can’t bring anything of value to that discussion.

Does that ever actually happen?

1 Like

This is exactly why it doesn’t sound like a “fixable” problem to me, how can you make sure that whatever system is implemented will succesfully cover any and all possible use cases given all the different kind of users? I still believe the real problem is not having control on what Blender does when several datablocks have the same name, it doesn’t matter if it’s on a controlled production enviroment or using external assets and libraries. I mean, there could be an automatic option for people that doesn’t want to think about that and prefers Blender to handle duplicates automatically, there will be many cases where the automatic solution is more than enough; but still, having control over that is really needed.

We agree on that, but the first and easiest solution before anything else is giving the user the option to have control over what happens, before it happens. In my opinion Blender shouldn’t try to be smart enough to guess what to do all the time, it should be dumb enough to ask what to do when there’s not a clear solution. And since there’s so many kinds of users and uses for Blender, there’s no way to ensure the program will always automatically know how to handle things.

Append should be as easy as it sounds, bring the asset to the file as local and keep everything editable, and IF there’s datablocks with the same name just ask the user what to do there, it’s that simple.

2 Likes

I don’t believe there’s going to be an easy solution that covers all cases. And, I don’t believe that nothing should be done until such a solution is conceived of.

For me, I’m far more concerned about duplicating my own assets, then downloading something off a random Internet site and hoping that blender can figure out that these are not my assets with the same name.

Perhaps on the import/append, there might be some merit to blender identifying questionable duplicates, giving the user a list, and asking the user to confirm which of these it wishes to replace or actually copy through with the secondary version.

2 Likes

I agree with Jacques that comparing just based on names is not going to work well. It’s very easy to run into naming conflicts, and when it happens, how do you then decide which ones in a list of potentially dozens are really the same?

Especially if you’re working in a team, or with a blend file created weeks or months ago, how do you check if it’s safe to deduplicate? Open both blend files and tediously compare e.g. all the nodes in a node group? And if you get it wrong, you might find out hours later that your geometry nodes group isn’t quite working as it should.

My concern is not that the link will break, but that deduplication will stop working properly when files get renamed, reorganized, or transferred between computers. Or that it incorrectly deduplicates something when it shouldn’t.

I wasn’t really imagining solving this case. Only on append and make local, where performance is not so much of a concern. Maybe we need to do better hiding indirectly linked datablocks in the UI, or have project level functionality to put such duplicate datablocks in a blend file inside the project, that it then links to. Or to deduplicate things inside a project by looking across blend files.

If we did want to solve this without linking and make loading fast, a possibility would be to store a hash of the datablock contents that is updated whenever an edit is made and the blend file gets saved. Though I’m not sure this is really something we should be doing.

2 Likes

I still believe this should be up to the user, not an automatic thing. But one thing that would make the sorting and decision easier would be if Blender showed a list with the date of the latest change next to each datablock. The most recent might not be the one you want/need but that would be very helpful.

When working on a team (unless it’s a very chaotic process) it’s probable that any change made to an asset will mean the asset will also be saved as a new version, so that’s not really an issue

1 Like

I have several different thoughts on this approach, which are not necessarily perfectly sorted at the moment, so sorry if my feedback is a bit incoherent and not very concise.

First off: I agree with a lot of the assumptions that are being made of why a system like this is necessary. Duplication of identical datablocks is an existing problem that, in the best case, just adds clutter to the file and worse creates immense confusion with untraceble versions of datablocks that have almost the same name and for which de-duplicating can mean loss of data.

Overall this proposal has a lot of aspects that I do like, but also a lot that I don’t.
I do like the approach of keeping assets as library dataaround in the file and only making it truly local, when it needs to be edited locally.
But I do want to say that I feel this approach adds a lot of complexity without replacing much of the existing complexity.


I definitely agree that simply using a data-block’s name and type for de-duplication is not a sufficient solution. And while giving the user control over the decision-making process when it comes to de-duplication of data-blocks sounds nice on paper, in practice 90% of cases can undoubtedly be handled automatically and the whole point here is to reduce management overhead.
This point is best illustrated with the issue that arises immediately when we nest assets that will be shipped with Blender.

Example:

  1. Asset A references Asset B
  2. Use B
  3. Use A

This example should not prompt any input from the user to ensure de-duplication of asset B, if this has not been changed by the user between step 2 and 3.


What exactly is the point of linking under the immutability condition? Essentially, when the file is not allowed to significantly change anymore, why should I keep the link to the data in a separate file. Only to share the data storage on disk between different files, I assume. But that’s not the only reason why I would usually link data.
When I link a data-block within a project, I expect every update to it (/version) to propagate everywhere it is being used. ‘Regular linking’ with receiving all updates should not stand in conflict with the approach of append+re-use/link+embed. With this proposal I understand the inteded use to be more dictated from the asset side, rather than being chosen when the asset is used.


I’m having quite some trouble with the approach of solving asset versioning via the file name. Firstly I don’t think it’s good to put this many constraints on how file management in a project should work. Of course, some restrictions are necessary, but enforcing versioning like this is not a good solution imo, since this is something that should just work quite generically and leave these kinds of decisions to the user. More critically I feel that the value of the information that the version number gives is heavily diluted in cases with several assets per file. If we keep track of user-set version numbers (which does make sense imo), I think we need to do this on a data-block level.
(I haven’t fully thought through what this means for a hierarchy of versioned assets)


One problem I have with the proposal is also that the ‘publishing’ process is not defined. I assume this is intended to depend on the use-case and on the pipeline that the assets are used in and be fully in the hands of the user. But with it being such a quite integral part of how users are supposed to interact with the asset system it seems to me like out-sourcing a large part of how this works to something that is done fully outside the control of Blender and requires a lot of additional knowledge that is not inherently taught by the system.


I also do think that we need a way for users to explicitly version assets. Simply recognising that two data-blocks are different does not provide enough context to make an informed decision about de-duplication, which might still make sense to do regardless.
This point is not necessary for the actual issue presented here, so it could also be an additional mechanism that happens on top.


I agree with all the points that Brecht is making here.

I’m personally not too concerned with this aspect tbh. This is already how it works for regular linking of files and I think it’s not unreasonable for asset libraries to require a consistent structure for this kind of de-duplication to work properly and there can be operators to fix broken references just as well.
(Transferring files between computers should work though, I agree)


My preliminary conclusion

Generally, I do think isolating non-editable, local (in the sense of file storage) asset data-blocks into name-spaces per library is the right approach, so the link+embed concept makes sense to me.
I do think the unique identification of the asset is not something the user should need to worry about in most cases. So (besides hashing the data-block), I think using [an identifier of the library + relative filepath of the asset file + asset name + asset version] would be a more suitable solution than imposing restrictions on the file name and file management process.
Regardless of automatic hashing of assets (which would be great imo), I do think we need asset level explicit versioning and the ability for users to upgrade to new versions of an asset when available, at least explicitly. Managing the version of an asset in the data-block means the publishing step becomes managed from within Blender as well. Old versions could also be optionally (or by default) kept around when creating a new version.

6 Likes

Initial Feedback

  1. Adds yet another layer to the linking/appending workflow which is another complication as a user experience for data management. It is already complicated with the “marking” as asset layer on top of that base layer of linked and appended.
  2. Doesn’t polish conflict checking when linking data or storing data, so the former system remains the same and a new system is added to solve what the former systems (That will still exist) require.
  3. Having a user opt-in to an upgrade per asset I think is a tall order - specially once you get to thousands of assets. This also is a big spanner in any collaborative pipeline.
  4. “casting” or saving data to libraries from the source file as a UX still isn’t exactly solved if at all in the proposal.

The method of how you handle how data is packed, or parsed, and how it overrides and get stored… without adding another layer of usability confusion and know-how for ultimately the same thing: data transferal and passing from a global browser - should be key here. Under the hood performance and handling.

But as a UX, less is more:

  1. Link, Append from Browser.
  2. Save to Browser (cast to library)
  3. Ship Browser

A user shouldn’t worry about more.

Right now:

  1. Link, Append from browser, all good - fits standards
  2. Save to browser
    2.a First by marking asset
    2.b Then save the blend to the library
    2.c Then reference the library from preferences
  3. Ship Browser
    3.a Pack library blend files into a zip
    3.b Download, unpack zip, then manually reference library

Adding another UX to manage the 3 step workflow of a browser… is not the solution. A lot of the UX from the current browser could be optimized by automating some of the steps for the user under the hood.

If you add the “embed” option as a UX, I guess you should update the “Pack” nomeclature and UX so they are the one and the same. A “packed” asset and a “linked” or “appended” asset/data.

Proposed UX:

  1. When appending an asset, “unmark” it automatically, only data is loaded to the new blend file, not an “assets”. This will remove duplicates from the UX by design, so the source files are the only asset files.
  2. When marking as an asset, add the UX or an operator to “cast to file” (same context menu if necessary) so that you don’t need to work directly in the library in the source files library structure, and a user can “fling” or “save” the marked asset to any designated library anywhere in the system. Save to selected in browser (overwrite), or save to a new asset file in library (isolate data block, cast to new blend, save in library location).
    2.a Right click on data
    2.b Mark as asset
    2.c Then “Cast to Library” operator
    2.d Select library from a prompt, confirm
    2.e Have an opt-in to “Save to Selected” asset in the browser in the prompt
  3. When “Cast to Library” on a selected asset in browser is used, allow an opt-in versioning prompt: “You are overwriting X file, are you sure?”
  4. Essentials should have linking toggles built in in from the get go (not only append), so if Blender updates, linked assets would update if a user migrates Blender versions. Allow a user to opt-out/in with append to break the links. I was not sure why Essentials don’t have “linked” built in - this design question wouldn’t be one if that UX were a fact from the get go. If linked data was used, then resync overrides and you get your version update (if library overrides were used by user.) If library overrides weren’t used, then they get the update automatically, hands free, when loading files referencing the essentials librararies as linked data. The quick solution here is also… store Essentials in the USER relative path, and if not, in the SYSTEM relative path. Libraries loaded and referenced from here would… work, by using the API that already exists. No need for new API. If you link from Essentials, it would set the linked data to those paths dynamically.
  5. Allow editing of asset categories outside the source files (drag and drop to organize from anywhere without needing to be in source files)

Polish the UX, don’t add more data management layers for the end user…

Having built a shipped library with the linking and appending system through an addon that registers the library itself, I don’t think re-inventing the data management wheel is good here - alternatively polishing the user experience first and fore-most with what’s there I would consider better resource management and ultimately a better end-user experience.

EDIT: Though in all fairness, if this is a proposal to make the Essentials use the USER or SYSTEM relative paths and pave ways to help “cast” assets to libraries, then I guess the technicals for that would be great! I know a post-operator asset appent/link to set the path of linked assets to USER or SYSTEM by latest blender version is a bit of a curious backend challange for sure.

2 Likes

This entire improvement of asset handling and usage will have to involve many different aspects of development. I don’t want to go too far from the topic discussed in this proposal, but I guess the following is still relevant:

AFA I understand, instead of comparing the actual node change, the versioning detection are made based on the version number in the unique asset file name.
It’s not uncommon that blender keeps changing names of many nodes for various reasons. If they are done in the code they can be versioned properly, but if they are done in the asset, I am not sure if It’s realistic to version it well without any data block identifier.
If datablock identifier will be made for some reasons, does it make the unique file name method redundant and useless?

Everyone talking about using file names as unique, comprehensive identifiers has clearly never actually worked on a cross-team, multi-file, project. I have enough "final_output_logo (for print only) version 2 (revised).jpg"s on my work computer to know better. Ask me the number of times I’ve had to check “function.js” or “action_revised.php” into a codebase, and I’ll talk for as long as you have patience :wink:

10 Likes

Apologies if this is stupid or if another comment suggested something materially the same using different language (I’m feeling rather thick at the moment), but I take it that an internal name wouldn’t work? I name my new asset xyzzy, Blender internally assigns some random characters to it (and maybe version), making it xyzzy_foo.v01. Every time I use my asset xyzzy, Blender reads the internal name and knows what asset is being referred to. If I use a different asset called xyzzy, it reads e.g xyzzy_bar.v02, which is clearly a different data block.

While enforcing strict naming conventions makes sense when multiple people are working on a codebase, for end users it rarely goes well. Different companies might have their own house rules, or someone might just really want to name their asset “don’t use”, for reasons known only to them. An asset naming best practice guide might be the closest you can get on that one.

3 Likes

Hello. What about a version control system similar to git, at the exception that merge is prohibited since asset structs are stored as binary data (unless I’m mistaken) so it would really be impossible to gracefully resolve merge conflicts. So you would have a reliable trace of all the different modifications with authors, date, and commit message.

It would support branches (only branching outward since you can’t merge changes). And each asset metadata would store a list of commit hashes, commit message, author name and date of modification so it doesn’t blow up the file size. And the file revision history would be stored in an external folder similar to the git system.

You could share only the asset file, you wouldn’t be able to revert back the history of the asset but you could continue modifying it through a new branch history.

So when you append an asset and the last commit hash happens to correspond with an existing asset’s last commit hash, you can reuse the data, otherwise if the incoming commit hash is found in an existing asset history, ask if you want to overwrite the data, and if the last commit hash of an existing asset is found in the incoming commit hash history, ask if you want to reuse the data.

That way you can rename your asset however you want, what matters is its commit history. So what happens if two people modify the same asset and want to merge the changes ? Good question :slight_smile:

Just my 2 cents since I don’t think I have seen something similar to that in this thread.

Cheers

3 Likes

+1 for this proposal, it doesn’t add anything mind-blowing, but it alleviates at least one problem.

5 cents to functional deduplication:

  • definitely not a good thing on “.blend” open if it takes a lot of time

  • any popups asking for something on “.blend” open are actually very confusing, especially when this happens with file provided by someone else (downloaded from internet). User literally have no idea what is the right answer most of the time, especially when user have no understanding of the internal structure of scene. The possibility of such questions should be enabled explicitly somewhere in options

  • functional deduplication is not a clearly defined thing. some of our internal addons rely on node title or selection in unconnected socket in material nodegroup (it is a bad practise, but it is just easier to implement that way for addon smart behaviour tinting) which is clearly non-functional difference in common sense - but quite functional in reality

So there is a suggestion - make deduplication an separate op, not something “default”. Just like many have scripts to squash “.001” “.002” materials into single one, there can be an op that calculates “functional hashes” of everything and show to user “this seems the same like this, what you wish to replace?”. This calculations can be heavy, lengthy and can be even tuned to what extent “functional equality” should be treated. And a scary UI with replacements charts is also ok in this case - in case of separate op initiated by user

Deduplication is actually an unclear action, that should be performed in controlled manner by user that understand the scene at some extent. There is no way it may work ok in all cases for all users, imho

As far as deduplication goes, it should never happen on blend file open, at least as far as my current understanding of the ‘manual merge conflict resolution’ proposal goes. Rather, any confirmation for how to deduplicate items should happen only when appending (reuse data) an asset into your scene file.

Much of how this proposal works seems to closely follow how merge conflicts are resolved in software development. One confounding factor here is that there isn’t any kind of universally applicable way to visually compare the two conflicting data blocks in Blender, so users will pretty much have to fly blind if they aren’t already familiar with both files. Additionally, it seems like there isn’t really any existing equivalence heuristic when it comes to data blocks in Blender, so that might reduce the efficacy of automatic solutions.

On the asset creation side of things, I think it may be beneficial to improve the workflow to ‘publish’ an object to a library or new blend file.
Right now, there doesn’t seem to be a good way to save an object, material or node group to a new file. This functionality would be useful for multiple use cases in the asset creation pipeline.

  • New assets may have originally come from a larger project file, but you don’t actually want to put the entire scene into the library, just the reusable assets.
  • Creators may also want to work on an asset locally for a while before ‘publishing’ a revision to the library. This could come as either a separate, incremented .blend file, or as saving over the previous version.

On the asset consumption side of things, I think there are users that would vastly prefer both automatically and manually updating workflows. To this end, I propose separating the ability to ‘cache’ assets in the production file and the behavior of automatically updating the asset if there is a new revision. Dirty cache detection could be accomplished by something as simple as the last modified timestamp on the asset’s source file, or something more complicated that’s less prone to false positives.

It’s also important to me that there’s a linking option that is relative to the asset library (or something similar), rather than the blend file or system root. Having production machines across multiple operating systems (ex: Windows desktop, Mac laptop) makes linked data very prone to error.

2 Likes