GSoC 2024: File Input Nodes

File Input Node

Name: Divyam Maru

Contact:

  • I am active on [email protected] .
  • You can reach out to me on my WhatsApp contact no.: +919909977177
  • Alternatively, one can contact me on blender.chat, where I go by @Divyam-Maru

Synopsis

The main motive of the project is to make one or more than one node/s that dynamically load 3D files, such as STL, OBJ, Alembic, CSV, and 3D. This node would take inputs of files and would output a Geometry-type object.

Benefits

Geometry nodes undoubtedly form a major part of 3D design functionality in blender with their versatility. It would however feel even more complete with the addition of a 3D file importing node that would remove the need to import an object in the .blend file for it to be usable by the geometry node system. This would reduce the size of the .blend file saved on disk. Not to mention this might also allow for the usage of more robust 3D assets for usage in content creation as opposed to before the same.

Deliverables

  • At least one new node type that dynamically imports files to use in Blender. This node will have the option to load up a 3D file and output a Geometry Data Type.
    • For STL, Alembic and OBJ files, the determination of output is quite obvious
    • For CSV files, one can visualise it as a representation of the mesh as it happens in the spreadsheet. Incase the CSV file is an invalid type, where there are not 3 columns, an error in the node will be thrown.
  • Usage of existing parsing code to introduce a new datatype: File. Image will become a special case for this datatype. Also python API integration for the same datatype.
  • User Documentation
  • Potential Tasks
    • Developer Documentation for the new datatype if implemented

Project Details

Introduction of a new Datatype: File

  • Data Structure of File

    A file essentially is everything that can be programmed about the contents and nature of a file. The File datatype will store the following attributes:-

    • Filename: std::string
    • Path: std::string
    • Size: float
    • Type: struct:-
      • 3D Object
      • Python Script
      • CSV-like Files (Excel, CSV, etc)
  • Usage of the File Data Type

    The File datatype is primarily going to be used for the File Input Node. Using the information parsed in the popup box like the one used for image selection, one can set the data of a File and then utilise it in the node, or for any other future purposes.

    I will also work on design of a python API to handle the File Type and to make it easy to integrate it into future add-ons.

File Input Node

  • Basic Appearance and Function:
In the case of the basic node, the appearance and functionality should be similar to the image texture node, where my basic appearance layout would be similar to the one shown below (Edited from the Image Texture Shader Node since I couldn’t code it in yet). It will also have several other options that would make it easier to customize the import. For Example: users will be able to customise the position, rotation, scale, import material, etc. of the 3D object that is being imported.

2

  • Function for CSV

    In case of CSV or similar spreadsheet type files, it will show the (truncated in case of huge row numbers) rows and vertex coordinates.

    • Additional Tasks

      • A 3D viewer to view the imported file, similar to the rendered mode.

      • Should the time constraints permit me, I would like to also work on proper developer documentation of the features I introduce as well as deliver some case studies for the usage of the same. Along with that I will also try to contribute to other spheres of development and community as and how instructed by my mentor.

6 Likes

I not sure i get proper idea of implementation details. Do you planned to add new type of Data Block to blender to maintain file info in blender (file path, size of file, format, cache of data, …) and new node for Geometry Nodes to access this file?

Yes, that is what I intend to do.

What about Python api for this Data Block, how use will be able to operate by this data, see them in ui outside of geometry nodes? Also there is questions about linking and overriding. Plus asset system.

Cool that you’re trying to pitch in!

  1. About the File type. I’ve collaborated on a project with many automation with another artist using another procedural tool. There we actually determined what files to load inside of the nodegraph. So for the kinds of pipelines I work on, a File Input node with a path input pin would be far superior. And then probably an Enum output pin for A) if the file wasn’t found B) the file wasn’t able to be read successfully or C) the file was read/parsed successfully.
Summary

To give the concrete example, we “baked” out all expression and corrective shapekeys for characters to .obj files so the different artists could use any of their tool of choices to make modifications/corrections. Then the export procedure would use the characters name (stored as an attribute) to find the right sub-folder where the .obj files are, then loop through all supported shapekey names and try to load it and store it as an internal shapekey again.

Apart from not enabling the workflow I just described, I don’t see other benefits of making the File a datatype over a plain path input at all?

  1. Another design question I’d like to raise, is if it’s really the best idea to have a generic “file input” node for all datatypes that just finds the right importer. Instead of having a node per type. I see how it’s a nice ease of use, but as somebody building and maintaining pipelines I’d really prefer to specify that a procedure is expecting to find an .obj and break if it isn’t.
    Also what about different versions of importers? When the .obj importer was rewritten to c++ both, it co existed with the old Python version for a one or more releases to really make sure there were no regressions. A “File Input” that automatically picks how to import a file would need a whole other system to communicate/determine what the node is actually doing (and across versions). Which separate nodes for each type of file would inherently solve.

  2. When would you rerun the importer and how would you communicate that? I’d assume there would have to be some sort of cache so that every time the nodes run again, the file wouldn’t need a fresh import. If the file the node is pointing to is changed/updated, should that automatically reimport and force the nodes to run again? Or should it only reimport when/if the nodes are updated again? That would scale a lot better if I have a procedure that is loading tens of files which I’m batch exporting from another application/Blender. I wouldn’t want Blender to just start re evaluating everything after every. single. file. is. written. But it wouldn’t be as snappy/simple.

I would be against the possibility of user access to the disk random folder for procedural objects. So this definitely shouldn’t use a string from the input socket.

I see safety concerns could be reason to lock it down. But then I don’t see how it’s really protecting anything when any simple python script could also be used to do the same level of damage? The Blender policy on security already mentions that the user is expected to trust/verify the .blend files they’re opening:
https://developer.blender.org/docs/handbook/new_developers/faq/#how-does-blender-deal-with-security
Though maybe the “trusted source” popup could also be triggered as soon as any nodes that read/write files are enabled to run?

Python scripts are not runned on the file opening.

but… they are? If you put a script in a file, register it, and open the file, the script runs

I don’t want to have option in settings to disable auto-run of modifiers with geometry nodes.

Okay, if we actually want to talk about supporting string input socket for file path:

  1. File path syntax support in geometry nodes.
  2. Disk names on the current PC need to be added as some input node.
  3. Operations on the file paths also requires some certain Platform Dependent options.
  4. Do you think it is good idea to use Repeat Zone to traversal all random names in folder to open multiple files what user want (I know users will do that)? Okay, we have to expose list of files in folder…
  5. I don’t want to compare safety of Python (with access to raw data pointers and dlls) and geometry nodes (where only one expected way to involve the crash is not enough memory).
  6. Geometry nodes are strongly multithreaded thing. Even different objects can be evaluated parallelly. I don’t think this is okay to perform file access operations with arbitrary address in this context (at least for performance).

Won’t that complicate the initial requirements unnecessarily? For an initial version, all the user would need to start building some automated pipeline for batch processing files, is the file input NOT to be wrapped in a library type and be exposed as a string (path) pin instead.
The rest of the holes (your points 1-4) can be patched in with some Python scripts, but the core heavy duty processing/caching/not-putting-file-contents-in-main-data is handled by the File Input node(s). Of course processing a list would be nice. But array/list processing is not in Geometry Nodes yet and it’s not necessary to make the extended features a part of the initial requirements?

Yep, that did.
So lets stop this thread and come back to the line with file data block.

1 Like

We do not need a warning for a file input node. Similar file loading is already done for blend, image, sound, alembic and USD files without any warning. As long as it just reads the file it’s fine.

It is writing or deleting files and network access that are a risk, but that’s not relevant here.

4 Likes

File import nodes are quite a large topic and deceptively nuanced. Glad to see you’re jumping in!

The initial design should have considerations for some of the following though.

  • Do you expect only Mesh data to be imported? Or will you be importing other object types like materials, curves, point clouds, armatures, lights, cameras, empties etc. Some importers do not have options to select in such fine-grained ways. I notice that the more complicated, modern, formats like glTF, FBX, and USD are not mentioned but the design should consider what it would mean to operate against those formats. This leads to the next question.
  • How are Importer settings surfaced in the UI? There are a myriad of important options that may need to be set to import the right set of objects, in the right orientation, with the right properties associated etc.
  • How are errors that occurred during Import surfaced to the user?
  • For formats that allow animations, what frame do you import? All of them? One of them? Which one?
  • Python addons can add their own Import operators and formats. This is problematic 2 different ways. 1) We would have to figure out how calling into Python from the node graph will work. And 2) Having 1 node support any Importer (dynamically) may be prefered against having 1 node for each format. Having to implement a custom node for each custom format would mean that the addon author would also have to add another Node to Blender itself (in C++; and we ship only 3 times a year) to be useable. If that’s the case, so be it, but it should be extremely easy to do so then.

Again you don’t have to handle all of this yourself! But the design should call out what aspects still need to be considered, perhaps in the future, so that we don’t design ourselves into a corner and make it difficult to extend or add new formats later.

No way to run python from geometry nodes. At least now.

What about Python api for this Data Block,

I’ll add that one as well, thank you for pointing it out.

how use(r?) will be able to operate by this idea, see them in UI outside of geometry nodes?

We can show a viewer for them similar to the way the viewer works for 3D renderer, with a bit of tweaks. But that will also depend on IF we want the user to be able to view it, so I’d say it deserves a place in the Additional Tasks.

Linking, Overriding, Asset System

I see that the answer to that would be an in-node options like the image texture node in shaders, which would handle how much information from the asset is to be carried over to the node info.

1 Like

Some importers do not have options to select in such fine-grained ways. I notice that the more complicated, modern, formats like glTF, FBX, and USD are not mentioned but the design should consider what it would mean to operate against those formats.

I believe that the in-node options that I’m thinking of might work in the way you suggest it should.

How are Importer settings surfaced in the UI? There are a myriad of important options that may need to be set to import the right set of objects, in the right orientation, with the right properties associated etc.

I think the same would answer that, although it would require a more robust UI system.

How are errors that occurred during Import surfaced to the user?

About this, I can think of two ways to make it work, a simple and obscure one or a complicated yet verbose one. First is an in-node error that just informs of an error being thrown, and the second one being a dialog box that gives the user information about the error. Both seem like a good idea in respective test cases and once I can implement both, I believe I’d let the community decide on which one is better.

For formats that allow animations, what frame do you import? All of them? One of them? Which one?

Honestly speaking, I think I’ll keep animations out of this feature for now, but I’ll add them later on or put them under additional tasks. However off the top of my head I can think that it would add the animation frames to the animation that is going on in the main .blend file itself, and sync itself to the first frame or a specific frame. However, I can see that problem become complicated soon and would like to address it later.

Python addons can add their own Import operators and formats. This is problematic 2 different ways. 1) We would have to figure out how calling into Python from the node graph will work. And 2) Having 1 node support any Importer (dynamically) may be prefered against having 1 node for each format. Having to implement a custom node for each custom format would mean that the addon author would also have to add another Node to Blender itself (in C++; and we ship only 3 times a year) to be useable. If that’s the case, so be it, but it should be extremely easy to do so then.

The one node per type is not the idea, but I put it in draft to remind myself that it would be simpler to first code different nodes for different types and then put the logic in one node with cases for each. I really don’t know the answer to the first question and am currently researching on how to mitigate the same.