GSoC 2024: Geometry Nodes: File Import Nodes

Hi everyone,

My name is Devashish Lal (also go by CodeBlaze) and I’ll be working on File Import Nodes for Geometry Nodes this summer. I am currently pursuing my masters and my research is focused on 3D Computer Vision, Neural Field Rendering and Gaussian Splatting. I have been using blender on and off since Blender 2.6 and for me it’s a relaxing hobby. Love hard surface and environmental design. When geometry nodes came out it enabled me as a programmer to create fun things and I see it’s potential use in my field of research also but that requires importing point cloud data. When I saw this project on the proposal page I knew I had to try and submit a proposal.

I started with exploring the code base around existing bf_io_* importer projects and documented what changes would be required to integrate them with Geometry Nodes

Next I tried doing a proof of concept with a simple STL file import node, changes can be found here. with this I had enough information to formulate my proposal, at least for simple 3D data formats like OBJ and STL.

For CSV import, since there is no existing importer project a new bf_io_csv project would need to be created and after a discussion with @jacqueslucke , I formalised how CSV data would be handled in Geometry Node. you can read upon that in my finale proposal here

I aim to implement file import nodes for 3D data formats (OBJ, STL, etc), CSV and Point Clouds (PLY) over the coming summer and maybe after I am done with GSoC I’ll try to create a file import node along with a geometry node system for gaussian splatting and use it in my research work.

Looking forward to sharing these features with everyone.

Feedback Thread

36 Likes

Week 1

For this week:

  • I have raised a pull request for the STL import node (for now it uses string path as input, will be changed to file selector)
  • Went over the error reporting for geometry nodes and handled multiple error cases w.r.t to the STL importer
  • Refactored the STL importer ensuring zero code duplicacy
  • Python side changes for add menu layout (discussed where the STL import node should go)

At the beginning of the week Hans and I discussed that delivering the STL node first would be the most straight forward thing as it’s the simplest format. I still have some reviews to go through for the pull request and then most probably will start with the file selector dialog followed by refactoring of the OBJ importer.

9 Likes

Week 2

I merged the STL import node to main and it is available behind the experimental feature flag “New File Import Nodes”. I am excited to get this in early and collect feedback.

This week I will be working on some usability improvements i.e, adding a proper file selector

While adding the experimental feature flag I also learnt about the DNA and RNA system.

20 Likes

Week 3

This week was pretty hectic, tried my best to wrap my head around RNA still feel like a ways to go. anyway I raised a PR which adds a file selector to the Import STL node.

As a side effect PROP_FILEPATH can be added as a socket subtype to any string socket which means any string socket can be changed to file selector easily with a single line of code, this could be helpful at other places.

Getting file selector out of the way early makes it easy to integrate it into the upcoming import nodes.

As always these changes will be available behind the experimental feature flag “New File Import Nodes”, looking for feedback.

15 Likes

Week 4

This week was a bit slow for me but I have started looking into caching, also doing the OBJ refactoring on the side. Getting back to caching, this is going be built completely from ground up. While my dream is to create a completely generic cache for geometry nodes, after discussing with @HooglyBoogly we agreed to keep the scope simple and just focus on caching for file imports for now.

I started looking at some existing caching mechanisms like the Shader cache and will try to come up with a more detailed proposal for caching next week

11 Likes

Week 5

Have been all over the place this week trying to come up with a caching mechanism. In the meantime I decided to knock out the OBJ import node which seemed simple. here is the work in progress PR.

An OBJ file can contain multiple meshes and curves which makes it a bit challenging to output the data from the node, but geometry instances and geometry sets can be used for this.

As for caching I’ll do the simplest thing I can and then go from there based on reviews.

7 Likes

Week 6

This week, Curves threw me a curve ball :joy: (I had no idea that OBJ files can contain Curves) handling imports for multiple meshes and curves while refactoring the importer took a while but it’s working now. Changes can be seen here, as always once merged the node will be available behind the new importer node’s experimental toggle.

This week I’ll start with the most basic implementation to caching, which is more essential now as I can see blender freeze up while importing OBJ files

13 Likes

Week 7

Work on caching finally began you can check it out here. I will be developing this incrementally as there are still some un-answered questions but for the coming week I will be focusing on a manual cache clear button and start with a LRU Cache implementation.

Un-answered Questions

  • Cache invalidation - should we hook into dependency graph ? or use a LRU based caching.
  • Cache keys - for now file path is the cachekey but what if the file is modified ?
  • Cache cleanup on exit - for now I have added the cleanup call in window manager exit, but is there a better place to do it ?

You can add you feed back on the blender chat thread.

7 Likes

Week 8

Added an operation to the geometry nodes editor to manually clear the cache and experimented with a LRU cache using Boost’s implementation this week.

Since boost is an optional dependency we can’t use so will be implemented LRU cache using blender’s containers next.

5 Likes

Week 9

Learned that Boost is a optional dependency so I removed it and implemented the LRU cache without any dependencies, the core is done and working but some smaller things around cache are pending (limiting the cache size). for this there is a dependency on a PR (yet to be raised) that adds approximate size computation to GeometrySet.

Also I require some feedback on how cache related options should be exposed to the user (what is the right place), For which I have started a general feedback thread for my project.

On the side I completed the PLY import node which is ready for review - PR

4 Likes

Week 10

I spent this week adding the size_in_bytes_approximate() method to GeometrySet by adding it as a virtual method to GeometryComponent and implementing it for the required components, out of which only MeshComponent is giving a somewhat useable value, rest are way off :smiley: . check out the PR here

Got some review changes to do on the PLY import node PR will get it done tonight and get it in main early next week.

Also I made a post about my plan to implement the csv import node in the feedback thread, do check it out. the csv node was the last thing in my proposal, feels surreal, how far I got.

8 Likes

Week 11

I started work in the bf_io_csv project, implementing a csv importer also spent some time better understanding the attribute system for Mesh Data.

This week was pretty busy for me, wasn’t able to put it much time but I hope to catch up in the current week

6 Likes

Week 12

Covered a lot of ground this week for both the importer caching and csv import node, which are both now in a working stage*.

CSV Import Node


We love string processing don’t we :slight_smile: , but I learned a lot while implementing the csv_import project from scratch, this was the first time I really had to sit down and architect something for this summer’s work on blender.

Now I would love to say the CSV import node is perfect but it can only be called perfect if you just need INT and FLOAT along with some caveats (known issues listed in the PR)

Memory Size based importer cache

Thanks to @jacqueslucke and his work on the MemoryCounter API I was able to change the LRU cache implementation to be based on a fixed size (256MB for example) rather than number of cache entries.

Check out the updated PR.

That being said @jacqueslucke is working on a more generic and core cache system. I’ll be updating my implementation to use this system as it get’s merged. Also looking at this implementation is such a great learning experience for me as I felt my LRU implementation was more like a toy as it felt similar to the Leetcode question :smiley:

Finale Weeks

As the end of summer get’s closed my goal is get the csv and caching PR’s merged and I’ll continue working on improvements and user control after the summer ends. collecting user feedback and my discussion with @HooglyBoogly made me realise there is so much possible with the CSV node (also I have some personal goals to integrate blender in my 3D vision research). To do all this I’ll be working past summer and continue enjoying :smiley:

19 Likes

Week 13

This week I was wrapping up the reviews for both CSV and Caching PR’s. WIP tag’s have been removed from both of them and hopefully they will be merged soon.

CSV Import Node

I heavily refactored the csv_reader.cc this week following review’s from @HooglyBoogly. The code is more easy to understand now (reduced raw pointer manipulation) and we also scoped the import options as a separate PR which I continue working on after GSoC

Also there was a bug where the last character in the last line was not being read, took me a while to notice it :smiley:

Geometry Import Cache

I refactored the caching implementation to build upon @jacqueslucke work on the global memory cache (no more custom LRU logic :smiling_face_with_tear:). I also had discussion regarding the cache key and want to look into incorporating the last modified time of the file into the key to deal with the cases where a file is updated and the name is not changed but this would required the underlying importers to also return the last modified time or the node could directly query it.

14 Likes

GSoC 2024: File Import Nodes - Final Report

First of all I wanna thank @HooglyBoogly , @jacqueslucke , Blender Foundation and Google for this program as I have learnt so much in the last few months. 12 years ago when I started playing with blender as a kid I never imagined one day I would be contributing to it.

This summer I developed the File Import Node for Geometry Nodes aimed to reduce disk usage by externalizing data and also enabling data visualization workflows in Blender.

Creation of New Node

Adding new nodes to geometry nodes is very well documented in the Blender developer handbook they also link a PR to follow as an example. The api around nodes is also super well designed with the ability to customise inputs and outputs and well as hook into the node searching logic.

To follow iterative development at the very start @HooglyBoogly suggested that we put all the nodes behind an experimental feature toggle and keep merging each node as it’s developed. You can test these nodes in Blender 4.3 Alpha by switching on the new file import nodes feature in the preferences.

Refactoring Existing Importers

Existing importers (STL, OBJ, PLY) all work through the blender context and store the data loaded from file into the blender context. this posed the first issue as when the nodes in geometry nodes are getting executed we don’t have access to the blender context. This required major refactoring of existing importers to expose the loaded geometry to the node graph outside of the blender context.

STL Files

STL being the simplest of the three formats was the first one I tried to refactor and It was pretty straight forward to abstract away the importing logic and wrap it with 2 exposed methods one for the existing blender context and other for geometry nodes.

OBJ Files

OBJ files were a bit tricky as not only they could contain multiple meshes but also can contain curves. At first I had node idea how we would handle such complications but the GeometrySet has an Instances component which is exactly what’s needed for this use case.

Tough importing as instances does add another challenge as we can’t just extract out the core OBJ functionality and wrap to exposing functions like STL. In case of the import invocation coming from we have to convert imported meshes and curves into instances and then create the GeometrySet.

Another issue is MTL file as OBJ file can have materials, for now these are ignored as there is no way to create materials from geometry nodes.

PLY Files

Point cloud files were pretty simple and similar to STL file. personally PLY support is the main reason I got involved with this project as I have been looking to integrate Blender into my day to day Computer Vision Research.

One thing that troubles me is the fact that the PLY importer loads the data as a Mesh rather than a point cloud, something I further want to look into and see if this can be improved.

File path selector

I mentioned before that the nodes api is very well designed and I felt so the most when adding file path selection UI to the import nodes. Well all I had to do was implement file_path as a socket subtype of String and everything else just worked out of the box. This was something that I though would take me a lot of time but I was able to get it done pretty soon.

Midway

All the work so far was either refactoring existing code or building upon some well designed API’s but the next two tasks CSV Import and Geometry Caching needed to be built up from ground up

CSV Importer

Blender doesn’t have the capability to import CSV files so the first thing was to implement the importer and we didn’t wanna use existing CSV libraries cuz it’s is just simple string handling right ? well the importer is still pretty simple but there were a lot of edge cases although the most time I spent was in designing the importer figuring out which individual functions should be created and utility classes.

Following questions were asked about to import and use CSV data

  • Should a custom data type be created for CSV data?
    At first I thought this would be required but then after thinking of what the CSV data actually represents it became apparent that we could just use PointCloud where each row could be a point and each point can have arbitrary attributes.

  • Which data types to support and how to decide the data type of each column?
    For now we just support Integers and Floats and just look at the first row to decide which type should be used for the whole column (first try int else use float). this is something I hope to expand upon in the future

I spent a week looking at the existing OBJ and PLY importers to get inspired into writing the CSV importer and ended up with an implementation that I am quite proud of.

There are still a lot of features need to be added to the CSV importer to make it fully useful and I’ll work on them post GSoC, like user selected column data types. It would be nice to also integrate this importer into the blender context to be able to import CSV files outside of Geometry Nodes.

Geometry Caching

I have to be honest my first implementation of caching was very similar to my solution of the leetcode question design an LRU cache and it wokred, but I felt uneasy about it. Technically the final solution is also is a LRU cache but is much more robust and beautiful, done by @jacqueslucke

First Attempt

To get started quickly I just used boost’s lru_cache and got something up and running. Since boost is an optional dependency I soon starting implementing my own lru cache which didn’t take much time (copying from the leetcode question).

At this point we had a cache that worked on a fixed number of items

Counting Memory

Fixed number of items is not at all ideal for blender as the size of each item is not fixed (1 mesh could be a simple 8 vertex cube another could be a simple million vertex cube), so instead of the cache being limited by a fixed item count the goal was to limit it by total size of the cache (256mb for e.g). This required size computation of GeometrySet

You can look at my attempt in this PR, I could only get size computation for mesh components working and then @jacqueslucke implemented the MemoryCounter API and soon the Global Cache mechanism.

Big Boi Cache

Integrating with the global cache mechanism was super easy and makes the code so much cleaner

Future Work

I’ll just throw some wild ideas I got while working over the summer.

  • CSV improvements
    • more user controlled import options
    • support more types like strings and complex times like vectors and colors
    • csv exporter
  • Add support for more non-3d data types (JSON, YAML, SQL) and enable visual data pipeline curation workflows
  • Add support for scientific computing for Geometric Data
  • List and iterator contexts in geometry nodes (suggested by @clankill3r here)

Summary

I learned a lot over this summer and also realised how much I still have to learn to become a decent C++ developer. I come from a research background and deal with Computer Vision, Machine Learning, AR/VR and Graphics but did spend 3 years in the industry writing JavaScript.

This opportunity is something I have been looking for a long time and I enjoyed the whole process and I’ll definitely continue to contribute to blender.

Just a shameless plug in the end - check out my YouTube Channel

30 Likes