GSoC 2020: Faster IO for OBJ, STL & PLY. Feedback

Hi all!
I’m Ankit, & I’m working on https://wiki.blender.org/wiki/User:Ankitm/GSoC_2020_Proposal_IO_Perf .

Here’s the initial discussion about memory mapping: Clarifications for Fast IO project; And the proposal (it was decided to not use it). The draft on devtalk was not updated to reflect the OBJ’s prioritisation.

Please find the weekly, daily reports at https://wiki.blender.org/wiki/User:Ankitm. I’d be posting the link regularly here also.

Project status tracker is at https://developer.blender.org/T68936.

Branch is at https://developer.blender.org/diffusion/B/browse/soc-2020-io-performance/

Build to test: https://blender.community/c/graphicall/Pmbbbc/
Cheers!

16 Likes

Main feedback I have is UI-related. The Python IO scripts use real panels, but the C-based IO operators use clunky inconsistent UI boxes. Using real panels instead would be more consistent and preferable.

I’m here for the obj. :slight_smile:
I already gave up on importing big obj files in blender, I hope it turns out ok this time.
Good luck!

I wish fbx was on your list too, but that’s another story. :v:

7 Likes

What do you think about asynchronous loading as Alembic importer has?

All the examples I saw which had a operator written in C, access via python were bound with a button, row, column, or layout. I had written an IO panel with ImportHelper so I didn’t see that flexibility to bind an C operator to it. It could only execute other python functions.
But if I find a way to make it work, sure I can add that. It’ll help MTL also, which as of now is in limbo.

@ThinkingPolygons, thank you!
fbx bugs can be fixed in python itself. I’m aware of the number of reports.

Please see the discussion in the mmap clarification post. The whole file is to be read into memory at once, (or chunks, as appropriate ) but straight up worrying about disk IO issues before having a correctly working parser is not good IMHO.
For the exporter, I’m trying to keep the data-to-write & the writing operation not strictly sequential. As in, writing can continue separately while next batch is readied.
For importing, I think it’s going to be sequential.

1 Like

@ThinkingPolygons I feel the same and it’s a big pain that keeps me away from doing any serious stuff in Blender, since I often work with very detailed architectural models.

I am not sure if this would be part of this GSoC project, but may be it is somehow related. A few years ago I tried to import large generated scenes into Blender for further processing and animation. I even wrote my own Python importer for scenes consisting of primitives (like boxes, cylinders etc.) and triangular meshes (stored in obj files) with PBR materials which were quite new back then. However, I lost two months just to realize that Blender kept loading my large scene with a single thread running for more than a day. Just to compare - loading the scene in Mitsuba CLI took around a minute. I couldn’t believe how Blender can be so slow. Someone suggested me to merge the objects by materials. Well easy said if you don’t have hundreds of unique materials and if you don’t want to animate the objects. So merging was not an option for me. Then, I got feedback from the developers saying that Blender has deep in its core an update mechanism that updates the whole scene after each object addition. This update iterates over all existing objects, resulting in quadratic complexity! I started feeling it with around 2000 objects, it became a pain at around 5000 and with more than 10K it was already impossible. I needed to reach 50K at least. I tried to mitigate by hierarchical merging of scene parts into .blend files and merging them together, without success… I was searching for a scenario or setup that would allow to load a batch of objects without any scene updates in between, but I failed. There was no way to postpone the internal scene updates until all objects have been loaded through the Python API. Since then, I tried loading larger scenes now and then a few times, most recently two weeks ago and it is still a pain (loading 300Mb of data took 2.5 hours). The inevitable scene update call is still somewhere there being called.

Blender improved a lot over the years and there is awesome new experimental stuff like Spectral Cycles going on, but import of highly complex scenes remains one of the last barriers for using it on a regular basis, at least for me.

Sorry for the long story. @ankitm You may crash into the same issue as me, just to let you know. I hope you will be able to create a glorious workaround. I wish you all the best with the GSoC project! If you succeed, Blender will finally become my tool of choice for any project.

6 Likes

Glad to see this being picked up again! I’m sure you’ve seen it but if not, it might be worth a scan through this thread to see where the 2019 GSoC project stumbled.

1 Like

Hi @ankitm,

Good luck with your project, landing this would be incredibly useful to many people!

If you haven’t seen it, I recommend Matt Pharr’s series on the changes it took for PBRT to be able to load the Moana island scene: https://pharr.org/matt/blog/2018/07/08/moana-island-pbrt-1.html
Maybe some of these ideas apply here too!

1 Like

Thanks for that link, _name. Interesting read.
For this project part 3, https://pharr.org/matt/blog/2018/07/13/moana-island-pbrt-3.html , seems most relevant (for the importer). I think OBJ/STL/PLY formats are simple enough that a tokenizer/lexer split may not make sense. It is interesting that he finds memory mapping to be the right approach. There is some controversy here about whether it is right or not. I’ve seen the point about the value of special handling of number parsing perhaps being worth it, before.

1 Like

@isolin That sounds painful! I’m currently(in first 4 weeks) not working on importer, so cannot comment on what the new one would look like, but definitely not this bad. (see next comment for the hopeful preliminary results.)

@testure Yes, I had seen the thread when I was going over the project list, some months back. I’m taking care not to make those mistakes .

Thank you @_name for telling me about it!
Reminds me to be careful about memory allocations & function overhead.

2 Likes

Week 1 Report : June 1 to June 6

  • Received a lot of input from the mentors regarding coding style, comments, documentation, and file structures. Spent quite some time on cleaning the hastily committed rB485cc433. See current version at Diffusion
  • Currently the exporter exports vertices, face normals, and face indices; in world coordinates.
  • It’s lacking in (1) multiple objects, (2) texture coordinates (3) Axes transformation. If you compare obj txt files written with python & C++ , you’ll see differences in vertex order, and some coordinates with opposite sign. But the shape is correct.
  • Did a comparison between the old & new exporters. Release configuration, lite build + opensubdiv, default cube with 6+3 subsurf modifiers. See images: F8586987 F8586984
    • wm_handler_fileselect_do takes 82 seconds (python) vs 9 seconds (c++) (which is expected to be less than 14-15 s with texture coordinates).
    • File size is 186 MB.

Week 2 Todo:

  • Textures, multiple objects in the same file.
  • Modifiers, Curves, Multiple frames , Axes transform, Progress logging

https://wiki.blender.org/wiki/User:Ankitm/GSoC_2020_Weekly_Reports
https://wiki.blender.org/wiki/User:Ankitm/GSoC_2020_Daily_Reports#Week_1

13 Likes

A new thread for reports, separate from this one (feedback) would be better, no?

2 Likes

The only possible upside, I see, is a single place to read all weekly reports. That I’m already doing at the wiki.

And more-than-needed links = bad UX (:

Previous years we had 2 threads, one for reporting so it’s easy to see what was going on, and one with lively back and forth between users and student where a weekly report easily would have gotten lost in the noise.

4 Likes

An update on the OBJ exporter: vertex coordinates, texture coordinates, and face normals.

The last case (icosphere + 8 subsurf) memory usage: ~6 GB in python. 1.67 GB in C++.

The table on weekly reports is 3 days old. In case you don’t use texture coordinates, check the smaller file size & less time taken at
https://wiki.blender.org/wiki/User:Ankitm/GSoC_2020_Weekly_Reports#Week_1:_June_1-6

A build to test should be available (thanks to Lazydodo!) after these changes are merged in soc-2020-io-performance.
The code upto rB86845ea907f8ae24b3bed4ab591f8e3687ac5f22 crashes. So please don’t try that.

19 Likes

Wow, those export times looks amazing!
FINALLY! Quite a few times I was embarrassed to first show people how great Blender can be and then having to show them how to import assets into Blender just to sit there in awkward silence for 5 min.

10 Likes

Please test the build!
Thank you @lazydodo
f01a7d508d0d is the latest hash as of now.

The exporter is in the expected place, File > export > OBJ new. It only exports the single selected object as of now. And that should be mesh only. Not curves etc.
Also, if you run blender by console, you can see the export timings too.

Let me know about the bugs! Best is to share the steps to make a problematic mesh. If sharing the file is a must, please minimise its size.

3 Likes

It is currently crashing for me

@EitanSomething Thanks!
Fixed. rB3cbd4516c68f89942742c175ba7cdc2ce951b2d7 is the latest hash.
rBc2fbfb421ba4 fixed the issue.

Updated build should be available about 12 hours from now.

1 Like

I kicked it manually, will probably be 8-10 minutes done!

7 Likes