From_pydata crashes on large mesh

jamesavery · January 10, 2019, 4:07pm

When attempting to add a large mesh (176,230,815 vertices, 528,692,445 triangles) using mesh.from_pydata() in Blender 2.79b, I get a crash with the following error traceback:

Error: Array length mismatch (expected -115552647, got 528692445)
Traceback (most recent call last):
  File "/home/avery/work/maxibone/src/visualization/blender-vessels.py", line 120, in <module>
    add_numpy_quad_object("vessels",vertices,faces,(0.,0.,0.));
  File "/home/avery/work/maxibone/src/visualization/blender-vessels.py", line 45, in add_numpy_quad_object
    mesh.from_pydata(bpy_vertices,[],bpy_faces);
  File "/opt/blender/2.79/scripts/modules/bpy_types.py", line 429, in from_pydata  
  self.vertices.foreach_set("co", tuple(chain.from_iterable(vertices)))
RuntimeError: internal error setting the array

This looks like an integer overflow error, except that 528692445 is much less than 2^31 and should not be a problem to represent in a 32 bit int. I traced the error message to blender/makesrna/intern/rna_access.c, and all the relevant variables seem to be ints which, while perhaps size_t would be more sensible, should not have a problem to store 500 million.

I would be happy to try to fix this issue, but am very new to the Blender code. Where could the overflow occur, and how to fix it?

Another issue is the extreme slowness of from_pydata() (the data takes about 5 seconds to load from disk, but about an hour to load through from_pydata(). I would be happy to help implement a from_numpydata that should be possible to get to perform orders of magnitude faster, but will raise a separate issue for that.

Thank you so much for any help you may have!

jacqueslucke · January 11, 2019, 9:18am

Don’t know the exact reason why it does not work, maybe it is memory related, how much RAM do you have? (Edit: oops didn’t see the first line of your error message, that looks like an integer overflow indeed, will have to take a look in the source to see why it happens.)

Regarding performance I’d suggest you just fill in the mesh yourself similar to how Animation Nodes is doing it. Checkout this for more details.

What makes it faster is that the Blenders foreach_set methods are used with objects that support the buffer protocol instead of normal Python lists (which is what from_pydata is doing).

Maybe you have to cast your numpy array into another shape, however you should not have to make a copy.

(EDIT: Not sure how likely it is that this overflow is fixed soon. I guess if it has to be fixed there, it has to be fixed in quite a few other places as well. It is probably not worth the effort atm. Maybe you should try to find a way to split you mesh into multiple smaller ones.)

jamesavery · January 16, 2019, 3:45pm

Thank you very much!

I solved the overflow problem by changing the size and stride fields in RawArray from int to size_t in blender/makesrna/RNA_types.h:

typedef struct RawArray {
        void *array;
        RawPropertyType type;
        size_t len;
        size_t stride;
} RawArray;

as well as changing itemlen and arraylen from int to size_t in rna_raw_access() in blender/makesrna/intern/rna_access.c. There are probably many lengths that should be changed from int to size_t, but this fixes this particular problem.

Regarding performance of filling out the mesh: Thank you so much for the pointer!

But: Could I perhaps just make a C++ function that fills in the mesh, and export it to Python as e.g. from_numpydata()? The numpy-data is easily accessible from C++, which seems to me should be faster than doing anything from Python. That would also let me do it in parallel (it is frustrating to have 31 CPU cores sitting idle while one is bravely but slowly building a mesh). Any pointers to where I can find the relevant data structures in the code?

Thank you again for your help!

jacqueslucke · January 16, 2019, 4:00pm

I would not expect a big performance boost by using multiple threads here because internally this is mostly a memcpy. So the performance it is probably limited by the memory bandwidth and not by your CPU. The Python overhead is negligible when you pass numpy arrays or other memory views into the foreach_set method.

Maybe you could also get the raw memory pointers from blender to copy the data in yourself. However, I don’t think that is worth the effort.

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

From_pydata crashes on large mesh