Performance disparity for rendering between the local machine and the cluster

am trying to use Blender (v3.3.1) to render some simple scenes. My script runs smoothly on my local machine but is much slower on the cluster with slurm support. The main bottleneck is with bpy.ops.render.render(animation=True, write_still=True). I once assumed the cluster machine had better hardware specs than my local one. I tried both CPU and GPU renderings, but both of them had this performance drop. For the details, I also post a similar question on GitHub Question about performance drop on the cluster · Issue #950 · DLR-RM/BlenderProc · GitHub.

I found there was a similar question on the forum, but no further idea since then Rendering on GPU cluster.

Any ideas on solving the issue? Thank you in advance.

1 Like

This may seem a silly thing, but have you checked the saving file time?

If everything is ok it should take less than a second, but a few weeks ago I encountered a problem in our network and suddenly saving a 10 Mb file was taking nearly 3 minutes, it was fine from my computer, but it was SUPER slow from the farm.

The problem I had was related to the network and our NAS, so it may not be happening to you if you are using something like AWS, but I mention it just in case the bottleneck is happening in the saving file pahse.

I wish you luck solving it :slight_smile:

This is not a question about contributing code to Blender. Please see the header for links to other websites for support in using Blender, rather than Blender development.

The logical things to test would be:

  • Run Blender with --debug-cycles to get more detailed information.
  • Try rendering with just 1,2,3,… GPUs to see if it’s a scaling issue.