Hardware specific rendering bucket size CYCLES, CPU+GPU

I’ve noticed in Blender manual that the rendering bucket size is somewhat dedicated to hardware:
https://docs.blender.org/manual/en/dev/render/cycles/render_settings/performance.html

But what if I’m rendering with both CPU and GPU? Should user test multiple options and pick whichever is faster for ones hardware configuration?

If yes, than I have question/proposition:
Currently bucket size is assigned to every CPU thread and GPU as a whole. But what if CPU got its own meta-bucket the same size as GPU one, then this meta-bucket would be partitioned for individual cores/threads? - see image below.
Could this benefit rendering times or its wrong idea?

3 Likes

As a general rule of thumb for current cycles state for CPU+GPU rendering you should use smaller bucket sizes, you can do your own test to find if this is correct, but usually 16x16 or 32x32 would be faster.
This can change in the (near) future.

4 Likes

I was suggesting this for a while now. The whole advantage of the CPU+GPU speed increase is gone, when the GPU has finished and the rendering is stuck on some slower CPU buckets at that point. Especially when it’s a tricky area with lots of samples. It would be great if the GPU could assist or take over the CPU buckets at the end. I guess using your meta-bucket system might work as well.

4 Likes

Here is my quick and dirty pseudocode. I didn’t accually look into Blender code yet, so if there is anyone more willing to test this - please don’t hesitate.

// Reads input form user.
if GPU + CPU
	take GPU_bucket_size
	take CPU_bucket_size
		// Checks if the size of CPU bucket is greater or equal to GPU one. Assigns the size of CPU bucket equal to GPU one. Render as usual.
		if CPU_bucket_size >= GPU_bucket_size
			assign CPU_bucket_size = GPU_bucket_size
			render
		else
		// If the CPU bucket is smaller than GPU one. Divide GPU bucket size by CPU bucket size. Round it up and square.
		x = ((GPU_bucket_size / CPU_bucket_size) round up) * 2
		assign x number of CPU_bucket to CPU_meta_bucket
			if there is not enough CPU_buckets
				create placeholder_buckets for remainig render space
			else
		clip render region of CPU_meta_bucket
		render
		repeat for remaining number of threads
// Assigning to new or existing CPU meta bucket.
if rendering CPU_bucket is finished
	if there are placeholder_buckets in existing CPU_meta_bucket
		assign finished CPU_bucket to them
	else
	spawn new CPU_meta_bucket
1 Like