Hello! I am working on implementing a small optimization (but huge for my use-case) centered around denoising. I want to check whether a tile is completely transparent, and if so skip the denoising for that tile. I’ve been through session.cpp and I implemented something that I think is on the right path. However, the pipeline is somewhat opaque to me; I’m not sure that I am trying to implement this in the right place.
Basically, I have this function:
bool FullyTransparent(ccl::Tile* tile){
float* in_combined;
in_combined = tile->buffers[0].buffer.data();
BufferParams *params = &tile->buffers[0].params;
int pass_stride = params->get_passes_size();
int size = params->width * params->height;
int offset = 0;
int str = 0;
params->get_offset_stride(str,offset);
float *in = tile->buffers[0].buffer.data() + offset;
for (int i = 0; i < size; i++, in += pass_stride, in_combined += pass_stride, in_combined += 4) {
if(in_combined[3] > 0.01f)
{
return false;
}
}
return true;
}
Down in session::AcquireTile, where we normally have rtile.task = RenderTile::DENOISE; I added a little check:
if (tile->state == Tile::DENOISE) {
if (!FullyTransparent(tile)) {
rtile.task = RenderTile::DENOISE;
}
else {
tile->state = tile->DENOISED;
}
}
I’m pretty sure that FullyTransparent() is on the right track (I know that it doesn’t work quite right), but I’m not at all sure if I am on the right track about where to actually put the check.
I’m not looking for somebody to write the code for me, just point me in the right direction please!
I would add this check in the device code that does the denoising, for example CPUDevice::denoise_openimagedenoise_buffer. For two reasons:
You don’t need to change the overall rendering logic, I don’t think changing the task or state in the tile will be sufficient.
It ensures your pixels have been copied to CPU memory, which your code seems to assume. If you are using a GPU denoiser, you’d need to either copy them to the CPU or do the test on the GPU.
Ah, I understand. I hadn’t considered that the rendering pass data might not be immediately CPU accessible. I figured that implementing it in the renderer pipeline would make it more generically useful, but as it happens I AM only interested in the OpenImage denoiser.
It looks like I can probably just pop the check it into denoise_openimagedenoise. Thanks!
Hehey, I got it! With almost no understanding of what I was doing! I think that’s a skill
I’d never submit a pull requested for this code, but just adding this to the start of the denoising function cut my render time by a third (will probably be 2 thirds when I modify my project to take advantage of it):
int offset = 0;
int str = 0;
BufferParams *params = &rtile.buffers[0].params;
params->get_offset_stride(str, offset);
int size = params->width * params->height;
float *in = rtile.buffers[0].buffer.data() + offset;
float *in_combined = rtile.buffers[0].buffer.data();
bool AllTransparent = true;
for (int i = 0; i < size;
i++, in += pass_stride, in_combined += pass_stride) {
if (in_combined[3] > 0.01f) {
AllTransparent = false;
break;
}
}
if (AllTransparent) {
return;
}
It’s a little bit tricky, because 99% of the stuff I do with Blender these days is work-related. I would need a sign-off from on high to work on anything for submission upstream. Also, this needs to be toggle-able in the settings so there is a bit more to do.
But you know, if you want a custom build the fix is right there