Tiled EXR / texture caching support?

I wondered what the plan/timeline might be for supporting formats like tiled EXR to deliver memory-efficient texture handling during rendering with Cycles.

We don’t really have a timeline for any feature, since most Cycles development is done by volunteers or developers working for studios with their own priorities.

All I can do is tell you the current stats, which is that there is a branch with an implementation of it (CPU only), but it’s not ready for a stable release.

Sounds perfectly fair. What’s needed to progress it to a state ready for a stable release? Is that branch available somewhere for curious folks to experiment with?

The branch is here:
https://developer.blender.org/diffusion/B/history/cycles_texture_cache/

I’m not sure what exactly is needed still, @StefanW is the one who implemented this.

The actual caching part is working quite well, since it’s OpenImageIO doing all the work. There are few open issues to solve:

  • Right now it’s placing tile EXR files next to the original texture without any option for overrides. This should become a user option, since it may not always be desired or even possible to write to that location.
  • There are a few cases where the texture differentials are off and not only can give incorrect filtering but also show artefacts in bump mapping.
  • There’s always room for performance improvement.
  • Normal maps are filtered like ordinary maps, which is not ideal. To enable a separate filtering path for normal maps, there needs to be a user-visible setting where normal maps can be marked as such. (The “Color/Non-Color Data” menu could have a third entry for normal maps.)

@StefanW : are those tasks that you’re working on, or need someone else to pick up?

Those are tasks I hope to be able to work on again in a couple of weeks or months - unless someone else feels like helping out, that’s more than welcome!

For the immediate future, I have some more urgent things to work on.

I need some time to figure out how to build that branch and test it with blender. In the meantime, a specific question arising from limitations/issues in other applications’ EXR handling.

In the case of a multi-threaded render, with many different tiled EXR assets being read in, does each thread open its own stream(s) for those files, or is the I/O centrally managed in OpenImageIO?

I’ve been able to run into trouble before due to heavy use of tiled EXRs where the cumulative effect of each thread having its own streams could cause the renderer to crash because the data could not be streamed due to file handle limits, etc. Depending on your approach, this may (or may not) be an issue.

File handle limits are not an issue. All file I/O is handled by OpenImageIO’s texture system, which exposes the max number of open files as a parameter that can be changed.

This is one of the patches that I’m most interested in. Glad to see it gets some attention again.
Just a quick question: is this compatible with GPU out of the box or is it CPU only at the moment?

I come from an Arnold background (Softimage + SItoA) and the texture cache for rendering GBs of textures in a few MBs of RAM is the thing I miss most now that I’m using Blender for mostly everything.

Doing this for the GPU is a whole other story. GPU kernels can’t call OpenImageIO (or any disk access for that matter), so fully out of core (going to disk instead of just to pinned main memory) texturing on GPUs will be more complex.

Thanks for clearing this up.

I pushed some updates to the branch and uploaded a Win64 build:
https://1drv.ms/u/s!Ap47HIkOUUa3hEVXNWlHo01iB1H0

Quick guide:

  • Texture caching works only with external textures, not packed textures. * You can exploit this fact to selectively exclude textures from mip mapping/caching by packing them.
  • Tiled mip maps are placed next to the original textures. Make sure Blender has write access to that location. It is planned to make the location of .tx files customizable.
  • Texture cache options are in the “Performance” sections of Cycles’ options. They’re largely equivalent to Arnold’s options as they both use OpenImageIO.
  • Texture cache size 0 = texture cache turned off. I recommend a cache size of ~500MB as a starting point. Adjust to your needs based on how much memory your machine has.
  • Most other options can be safely left at their defaults.
  • Diffuse and Glossy blur are speed/quality tradeoffs in the range of 0-1. Larger = faster.
  • When you launch it from the console, it will print some statistics after the render.
  • With mip mapping, it is even more important to make sure that maps like bump, normals, etc to non-color data.

Known issues:

  • Wrong color space used when using OSL.
  • Bump maps may look wrong in some situations.
  • Mip maps for normal maps are not guaranteed to be normalized.
2 Likes

This is some exciting news. I’ll give it a try as soon as I can.

Letting it chew through one of my test cases, it’s shaved at least 7 GB off the memory footprint during rendering (down from 24 GB to 17 GB). That’s pleasant. No issues so far in terms of failures and CPU usage seems pretty stable throughout.

I want to see how this works over a network as well.

I’m not yet sure if this is coming from the tiling changes. In a local render with a procedural noise texture, I’m getting bucket-wise differences in the output. I’m trying to build a simplified test case to explore this further.

Some additional feedback. If I disable the auto-convert (as I’m using tiled EXRs and don’t really want a set of .tx files around as well), the render ends up much darker than for .tx files. This is for EXRs driving the color of an emission node. Expected?

Console shows (first for the conversion being prevented, and then for the conversion allowed):

Warning: Total files 27 | Changed 26 | Failed 1

RNA_boolean_get: WM_OT_save_as_mainfile.exit not found.
Cycles shader graph connect: input already connected.
Cycles shader graph connect: input already connected.
OpenImageIO Texture statistics
Options: gray_to_rgb=1 flip_t=1 max_tile_channels=5
Queries/batches :
texture : 34308586 queries in 34308586 batches
texture 3d : 0 queries in 0 batches
shadow : 0 queries in 0 batches
environment : 0 queries in 0 batches
Interpolations :
closest : 11175825
bilinear : 433398531
bicubic : 0
Average anisotropic probes : 10.1
Max anisotropy in the wild : 1e+06

OpenImageIO ImageCache statistics (shared) ver 1.7.15
Options: max_memory_MB=2000.0 max_open_files=100 autotile=64
autoscanline=0 automip=1 forcefloat=1 accept_untiled=1
accept_unmipped=1 read_before_insert=0 deduplicate=1
unassociatedalpha=0 failure_retries=0
Images : 1 unique
ImageInputs : 1 created, 1 current, 1 peak
Total pixel data size of all images referenced : 10.7 MB
Total actual file size of all images referenced : 5.8 MB
Pixel data read : 10.9 MB
File I/O time : 0.9s (0.1s average per thread)
File open time only : 0.0s
Tiles: 349 created, 348 current, 348 peak
total tile requests : 1057733845
micro-cache misses : 27285820 (2.57965%)
main cache misses : 349 (3.29951e-05%)
redundant reads: 1 tiles, 32 KB
Peak cache memory : 21.8 MB
Image file statistics:
opens tiles MB read --redundant-- I/O time res File
1 1 349 10.9 ( 1 0.0) 0.9s 2048x 512x4.f16 D:\blender\tiled_cache_support\frm8k0012_tiled.exr MIP-COUNT[256,64,16,4,2,1,1,1,1,2,0,1]

Tot: 1 349 10.9 ( 1 0.0) 0.9s
Broken or invalid files: 0

Cycles shader graph connect: input already connected.
Cycles shader graph connect: input already connected.
OpenImageIO ImageCache statistics (shared) ver 1.7.15
Options: max_memory_MB=2000.0 max_open_files=100 autotile=64
autoscanline=0 automip=1 forcefloat=1 accept_untiled=1
accept_unmipped=1 read_before_insert=0 deduplicate=1
unassociatedalpha=0 failure_retries=0
No images opened
Tiles: 349 created, 0 current, 348 peak
total tile requests : 0
micro-cache misses : 0 (-nan(ind)%)
main cache misses : 0 (-nan(ind)%)
redundant reads: 1 tiles, 32 KB
Peak cache memory : 0 B
Image file statistics:
opens tiles MB read --redundant-- I/O time res File
BROKEN D:\blender\tiled_cache_support\frm8k0012_tiled.exr

Tot: 0 0 0.0 ( 1 0.0) 0.0s
1 was constant-valued in all pixels
Broken or invalid files: 0

Cycles shader graph connect: input already connected.
Cycles shader graph connect: input already connected.
OpenImageIO Texture statistics
Options: gray_to_rgb=1 flip_t=1 max_tile_channels=5
Queries/batches :
texture : 34308586 queries in 34308586 batches
texture 3d : 0 queries in 0 batches
shadow : 0 queries in 0 batches
environment : 0 queries in 0 batches
Interpolations :
closest : 11175825
bilinear : 433398648
bicubic : 0
Average anisotropic probes : 10.1
Max anisotropy in the wild : 1e+06

OpenImageIO ImageCache statistics (shared) ver 1.7.15
Options: max_memory_MB=2000.0 max_open_files=100 autotile=64
autoscanline=0 automip=1 forcefloat=1 accept_untiled=1
accept_unmipped=1 read_before_insert=0 deduplicate=1
unassociatedalpha=0 failure_retries=0
Images : 1 unique
ImageInputs : 3 created, 2 current, 2 peak
Total pixel data size of all images referenced : 21.3 MB
Total actual file size of all images referenced : 9.3 MB
Pixel data read : 21.8 MB
File I/O time : 0.7s (0.0s average per thread)
File open time only : 0.0s
Tiles: 698 created, 348 current, 348 peak
total tile requests : 1057739090
micro-cache misses : 27285547 (2.57961%)
main cache misses : 349 (3.29949e-05%)
redundant reads: 2 tiles, 96 KB
Peak cache memory : 21.8 MB
Image file statistics:
opens tiles MB read --redundant-- I/O time res File
1 1 0 0.0 ( 1 0.0) 0.0s 2048x 512x4.f16 D:\blender\tiled_cache_support\frm8k0012_tiled.exr MIP-COUNT[0,0,0,0,0,0,0,0,0,0,0,0]
2 1 349 21.8 ( 1 0.1) 0.7s 2048x 512x4.f32 D:\blender\tiled_cache_support\frm8k0012_tiled.tx MIP-COUNT[256,64,16,4,2,1,1,1,1,2,0,1]

Tot: 2 349 21.8 ( 2 0.1) 0.7s
Broken or invalid files: 0

Right now the cache works on the assumption that all cache files are in linear color space. Mip mapping/filtering in any non-linear space will give the wrong results.

Do your cache files contain mip maps? Files that are tiled but don’t contain mip maps can cost quite a performance penalty.

They ought to. I did the conversion in modo and it’s supposed to perform the colorspace conversion and mipmap them automatically. I’ve uploaded some test content here : Test for tiled EXR vs tx

For a 15 GB TrueMarble tiled EXR (172800 x 86400), which renders fine in modo and was originally built a few years ago using db&w’s infiniMap EXR convertor, trying to use this in this branch of blender :

Calloc returns null: len=2655518720 in imb_addrectfloatImBuf, total 14804172
Error : EXCEPTION_ACCESS_VIOLATION
Address : 0x00007FF651BF69F5
Module : D:\Downloads\blender texture cache Win64\blender.exe

Also, when I tried to replace an image in an existing material network and render, I got this error:

Malloc returns null: len=3475243008 in Cycles Aligned Alloc, total 764703032

The box has 64 GB RAM.