GSoC 2025: Improving Sampling in the Compositor

Improving Sampling in the Compositor

Proposal link: Google Summer of Code

Benjamin Beilharz
Blender Chat: Ben Beilharz @ben:blender.org
email redacted

Some handles:

  • GitHub: pixelsandpointers
  • BlueSky: ben.graphics

Synopsis

The goal of this project is to improve the compositor’s sampling situation. This includes exposing the interpolation option to all nodes reliant on it. The interpolation options are already partially implemented as seen in this #119592.

Furthermore, sampling requires border conditions, such as zero, extend, or wrap. At the current state, these conditions are not exposed to the user, but use the wrap condition across methods. The conditions should also be selectable in the node giving more freedom to the users.

Lastly, there is one sampling method that requires some special attention – BLI_ewa_filter.
Currently, the EWA sampling uses different implementations across CPU and GPU, leading to different results on different devices/OSs. Blender uses EWA sampling primarily for anisotropic texture sampling. The goal is to write an implementation of the EWA filter for the CPU, which is transferable to the GPU, aiming for similar/equal results for CPU and GPU versions, allowing it to be used in the compositor to enable anisotropic compositing. Literature is available:

  • Heckbert’s original algorithm [1]
  • Modern GPU-based approximations exist [2]

To summarize the project proposal:

  • Unify compositor nodes by exposing interpolation and border conditions to eligible nodes
  • Improve EWA sampling situation, writing a coherent implementation for CPU and GPU with similar results

Benefits

The exposure of the sampling and border conditions will give the users more freedom in the compositing workflow.
An anisotropic texture filtering node would allow users to perform high-quality resampling with reduced aliasing, particularly useful after transformations such as rotation, scaling, or camera projection. This improvement will enhance texture and compositing workflows.

Deliverables

  • Compositor nodes with exposed interpolation and border conditions
  • A modernized CPU implementation of EWA filtering.
  • A GPU implementation of EWA filtering based on the CPU implementation.

Project Schedule

Phase Duration Tasks
Literature Research 1-2 weeks Familiarize with EWA literature, researching modern methods and approximations. Comparative analysis of methods. Potential discussion with mentors to weigh factors and pick implementation.
Phase 1: CPU Implementation 2-3~ weeks Rewrite and optimize BLI_ewa_filter, add unit tests.
Phase 2: GPU Implementation 1-2~ weeks Adapt a GPU version based on the CPU one.
Phase 3: Interpolation Exposure 2 weeks Add EWA to interpolation methods and expose to eligible nodes.
Phase 4: Border Condition Exposure 2 weeks Add border conditions to eligable nodes.
Testing & Final Adjustments 2 weeks Bug fixes and documentation.

Total: 175 hours over 13 weeks (flexible start date from May 16).

Potential Challenges & Considerations

  • Ensuring the filter is robust, producing the same results across different hardware and platforms
  • Potentially writing an approximation of EWA sampling

Bio

G’day, this is Ben! I’m a PhD student at TU Darmstadt, researching the intersection of physically based rendering (PBR) and vision science. I hold a Bachelor’s in Computational Linguistics (Heidelberg University) and a Master’s in Computer Science with a focus on Visual Computing (Technical University of Darmstadt).

In my free time, I enjoy learning new languages, traveling, photography, and working on some personal projects.

The passion for computer graphics was ignited in me when I watched Avatar, momentarily set aside, only to be reignited after Avatar 2. This led me to pivot from NLP/AI to PBR and differentiable rendering. Since my university lacks dedicated global illumination courses, I’ve been self-teaching anything PBR.

My aspiration is to work someday as a rendering engineer, but this requires a bit more time on my end to hone my skills for the next three years, which I also hope to achieve by contributing to Blender and growing with contributions.

For development, I have been mainly in the machine learning domain and have become comfortable in Python, having worked on different research projects, companies, and startups. I also taught Python for a year at university to freshers. In 2023, I was part of the ASWF summer learning program, learning to use Python to write custom tooling for DCCs, and got in touch with MaterialX and OSL. C++ became increasingly important for me as my interest in CG grew. I haven’t been able to write C++ on a day-to-day basis, but I have completed some projects, such as a minimalistic 3D editor using OpenGL or some rendering projects. So, there is a strong urge to write more in C++ and get as comfortable as Python. Apart from contributing to Blender and potentially open standards at ASWF, I also started to work on a production renderer as a personal project that I want to develop for the coming years.

After GSOC got announced, I looked into the proposed projects and realistically picked a project that seems to be slightly above my current C++ knowledge, so I have room to grow while learning more about Blender’s internals – the compositor. I opened a PR for a first good issue. This touched upon surprisingly many things: RNA/DNA system, node system, UI builder patterns, GPU contexts, etc., so it was a very nice introduction to Blender and the APIs that will be used to implement the proposal.

References

[1] Heckbert, P. S. (1989). Fundamentals of texture mapping and image warping.
[2] Mavridis, P., & Papaioannou, G. (2011, February). High-quality elliptical texture filtering on GPU. In Symposium on Interactive 3D Graphics and Games (pp. 23-30).

6 Likes

Week 1


This week, I have been mainly looking into the state of the leftover issues regarding the interpolation in the compositor’s transformation nodes (#119592). The last node to include the interpolation mode was the displacement node, for which I have filed a PR #139802. While working on the PR, I encountered some nodes with existing EWA interpolation. Therefore, I refactored some DNA and RNA code and exposed the EWA interpolation in the compositor’s interpolation enum, for which another PR #139833 has been provided.

Alongside fulfilling #119592, I also started to read into EWA filtering to get a general idea. The remaining update will be a summary of my findings.

Elliptical Weighted Average sampling

Elliptical Weighted Average (EWA) sampling and filtering is a technique primarily used in computer graphics to improve the quality of textures on surfaces that are viewed at oblique angles. It’s a form of anisotropic filtering, meaning it accounts for the distortion of the texture footprint when a surface is not perpendicular to the viewer.

When a textured surface is viewed at a steep angle, a screen pixel doesn’t map to a simple square or circle on the texture. Instead, it projects to an ellipse in texture space. Simpler filtering methods (like bilinear or trilinear filtering) treat the texture sampling area as a square, leading to blurriness (if the sampling area is too large) or aliasing/jaggies (if it’s too small or doesn’t cover the correct texels).

EWA filtering accurately models this elliptical footprint. It then calculates a weighted average of the texels (texture pixels) that fall within this ellipse. The weighting is typically done using a Gaussian function, giving more importance to texels closer to the center of the ellipse. This approach helps to reduce aliasing and preserve texture detail better than isotropic (uniform direction) filtering methods.

Terminology

  • Anisotropic filtering: A general term for texture filtering techniques that account for the directionality of the texture mapping. EWA is a high-quality method of anisotropic filtering.
  • Footprint: The shape on the texture that corresponds to a single pixel on the screen. In EWA, this is an ellipse.
  • Resampling Filter: EWA combines a reconstruction filter (to reconstruct the continuous texture signal from discrete texels) and a low-pass filter (to prevent aliasing when sampling this continuous signal for the screen pixel). Often, both are Gaussian functions, and their combination results in another Gaussian.
  • Splatting: While originally developed for texture filtering, the EWA concept has been extended to other areas like point rendering and volume rendering. In these contexts, “EWA splatting” refers to projecting an elliptical (or ellipsoidal in 3D) kernel onto the screen for each point or voxel, effectively “splatting” its contribution over an area.
  • Screen-Space EWA: This is the classic formulation, notably detailed by Heckbert. In this approach, the filter is initially defined in screen space. For instance, a circular pixel footprint is considered in the output image. This screen-space footprint is then inversely mapped into the 2D texture space. Due to the nature of perspective projection and texture mapping, this inverse mapping typically transforms the circular screen-space region into an ellipse in texture space. The EWA filter then computes a weighted average of the texels that fall under this ellipse. While conceptually direct, this method can be computationally intensive due to the per-pixel inverse mapping required to determine the ellipse parameters and the subsequent iteration over texels in texture space.
  • Object-Space EWA: In contrast, object-space EWA formulates the filter in the object’s local parameter space (e.g., UV coordinates on a mesh) or directly on surface elements like “surfels” (surface elements, often used in point-based rendering). These object-space filters, which are themselves elliptical, are then projected or rendered onto the screen. This approach can be more amenable to hardware acceleration using standard graphics pipelines, as it can leverage GPU vertex and pixel shaders more directly. For example, surfel polygons can be deformed in a vertex shader to match the view-dependent EWA filter shape, and then rasterized and textured. This method is particularly efficient if a single object-space primitive (like a surfel or a polygon) covers multiple screen pixels, as the filter setup cost can be amortized.

Core papers

Heckbert, Fundamentals of Texture Mapping and Image Warping

  • introduces EWA sampling
  • screen pixel projects to an ellipse in texture space
  • uses a Gaussian filter weighted over the ellipse
  • effectively addresses aliasing and blurring at oblique angles
  • seconstruction and low-pass filter in a single filter

Zwicker et al. 2002, Surface Splatting

  • introduced the screen space EWA formulation, which allows the filter to be evaluated in screen space

Mavridis et al. 2011, High Quality Elliptical Texture Filtering on GPU

  • brings EWA quality to real-time applications, leveraging the GPU
  • uses approximations/multi-pass techniques
  • uses GPU hardware features
  • may not be helpful for the CPU implementation

Comparative analysis

The computational complexity of EWA filtering is a critical factor for its practical implementation, especially in interactive applications like Blender’s compositor. This section examines the key determinants of computational cost and profiles the complexity of different EWA formulations.

Several factors contribute to the overall computational complexity of EWA methods:

  1. Filter Extent/Area: The primary determinant is the size of the elliptical filter kernel in texture space. This area, representing the number of texels that contribute to a single output pixel, can vary depending on the degree of minification and anisotropy. Directly proportional to this projected area.
  2. Jacobian Calculation: For screen-space EWA, computing the partial derivatives of the screen-to-texture mapping (the Jacobian matrix) is necessary for each pixel to define the ellipse.
  3. Texture Fetches: The number of texture samples read from memory per output pixel. This is directly related to the filter extent and is a major bottleneck, especially on GPUs.
  4. Arithmetic Operations: These include calculations for ellipse parameters, evaluating the weighting function (e.g., Gaussian) for each texel, and accumulating the weighted texel values.
  5. Precomputation: Some EWA variants might involve pre-processing steps. For example, in the context of point-based rendering, optimal sampling of textures to irregular point data might be done as a pre-process. While standard mipmaps are a form of prefiltering, EWA’s dynamic nature usually requires more on-the-fly computation.
  6. Control Flow: In GPU implementations, conditional branching and divergent execution paths (e.g., if filter complexity or loop iterations vary between adjacent pixels processed in parallel) can negatively impact performance.

The “projected texture area” in texture space, corresponding to a single screen pixel, is the core element of EWA’s cost. The cost per screen pixel is proportional to the number of texture pixels within this projected ellipse. The size and shape of this area are determined by the Jacobian of the texture mapping. Thus, the central challenge in optimizing EWA lies in efficiently calculating or approximating this Jacobian and then effectively summing the contributions of texels within the resulting ellipse.

Method Core Idea Primary Computational Steps Qualitative CPU Complexity Qualitative GPU Complexity Key Advantages Key Disadvantages/Challenges Relevance/Potential for Blender Compositor
Screen-Space EWA (Exact/Naive) Inverse map pixel footprint to texture space ellipse; sum weighted texels. Jacobian, ellipse params, iterate texels in bounding box, check inside ellipse, weight, fetch, accumulate. High, Variable (O(N_ellipse​)) High, Variable (Shader-bound, Texture-bound due to many fetches & dynamic loops) High quality (ground truth). Very slow for large ellipses; thread divergence on GPU. CPU reference; high-quality offline option if performance allows.
Screen-Space EWA (Optimized Fetches) Exact EWA using hardware bilinear/trilinear fetches to sample 2x2 texel quads. 2 Similar to naive, but fewer, higher-quality fetches. Medium-High, Variable Medium-High, Variable (Fewer fetches, but still dynamic loops) Exact quality with 2-4x fewer fetches than naive. Still potentially slow for very large ellipses. Better GPU exact option; possible high-quality mode.
Object-Space EWA (Exact) Define EWA filter on object primitives; project to screen. 4 Deform primitive (vertex shader), rasterize, sample texture (pixel shader). Medium (per primitive) Medium (Vertex-bound for deformation, pixel-bound for texturing) Amortizes filter setup over pixels; good for large primitives. More complex setup; visibility handling (splatting) adds overhead. Relevant for 3D compositing operations or point-based elements.
GPU Hardware Approx. (Multi-Probe) Approximate EWA ellipse with few (e.g., 5-16) hardware anisotropic/linear samples. 2 Calculate probe locations/orientations, perform N hardware-accelerated fetches, combine weighted results. N/A Low-Medium, Fixed (Texture-bound due to N fetches) Fast, predictable performance; good quality for moderate anisotropy. Approximation (may show artifacts in extreme cases); quality depends on N and hardware. Ideal for real-time viewport compositor; default GPU option.

Sources:
[1] P. Mavridis and G. Papaioannou, “High quality elliptical texture filtering on GPU,” in Symposium on Interactive 3D Graphics and Games, San Francisco California: ACM, Feb. 2011, pp. 23–30. doi: 10.1145/1944745.1944749.

[2] M. Zwicker, H. Pfister, J. Van Baar, and M. Gross, “Surface splatting,” in Proceedings of the 28th annual conference on Computer graphics and interactive techniques, ACM, Aug. 2001, pp. 371–378. doi: 10.1145/383259.383300.

Next week

  • merge open PRs
  • more in-depth reading
  • summarize comparable results to discuss in module meeting (what trade-offs are acceptable)
  • look at existing EWA implementation
  • potentially start drafting the boundary condition PR
3 Likes