Cycles feedback

I think that he is pointing that even with high samples the detaisl are lost in OIDN

1 Like

This is strange, because NLM has been consistently worse than OpenImage for me

I just disagree with that

This is just a test, I have always used NLM below 600 samples in animations and I can state that it has a lot of potential to continue in the blender, oidn ends up with fine details even with large number of samples. it’s good to always have both denoisers as options

I have done amazing animations in versions 2.79 to 2.83 using only NLM and I can say that it was always very useful
Some people in this topic should learn more about composition and know when to use the denoisers for each situation!

Your method seems to go backwards with mine. I don’t understand why your results is so different from mine, I am just stating that from day 1 NLM has been giving me this result:

That was how it ruined my first render of the 2.78 version donut scene. After that I have been avoiding using Blender’s denoiser and use some third-party software instead, until 2.81, the AI denoiser, it was such a big news

EDIT: Look here Blener Guru once stated this:



Andrew is right, but that doesn’t mean the native denoiser isn’t good for other purposes.
it is better to use oidn in materials with reflections and glass shader
do not use nlm in very dark scenes and with very few samples, use portals in closed scenes as interiors, you need to balance or separate the objects in the view layer and then make the composition
you don’t have to render everything in just one scene
here’s an example of what I do. viewlayers
it’s not rule, just a hint

1 Like

Hello I want to share my test with Cycles-X.
I did only a CPU benchmark between Cycles and Cycles X.
I have an AMD Epyc workstation (2x7742) with 128Cores. I rendered one frame of my latest projects.
Cycles: 4m4sec
Cycles-X: 2m56sec
The speedup on this scene on CPU side is impressive! 28% speedup
Keep it up!

I agree OIDN is really better. Its the one I use. But it would be better if texture details was better preserved.
If passes are denoised in compositor, then textures come with details. But there are issues I found with CyclesX passes like diffdir and diffind when the shader is a mix principled and translucent. Kind of subtle but the results are darker than it should.

Those are very impressive numbers!
Can I ask what OS are you using and what scene was that?
I cannot wrap my head around CPU performance in Cycles-X considering performance regression seen in my benchmarks.

Hello, I am using Windows 10 Pro for Workstations as OS, because of the many cores and the AMD Epyc processors.
As scene as I wrote I used one of my latest projects. See screenshot.

Maybe the speedup in cpu rendering is only seen on my computer? Because of the many cores I had not an equal scale in performance. I tested the same scene with a Threadripper 1950x Zen with 16 cores. The 128core machine was only 500% fast but has 8x the core count. I tought is a threading problem…

1 Like

CPU rendering performance is not something we have tuned yet. There’s multiple factors that depend on the scene and hardware, some will make it slower and some will make it faster than before. For a 128 core machine, I imagine that scheduling work at a per-pixel level rather than per-tile level is helpful.

All the performance regressions we know of we think can be solved, but Cycles X is a prototype for GPU rendering at the moment, we’ll get to the CPU performance later.

For GPU rendering, there were some additional optimizations in the past few days, especially for RTX cards. If anyone is curious here are new charts based on our testing.

Quadro RTX 6000 (OptiX)(1)

Quadro RTX A6000 (OptiX)(2)

17 Likes

I just downloaded the new build but on my end it is still slower than master. Is it because I am using a lower end GPU? You guys seem to be testing on RTX a lot, but my lower end GPU does not seem to benefit much

EDIT: To be fair though, Cycles X does seem to be more stable than Master. Some scenes that would crash the master on my computer, rendered normally on Cycles X. I do appreciate the stability.

It may depend on the GPU, you didn’t mention which one you are using. There may be some bottleneck, but it’s too early for us to test on a wider range of GPUs and scenes.

Ok, I am using MX130 on my laptop

Hi there !
Great to see the progress you are doing with this new version of cycles !
Am really looking forward to seeing what comes of it in the next few months ! Specially with “indoor” scenes.
I just tested the cycles-x of today versus 2.93 daily on macos on the cpu.
The “monster under the bed” scene of Metin went from 8m48 (v2.93) to 5m28 on a i9-10850k.
I noticed the displacement on the shader of the mask is causing a few problem.
Keep it up !

I did some testings using my own test scene (100 samples, OIDN)

3600X / 1660GTX (466.11)

Time in seconds:

CPU + GPU:

Cycles (2.92) (02948a2cab44) / Tiles x240 / CUDA: 23:71
Cycles (2.92) (02948a2cab44) / Tiles x240 / OptiX: 19:81

Cycles (2.93) (96abe5ebbc55) / Tiles x240 / CUDA: 22:55
Cycles (2.93) (96abe5ebbc55) / Tiles x240 / OptiX: 18:79

Cycles (3.0) (d08cc63e2f98) / Tiles x240 / CUDA: 22:64
Cycles (3.0) (d08cc63e2f98) / Tiles x240 / OptiX: 18:53


GPU Only:

Cycles (2.92) (02948a2cab44) / Tiles x240 / CUDA: 28:36
Cycles (2.92) (02948a2cab44) / Tiles x240 / OptiX: 23:15

Cycles (2.93) (96abe5ebbc55) / Tiles x240 / CUDA: 27:32
Cycles (2.93) (96abe5ebbc55) / Tiles x240 / OptiX: 21:69

Cycles (3.0) (d08cc63e2f98) / Tiles x240 / CUDA: 27:44
Cycles (3.0) (d08cc63e2f98) / Tiles x240 / OptiX: 21:80

Cyclex X (1dea1d93d39a) CUDA: 13:62
Cyclex X (1dea1d93d39a) OptiX: 14:17

Cyclex X (43789b764d9a) CUDA: 12:45
Cyclex X (43789b764d9a) OptiX: 12:98

Cyclex X (a176118b287f) CUDA: 10:47
Cyclex X (a176118b287f) OptiX: 10:89


GPU results (chart):

3 Likes

I’ve put together a new scene for my tests and here are the results.

The scene. There is a whole city outside the frame with emissive materials, lights, etc. that reflect in the windows during animation.

3600X / 1660GTX (466.11) / 100 Samples / OIDN / 1920x1080

GPU only (time in seconds:):

Cycles (2.92) / Tiles x240 / CUDA: 62:75
Cycles (2.92) / Tiles x240 / OptiX: 50:76

Cycles (2.93) (96abe5ebbc55) / Tiles x240 / CUDA: 61:96
Cycles (2.93) (96abe5ebbc55) / Tiles x240 / OptiX: 49:39

Cycles (3.0) (d08cc63e2f98) / Tiles x240 / CUDA: 61:96
Cycles (3.0) (d08cc63e2f98) / Tiles x240 / OptiX: 49:56

Cyclex X (1dea1d93d39a) CUDA: 31:64
Cyclex X (1dea1d93d39a) OptiX: 32:60

Cyclex X (43789b764d9a) CUDA: 28:81
Cyclex X (43789b764d9a) OptiX: 29:98

Cyclex X (a176118b287f) CUDA: 24:31
Cyclex X (a176118b287f) OptiX: 25:12

GPU results (chart):

1 Like

Thanks for the tests @JohnDow. Somewhat curious that CUDA is faster than OptiX here, but not entirely surprising since there is no hardware ray-tracing and our own ray-tracing implementation has some optimizations that Embree and OptiX do not.

Cycles-X still uses a single CPU core while GPU rendering just as regular cycles does today. Could this quirk be eliminated?

Hi Brecht, as I post earlier you’re right that ODIN in 2.93 is better than 2.92 version but still NLM is better at preserving details. Also OIDN was slower than NLM. Is there a newer methods similar to NLM? As an alternative for AI solutions?

@Eary
No one care about low sample renderings. For real production and final rendering it’s more desirable to preserve details than having a preview, low sampled and blurred image. If I want an archviz rendering I want sharp, fine image where details like grass, leather, carpet or similar things are not blurred which is a often case with AI solutions. OIDN 1.3 is better with preserving details than 1.2 but still has tendency to blur details.

There’s always new algorithms with pros and cons, but we’re not implementing a new denoiser as part of this project. And it makes little sense to try that if we haven’t checked yet for ways to improve detail preservation with OIDN.

Not likely without significant performance loss. Maybe for the multi-GPU case we can use a single core for all.