Multithreading: Where is Blender slow?

Cloth sim could be improved a lot because it’s completely single-threaded right now AFAIK.

4 Likes

I’m not sure why I thought this was a multi-threaded simulation. You are right, this is single thread.

3 Likes

Theoretically some part of it was multithreaded, @LucaRood should know better :slight_smile:

1 Like

Yes, some of it is multithreaded. The last thing I did in that regard was multithreading collision detection, along with other improvements and optimisations. More recently, other people (@mano-wii @angavrilov) have also made further optimisations.

There are still some parts that could be multithreaded, but the gains would be small at this point. Multithreading is not a magic solution that you just slap on top of inefficient code, suddenly making it super fast. I haven’t profiled the cloth code in a while, but off the top of my head, the slowest step right now is the linear solver itself, which can’t really be multithreaded. And the parts that can be multithreaded probably wouldn’t benefit from it, as they are already pretty quick.

Keep in mind that multithreading very small and quick operations can actually make the code slower, as just the thread initialisation can actually take longer than the operation it’s executing.

The cloth engine in Blender is reaching its limits. We’re just pushing diminishing returns at this point. What we need for faster and better simulations, are better algorithms. Blender’s cloth implementation is almost 15 years old now, with a bunch of different stuff getting bolted on top over the years, and while some parts have been rewritten, it is fundamentally still the same as 15 years ago.

There have been plenty of developments in the research community over these 15 years, and there is plenty of good stuff to implement now. But I’m afraid that it would be a rather major endeavour. We really are running out of low hanging fruit here. Multithreading is not the solution.

14 Likes

From my point of view a smoke simulation has loose its productivity. Especially when you make this in blender 2.8 vs 2.9.
Would be very cool if this can be improved!
Thanks in advance for your feedback!

2 Likes

Would Freestyle be a good candidate for multi-threading. It is profoundly slow, single threaded, and doesn’t seem to use GPU resources above 1-3% (I watched!) while the CPU goes from 15-100% and Memory max’s out all to quickly.

1 Like

Hi there, thank you for your suggestion, do you have a workflow I can look at?

1 Like

I am far from being fluent in code matters, so I was more asking as a noob, wanting better NPR experience. I don’t have a clue how this would be approached. Sorry.

1 Like

Hi @paulhart2 I should be more clear, I just meant a workflow in Blender that uses Freestyle.

1 Like

Hi everyone, quick update. I’ve got a (small) code change here to improve the modifier workflow (which is the task I started on): https://developer.blender.org/D10609

I’ll keep looking at the modifier side to see what else can be done. Please keep the suggestions coming.

15 Likes

Congrats! It’s committed :slight_smile:

2 Likes

Blender multithreding modifiers .
Blender is really ahead of any of its opponents in terms of modeling , but as soon as poly count increases to 1mil the software start to act slowly , in compare to ther other softwares (on the same model) , working with modifiers such as boolean , decimate , subdivision etc is too much slow , while the same operation happens too much faster in other softwares , here is the secret , the other softwares use hard drive temp files too but blender only uses ram for storing geometry , the others using all threads while blender is using just a single core , those softwares using gpu computation for geometry changes (in edit mode , editing a high dense mesh would be terrible) which causes the mesh to be calculated much more faster , these are the only reasons that those companies are calling their own softwares on comparing to the blender , a design standard software and calling blender a hobby software , these changes shouldn’t take too much of time to being programmed .
(Also mesh editing is not multithreaded)

3 Likes

You might not be a programmer and confused how modern CPUs work, but almost 100% of all power and extra processing power we have today is actually due to many many many MANY cores. Server CPUs today have 64 cores, not just 2 super fast cores. We know from experience that going very high up in the clock speed frequencies does not work, the power consumption and heat overload is astronomical, it is much easier to parallelize. The fact is that many things are easy to so parallel, it just needs a little organization. Staying single thread is absolutely not an option, a powerful desktop CPU today has at least 16 cores if not more, everything single core is very slow and unusable.
You overstate the problem with overhead greatly, today all software is multi-threaded, except to a few examples that physically can’t be done, there is just no alternative.
You might not be aware how things are today, Windows is not transparent in how tasks are organized, so start up a Linux and start a browser, a few games and any tool you can think of and be amazed of how many threads there are running in parallel.

No, again, you don’t understand how this works. It is all a matter how you organize everything.

The truth is, everybody IS doing it and it IS easy.
Your knowledge is highly outdated, Games have parallelized things since 15 years already, because most power that we got extra since around 18 years ago have come from more cores, not vastly faster cores.
I know, these (synthetic) meaningless benchmarks like geekbench still show single core performance, but that is only important on a small system like a cellphone that is very underpowered, that has very few CPU cores. Any good and powerful CPU today has at least 16 cores (with hyperthreading 32) that all work in parallel. I would not know one single AAA game that actually uses considerable CPU power that is not multi-threaded.
If code is single threaded, porting that over to multi threading is a little bit of an effort, but one that was in transition for over 10 years already. GPUs are another example of gigantic multi-parallelism, where you use thousands of tiny little cores and split up the workloads for it and they crunch numbers multiple times faster than any CPU can. Check out Blender Benchmark and see how fast a Nvidia RTX 3090 is compared to the fastest CPUs and then look at the price (the normal price, not the inflated Covid 19 price due to chip shortage) and you will notice that you will spend pretty much 5 times as much for a comparable CPU.
And even the fastest 64 core Threadripper Server CPU can’t reach the performance of a RTX 3090, it just does not have enough cores.

Companies are another example of parallelism: You don’t have really powerful companies where just 4 Geniuses work and do the work of thousands of people, small companies are very weak and can’t best very large companies that employ many thousands of people. Work has to be split up between these armies of workers and yes, there is overhead, but it negligible, your mindset it very old and outdated, it feels more at home in the beginning of this millennia…

But not doing multithreading is not a solution neither. Have there been new algorithms that can actually be parallelized or are we going to be stuck on slow simulations forever, without a solution? I do understand that some simulation things can’t be physically parallelized, but isn’t there a solution?

@MarkusBawidamann i think you’re highly optimistic on what can be multithreaded and what cannot, and while “everyone is doing it” it is NOT an easy problem.

See it like this, if you’re painting a house and your painter says, “well, it’s a big house, it’ll take me 6 days to do it” You can go, well lets hire 5 more guys, and get this thing done 1 day! not a problem.

Now if you’re hire the same painter to paint your small bathroom and he goes should take me an hour, hiring 5 extra guys does not get it done in 10 minutes, there is just not enough room for all painters to be there at once.

The current cloth engine is a bathroom, not a house. Can we redesign the bathroom so it would easily fit 6 people? Quite likely but that is not the bathroom we have today.

What @LucaRood is suggesting is: demolish the bathroom, build something better but that’ll take quite a bit of time.

What you are suggesting is : Just push the 5 extra painters in there, it’s easy, everyone is doing it.

and that just… .doesn’t work…

12 Likes

Your example does not work: We are talking about a high performance workload, this bathroom of yours is a huge commercial bathroom with 30 stalls (it is Blender 3D, a professional production grade tool, not a children’s toy app) and then you again need lots of painters. Or lets just say: When it takes a long time, then you MUST do it parallel or, it takes a looong time. If it is quick, no then you don’t need it, but then you don’t have a problem at all do you? If you just simulate a little towel, then it is not too slow, the problem comes when you have lots of Cloth objects and complex clothing with self collision etc…

The problem that we are having is that the current situation can be so slow that it is unusable. What other alternative solution do you propose that is not using all these cores?

That is actually the right way to do it: Once a system is obsolete and you run into a bottleneck that does not let it scale anymore, you need to redesign it to fix it.

I don’t know what you are talking about it, we do this in all applications today. That’s why a 64 Core Threadripper finishes the render Task in Blender much faster than your 4 Core. If you don’t know what you are talking about and have little experience in IT, you might not talk so loudly and confidently.
There are only rare examples where parallelization can’t be done.

Think we have a slight misunderstanding here, I am not saying cloth simulations cannot be multi threaded ever…I’m saying you can’t [easily] put threading in the current cloth sim we have.

We do seem to agree though, demolish the current sim, replace with a state of the art multi threaded one

5 Likes

Yeah, it being a 15 years old code :slight_smile:

I come late to this thread, I somewhat ignored it for dunno whatever reasons. But the fact is that before talking down to LazyDodo like that, and assuming authority because you have coded using Multithreading, I’d take a look at his profile, role and work at the Blender Foundation.

You’ll soon notice he indeed knows what is he talking about and then some.

Everything can be parallelized, no one is saying the opposite. But some tasks need a complete overhaul for the parallelization to be fully effective. Even Maya or Max aren’t completely parallelized.

And even so, some tasks give really no benefit to parallelization. About games, the main thread of these is still a single thread as of today, what’s parallelized is the creation of the render passes, which is usually done on CPU and using threads, that’s a perfect task to be parallelized actually. A few big tasks.

Blender is faster on a threadripper than on a 4 core CPU, of course, tons of it is parallelized, but some tasks and sims aren’t as of today, and people assume is because lazyness or incompetence. That’s wrong.

Don’t assume people is ignorant because you disagree with them. That’s a wrong approach to discussions, specially in a forum like this where most people look at the code more often than not.

1 Like