Cycles AMD HIP device feedback

No, it has not been changed.

1 Like

Recently found these issues with Vega 64

• Using CPU + GPU taking up to 5x longer to render (comparison to Only GPU i.e.vega 64)

•Enabling viewport denoiser make viewport very laggy and render very slow both on GPU and CPU.

currently using windows 10 with latest drivers 22 q4
CPU - Ryzen 7 2700
GPU - Rx Vega 64

Didn’t submit any reports because I don’t know if it is with my system or its actually a bug.But I have tested it with 22 q3 and 22 q4 both are giving the same results.

Also didn’t included any file because you can test it by just draging a asset with some image textures on it.

Added some test(on Blender 3.3.1 LTS)

With CPU+GPU it take around 00:04:36 to render

With GPU only it takes around 00:01:33

Kindly have look into it @brecht @bsavery

2 Likes

There are some known performance issues with CPU + GPU rendering. I would not expect 5x slower, that would be worth investigating.

Viewporting denoising is relatively slow on AMD GPUs because it’s using OpenImageDenoise on the CPU. There is only GPU accelerated denoising with OptiX on NVIDIA GPUs. GPU acceleration for OpenImageDenoise is being worked on by Intel.

cpu and gpu can’t mix rendering, only block rendering, and cpu priority is higher than gpu, cpu+gpu option = cpu rendering

Since it doesn’t look like HIP will ever be possible on older cards like hawaii gpus and such any chance that the AMD-flavoured dev going into HIP might spawn a spin-off series with Vulkan?

I posted awhile back on bsaverys github repo for rprblender that whatever was done to rpr made it render better and faster on my r9 390 than anything prior including cycles opencl , is it possible that some of that magic can be implemented alongside HIP as an alternative so Cycles can be used again? or is the sauce just too difference despite AMD code being involved with HIP.

1 Like

It is possible for a developer to go in an add a Vulkan rendering backend for Cycles, however I believe this is unlikely to happen or be accepted into the Cycles code base. See Brecht’s response to my question about implementing Vulkan here:

1 Like

I realize this is sacrilege in the intel HIP thread but:

At BCon22 I saw a presentation by an Intel guy. The version of blender on Intel’s OneAPI he showed off could run on intel ARC, NVidia (via a oneapi CUDA backend) or AMD (via a OneAPI OpenCL backend).

I’m not sure if all those backends are publicly available yet. And also not sure if it buys you anything on older hardware, because OpenCL support alone is not enough, the GPU needs enough register, local memory etc.

But still it might be interesting to watch.

/ducks and runs

1 Like

Not likely.

So the way that was working is piping the C++ through the HIP compiler or CUDA compiler in the case of NV, and translating the relevant library calls. It wouldn’t really give any advantage or hardware compatibility fixes that weren’t already in HIP.

Where it is interesting is on the developer side where you could have one unified backend for CUDA/HIP/OneAPI, but I’m not sure it would have a direct effect for users.

As far as I understood it OneAPI had an OpenCL backend. OpenCL predates HIP, no? I used OpenCL on AMD cards before I ever had heard of HIP, but maybe I’m just uninformed.

But probably those older cards don’t have the needed hardware capabilities anyway.

I don’t know enough about OneAPI to comment if there’s a OpenCL backend, but there was no OpenCL involved in the demo they were showing at BCon. At least on the AMD device.

Maybe the OpenCL backend was used to run on CPU? I don’t remember exactly.
I was referring to this slide: oneAPI backend: Cycles on Intel GPUs - YouTube

edit: now that I look at it again, I see that was used to run on CPU and older intel GPU’s indeed. Still older AMD cards do support OpenCL, so it might be interesting for people having that hardware…

But this getting offtopic, so I’ll shut up about it :wink:

1 Like

Before I go down the rabbit hole here of trying to build around my graphics card is there a hard insurmountable obstacle I’m not aware of? GFX702 so… Rocm support died officially in 2.0 I believe but IIRC Hawaii GPUs still kind of worked until 3.5 with patches, I need to re-read whatever my brain kept about CUDA warp being 32 and AMD wavefront being 64 and the workaround for that I think. warpSize or something.

So make a brew with a version of llvm, hip clang, cuda and shazbot that magically works, then build Blender and Cycles from scratch re-enabling the GFX7xx’s. If I can’t find whatever magical sauce people used to get Hawaiis working in 3.5.0 and can’t use hipify tools and such then that means I need to use the deprecated HCC, yeah? This will be fun for a highschool dropout. I didn’t see anything in the llvm docs in the target triples there that specifically said this won’t work. Prorender works like a slap of hot damn these days but I feel like I need to back my 390 like it’s a kid being bullied out of a little league baseball game.

Might be easier to break into AMD headquarters and thief me up a RDNA card holding up a wobbly table somewhere.

If I pull this off do you think AMD would sponsor my hardware for the next three years of my interactive media design diploma out of pity? Perhaps disgust? Let this abomination begin however it turns out because of AMDs documentation standards,

From the AMD GPU GCN3 ISA blurb.
“AMD GCN3 ISA Architecture document describes the environment, organization, and program state of AMD GCN Generation 3 devices which includes Radeon R9 family of devices. It details the instruction set and the microcode formats native to this family of processors that are accessible to programmers and compilers.”

That’s not confusing at all, what with half of R9 being GCN2. Cycles requires a minimum of Cuda computer 3.0 so… as long as I meet CUDA 3.0 in HIP code it should work, sort of, right?

Wish me luck.

2 Likes

Quick Feedback:

  • Not having access to the HIP libraries to compile with them openly is… making it really hard to troubleshoot or make custom builds with it.

When will these libraries be placed in the lib dependencies? Is there another open access point to these libs?

1 Like

Edit: Dumb.

If there’s no hip compiler for Windows afaik can we use hiprtc? I was dicking around with prorender and realized I got amd_comgr.dll which is all hiprtc needs independent of actually having HIP SDK?

From github on hiprtc.dll
" * This library can be used on systems without HIP install nor AMD GPU driver installed at all (offline compilation). Therefore it does not depend on any HIP runtime library"
" * But it does depend on COMGr. We may try to statically link COMGr into hipRTC to avoid any ambiguity."

So can a person on Windows use these tools to get from point A to B? Using hiprtc to make/try to make a fatbin for their unsupported architecture then try to walk backwards to make it compile and run properly?

1 Like

First of all, congratulations on the 3.4 6800xt speed up to 30s (actually overclocked to 2560) In addition I found a friend who reviewed the 7900xt from where he got the 7900xt rendering speed can only do 25s to complete, 84/72= 1.16, double the fp32 performance is not used?:smiling_face_with_tear:
Of course I don’t have any derogatory meaning, just this enhancement is difficult for me to have the idea of upgrading, 6800xt is also just and 3080 early because of the bitcoin trend can not buy gpu so 2 side snapping good to buy, from the current situation 7900xt play games is very good of course I am not sure of the specific performance, my friend also signed a non-disclosure agreement, cycles outside the details are not possible and I said.

Blender could only support and optimize released hardwares. I suppose the 7900 series will get a boost after AMD releasing the corresponding drivers. And when the hardware raytracing being enabled in the future, there will be another huge boost I guess.

1 Like

Since the 7000 series rop to 192 192/128 = 1.5, then double the fp32 can at least increase the actual computing speed of 50 percent to make sense, if it is 1.5x rdna2 execution efficiency that should also be 30/((84/72)*1.5) = 17s

Maybe it is restricted by the driver? Since old driver would never know how much power future hardware has.

1 Like

Just caught a 7900 XT benchmark on opendata, I don’t know what to think about that. I know it’s only one bench but 1/3rd of the 4080 performance? I am trying to wrap my head around this, less performance than a 3070?
So… with the XTX having a slight bump in hardware spec it’s still going to be outpaced by leagues? Even if it hits a 4,000 it’s still under 1/2 the performance of a card nobody seems to care for, the 4080 because it’s bad price to performance ratio.

I’d better move my attention from the equipment to the technology, and when the technology arrives to focus on better equipment. :sob: :sob: :sob: :sob:

1 Like