Cycles AMD HIP device feedback

gfx1035 is listed as APU on User Guide for AMDGPU Backend — LLVM 15.0.0git documentation, and that’s the list we are going off since it should correspond to how LLVM compiler in the HIP SDK treats it. The bug report also mentions “Rembrandt RDNA2 IGP”, where IGP means integrated.

Also it’s not obvious to me that switchable graphics means that one of the two GPUs is necessarily dedicated. Maybe but I’m not sure.

Maybe @bsavery can clarify.

1 Like

Actually @brecht gfx 1035 can be enabled I think. We have tested internally, and the only issue is with the older HIP sdk it would throw a warning of an unknown architecture. The one installed on build bots now should be fine with it.

gfx1035 has been enabled now in 3.2 and 3.3, tomorrow’s build should have it.

1 Like

Hello Dear Blender Devs, i have noticed that in the task ⚓ T91571 Cycles HIP device
the “Linux support and stability” lacks the polaris architecture, but the polaris gfx8** architecture supports HIP at least it does in linux as shown in the output of my personal machine :
/opt/rocm/hip/bin/hipInfo


device# 0
Name: AMD Radeon RX 580 Series
pciBusID: 1
pciDeviceID: 0
pciDomainID: 0
multiProcessorCount: 36
maxThreadsPerMultiProcessor: 2560
isMultiGpuBoard: 0
clockRate: 1340 Mhz
memoryClockRate: 2000 Mhz
memoryBusWidth: 256
totalGlobalMem: 8.00 GB
totalConstMem: 8589934592
sharedMemPerBlock: 64.00 KB
canMapHostMemory: 1
regsPerBlock: 65536
warpSize: 64
l2CacheSize: 0
computeMode: 0
maxThreadsPerBlock: 1024
maxThreadsDim.x: 1024
maxThreadsDim.y: 1024
maxThreadsDim.z: 1024
maxGridSize.x: 2147483647
maxGridSize.y: 2147483647
maxGridSize.z: 2147483647
major: 8
minor: 0
concurrentKernels: 1
cooperativeLaunch: 0
cooperativeMultiDeviceLaunch: 0
isIntegrated: 0
maxTexture1D: 16384
maxTexture2D.width: 16384
maxTexture2D.height: 16384
maxTexture3D.width: 16384
maxTexture3D.height: 16384
maxTexture3D.depth: 8192
isLargeBar: 0
asicRevision: 1
maxSharedMemoryPerMultiProcessor: 64.00 KB
clockInstructionRate: 1000.00 Mhz
arch.hasGlobalInt32Atomics: 1
arch.hasGlobalFloatAtomicExch: 1
arch.hasSharedInt32Atomics: 1
arch.hasSharedFloatAtomicExch: 1
arch.hasFloatAtomicAdd: 1
arch.hasGlobalInt64Atomics: 1
arch.hasSharedInt64Atomics: 1
arch.hasDoubles: 1
arch.hasWarpVote: 1
arch.hasWarpBallot: 1
arch.hasWarpShuffle: 1
arch.hasFunnelShift: 0
arch.hasThreadFenceSystem: 1
arch.hasSyncThreadsExt: 0
arch.hasSurfaceFuncs: 0
arch.has3dGrid: 1
arch.hasDynamicParallelism: 0
gcnArchName: gfx803
peers:
non-peers: device#0

memInfo.total: 8.00 GB
memInfo.free: 8.00 GB (100%)

edit : i know that you might not support it at this time , but add it as a task so maybe someone might work on it.

1 Like

I only added Vega because that’s planned to be worked on. Polaris may or may not happen, but I don’t want to give the impression that it is on the roadmap. Those cards are no longer listed as supported GPUs in the ROCm docs, so even if we do get it working at some point, that may be difficult to maintain.

3 Likes

So after removing Rocm 5.1.3 I’ve been able to compile Blender 3.3 alpha from git finally with no HIP or Vega support obviously. Something is still broken in Rocm 5.1.3 (or in Blender itself) so latest Blender will fail at generating HIP kernels during compilation, at least on Arch with the AUR version.

Have you tried with opencl-amd package? Maybe there are differences in the packages that will help you compile it.

I did that and blender-git builds now skipping fatbins with Vega enabled. But it still doesn’t render with “hip hipcc compiler not found, install hip toolkit in default location” error. I don’t know how to address that.

I’m not on my PC at the moment to check but it’s possible that you are missing files included in the opencl-amd-dev package. (It’s a big package about 9-10GB)

I’ve installed opencl-amd-dev, and it fails to build gfx900 kernel and then later fails to complete the build.

[360/4615] Generating kernel_gfx900.fatbin
FAILED: intern/cycles/kernel/kernel_gfx900.fatbin /home/user/12345blnd/src/build/intern/cycles/kernel/kernel_gfx900.fatbin 
cd /home/user/12345blnd/src/build/intern/cycles/kernel && /opt/rocm-5.1.3/hip/bin/hipcc --amdgpu-target=gfx900 --genco /home/user/12345blnd/src/blender/intern/cycles/kernel/device/hip/kernel.cpp -D CCL_NAMESPACE_BEGIN= -D CCL_NAMESPACE_END= -D HIPCC -I /home/user/12345blnd/src/blender/intern/cycles/kernel/.. -I /home/pink/12345blnd/src/blender/intern/cycles/kernel/device/hip -Wno-parentheses-equality -Wno-unused-value --hipcc-func-supp -ffast-math -o /home/user/12345blnd/src/build/intern/cycles/kernel/kernel_gfx900.fatbin
error: unhandled SGPR spill to memory
error: unhandled SGPR spill to memory
2 errors generated when compiling for gfx900.
1 Like

To be clear, getting Vega to work is not a matter of build configuration or applying existing patches, there are other issues to be fixed still. Otherwise we would have enabled it already.

What about Rocm 5.1.3 breaking HIP kernel compilation for all AMD GPUs with Blender-git 3.3 alpha?

Since we’re already at Bcon4, just a few days away from the scheduled release, and with no update from Sayak, looks like RDNA support is gonna slip again?

Edit: Or are we actually running into a bug in HIP itself? AMD seems to prepare another release, I see an activity spike on the repos, which usually happens right before an update.

What makes you think RDNA isn’t supported by Cycles HIP? Both RDNA and RDNA2 GPUs are officially enabled in Cycles HIP right now.

Yeah, but it’s still broken on Linux: ⚓ T97591 Cycles HIP error with image textures on Linux and RDNA1

It’s a driver side issue. So I don’t think any changes on the Blender side right now will help.

Not only that but AMD confirmed that RDNA will not be supported for ROCm so don’t get your hopes up. I enjoyed having this GPU in these crazy times but I’ve completely abandoned the idea of doing anything else other than gaming with it.

3 Likes

Some scenes do work just fine though, which means there’s clearly a workaround at least. But I don’t understand why they don’t cause Blender to crash. Junk Shop renders just fine, and looking at the log, I see many of the same hipTexObjectCreate calls that normally cause immediate segfaults, except they execute without issue in this particular scene.

@Luciddream . I work for AMD :wink: The driver team is only currently supporting RDNA2 for Blender with HIP. However, we’re doing what we can to make sure RDNA is enabled in Blender.

@wsippel I’m not sure what makes you say “clearly there is a workaround”. It’s just some scenes don’t cause the issue. It does seem to be in the hipTextureObjectCreate call, the driver team is looking at why its only on the RDNA cards.

1 Like

Well, I dug around some more to figure out why Junk Shop works. There appears to be a workaround, it’s just not a very convenient workaround: As far as I can tell, what’s causing the crash is textures with an odd horizontal resolution (odd vertical resolution is fine though). I made a bunch of test textures in different resolutions. 2048x2048, 4096x4096, 2048x4096, 768x4096 and 1024x1023 all worked, 2047x2048 crashed.

1 Like