Issues Building on MacOS Big Sur (11.4)

Discussion related to the LuxCore functionality, implementations and API.
danielbui78
Posts: 17
Joined: Wed Jun 23, 2021 10:47 pm

Re: Issues Building on MacOS Big Sur (11.4)

Post by danielbui78 »

u3dreal wrote: Mon Nov 15, 2021 2:29 pm
danielbui78 wrote: Mon Nov 15, 2021 12:26 pm FYI, I think Big Sur 11.6 did break some OpenCL support (at least for HD4000) with LuxCore 2.5 officially distributed binaries. LuxCore 2.6 still works.
I'm still baffeled how random it works or not. Every update is russian roulette.
Update: I figured out my OpenCL issues with 11.6: If something triggers a kernel recompilation, LuxCore will fail unless the kernel cache is manually deleted. Maybe this problem was present for me in 11.4 as well but I never noticed it because I never made configuration changes requiring kernel recompilation.

In other words: in order for me to get kernel recompilation to work in 11.6, I must go into the kernel cache folder for the appropriate device (~/luxcorerener.org/ocl_kernel_cache/....) and do `rm *.ocl`. NOTE: this only relates to runtime OpenCL error messages during kernel compilation, I still have device-specific system crashes due to run-away kernel compilation process eating up 80+ GB of RAM.

Just to clarify, my previous statement was wrong: LuxCore 2.5 + Big Sur 11.6 + HD4000 is working, at least for imagepipeline OCL -- as long as you manually delete *.ocl files.
Last edited by danielbui78 on Wed Nov 17, 2021 12:43 pm, edited 1 time in total.
User avatar
Dade
Developer
Developer
Posts: 5672
Joined: Mon Dec 04, 2017 8:36 pm
Location: Italy

Re: Issues Building on MacOS Big Sur (11.4)

Post by Dade »

danielbui78 wrote: Wed Nov 17, 2021 12:29 pm In other words: in order for me to get kernel recompilation to work in 11.6, I must go into the kernel cache folder for the appropriate device (~/luxcorerener.org/ocl_kernel_cache/....) and do `rm *.ocl`
It is sounds like the (driver) bug is related to returning (or reading back) the binaries of compiled kernel binary.
Support LuxCoreRender project with salts and bounties
danielbui78
Posts: 17
Joined: Wed Jun 23, 2021 10:47 pm

Re: Issues Building on MacOS Big Sur (11.4)

Post by danielbui78 »

Dade wrote: Wed Nov 17, 2021 12:38 pm
danielbui78 wrote: Wed Nov 17, 2021 12:29 pm In other words: in order for me to get kernel recompilation to work in 11.6, I must go into the kernel cache folder for the appropriate device (~/luxcorerener.org/ocl_kernel_cache/....) and do `rm *.ocl`
It is sounds like the (driver) bug is related to returning (or reading back) the binaries of compiled kernel binary.
Thanks for the tip. I'm looking at the kernel cache's filepath/hash algorithm: cached kernels are segregated into folders by vendor and device name, then a filename hash is created using the compiler options + kernel source code. However, I do not see work_group size being included in the compiler options list that is hashed. This is the parameter that I am changing in the cfg file: opencl.gpu.workgroup.size which is causing the OpenCL error requiring me to delete the *.ocl files.

Does workgroup size need to be added to the compiler options hash? Also, if the device vendor changes the default workgroup size in a dirver update, do we need to catch this and force kernel recompilation?
User avatar
Dade
Developer
Developer
Posts: 5672
Joined: Mon Dec 04, 2017 8:36 pm
Location: Italy

Re: Issues Building on MacOS Big Sur (11.4)

Post by Dade »

danielbui78 wrote: Wed Nov 17, 2021 1:29 pm Does workgroup size need to be added to the compiler options hash? Also, if the device vendor changes the default workgroup size in a dirver update, do we need to catch this and force kernel recompilation?
As far as I know it is not a kernel compiler option so it shouldn't be included in the hash.

Group size is pretty much an hardware characteristic, it has always been 32 for NVIDIA GPUs and was 64 for old AMD GPUs but it is now 32 for AMD too.

What value are you using ? (and why are you changing it ? on what GPU ?)
Support LuxCoreRender project with salts and bounties
danielbui78
Posts: 17
Joined: Wed Jun 23, 2021 10:47 pm

Re: Issues Building on MacOS Big Sur (11.4)

Post by danielbui78 »

Dade wrote: Wed Nov 17, 2021 2:29 pm
danielbui78 wrote: Wed Nov 17, 2021 1:29 pm Does workgroup size need to be added to the compiler options hash? Also, if the device vendor changes the default workgroup size in a dirver update, do we need to catch this and force kernel recompilation?
As far as I know it is not a kernel compiler option so it shouldn't be included in the hash.

Group size is pretty much an hardware characteristic, it has always been 32 for NVIDIA GPUs and was 64 for old AMD GPUs but it is now 32 for AMD too.

What value are you using ? (and why are you changing it ? on what GPU ?)
I would agree that MAXIMUM work-group size is a hardware characteristic, but the actual work-group size can be variable. I am using the Intel OpenCL CPU driver as well as the Intel HD4000 opencl driver, which is used by default on my MacBook Air for imagepipeline opencl acceleration. Previously, I had to manually set opencl.cpu.workgroup = 32 for the Intel OpenCL CPU driver to work. This appears to now cause an OpenCL error with Big Sur (at least for 11.6): CL_INVALID_WORK_GROUP_SIZE. After trial and error, it appears the only value that now works is opencl.cpu.workgroup = 1 (at least for 11.6).

While trying to figure out this value as well as when just upgrading from Big Sur 11.4 to 11.6, it was necessary to remove previous *.ocl files to avoid CL_BUILD_PROGRAM_FAILURE or CL_​INVALID_​VALUE errors after the call to clBuildProgram() on line 373 of ocl.cpp, and its corresponding CHECK_OCL_ERROR() call on line 374.
User avatar
u3dreal
Developer
Developer
Posts: 560
Joined: Tue Dec 03, 2019 3:23 pm
Location: Ulm
Contact:

Re: Issues Building on MacOS Big Sur (11.4)

Post by u3dreal »

MCurto just pointed out that the CL_INVALID_VALUE comes from an old kernel cache ... after deleting it things works now on 11.6.1.
At least for the image pipeline.

I also have the same here
CL_BUILD_PROGRAM_FAILURE
but deleteing does not help and brings up the same error after 20min.
Could setting

Code: Select all

opencl.cpu.workgroup = 1
here too ??

or did you mean

Code: Select all

opencl.gpu.workgroup = 1
CPU or GPU ?
I have GT750m and Iris Pro ..Iris has not worked for a year. 750m has stopped working some versions ago.

Interesting you got the intel driver to work..
Quoting Dade

We don't touch it ... not even with a 2 meter long stick :)
check out my newest stuff http://q3de.com/research/
portfolio http://q3de.com/


MB Pro i7 2.3Ghz, IrisPro 1.5GB, GTX750m 2GB - BigSur
Xeon X5650@4Ghz, RX 5700 - BigSur , Windows 10, Ubuntu 20.04
danielbui78
Posts: 17
Joined: Wed Jun 23, 2021 10:47 pm

Re: Issues Building on MacOS Big Sur (11.4)

Post by danielbui78 »

u3dreal wrote: Wed Nov 17, 2021 6:27 pm Interesting you got the intel driver to work..
Sorry for confusion: I only got Intel OpenCL CPU driver fully working for general rendering -- not the OpenCL GPU (HD4000) driver. After I do the workaround with `rm *.ocl`, the Intel OpenCL GPU driver will start compiling the kernel for general rendering, but will start using 40+ gigabytes of ram (my macbook air only has 8 gigs of physical ram), run out of swap space and then effectively crash MacOS. This happens for me on AMD RX 580 + LuxCore 2.6 as well (system crashing after 80+gb ram usage on my desktop with 32gb ram), but AMD RX 580 + LuxCore 2.5 is able to finish kernel compiling and work, although with artifacts.

However, HD4000 (OpenCL GPU) is working for the imagepipeline operations.
User avatar
u3dreal
Developer
Developer
Posts: 560
Joined: Tue Dec 03, 2019 3:23 pm
Location: Ulm
Contact:

Re: Issues Building on MacOS Big Sur (11.4)

Post by u3dreal »

Ah OK same here for Iris Pro and GT750m imagepipeline works.
Strange you have problems with the 580.
I run a rx5700 and it works fine for now.

Thanks for clearing things up.
check out my newest stuff http://q3de.com/research/
portfolio http://q3de.com/


MB Pro i7 2.3Ghz, IrisPro 1.5GB, GTX750m 2GB - BigSur
Xeon X5650@4Ghz, RX 5700 - BigSur , Windows 10, Ubuntu 20.04
Post Reply