Mac OS

Discussion related to the Engine functionality, implementations and API.
User avatar
B.Y.O.B.
Developer
Posts: 1811
Joined: Mon Dec 04, 2017 10:08 pm
Location: Germany
Contact:

Re: Mac OS

Post by B.Y.O.B. » Thu Oct 11, 2018 9:21 pm

robbrown wrote:
Thu Oct 11, 2018 9:18 pm
Weirdly if I try to deselect the GPU and leave CPU enabled in the LuxCore Device Settings panel of Blender
What you are doing here is: you leave the C++ CPU devide enabled, not the OpenCL CPU device (aka hybrid rendering, but without the GPU).
What you want (only OpenCL CPU) is not possible in BlendLuxCore, that's why I added the option in the debug panel I mentioned above.

And yes, try to comment out the opencl.devices.select="01" line.

Try to use the same settings as I do in the new debug option:
https://github.com/LuxCoreRender/BlendL ... fbf203R100
Only these 3 config lines for OpenCL.
Support LuxCoreRender project with salts and bounties

robbrown
Developer
Posts: 47
Joined: Mon Sep 03, 2018 1:04 am

Re: Mac OS

Post by robbrown » Thu Oct 11, 2018 9:32 pm

Ah I see, so Use CPUs in Blender only works when a GPU Device is selected... I wonder if there should be a warning or Use CPUs becomes grayed out or something to indicate that from a usability perspective. Although I don't think the average user is doing this. :lol:

Removing the devices select fixed the keeps using the GPU issue, it's now using OpenCL CPU only and seems to be ok so we're definitely headed for driver issue territory.

robbrown
Developer
Posts: 47
Joined: Mon Sep 03, 2018 1:04 am

Re: Mac OS

Post by robbrown » Thu Oct 11, 2018 9:33 pm

Doh... There is one, I'm just not reading today

robbrown
Developer
Posts: 47
Joined: Mon Sep 03, 2018 1:04 am

Re: Mac OS

Post by robbrown » Wed Oct 17, 2018 4:35 am

Continued progress on searching for root cause of the OpenCL with LuxCore2.1Benchmark on Nvidia cards narrowed down to the material Mix node being the culprit. If I delete the mix node from all materials and just route the previous color output to the input of diffuse color the program no longer results in OpenCL throwing a SIGABORT...

I also figured out that this is occurring at oclQueue.finish() and yields a GPU restart, system diagnostics report usually yields:

Code: Select all

Channel exception! Exception type = 0x1f Access Violation Error (MMU Error 2)
Occasionally:

Code: Select all

Channel exception! Exception type = 0xd Graphics Engine Error (GR Exception Error)
I haven't seen any additional OpenCL error checking I could be adding in the LuxCore code so seems weird it's not caught before oclQueue.finish(). Sadly I don't have a lot of path tracing experience under my belt (Work in progress :lol:) so I'm probably stuck with printf statements unless anyone has any suggestions.

Also noticed that a pyunittest fails: test_InfiniteLight_BIDIRCPU_METROPOLIS with an abort trap: 6

User avatar
Dade
Developer
Posts: 1530
Joined: Mon Dec 04, 2017 8:36 pm

Re: Mac OS

Post by Dade » Wed Oct 17, 2018 9:42 am

I have fought for a long time with AMD driver bugs, it is an incredible tedious process. Finding a work around requires a huge amount of tries and luck (if a workaround even exist). It may be not worth your time.
Mix material is one of the few materials referencing other materials and it is exactly the kind of complex thing that can drive nuts weak OpenCL compilers. You could try to disable (i.e. declare "") the OPENCL_FORCE_INLINE/OPENCL_FORCE_NOT_INLINE macros, it is defined only for NVIDIA drivers and it is a workaround to NVIDIA compiler problem (i.e. trying to inline everything) on Linux and Windows.

The crash of test_InfiniteLight_BIDIRCPU_METROPOLIS is quite strange, can you post a stack trace of the crash ?
Support LuxCoreRender project with salts and bounties

robbrown
Developer
Posts: 47
Joined: Mon Sep 03, 2018 1:04 am

Re: Mac OS

Post by robbrown » Fri Oct 19, 2018 6:25 pm

I was hoping to get Xcode debugger attached to the python tests but haven't had success with that yet. I can run the debugger with luxcoreui now, which I need to make a PR for some cmake additions to add macOS Debug symbols.

Anyway I've attached the crash log from the OS which has the stack trace. I suspect lightVertex.throughput is becoming a Inf since lightEmitPdfW and lightDirectPdfW look to be zero, interesting though since GetRadiance is returning what look to be zero values many times before the crash occurs.

If it looks like there's more digging to do, I can do so with print statements for now.
Attachments
Python_2018-10-18-crash.zip
(17.18 KiB) Downloaded 11 times

robbrown
Developer
Posts: 47
Joined: Mon Sep 03, 2018 1:04 am

Re: Mac OS

Post by robbrown » Sat Oct 20, 2018 6:30 pm

Has LuxCore been compiled with clang on Linux before?

I was hoping there was some GCC compiler flag handling floating point rounding but don’t see anything other than the windows fp-precise.

Anyway thinking to rule out floating point accuracy/variability among compilers I’d compile on Linux with clang and see if the assert fails there.

I added an if block to not fall into the Inf problem and quickly ran into a different assert failure around like 600 leading me to think it’s the floating point differences.

User avatar
B.Y.O.B.
Developer
Posts: 1811
Joined: Mon Dec 04, 2017 10:08 pm
Location: Germany
Contact:

Re: Mac OS

Post by B.Y.O.B. » Sat Oct 20, 2018 7:11 pm

robbrown wrote:
Sat Oct 20, 2018 6:30 pm
Has LuxCore been compiled with clang on Linux before?
The readme of the compilation script for Linux says that clang is supported, but I did not test it yet.
Support LuxCoreRender project with salts and bounties

User avatar
Dade
Developer
Posts: 1530
Joined: Mon Dec 04, 2017 8:36 pm

Re: Mac OS

Post by Dade » Sun Oct 21, 2018 9:51 am

robbrown wrote:
Fri Oct 19, 2018 6:25 pm
I was hoping to get Xcode debugger attached to the python tests but haven't had success with that yet. I can run the debugger with luxcoreui now, which I need to make a PR for some cmake additions to add macOS Debug symbols.

Anyway I've attached the crash log from the OS which has the stack trace. I suspect lightVertex.throughput is becoming a Inf since lightEmitPdfW and lightDirectPdfW look to be zero, interesting though since GetRadiance is returning what look to be zero values many times before the crash occurs.

If it looks like there's more digging to do, I can do so with print statements for now.
I'm trying to reproduce the problem here but it doesn't happen. I have enabled asserts but they never fails when rendering test_InfiniteLight_BIDIRCPU_METROPOLIS.
The code has a guard "if (!lightVertex.throughput.Black()) { }" before the division by the PDF and it should never happen to have a not black color with a zero PDF.
If it isn't something strange like a compiler bug, it should happen here too.

Does it always happen ?

You can execute a :

Code: Select all

cd pyunittests
../bin/luxcoreui -D renderengine.type BIDIRCPU -D sampler.type METROPOLIS resources/scenes/simple/light-infinite.cfg
to run the scene by hand.
Support LuxCoreRender project with salts and bounties

robbrown
Developer
Posts: 47
Joined: Mon Sep 03, 2018 1:04 am

Re: Mac OS

Post by robbrown » Sun Oct 21, 2018 6:26 pm

It happens every time with METROPOLIS for me, if I add additional checks for PDF being zero there to skip, it'll fail a few iterations later in the TraceLightPath IsAllValid assert with more Inf floats. If I run the RANDOM test instead it occasionally makes it past this point which is what leads me to think it's some sort of Floating Point behavior that's causing the issue.

Code: Select all

...
if(lightEmitPdfW == 0.f && lightDirectPdfW == 0.f)
{
    fprintf(stdout, "lightEmitPdfW=%1.20f, lightDirectPdfW=%1.20f\n", lightEmitPdfW, lightDirectPdfW);
    lightVertex.throughput = 0.0;
}

if (!lightVertex.throughput.Black()) {
...
I only get the fprintf with PDF at zero twice for METROPOLIS and about the same for RANDOM.

Manually running the rendering through luxcoreui yields the same result.

Currently trying to get clang to work on my linux box isn't going too smoothly, zlib fails right away. Using the precompiled gcc libs shows LuxCore cmake failing with "LLVMgold.so: error loading plugin:" so still working on that front.

Post Reply