MultiGPU stresstest - GPU(s) stop(s) calculating kernel

Use this forum for general user support and related questions.
Forum rules
Please upload a testscene that allows developers to reproduce the problem, and attach some images.
Wumme
Posts: 7
Joined: Fri Feb 05, 2021 7:14 am

MultiGPU stresstest - GPU(s) stop(s) calculating kernel

Post by Wumme »

Hello all,
I did a quick search and could not find anyone with a similar problem, so here we are.

The problem I am having is the following. I do a stresstest consisting from LuxBall HDR scene stress test on all GPUs and a Prime95 torture test on the CPU. After some time usually 1, 2 or even 3 GPUs just stop running the kernel and their usage drops to 0%. ("some time" can be 2 hours or 20 hours)
If I pause LuxMark all other GPUs, which are still calculating, stop and LuxMark freezes. Maybe because it does not receive an answer from the GPU(s) that stopped calculating??
The weird thing is that if I open up an new Instance of LuxMark everything is back to normal and I can benchmark again, even if the other instance is still open.

The reason I am doing this is that I need to verify the integrety of a PC-build and for that I've set up a couple of tests.

The build consists out of:
MB: Asus Pro ws x299 sage II
CPU: Intel Core i9-10900X
GPUs: 4x AMD Radeon Pro VII

Two of the GPUs are connected directly to the motherboard and the other two are connected over riser cabels. The reason for that is that all 7 PCIe slots are needed (for usb expension cards).

I thought about it and a lot of different things that could cause the problem came into my mind. Like maybe the CPU is to slow for whatever task LuxMark is performing while stresstesting. Or maybe the only 8xPCIe lanes per GPU are the bottle neck. Or maybe the PSU is having a power spike and for a short periode switches off the power to a GPU.
BUT it happend with an 2xGPU setup too. The wattage is fine aswell and I could not confirm any power spikes. I could not reproduce the problem without the Prime95 test so maybe it really is to busy.

My question here is now, because of that weird behaviour of LuxMark, that after a GPU drops usage and the process must be killed. Is it possible that LuxMark itself has a bug?

I have attached a screenshot of MSI Afterburner with the 2xGPUs setup. As you can see there the GPU1 drops usage, power and temp and GPU2 is running normally.

Hopefully somebody can help me.
Thanks in advance,
Wumme
Attachments
screenshot_GPU_fail.PNG
User avatar
Dade
Developer
Developer
Posts: 5672
Joined: Mon Dec 04, 2017 8:36 pm
Location: Italy

Re: MultiGPU stresstest - GPU(s) stop(s) calculating kernel

Post by Dade »

Wumme wrote: Fri Feb 05, 2021 8:37 am My question here is now, because of that weird behaviour of LuxMark, that after a GPU drops usage and the process must be killed. Is it possible that LuxMark itself has a bug?
May be but the last LuxMark version has been used now for many years and no one had ever reported a similar problem. It may still be a software problem (instead of an hardware one) but, given the past record, it can be in AMD drivers instead of LuxMark. Do you need to use AMD GPUs or can you tried NVIDIA too ?

You can try to run 8 instances of LuxMark (one for each GPU) instead that a single one using all GPUs: if it works, it should be a good hint the problem is software and not hardware related.

Does also happen if you run LuxMark on a single GPU ?
Support LuxCoreRender project with salts and bounties
Wumme
Posts: 7
Joined: Fri Feb 05, 2021 7:14 am

Re: MultiGPU stresstest - GPU(s) stop(s) calculating kernel

Post by Wumme »

Dade wrote: Fri Feb 05, 2021 11:09 amMay be but the last LuxMark version has been used now for many years and no one had ever reported a similar problem.?
Ok, yea I have already checked if I use the right version. LuxMark v3.1.
Dade wrote: Fri Feb 05, 2021 11:09 amDo you need to use AMD GPUs or can you tried NVIDIA too ?
Yes I need to use AMD GPUs.
Dade wrote: Fri Feb 05, 2021 11:09 am You can try to run 8 instances of LuxMark (one for each GPU) instead that a single one using all GPUs: if it works, it should be a good hint the problem is software and not hardware related.
Alright, I will try this. I haven't thought about this approach! :lol:
Dade wrote: Fri Feb 05, 2021 11:09 amDoes also happen if you run LuxMark on a single GPU ?
I haven't tried that, because as far as I noticed the GPU which never droped out was the one where the display is connected to.
Wumme
Posts: 7
Joined: Fri Feb 05, 2021 7:14 am

Re: MultiGPU stresstest - GPU(s) stop(s) calculating kernel

Post by Wumme »

Dade wrote: Fri Feb 05, 2021 11:09 am You can try to run 8 instances of LuxMark (one for each GPU) instead that a single one using all GPUs: if it works, it should be a good hint the problem is software and not hardware related.
I just came back from lunch break and wanted to start own instances of LuxMark for every GPU, but the moment I wanted to start the stresstest I saw that you can only StressTest all GPUs at once and not each on its own.
Is it somehow possible to stresstest just one at a time?
User avatar
Dade
Developer
Developer
Posts: 5672
Joined: Mon Dec 04, 2017 8:36 pm
Location: Italy

Re: MultiGPU stresstest - GPU(s) stop(s) calculating kernel

Post by Dade »

Wumme wrote: Fri Feb 05, 2021 12:49 pm I just came back from lunch break and wanted to start own instances of LuxMark for every GPU, but the moment I wanted to start the stresstest I saw that you can only StressTest all GPUs at once and not each on its own.
Is it somehow possible to stresstest just one at a time?
Mmm, right, you could download LuxCoreRender standalone version (https://luxcorerender.org/download/), download LuxMark test scene (https://github.com/LuxCoreRender/LuxCor ... rk-LuxBall) and run LuxCoreUI from command line :idea:

It is not LuxMark but is even a far more complex/heavy test.
Support LuxCoreRender project with salts and bounties
Wumme
Posts: 7
Joined: Fri Feb 05, 2021 7:14 am

Re: MultiGPU stresstest - GPU(s) stop(s) calculating kernel

Post by Wumme »

Dade wrote: Fri Feb 05, 2021 4:35 pm Mmm, right, you could download LuxCoreRender standalone version (https://luxcorerender.org/download/), download LuxMark test scene (https://github.com/LuxCoreRender/LuxCor ... rk-LuxBall) and run LuxCoreUI from command line :idea:

It is not LuxMark but is even a far more complex/heavy test.
So first of all, sry for the late reply. I have downloaded the standalone version of LuxCoreRender and also the LuxMark test scene.
It took me some time to get it to work and at the end I still could not figure out why the CPU is being used as a interactive device, even though I used all conifgs I could find to prohibit the use of the CPU. I tried to run the scene from the LuxMark folder with LuxCoreUI, but that did not work neither.

BUUUT! I came up with an idee. If I configure the render.cfg for LuxMark, it may just use these settings and not just override them.
So I just added the following line of code to the render.cfg in LuxMark/scenes/luxball/, started LuxMark.. and it worked!! It just did the stress test on 1 GPU.
opencl.devices.select = "0001"
What I then did was, I just copied LuxMark to create 4 folders and than adjust the settings so every LuxMark uses a diffrent GPU.
The StessTest has been running now for 17hrs, but I will start it again now. I just need to verify 12hrs under max load and I want to be sure it is not an one out of a hundred lucky test.
User avatar
Dade
Developer
Developer
Posts: 5672
Joined: Mon Dec 04, 2017 8:36 pm
Location: Italy

Re: MultiGPU stresstest - GPU(s) stop(s) calculating kernel

Post by Dade »

Wumme wrote: Tue Feb 09, 2021 7:49 am It took me some time to get it to work and at the end I still could not figure out why the CPU is being used as a interactive device, even though I used all conifgs I could find to prohibit the use of the CPU.
The latest LuxCore uses C++ code for rendering, by default, with the CPU (i.e. no need to use/have an OpenCL CPU device). To disable the CPU usage, you have to set the number of CPU threads to 0 by setting "native.threads.count" to 0.
Support LuxCoreRender project with salts and bounties
Wumme
Posts: 7
Joined: Fri Feb 05, 2021 7:14 am

Re: MultiGPU stresstest - GPU(s) stop(s) calculating kernel

Post by Wumme »

Dade wrote: Tue Feb 09, 2021 11:30 am The latest LuxCore uses C++ code for rendering, by default, with the CPU (i.e. no need to use/have an OpenCL CPU device). To disable the CPU usage, you have to set the number of CPU threads to 0 by setting "native.threads.count" to 0.
I have already tried that. Does not work for me.. or I am overseeing something in the configs. Here the output of the luxcoreui call. 21 intersection devices are being allocated. 20 is the count of the logical processors.

PS: the workaround with 4 LuxMark instances dropped out a GPU again. So this did not work either.

Code: Select all

LuxCoreUI v2.4 (LuxCore demo: http://www.luxcorerender.org)
[LuxCore][0.000] Configuration: 
[LuxCore][0.000]   opencl.cpu.use = "0"
[LuxCore][0.000]   native.threads.count = "0"
[LuxCore][0.000]   opencl.gpu.use = "1"
[LuxCore][0.000]   opencl.devices.select = "1000"
[LuxCore][0.000]   path.pathdepth.total = "9"
[LuxCore][0.000]   path.pathdepth.diffuse = "5"
[LuxCore][0.000]   path.pathdepth.glossy = "5"
[LuxCore][0.000]   path.pathdepth.specular = "8"
[LuxCore][0.000]   path.hybridbackforward.enable = "0"
[LuxCore][0.000]   path.hybridbackforward.partition = "0"
[LuxCore][0.000]   path.hybridbackforward.glossinessthreshold = "0.048999998718500137"
[LuxCore][0.000]   film.noiseestimation.warmup = "8"
[LuxCore][0.000]   film.noiseestimation.step = "32"
[LuxCore][0.000]   sampler.sobol.adaptive.strength = "0.5"
[LuxCore][0.000]   sampler.random.adaptive.strength = "0.5"
[LuxCore][0.000]   sampler.metropolis.largesteprate = "0.40000000000000002"
[LuxCore][0.000]   sampler.metropolis.maxconsecutivereject = "512"
[LuxCore][0.000]   sampler.metropolis.imagemutationrate = "0.10000000000000001"
[LuxCore][0.000]   renderengine.type = "PATHOCL"
[LuxCore][0.000]   sampler.type = "SOBOL"
[LuxCore][0.000]   film.width = "800"
[LuxCore][0.000]   film.height = "800"
[LuxCore][0.000]   film.filter.type = "BLACKMANHARRIS"
[LuxCore][0.000]   film.filter.width = "1.5"
[LuxCore][0.000]   lightstrategy.type = "POWER"
[LuxCore][0.000]   scene.epsilon.min = "9.9999997473787516e-06"
[LuxCore][0.000]   scene.epsilon.max = "0.10000000149011612"
[LuxCore][0.000]   film.opencl.enable = "1"
[LuxCore][0.000]   film.opencl.device = "0"
[LuxCore][0.000]   path.forceblackbackground.enable = "0"
[LuxCore][0.000]   filesaver.format = "TXT"
[LuxCore][0.000]   filesaver.renderengine.type = "PATHOCL"
[LuxCore][0.000]   renderengine.seed = "1"
[LuxCore][0.000]   batch.haltspp = "0"
[LuxCore][0.000]   batch.halttime = "43200"
[LuxCore][0.000]   film.imagepipelines.1.0.type = "INTEL_OIDN"
[LuxCore][0.000]   film.imagepipelines.1.0.oidnmemory = "6000"
[LuxCore][0.000]   film.imagepipelines.1.1.type = "NOP"
[LuxCore][0.000]   film.imagepipelines.1.2.type = "TONEMAP_LINEAR"
[LuxCore][0.000]   film.imagepipelines.1.2.scale = "0.77999997138977051"
[LuxCore][0.000]   film.imagepipelines.1.3.type = "GAMMA_CORRECTION"
[LuxCore][0.000]   film.imagepipelines.1.3.value = "2.2000000000000002"
[LuxCore][0.000]   film.imagepipelines.0.0.type = "NOP"
[LuxCore][0.000]   film.imagepipelines.0.1.type = "TONEMAP_LINEAR"
[LuxCore][0.000]   film.imagepipelines.0.1.scale = "0.77999997138977051"
[LuxCore][0.000]   film.imagepipelines.0.2.type = "GAMMA_CORRECTION"
[LuxCore][0.000]   film.imagepipelines.0.2.value = "2.2000000000000002"
[LuxCore][0.000]   film.imagepipelines.1.radiancescales.0.enabled = "1"
[LuxCore][0.000]   film.imagepipelines.1.radiancescales.0.globalscale = "1"
[LuxCore][0.000]   film.imagepipelines.1.radiancescales.0.rgbscale = "1" "1" "1"
[LuxCore][0.000]   film.imagepipelines.0.radiancescales.0.enabled = "1"
[LuxCore][0.000]   film.imagepipelines.0.radiancescales.0.globalscale = "1"
[LuxCore][0.000]   film.imagepipelines.0.radiancescales.0.rgbscale = "1" "1" "1"
[LuxCore][0.000]   film.outputs.0.type = "RGB_IMAGEPIPELINE"
[LuxCore][0.000]   film.outputs.0.index = "0"
[LuxCore][0.000]   film.outputs.0.filename = "RGB_IMAGEPIPELINE_0.png"
[LuxCore][0.000]   film.outputs.1.type = "ALBEDO"
[LuxCore][0.000]   film.outputs.1.filename = "ALBEDO.exr"
[LuxCore][0.000]   film.outputs.2.type = "AVG_SHADING_NORMAL"
[LuxCore][0.000]   film.outputs.2.filename = "AVG_SHADING_NORMAL.exr"
[LuxCore][0.000]   film.outputs.3.type = "RGB_IMAGEPIPELINE"
[LuxCore][0.000]   film.outputs.3.index = "1"
[LuxCore][0.000]   film.outputs.3.filename = "RGB_IMAGEPIPELINE_1.png"
[LuxCore][0.000]   scene.file = "scene.scn"
[LuxCore][0.000] File Name Resolver Configuration: 
[LuxCore][0.000]   .
[LuxCore][0.000]   ./LuxCoreTestScenes/scenes/LuxBall/LuxCoreScene
[SDL][0.000] Reading scene: ./LuxCoreTestScenes/scenes/LuxBall/LuxCoreScene/scene.scn
[SDL][0.000] Texture definition: 1950917966792Color
[SDL][0.000] Reading texture map: ./LuxCoreTestScenes/scenes/LuxBall/LuxCoreScene/imagemap-00000.png
[SDL][0.015] Material definition: __CLAY__
[SDL][0.015] Material definition: 1950790513832_AREA_LIGHT_MAT
[SDL][0.015] Material definition: 1950790488536_AREA_LIGHT_MAT
[SDL][0.015] Material definition: 1950790484072_AREA_LIGHT_MAT
[SDL][0.015] Material definition: walls1950785125336
[SDL][0.015] Material definition: floor1950785130040
[SDL][0.015] Material definition: luxball1950785136424
[SDL][0.015] Material definition: luxball__stand1950785131720
[SDL][0.015] Material definition: stand__text1950785130376
[SDL][0.015] Camera type: perspective
[SDL][0.015] Camera position: Point[0.117648, -0.341081, 0.27677]
[SDL][0.015] Camera target: Point[0.114897, -0.333106, 0.271402]
[SDL][0.015] Camera clipping plane disabled
[SDL][0.015] The 19507905138320 object is a light sources with 2 triangles
[SDL][0.015] The 19507904885360 object is a light sources with 2 triangles
[SDL][0.015] The 19507904840720 object is a light sources with 2 triangles
[SDL][0.047] Scene objects count: 11
[SDL][0.218] Camera type: perspective
[SDL][0.218] Camera position: Point[0.117648, -0.341081, 0.27677]
[SDL][0.218] Camera target: Point[0.114897, -0.333106, 0.271402]
[SDL][0.218] Camera clipping plane disabled
Film size adjusted: 800x800 (Frame buffer size: 800x800)
RenderConfig has cached kernels: True
[LuxCore][0.234] Film resolution: 800x800
[SDL][0.234] Film output definition: RGB_IMAGEPIPELINE [image.png]
[SDL][0.234] Image pipeline: film.imagepipelines.0
[SDL][0.234] Image pipeline step 0: NOP
[SDL][0.234] Image pipeline step 1: TONEMAP_LINEAR
[SDL][0.234] Image pipeline step 2: GAMMA_CORRECTION
[SDL][0.234] Image pipeline: film.imagepipelines.1
[SDL][0.234] Image pipeline step 0: INTEL_OIDN
[SDL][0.234] Image pipeline step 1: NOP
[SDL][0.234] Image pipeline step 2: TONEMAP_LINEAR
[SDL][0.234] Image pipeline step 3: GAMMA_CORRECTION
[SDL][0.234] Film output definition: RGB_IMAGEPIPELINE [RGB_IMAGEPIPELINE_0.png]
[SDL][0.234] Film output definition: ALBEDO [ALBEDO.exr]
[SDL][0.234] Film output definition: AVG_SHADING_NORMAL [AVG_SHADING_NORMAL.exr]
[SDL][0.234] Film output definition: RGB_IMAGEPIPELINE [RGB_IMAGEPIPELINE_1.png]
[LuxRays][0.234] OpenCL support: enabled
[LuxRays][0.500] OpenCL Platform 0: AMD Accelerated Parallel Processing
[LuxRays][0.500] CUDA support: enabled
[LuxRays][0.500] Device 0 name: Native
[LuxRays][0.500] Device 0 type: NATIVE_THREAD
[LuxRays][0.500] Device 0 compute units: 1
[LuxRays][0.500] Device 0 preferred float vector width: 4
[LuxRays][0.500] Device 0 max allocable memory: 17592186044415MBytes
[LuxRays][0.500] Device 0 max allocable memory block size: 17592186044415MBytes
[LuxRays][0.500] Device 0 has out of core memory support: 0
[LuxRays][0.500] Device 1 name: gfx906
[LuxRays][0.500] Device 1 type: OPENCL_GPU
[LuxRays][0.500] Device 1 compute units: 60
[LuxRays][0.500] Device 1 preferred float vector width: 1
[LuxRays][0.500] Device 1 max allocable memory: 16368MBytes
[LuxRays][0.500] Device 1 max allocable memory block size: 13695MBytes
[LuxRays][0.500] Device 1 has out of core memory support: 0
[LuxRays][0.500] Device 2 name: gfx906
[LuxRays][0.500] Device 2 type: OPENCL_GPU
[LuxRays][0.500] Device 2 compute units: 60
[LuxRays][0.500] Device 2 preferred float vector width: 1
[LuxRays][0.500] Device 2 max allocable memory: 16368MBytes
[LuxRays][0.500] Device 2 max allocable memory block size: 13695MBytes
[LuxRays][0.500] Device 2 has out of core memory support: 0
[LuxRays][0.500] Device 3 name: gfx906
[LuxRays][0.500] Device 3 type: OPENCL_GPU
[LuxRays][0.500] Device 3 compute units: 60
[LuxRays][0.500] Device 3 preferred float vector width: 1
[LuxRays][0.500] Device 3 max allocable memory: 16368MBytes
[LuxRays][0.500] Device 3 max allocable memory block size: 13695MBytes
[LuxRays][0.500] Device 3 has out of core memory support: 0
[LuxRays][0.500] Device 4 name: gfx906
[LuxRays][0.500] Device 4 type: OPENCL_GPU
[LuxRays][0.500] Device 4 compute units: 60
[LuxRays][0.500] Device 4 preferred float vector width: 1
[LuxRays][0.500] Device 4 max allocable memory: 16368MBytes
[LuxRays][0.500] Device 4 max allocable memory block size: 13695MBytes
[LuxRays][0.500] Device 4 has out of core memory support: 0
[LuxRays][0.500] Creating 21 intersection device(s)
[LuxRays][0.500] Allocating intersection device 0: gfx906 (Type = OPENCL_GPU)
[LuxRays][0.500] Allocating intersection device 1: Native (Type = NATIVE_THREAD)
[LuxRays][0.500] Allocating intersection device 2: Native (Type = NATIVE_THREAD)
[LuxRays][0.500] Allocating intersection device 3: Native (Type = NATIVE_THREAD)
[LuxRays][0.500] Allocating intersection device 4: Native (Type = NATIVE_THREAD)
[LuxRays][0.500] Allocating intersection device 5: Native (Type = NATIVE_THREAD)
[LuxRays][0.500] Allocating intersection device 6: Native (Type = NATIVE_THREAD)
[LuxRays][0.500] Allocating intersection device 7: Native (Type = NATIVE_THREAD)
[LuxRays][0.500] Allocating intersection device 8: Native (Type = NATIVE_THREAD)
[LuxRays][0.500] Allocating intersection device 9: Native (Type = NATIVE_THREAD)
[LuxRays][0.500] Allocating intersection device 10: Native (Type = NATIVE_THREAD)
[LuxRays][0.500] Allocating intersection device 11: Native (Type = NATIVE_THREAD)
[LuxRays][0.500] Allocating intersection device 12: Native (Type = NATIVE_THREAD)
[LuxRays][0.500] Allocating intersection device 13: Native (Type = NATIVE_THREAD)
[LuxRays][0.500] Allocating intersection device 14: Native (Type = NATIVE_THREAD)
[LuxRays][0.500] Allocating intersection device 15: Native (Type = NATIVE_THREAD)
[LuxRays][0.500] Allocating intersection device 16: Native (Type = NATIVE_THREAD)
[LuxRays][0.500] Allocating intersection device 17: Native (Type = NATIVE_THREAD)
[LuxRays][0.500] Allocating intersection device 18: Native (Type = NATIVE_THREAD)
[LuxRays][0.500] Allocating intersection device 19: Native (Type = NATIVE_THREAD)
[LuxRays][0.500] Allocating intersection device 20: Native (Type = NATIVE_THREAD)
[LuxCore][0.500] CUDA devices used:
[LuxCore][0.500] OpenCL devices used:
[LuxCore][0.500] [gfx906 OpenCLIntersect]
[LuxCore][0.500]   Device OpenCL version: OpenCL 2.0 AMD-APP (3075.13)
[LuxCore][0.500] Native devices used: 20
[LuxCore][0.500] Configuring 1 OpenCL render threads
[LuxCore][0.500] Configuring 20 native render threads
[LuxRays][0.500] Preprocessing DataSet
[LuxRays][0.500] Total vertex count: 69630
[LuxRays][0.500] Total triangle count: 116976
[LuxRays][0.500] Preprocessing DataSet done
[LuxRays][0.500] Adding DataSet accelerator: BVH
[LuxRays][0.500] Total vertex count: 69630
[LuxRays][0.500] Total triangle count: 116976
[LuxRays][0.500] BVH Dataset preprocessing time: 0ms
[LuxRays][0.500] BVH builder: EMBREE_BINNED_SAH
[LuxRays][0.547] BVH build hierarchy time: 47ms
[LuxRays][0.547] BVH total build time: 47ms
[LuxRays][0.547] Total BVH memory usage: 5451Kbytes
[LuxRays][0.547] Adding DataSet accelerator: EMBREE
[LuxRays][0.547] Total vertex count: 69630
[LuxRays][0.547] Total triangle count: 116976
[LuxRays][0.562] EmbreeAccel build time: 14ms
[LuxRays][0.797] [Device gfx906 OpenCLIntersect] BVH mesh vertices buffer size: 815Kbytes
[LuxRays][0.812] [Device gfx906 OpenCLIntersect] BVH nodes buffer size: 5451Kbytes
[LuxRays][0.812] [BVHKernel] Compiler options: -D LUXRAYS_OPENCL_KERNEL -D PARAM_RAY_EPSILON_MIN=1e-05f -D PARAM_RAY_EPSILON_MAX=0.1f -D LUXRAYS_OPENCL_DEVICE -D LUXRAYS_OS_WINDOWS -cl-fast-relaxed-math -cl-mad-enable
[LuxRays][0.812] [BVHKernel] Compiling kernels 
[LuxRays][0.812] [BVHKernel] Program cached
[LuxCore][0.828] [PathOCLRenderEngine] OpenCL task count: 524288
[LuxCore][0.828] [PathOCLBaseRenderEngine] OpenCL max. page memory size: 14023884Kbytes
[LuxCore][0.828] Compile Geometry
[LuxCore][0.843] Scene geometry compilation time: 14ms
[LuxCore][0.843] Compile 28 Textures
[LuxCore][0.843] Texture evaluation ops count: 84
[LuxCore][0.843] Texture evaluation max. stack size: 3
[LuxCore][0.843] Textures compilation time: 0ms
[LuxCore][0.843] Compile 9 Materials
[LuxCore][0.843] Material evaluation ops count: 63
[LuxCore][0.843] Material evaluation max. stack size: 8
[LuxCore][0.843] Material compilation time: 0ms
[LuxCore][0.843] Compile Lights
[LuxCore][0.843] Lights compilation time: 0ms
[LuxCore][0.843] Compile ImageMaps
[LuxCore][0.843] Image maps page(s) count: 1
[LuxCore][0.843]  RGB channel page 0 size: 7475Kbytes
[LuxCore][0.843] Image maps compilation time: 0ms
[LuxCore][0.843] Always enabled OpenCL code: 
[LuxCore][0.843] Compile Geometry
[LuxCore][0.843] Scene geometry compilation time: 0ms
[LuxCore][0.843] Compile 28 Textures
[LuxCore][0.843] Texture evaluation ops count: 84
[LuxCore][0.843] Texture evaluation max. stack size: 3
[LuxCore][0.843] Textures compilation time: 0ms
[LuxCore][0.843] Compile 9 Materials
[LuxCore][0.843] Material evaluation ops count: 63
[LuxCore][0.843] Material evaluation max. stack size: 8
[LuxCore][0.843] Material compilation time: 0ms
[LuxCore][0.843] Compile Lights
[LuxCore][0.843] Lights compilation time: 0ms
[LuxCore][0.843] Compile ImageMaps
[LuxCore][0.843] Image maps page(s) count: 1
[LuxCore][0.843]  RGB channel page 0 size: 7475Kbytes
[LuxCore][0.843] Image maps compilation time: 0ms
[LuxCore][0.843] Starting 1 OpenCL render threads
[LuxRays][0.859] [Device gfx906 OpenCLIntersect] RADIANCE_PER_PIXEL_NORMALIZEDs[0] buffer size: 10000Kbytes
[LuxRays][0.859] [Device gfx906 OpenCLIntersect] ALBEDO buffer size: 10000Kbytes
[LuxRays][0.859] [Device gfx906 OpenCLIntersect] AVG_SHADING_NORMAL buffer size: 10000Kbytes
[LuxRays][0.859] [Device gfx906 OpenCLIntersect] NOISE buffer size: 2500Kbytes
[LuxRays][0.890] [Device gfx906 OpenCLIntersect] RADIANCE_PER_PIXEL_NORMALIZEDs[0] buffer size: 10000Kbytes
[LuxRays][0.890] [Device gfx906 OpenCLIntersect] Camera buffer size: 5468bytes
[LuxRays][0.890] [Device gfx906 OpenCLIntersect] Normals buffer size: 815Kbytes
[LuxRays][0.890] [Device gfx906 OpenCLIntersect] UVs buffer size: 426Kbytes
[LuxRays][0.890] [Device gfx906 OpenCLIntersect] Triangle normals buffer size: 1370Kbytes
[LuxRays][0.890] [Device gfx906 OpenCLIntersect] Vertices buffer size: 815Kbytes
[LuxRays][0.890] [Device gfx906 OpenCLIntersect] Triangles buffer size: 1370Kbytes
[LuxRays][0.890] [Device gfx906 OpenCLIntersect] Mesh description buffer size: 3432bytes
[LuxRays][0.890] [Device gfx906 OpenCLIntersect] ImageMap descriptions buffer size: 32bytes
[LuxRays][0.890] [Device gfx906 OpenCLIntersect] ImageMaps buffer size: 7475Kbytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] Textures buffer size: 8288bytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] Texture evaluation ops buffer size: 672bytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] Texture evaluation stacks buffer size: 6144Kbytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] Materials buffer size: 2016bytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] Material evaluation ops buffer size: 756bytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] Material evaluation stacks buffer size: 16384Kbytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] Scene objects buffer size: 264bytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] Lights buffer size: 1992bytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] Light offsets (Part I) buffer size: 44bytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] Light offsets (Part II) buffer size: 24bytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] LightsDistribution buffer size: 56bytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] InfiniteLightSourcesDistribution buffer size: 56bytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] Ray buffer size: 24576Kbytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] RayHit buffer size: 10240Kbytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] GPUTaskConfiguration buffer size: 288bytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] GPUTask buffer size: 339968Kbytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] GPUTaskDirectLight buffer size: 30720Kbytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] GPUTaskState buffer size: 200704Kbytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] GPUTask Stats buffer size: 2048Kbytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] SamplerSharedData buffer size: 2511Kbytes
[LuxCore][0.906] [PathOCLBaseRenderThread::0] Size of a Sample: 40bytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] Sample buffer size: 20480Kbytes
[LuxCore][0.906] [PathOCLBaseRenderThread::0] Size of a SampleData: 8bytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] SampleData buffer size: 4096Kbytes
[LuxCore][0.906] [PathOCLBaseRenderThread::0] Size of a SampleResult: 304bytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] Sample buffer size: 155648Kbytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] PathInfo buffer size: 55296Kbytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] DirectLightVolumeInfo buffer size: 22528Kbytes
[LuxRays][0.906] [Device gfx906 OpenCLIntersect] Pixel Filter Distribution buffer size: 33Kbytes
[LuxCore][0.906] [PathOCLBaseRenderThread::0] Compiling kernels 
[LuxRays][0.906] [PathOCL kernel] Compiler options: -D LUXRAYS_OPENCL_KERNEL -D SLG_OPENCL_KERNEL -D RENDER_ENGINE_PATHOCL -D PARAM_RAY_EPSILON_MIN=1e-05f -D PARAM_RAY_EPSILON_MAX=0.1f -D LUXCORE_AMD_OPENCL -D LUXRAYS_OPENCL_DEVICE -D LUXRAYS_OS_WINDOWS -cl-fast-relaxed-math -cl-mad-enable
[LuxRays][0.906] [PathOCL kernel] Compiling kernels 
[LuxRays][0.984] [PathOCL kernel] Program cached
[LuxCore][0.984] [PathOCLBaseRenderThread::0] Compiling Film_Clear Kernel
[LuxCore][0.984] [PathOCLBaseRenderThread::0] Compiling InitSeed Kernel
[LuxCore][0.984] [PathOCLBaseRenderThread::0] Compiling Init Kernel
[LuxCore][0.984] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_RT_NEXT_VERTEX Kernel
[LuxCore][0.984] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_HIT_NOTHING Kernel
[LuxCore][0.984] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_HIT_OBJECT Kernel
[LuxCore][0.984] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_RT_DL Kernel
[LuxCore][0.984] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_DL_ILLUMINATE Kernel
[LuxCore][0.984] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_DL_SAMPLE_BSDF Kernel
[LuxCore][0.984] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_GENERATE_NEXT_VERTEX_RAY Kernel
[LuxCore][0.984] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_SPLAT_SAMPLE Kernel
[LuxCore][0.984] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_NEXT_SAMPLE Kernel
[LuxCore][0.984] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_GENERATE_CAMERA_RAY Kernel
[LuxCore][0.984] [PathOCLBaseRenderThread::0] AdvancePaths_MK_* workgroup size: 32
[LuxCore][0.984] [PathOCLBaseRenderThread::0] Kernels compilation time: 78ms
[LuxCore][1.000] Starting 20 native render threads
[LuxCore][1.156] Film hardware image pipeline
[LuxCore][1.156] Film hardware device used: gfx906 OpenCLIntersect (Type: OPENCL_GPU)
[LuxCore][1.156]   Device OpenCL version: OpenCL 2.0 AMD-APP (3075.13)
[LuxRays][1.250] [Device gfx906 OpenCLIntersect] IMAGEPIPELINE buffer size: 7500Kbytes
[LuxRays][1.359] [Device gfx906 OpenCLIntersect] Merge buffer size: 10000Kbytes
[LuxRays][1.359] [MergeSampleBuffersOCL] Compiler options: -D LUXRAYS_OPENCL_KERNEL -D SLG_OPENCL_KERNEL -D LUXRAYS_OPENCL_DEVICE -D LUXRAYS_OS_WINDOWS
[LuxRays][1.359] [MergeSampleBuffersOCL] Compiling kernels 
[LuxRays][1.515] [MergeSampleBuffersOCL] Program cached
[LuxCore][1.515] [MergeSampleBuffersOCL] Compiling Film_MergeBufferInitialize Kernel
[LuxCore][1.515] [MergeSampleBuffersOCL] Compiling Film_MergeRADIANCE_PER_PIXEL_NORMALIZED Kernel
[LuxCore][1.515] [MergeSampleBuffersOCL] Compiling Film_MergeRADIANCE_PER_SCREEN_NORMALIZED Kernel
[LuxCore][1.515] [MergeSampleBuffersOCL] Compiling Film_MergeBufferFinalize Kernel
[LuxCore][1.515] [MergeSampleBuffersOCL] Kernels compilation time: 155ms
[LuxRays][1.703] [LinearToneMap] Compiler options: -D LUXRAYS_OPENCL_KERNEL -D SLG_OPENCL_KERNEL -D LUXRAYS_OPENCL_DEVICE -D LUXRAYS_OS_WINDOWS
[LuxRays][1.703] [LinearToneMap] Compiling kernels 
[LuxRays][1.937] [LinearToneMap] Program cached
[LuxCore][1.937] [AutoLinearToneMap] Compiling LinearToneMap_Apply Kernel
[LuxCore][1.937] [LinearToneMap] Kernels compilation time: 233ms
[LuxRays][1.937] [Device gfx906 OpenCLIntersect] Gamma table buffer size: 16Kbytes
[LuxRays][1.937] [GammaCorrectionPlugin] Compiler options: -D LUXRAYS_OPENCL_KERNEL -D SLG_OPENCL_KERNEL -D LUXRAYS_OPENCL_DEVICE -D LUXRAYS_OS_WINDOWS
[LuxRays][1.937] [GammaCorrectionPlugin] Compiling kernels 
[LuxRays][2.125] [GammaCorrectionPlugin] Program cached
[LuxCore][2.125] [GammaCorrectionPlugin] Compiling GammaCorrectionPlugin_Apply Kernel
[LuxCore][2.125] [GammaCorrectionPlugin] Kernels compilation time: 187ms
[LuxCore][4.406] [gfx906 OpenCLIntersect] Memory used for hardware image pipeline: 17500Kbytes
Done.
Martini
Posts: 125
Joined: Fri Nov 23, 2018 11:36 am
Location: Australia

Re: MultiGPU stresstest - GPU(s) stop(s) calculating kernel

Post by Martini »

Wumme wrote: Wed Feb 10, 2021 7:15 am
Dade wrote: Tue Feb 09, 2021 11:30 am The latest LuxCore uses C++ code for rendering, by default, with the CPU (i.e. no need to use/have an OpenCL CPU device). To disable the CPU usage, you have to set the number of CPU threads to 0 by setting "native.threads.count" to 0.
...

Code: Select all

LuxCoreUI v2.4 (LuxCore demo: http://www.luxcorerender.org)
[LuxCore][0.000] Configuration: 
[LuxCore][0.000]   opencl.cpu.use = "0"
[LuxCore][0.000]   native.threads.count = "0"
I tried exporting from BlendLuxCore and what it output for me (and worked) is:

Code: Select all

opencl.native.threads.count = 0
I think you are missing the opencl. prefix.
Wumme wrote: Wed Feb 10, 2021 7:15 am

Code: Select all

[LuxCore][0.000]   opencl.cpu.use = "0"
[LuxCore][0.000]   native.threads.count = "0"
[LuxCore][0.000]   opencl.gpu.use = "1"
[LuxCore][0.000]   opencl.devices.select = "1000"
I notice that all your values are quoted. I'm not sure if it makes a difference, but I think if the value is a pure int or float, then it should not be quoted?

Code: Select all

opencl.cpu.use = 0
opencl.native.threads.count = 0
opencl.gpu.use = 1
opencl.devices.select = "1000"
Hope this helps :)
AMD Ryzen Threadripper PRO 5995WX 64-Cores | 2x Gigabyte RTX 4090 Gaming OC
ASUS Pro WS WRX80E-SAGE SE WIFI | 256GB Kingston Server Premier ECC Unbuffered DDR4
User avatar
Dade
Developer
Developer
Posts: 5672
Joined: Mon Dec 04, 2017 8:36 pm
Location: Italy

Re: MultiGPU stresstest - GPU(s) stop(s) calculating kernel

Post by Dade »

Martini wrote: Wed Feb 10, 2021 10:02 am I think you are missing the opencl. prefix.
Yes, I was wrong "native.threads.count" is for PATHCPU, PATHOCL requires "opencl.native.threads.count".
Support LuxCoreRender project with salts and bounties
Post Reply