PATHGPU with OpenCL and CUDA support

Discussion related to the LuxCore functionality, implementations and API.
User avatar
u3dreal
Developer
Posts: 428
Joined: Tue Dec 03, 2019 3:23 pm
Location: Ulm
Contact:

Re: PATHGPU with OpenCL and CUDA support

Post by u3dreal » Thu Apr 23, 2020 9:51 pm

Meanwhile on osx HighSierra...

Code: Select all

[LuxRays][28.533] EmbreeAccel build time: 0ms
[LuxRays][28.533] [Device GeForce GT 750M CUDAIntersect] BVH mesh vertices buffer size: 768bytes
[LuxRays][28.533] [Device GeForce GT 750M CUDAIntersect] BVH nodes buffer size: 1536bytes
[LuxRays][28.533] [BVHKernel] Compiler options: -D LUXRAYS_CUDA_DEVICE  -D LUXRAYS_OPENCL_KERNEL -D PARAM_RAY_EPSILON_MIN=1e-05f -D PARAM_RAY_EPSILON_MAX=0.1f
[LuxRays][28.533] [BVHKernel] Compiling kernels
[LuxRays][28.674] [BVHKernel] CUDA program compilation error: 
epsilon_funcs.cl(32): error: identifier "PARAM_RAY_EPSILON_MIN" is undefined

epsilon_funcs.cl(32): error: identifier "PARAM_RAY_EPSILON_MAX" is undefined

ray_funcs.cl(77): error: identifier "VSTORE3F" is undefined

ray_funcs.cl(90): error: identifier "VSTORE3F" is undefined

ray_funcs.cl(102): error: identifier "VSTORE3F" is undefined

bvh.cl(162): error: identifier "VLOAD3F" is undefined

6 errors detected in the compilation of "BVHKernel".

RenderSession starting error: 
BVHKernel CUDA program compilation error
[LuxRays][28.684] WARNING: there is a memory leak in LuxRays HardwareD
check out my newest stuff http://q3de.com/research/
portfolio http://q3de.com/


MB Pro i7 2.3Ghz, IrisPro 1.5GB, GTX750m 2GB - High Sierra
Xeon X5650@4Ghz, 2x GTX 770 Phantom RX 5700 - Catalina / High Sierra, Windows 10, Ubuntu 18.04

juangea
Donor
Posts: 197
Joined: Thu Jan 02, 2020 6:23 pm

Re: PATHGPU with OpenCL and CUDA support

Post by juangea » Fri Apr 24, 2020 9:50 am

The out of core is awesome!

This brings a question, if OOC rendering needs buckets, will Ligth Trace be compatible with buckets?

I really like buckets, specially the multi pass version, not sure if it’s faster, but feels faster, but right now is a bit unstable, it can give uncomplete buckets when it finishes.

Anyways, OOC is awesome!

User avatar
lacilaci
Donor
Posts: 1940
Joined: Fri May 04, 2018 5:16 am

Re: PATHGPU with OpenCL and CUDA support

Post by lacilaci » Fri Apr 24, 2020 11:09 am

juangea wrote:
Fri Apr 24, 2020 9:50 am
The out of core is awesome!

This brings a question, if OOC rendering needs buckets, will Ligth Trace be compatible with buckets?

I really like buckets, specially the multi pass version, not sure if it’s faster, but feels faster, but right now is a bit unstable, it can give uncomplete buckets when it finishes.

Anyways, OOC is awesome!
I think it's gonna be something like octane, where you don't really have buckets.. but at the same time you kinda do :D. I guess lighttracing can be sampled separately as it is running on cpu and is just added on top of rendering... ?

User avatar
Dade
Developer
Posts: 4817
Joined: Mon Dec 04, 2017 8:36 pm
Location: Italy

Re: PATHGPU with OpenCL and CUDA support

Post by Dade » Fri Apr 24, 2020 11:30 am

lacilaci wrote:
Fri Apr 24, 2020 11:09 am
I think it's gonna be something like octane, where you don't really have buckets.. but at the same time you kinda do :D.
Yes, I'm playing with the idea of a special Random/Sobol sampler mode to improve "locality" (i.e. a TILEPATHOCL-like rendering without using TILEPATHOCL)
lacilaci wrote:
Fri Apr 24, 2020 11:09 am
I guess lighttracing can be sampled separately as it is running on cpu and is just added on top of rendering... ?
Yes, it is the main reason to have a special Random/Sobol sampler out-of-core mode instead of just use TILEPATHOCL (where you can not enable light tracing).
Support LuxCoreRender project with salts and bounties

mischterlampe
Posts: 44
Joined: Fri Apr 03, 2020 6:22 pm

Re: PATHGPU with OpenCL and CUDA support

Post by mischterlampe » Sat Apr 25, 2020 11:38 am

These news are very good news.

I tried to test it by myself but I get an error.

I downloaded the cuda win from today.
I choosed for the render device "OpenCL" is that correct?

When I hit render I get that error

"Hardware device selection string has the wrong length, must be 4 instead of 2"

I'm using a RTX 2070 and a RTX 2060 super.

The console keeps printing this.

Code: Select all

==================================================
[Engine/Viewport] New session
[Exporter] Creating session
[SDL][136.703] Camera type: perspective
[SDL][136.703] Camera position: Point[14.7279, -6.50511, 8.01803]
[SDL][136.703] Camera target: Point[13.9091, -6.14344, 7.57226]
[SDL][136.703] Camera clipping plane disabled
[SDL][136.703] Light definition: 1690932647784
[SDL][136.703] Light definition: __WORLD_BACKGROUND_LIGHT__
[LuxCore][136.719] Configuration:
[LuxCore][136.719]   path.pathdepth.total = 7
[LuxCore][136.719]   path.pathdepth.diffuse = 5
[LuxCore][136.719]   path.pathdepth.glossy = 5
[LuxCore][136.719]   path.pathdepth.specular = 6
[LuxCore][136.719]   path.hybridbackforward.enable = 0
[LuxCore][136.719]   path.hybridbackforward.partition = 0
[LuxCore][136.735]   path.hybridbackforward.glossinessthreshold = 0.048999998718500137
[LuxCore][136.735]   rtpath.resolutionreduction.preview = 4
[LuxCore][136.735]   rtpath.resolutionreduction.preview.step = 2
[LuxCore][136.735]   rtpath.resolutionreduction = 2
[LuxCore][136.735]   opencl.cpu.use = 0
[LuxCore][136.735]   opencl.gpu.use = 1
[LuxCore][136.735]   opencl.devices.select = "11"
[LuxCore][136.735]   opencl.native.threads.count = 0
[LuxCore][136.735]   renderengine.type = "RTPATHOCL"
[LuxCore][136.735]   sampler.type = "TILEPATHSAMPLER"
[LuxCore][136.735]   film.width = 1526
[LuxCore][136.735]   film.height = 1044
[LuxCore][136.735]   film.filter.type = "BLACKMANHARRIS"
[LuxCore][136.735]   film.filter.width = 1.5
[LuxCore][136.735]   lightstrategy.type = "LOG_POWER"
[LuxCore][136.735]   scene.epsilon.min = 9.9999997473787516e-06
[LuxCore][136.735]   scene.epsilon.max = 0.10000000149011612
[LuxCore][136.735]   film.opencl.enable = 1
[LuxCore][136.735]   film.opencl.device = 0
[LuxCore][136.735]   path.forceblackbackground.enable = 0
[LuxCore][136.735]   renderengine.seed = 1
[LuxCore][136.735]   film.outputs.0.type = "RGB_IMAGEPIPELINE"
[LuxCore][136.735]   film.outputs.0.index = 0
[LuxCore][136.735]   film.outputs.0.filename = "RGB_IMAGEPIPELINE_0.png"
[LuxCore][136.735]   film.outputs.1.type = "ALBEDO"
[LuxCore][136.735]   film.outputs.1.filename = "ALBEDO.exr"
[LuxCore][136.735]   film.outputs.2.type = "AVG_SHADING_NORMAL"
[LuxCore][136.735]   film.outputs.2.filename = "AVG_SHADING_NORMAL.exr"
[LuxCore][136.735]   film.imagepipelines.000.0.type = "NOP"
[LuxCore][136.735]   film.imagepipelines.000.1.type = "TONEMAP_LINEAR"
[LuxCore][136.735]   film.imagepipelines.000.1.scale = 1
[LuxCore][136.735]   film.imagepipelines.000.radiancescales.0.enabled = 1
[LuxCore][136.735]   film.imagepipelines.000.radiancescales.0.globalscale = 1
[LuxCore][136.735]   film.imagepipelines.000.radiancescales.0.rgbscale = 1 1 1
[LuxCore][136.750]   batch.haltspp = 0
[LuxCore][136.750]   batch.halttime = 0
[LuxCore][136.750] File Name Resolver Configuration:
Export took 0.0 s
[LuxCore][136.750] Film resolution: 1526x1044
[SDL][136.750] Film output definition: RGB_IMAGEPIPELINE [image.png]
[SDL][136.750] Image pipeline: film.imagepipelines.000
[SDL][136.750] Image pipeline step 0: NOP
[SDL][136.750] Image pipeline step 1: TONEMAP_LINEAR
[SDL][136.750] Film output definition: RGB_IMAGEPIPELINE [RGB_IMAGEPIPELINE_0.png]
[SDL][136.750] Film output definition: ALBEDO [ALBEDO.exr]
[SDL][136.750] Film output definition: AVG_SHADING_NORMAL [AVG_SHADING_NORMAL.exr]
[LuxRays][136.750] OpenCL Platform 0: NVIDIA Corporation
[LuxRays][136.750] CUDA driver version: 11.0
[LuxRays][136.750] CUDA device count: 2
[LuxRays][136.750] Device 0 name: Native
[LuxRays][136.750] Device 0 type: NATIVE_THREAD
[LuxRays][136.750] Device 0 compute units: 1
[LuxRays][136.750] Device 0 preferred float vector width: 4
[LuxRays][136.750] Device 0 max allocable memory: 17592186044415MBytes
[LuxRays][136.750] Device 0 max allocable memory block size: 17592186044415MBytes
[LuxRays][136.750] Device 0 has out of core memory support: 0
[LuxRays][136.750] Device 1 name: GeForce RTX 2070
[LuxRays][136.750] Device 1 type: OPENCL_GPU
[LuxRays][136.766] Device 1 compute units: 36
[LuxRays][136.766] Device 1 preferred float vector width: 1
[LuxRays][136.766] Device 1 max allocable memory: 8192MBytes
[LuxRays][136.766] Device 1 max allocable memory block size: 2048MBytes
[LuxRays][136.766] Device 1 has out of core memory support: 0
[LuxRays][136.766] Device 2 name: GeForce RTX 2060 SUPER
[LuxRays][136.766] Device 2 type: OPENCL_GPU
[LuxRays][136.766] Device 2 compute units: 34
[LuxRays][136.766] Device 2 preferred float vector width: 1
[LuxRays][136.766] Device 2 max allocable memory: 8192MBytes
[LuxRays][136.766] Device 2 max allocable memory block size: 2048MBytes
[LuxRays][136.766] Device 2 has out of core memory support: 0
[LuxRays][136.766] Device 3 name: GeForce RTX 2070
[LuxRays][136.766] Device 3 type: CUDA_GPU
[LuxRays][136.766] Device 3 compute units: 64
[LuxRays][136.766] Device 3 preferred float vector width: 1
[LuxRays][136.766] Device 3 max allocable memory: 8192MBytes
[LuxRays][136.766] Device 3 max allocable memory block size: 17592186044415MBytes
[LuxRays][136.766] Device 3 has out of core memory support: 1
[LuxRays][136.766] Device 4 name: GeForce RTX 2060 SUPER
[LuxRays][136.766] Device 4 type: CUDA_GPU
[LuxRays][136.766] Device 4 compute units: 64
[LuxRays][136.766] Device 4 preferred float vector width: 1
[LuxRays][136.766] Device 4 max allocable memory: 8192MBytes
[LuxRays][136.766] Device 4 max allocable memory block size: 17592186044415MBytes
[LuxRays][136.766] Device 4 has out of core memory support: 1
ERROR: Hardware device selection string has the wrong length, must be 4 instead of 2
Traceback (most recent call last):
  File "C:\Users\philip\AppData\Roaming\Blender Foundation\Blender\2.83\scripts\addons\BlendLuxCore\engine\viewport.py", line 51, in view_update
    engine.session = engine.exporter.create_session(depsgraph, context, engine=engine)
  File "C:\Users\philip\AppData\Roaming\Blender Foundation\Blender\2.83\scripts\addons\BlendLuxCore\export\__init__.py", line 251, in create_session
    return pyluxcore.RenderSession(renderconfig)
RuntimeError: Hardware device selection string has the wrong length, must be 4 instead of 2
[SDL][136.782] Camera type: perspective
[SDL][136.782] Camera position: Point[0, 0, 0]
[SDL][136.782] Camera target: Point[0, 0, 0]
[SDL][136.782] Camera clipping plane disabled
[LuxCore][136.782] Configuration:
[LuxCore][136.782]   renderengine.type = "RTPATHOCL"
[LuxCore][136.782]   sampler.type = "TILEPATHSAMPLER"
[LuxCore][136.782]   scene.epsilon.min = 9.9999997473787516e-06
[LuxCore][136.782]   scene.epsilon.max = 0.10000000149011612
[LuxCore][136.782] File Name Resolver Configuration:
==================================================

any idea?

acasta69
Developer
Posts: 361
Joined: Tue Jan 09, 2018 3:45 pm

Re: PATHGPU with OpenCL and CUDA support

Post by acasta69 » Sat Apr 25, 2020 12:04 pm

mischterlampe wrote:
Sat Apr 25, 2020 11:38 am
"Hardware device selection string has the wrong length, must be 4 instead of 2"
It might be that Cuda support has to be implemented in the BlendLuxCore addon also:
viewtopic.php?f=5&t=1950&p=22196#p22196
Support LuxCoreRender project with salts and bounties

Windows 10 64 bits, i7-4770 3.4 GHz, RAM 16 GB, GTX 970 4GB v445.87

User avatar
Odilkhan Yakubov
Posts: 127
Joined: Fri Jan 26, 2018 10:07 pm
Location: Tashkent, Uzbekistan

Re: PATHGPU with OpenCL and CUDA support

Post by Odilkhan Yakubov » Mon Apr 27, 2020 10:33 am

Wow! Very promising. Is it possible to test out with GTX 1060 6GB? As mentioned about bucket rendering, I'd like bucket rendering too. It is faster than Progressive refine as Cycles named a long time ago. ;)
Image

User avatar
B.Y.O.B.
Developer
Posts: 3872
Joined: Mon Dec 04, 2017 10:08 pm
Location: Germany
Contact:

Re: PATHGPU with OpenCL and CUDA support

Post by B.Y.O.B. » Mon Apr 27, 2020 10:41 am

Odilkhan Yakubov wrote:
Mon Apr 27, 2020 10:33 am
Wow! Very promising. Is it possible to test out with GTX 1060 6GB? As mentioned about bucket rendering, I'd like bucket rendering too. It is faster than Progressive refine as Cycles named a long time ago. ;)
Just because progressive refine is super slow in Cycles doesn't mean that that's an inherent property of progressive refine compared to buckets.

juangea
Donor
Posts: 197
Joined: Thu Jan 02, 2020 6:23 pm

Re: PATHGPU with OpenCL and CUDA support

Post by juangea » Mon Apr 27, 2020 10:57 am

My feeling is that it's also a bit faster in lux, and the progressive tile seemed even more convinient, but it has some flaws right now, however I don't have a proper correct comparison to be completely sure about tile rendering being actually faster for real.

Anyways I like it more in general, but we don't use it because it has some problems, like leaving unfinished tiles or non properly finished tiles or parts, and of course, not supporting Light Tracing :)

User avatar
Odilkhan Yakubov
Posts: 127
Joined: Fri Jan 26, 2018 10:07 pm
Location: Tashkent, Uzbekistan

Re: PATHGPU with OpenCL and CUDA support

Post by Odilkhan Yakubov » Mon Apr 27, 2020 12:20 pm

B.Y.O.B. wrote:
Mon Apr 27, 2020 10:41 am
Odilkhan Yakubov wrote:
Mon Apr 27, 2020 10:33 am
Wow! Very promising. Is it possible to test out with GTX 1060 6GB? As mentioned about bucket rendering, I'd like bucket rendering too. It is faster than Progressive refine as Cycles named a long time ago. ;)
Just because progressive refine is super slow in Cycles doesn't mean that that's an inherent property of progressive refine compared to buckets.
Yes, of course. Just remind it ;)
Image

Post Reply