PATHGPU with OpenCL and CUDA support

Discussion related to the LuxCore functionality, implementations and API.
User avatar
Dade
Developer
Developer
Posts: 5672
Joined: Mon Dec 04, 2017 8:36 pm
Location: Italy

Re: PATHGPU with OpenCL and CUDA support

Post by Dade »

I have merged the cuda_rendering branch with the master: the CUDA support is officially on.
Support LuxCoreRender project with salts and bounties
User avatar
Dade
Developer
Developer
Posts: 5672
Joined: Mon Dec 04, 2017 8:36 pm
Location: Italy

Re: PATHGPU with OpenCL and CUDA support

Post by Dade »

Some interesting new CUDA profiling data. This is a rendering of a scene with ~14,000,000 triangles (an hair scene with high tessellation for the test):

Screenshot from 2020-04-22 16-42-53.png

87.8% of the time is spent running the ray/triangle intersection kernel: RTX can potentially destroy this time.

This is LuxCore2.1Benchmark scene (~1,400,000 triangles):

Screenshot from 2020-04-22 16-46-10.png

Only 17.5% of the time is spent running the ray/triangle intersection kernel: RTX can offer very little help here.

Short version: the RTX importance will scale up with the scene geometry complexity and down with the shading complexity. Most modern scenes are usually in a 40%/60% or 60%/40% ratio range.
Support LuxCoreRender project with salts and bounties
User avatar
Sharlybg
Donor
Donor
Posts: 3101
Joined: Mon Dec 04, 2017 10:11 pm
Location: Ivory Coast

Re: PATHGPU with OpenCL and CUDA support

Post by Sharlybg »

Short version: the RTX importance will scale up with the scene geometry complexity and down with the shading complexity. Most modern scenes are usually in a 40%/60% or 60%/40% ratio range.
Maybe the next GPU iterations will include specific hardware acceleration for shading complexity.
I have merged the cuda_rendering branch with the master: the CUDA support is officially on.
Anything missing or what we shouldn't expect from tis first CUDA support ? you are so fast it is barelly believeable.
Support LuxCoreRender project with salts and bounties

Portfolio : https://www.behance.net/DRAVIA
User avatar
lacilaci
Donor
Donor
Posts: 1969
Joined: Fri May 04, 2018 5:16 am

Re: PATHGPU with OpenCL and CUDA support

Post by lacilaci »

Dade wrote: Wed Apr 22, 2020 3:16 pm Some interesting new CUDA profiling data. This is a rendering of a scene with ~14,000,000 triangles (an hair scene with high tessellation for the test):


Screenshot from 2020-04-22 16-42-53.png


87.8% of the time is spent running the ray/triangle intersection kernel: RTX can potentially destroy this time.

This is LuxCore2.1Benchmark scene (~1,400,000 triangles):


Screenshot from 2020-04-22 16-46-10.png


Only 17.5% of the time is spent running the ray/triangle intersection kernel: RTX can offer very little help here.

Short version: the RTX importance will scale up with the scene geometry complexity and down with the shading complexity. Most modern scenes are usually in a 40%/60% or 60%/40% ratio range.
From personal experience on my work archviz/productviz cycles on rtx was always at least 2x faster than on cuda and from what i gathered about octane it should be 2-3x the performance of cuda as well..

Sure a scene with 500plygons wont benefit but we dont do such things in 2020 do we?
User avatar
Sharlybg
Donor
Donor
Posts: 3101
Joined: Mon Dec 04, 2017 10:11 pm
Location: Ivory Coast

Re: PATHGPU with OpenCL and CUDA support

Post by Sharlybg »

Sure RTX + Out of core on heavy project like with many subdivision + displacement + heavy vegetation + megascans will rock . ;)
Support LuxCoreRender project with salts and bounties

Portfolio : https://www.behance.net/DRAVIA
User avatar
Dade
Developer
Developer
Posts: 5672
Joined: Mon Dec 04, 2017 8:36 pm
Location: Italy

Re: PATHGPU with OpenCL and CUDA support

Post by Dade »

Sharlybg wrote: Wed Apr 22, 2020 3:39 pm Anything missing or what we shouldn't expect from tis first CUDA support ? you are so fast it is barelly believeable.
As far as I know, everything (tested) works (i.e. everything not yet tested could not work).
Support LuxCoreRender project with salts and bounties
User avatar
Dade
Developer
Developer
Posts: 5672
Joined: Mon Dec 04, 2017 8:36 pm
Location: Italy

Re: PATHGPU with OpenCL and CUDA support

Post by Dade »

I .. couldn't ... resist ...

Very first out of core rendering, 10+GB scene rendered on a 8GB card (8GB used by the OS, applications, etc. too, not a dedicated compute-only GPU):

outofcore.jpg

The scene has more than 8GB of artificially up scaled textured to use more ram:

Code: Select all

[LuxRays][26.806] [Device GeForce RTX 2070 SUPER CUDAIntersect] ImageMap descriptions buffer size: 784bytes (OUT OF CORE)
[LuxRays][26.809] [Device GeForce RTX 2070 SUPER CUDAIntersect] ImageMaps buffer size: 4181238Kbytes (OUT OF CORE)
[LuxRays][36.726] [Device GeForce RTX 2070 SUPER CUDAIntersect] ImageMaps buffer size: 3806677Kbytes (OUT OF CORE)
[LuxRays][46.090] [Device GeForce RTX 2070 SUPER CUDAIntersect] ImageMaps buffer size: 551053Kbytes (OUT OF CORE)
Out of core rendering is going to be a delicate beast (notice how I'm using tile rendering to have good sample locality) but still looks like black magic.
Support LuxCoreRender project with salts and bounties
User avatar
lacilaci
Donor
Donor
Posts: 1969
Joined: Fri May 04, 2018 5:16 am

Re: PATHGPU with OpenCL and CUDA support

Post by lacilaci »

Dade wrote: Thu Apr 23, 2020 3:09 pm I .. couldn't ... resist ...

Very first out of core rendering, 10+GB scene rendered on a 8GB card (8GB used by the OS, applications, etc. too, not a dedicated compute-only GPU):


outofcore.jpg


The scene has more than 8GB of artificially up scaled textured to use more ram:

Code: Select all

[LuxRays][26.806] [Device GeForce RTX 2070 SUPER CUDAIntersect] ImageMap descriptions buffer size: 784bytes (OUT OF CORE)
[LuxRays][26.809] [Device GeForce RTX 2070 SUPER CUDAIntersect] ImageMaps buffer size: 4181238Kbytes (OUT OF CORE)
[LuxRays][36.726] [Device GeForce RTX 2070 SUPER CUDAIntersect] ImageMaps buffer size: 3806677Kbytes (OUT OF CORE)
[LuxRays][46.090] [Device GeForce RTX 2070 SUPER CUDAIntersect] ImageMaps buffer size: 551053Kbytes (OUT OF CORE)
Out of core rendering is going to be a delicate beast (notice how I'm using tile rendering to have good sample locality) but still looks like black magic.
Pretty cool. What's the performance hit though?
User avatar
Sharlybg
Donor
Donor
Posts: 3101
Joined: Mon Dec 04, 2017 10:11 pm
Location: Ivory Coast

Re: PATHGPU with OpenCL and CUDA support

Post by Sharlybg »

I .. couldn't ... resist ...

Very first out of core rendering, 10+GB scene rendered on a 8GB card (*GB used by the OS applications, etc. too, not a dedicated compute-only GPU):
Euh what :shock:

You're doing out of core right now and it work ? Only for Nvidia i guess :mrgreen:
Support LuxCoreRender project with salts and bounties

Portfolio : https://www.behance.net/DRAVIA
User avatar
Dade
Developer
Developer
Posts: 5672
Joined: Mon Dec 04, 2017 8:36 pm
Location: Italy

Re: PATHGPU with OpenCL and CUDA support

Post by Dade »

lacilaci wrote: Thu Apr 23, 2020 3:16 pm Pretty cool. What's the performance hit though?
Quite big, a 50% slower in this first test but it is all about how much "locality" you have (i.e. GPU ram is used like a cache so it is all about the cache hit rate). I'm thinking to some special SOBOL option, dedicated to out of core rendering, to improve the samples "locality".

At the end of the day, as usual, there no free lunches, you have to give something (rendering time) to get something (bigger scene rendering).

P.S. Indeed, it is CUDA only stuff.
Support LuxCoreRender project with salts and bounties
Post Reply