Fully empty scene takes 1.07Gb of VRam... broken?

Use this forum for general user support and related questions.
Forum rules
Please upload a testscene that allows developers to reproduce the problem, and attach some images.
juangea
Donor
Donor
Posts: 332
Joined: Thu Jan 02, 2020 6:23 pm

Fully empty scene takes 1.07Gb of VRam... broken?

Post by juangea »

I was debugging one of our scenes (the exterior I showed before) and then I realized something weird was happening, after emptying and cleaning EVERYTHING in the scene, it takes 1.07Gb of Vram to render it.

A new scene takes 270Mb, but for some reason this scene takes 1.07Gb, no caches enabled, nothing weird that I can see there, so I don't understand what's happening, here is the file:
FULLY_EMPTY_1Gb_VRAM.zip
(178.68 KiB) Downloaded 135 times
There may be something weird in my computer, but a new blender scene takes just 270Mb, so there must be something going on there.

P.S.: Blender 2.92 - LuxCore 2.5 Release - RTX2080Ti
User avatar
Sharlybg
Donor
Donor
Posts: 3101
Joined: Mon Dec 04, 2017 10:11 pm
Location: Ivory Coast

Re: Fully empty scene takes 1.07Gb of VRam... broken?

Post by Sharlybg »

I remender that yesterday enabling "pack all data into blend file" just pack also LOL thumbnail.
I wish to try before confirmation.
Support LuxCoreRender project with salts and bounties

Portfolio : https://www.behance.net/DRAVIA
juangea
Donor
Donor
Posts: 332
Joined: Thu Jan 02, 2020 6:23 pm

Re: Fully empty scene takes 1.07Gb of VRam... broken?

Post by juangea »

But I’m talking about VRAM during render, no file size, besides I deleted everything, in that version some como nodes were still there, but on discord I uploaded another version without those, still 1Gb of VRAM, something is happening there but I don’t know what :S
User avatar
Dade
Developer
Developer
Posts: 5672
Joined: Mon Dec 04, 2017 8:36 pm
Location: Italy

Re: Fully empty scene takes 1.07Gb of VRam... broken?

Post by Dade »

But where does Blender takes the VRam usage ? I doubt it asks it to BlendLuxCore. It may be a totally wrong number. If I render your scene in with LuxCoreUI, it takes about 600MB (about as the 500MB required to render cornell box test scene without denoiser AOVs):
vram.jpg
.5GB is about the fixed cost of frame buffers, AOVs and half million of GPU thread states.
Support LuxCoreRender project with salts and bounties
juangea
Donor
Donor
Posts: 332
Joined: Thu Jan 02, 2020 6:23 pm

Re: Fully empty scene takes 1.07Gb of VRam... broken?

Post by juangea »

Then there is something related to my system, because others have tested it in their systems and says that it takes 500 or 600Mb for them, but in my system it takes 1Gb, however a fully empty scene takes 1Gb, here is a picture:
blender_vEgNXsOfXQ.png
And here is a picture of a newly created .blend:
blender_tCAflhvApM.png
P.S.: the total amount of VRam (4Gb) in the second one is because I left enabled the GTX970, so the limit is 4Gb, but the result is the same with the RTX2080Ti alone
User avatar
B.Y.O.B.
Developer
Developer
Posts: 4146
Joined: Mon Dec 04, 2017 10:08 pm
Location: Germany
Contact:

Re: Fully empty scene takes 1.07Gb of VRam... broken?

Post by B.Y.O.B. »

Dade wrote: Sat May 08, 2021 12:02 pm But where does Blender takes the VRam usage ? I doubt it ask it to BlendLuxCore. It may be a totally wrong number.
It is retrieved from LuxCore stats: https://github.com/LuxCoreRender/BlendL ... cs.py#L102
User avatar
Dade
Developer
Developer
Posts: 5672
Joined: Mon Dec 04, 2017 8:36 pm
Location: Italy

Re: Fully empty scene takes 1.07Gb of VRam... broken?

Post by Dade »

juangea wrote: Sat May 08, 2021 12:49 pm Then there is something related to my system, because others have tested it in their systems and says that it takes 500 or 600Mb for them, but in my system it takes 1Gb, however a fully empty scene takes 1Gb, here is a picture:
Post here the complete console output of a rendering, it includes a lot of details about the amount of memory allocated on the GPUs :idea:
Support LuxCoreRender project with salts and bounties
juangea
Donor
Donor
Posts: 332
Joined: Thu Jan 02, 2020 6:23 pm

Re: Fully empty scene takes 1.07Gb of VRam... broken?

Post by juangea »

Ok, I’m out but I’ll post it later 👍
juangea
Donor
Donor
Posts: 332
Joined: Thu Jan 02, 2020 6:23 pm

Re: Fully empty scene takes 1.07Gb of VRam... broken?

Post by juangea »

In this test, 959Mb were used, here is the log:

Code: Select all

Read blend: X:\cr_7villas\Blender\Scenes\7villas_base_test_EMPTY.blend
==================================================
[Engine/Final] Rendering layer "View Layer"
[Exporter] Creating session
[SDL][31.906] Define ImageMap: NamedObject
[SDL][31.906] Camera type: perspective
[SDL][31.906] Camera position: Point[20.7337, 72.6388, 26.079]
[SDL][31.906] Camera target: Point[20.6341, 73.5867, 25.7766]
[SDL][31.906] Camera clipping plane disabled
[SDL][31.906] Light definition: __WORLD_BACKGROUND_LIGHT__
[LuxCore][31.921] Configuration:
[LuxCore][31.921]   path.pathdepth.total = 9
[LuxCore][31.921]   path.pathdepth.diffuse = 6
[LuxCore][31.921]   path.pathdepth.glossy = 6
[LuxCore][31.921]   path.pathdepth.specular = 8
[LuxCore][31.921]   path.hybridbackforward.enable = 0
[LuxCore][31.921]   path.hybridbackforward.partition = 0.9
[LuxCore][31.921]   path.hybridbackforward.glossinessthreshold = 0.049
[LuxCore][31.921]   opencl.cpu.use = 0
[LuxCore][31.921]   opencl.gpu.use = 1
[LuxCore][31.921]   opencl.devices.select = "00010"
[LuxCore][31.921]   film.noiseestimation.warmup = 8
[LuxCore][31.921]   film.noiseestimation.step = 32
[LuxCore][31.921]   sampler.sobol.adaptive.strength = 0.95
[LuxCore][31.921]   sampler.sobol.bucketsize = 1
[LuxCore][31.921]   sampler.sobol.tilesize = 16
[LuxCore][31.921]   sampler.sobol.supersampling = 1
[LuxCore][31.921]   sampler.sobol.overlapping = 32
[LuxCore][31.921]   renderengine.type = "PATHOCL"
[LuxCore][31.921]   sampler.type = "SOBOL"
[LuxCore][31.921]   film.width = 1920
[LuxCore][31.921]   film.height = 1080
[LuxCore][31.921]   film.filter.type = "NONE"
[LuxCore][31.921]   film.filter.width = 1.5
[LuxCore][31.921]   lightstrategy.type = "LOG_POWER"
[LuxCore][31.921]   scene.epsilon.min = 1e-05
[LuxCore][31.921]   scene.epsilon.max = 0.1
[LuxCore][31.921]   film.opencl.enable = 1
[LuxCore][31.921]   film.opencl.device = 3
[LuxCore][31.921]   path.forceblackbackground.enable = 0
[LuxCore][31.921]   renderengine.seed = 1
[LuxCore][31.921]   film.outputs.0.type = "RGB_IMAGEPIPELINE"
[LuxCore][31.921]   film.outputs.0.index = 0
[LuxCore][31.921]   film.outputs.0.filename = "RGB_IMAGEPIPELINE_0.png"
[LuxCore][31.921]   film.outputs.1.type = "RGB_IMAGEPIPELINE"
[LuxCore][31.921]   film.outputs.1.index = 1
[LuxCore][31.921]   film.outputs.1.filename = "RGB_IMAGEPIPELINE_1.png"
[LuxCore][31.921]   film.imagepipelines.001.0.type = "NOP"
[LuxCore][31.921]   film.imagepipelines.001.1.type = "TONEMAP_LINEAR"
[LuxCore][31.921]   film.imagepipelines.001.1.scale = 1
[LuxCore][31.921]   film.imagepipelines.001.radiancescales.0.enabled = 1
[LuxCore][31.921]   film.imagepipelines.001.radiancescales.0.globalscale = 1
[LuxCore][31.921]   film.imagepipelines.001.radiancescales.0.rgbscale = 1 1 1
[LuxCore][31.921]   film.imagepipelines.001.2.type = "GAMMA_CORRECTION"
[LuxCore][31.921]   film.imagepipelines.001.2.value = 2.2
[LuxCore][31.921]   film.noiseestimation.index = 1
[LuxCore][31.921]   film.imagepipelines.000.0.type = "NOP"
[LuxCore][31.921]   film.imagepipelines.000.1.type = "TONEMAP_LINEAR"
[LuxCore][31.921]   film.imagepipelines.000.1.scale = 1
[LuxCore][31.921]   film.imagepipelines.000.radiancescales.0.enabled = 1
[LuxCore][31.921]   film.imagepipelines.000.radiancescales.0.globalscale = 1
[LuxCore][31.921]   film.imagepipelines.000.radiancescales.0.rgbscale = 1 1 1
[LuxCore][31.921]   batch.haltspp = 512 0
[LuxCore][31.921]   batch.halttime = 0
[LuxCore][31.921] File Name Resolver Configuration:
Export took 0.0 s
[LuxCore][31.937] Film resolution: 1920x1080
[SDL][31.937] Film output definition: RGB_IMAGEPIPELINE [image.png]
[SDL][31.937] Image pipeline: film.imagepipelines.000
[SDL][31.937] Image pipeline step 0: NOP
[SDL][31.937] Image pipeline step 1: TONEMAP_LINEAR
[SDL][31.937] Image pipeline: film.imagepipelines.001
[SDL][31.937] Image pipeline step 0: NOP
[SDL][31.937] Image pipeline step 1: TONEMAP_LINEAR
[SDL][31.937] Image pipeline step 2: GAMMA_CORRECTION
[SDL][31.937] Film output definition: RGB_IMAGEPIPELINE [RGB_IMAGEPIPELINE_0.png]
[SDL][31.937] Film output definition: RGB_IMAGEPIPELINE [RGB_IMAGEPIPELINE_1.png]
[LuxRays][31.937] OpenCL support: enabled
[LuxRays][31.937] OpenCL Platform 0: NVIDIA CUDA
[LuxRays][31.953] OpenCL Platform 1: Intel(R) OpenCL
[LuxRays][31.953] CUDA support: enabled
[LuxRays][31.953] CUDA support: available
[LuxRays][31.953] CUDA driver version: 11.30
[LuxRays][31.953] CUDA device count: 2
[LuxRays][31.953] Optix support: available
[LuxRays][31.953] Device 0 name: Native
[LuxRays][31.953] Device 0 type: NATIVE_THREAD
[LuxRays][31.953] Device 0 compute units: 1
[LuxRays][31.968] Device 0 preferred float vector width: 4
[LuxRays][31.968] Device 0 max allocable memory: 17592186044415MBytes
[LuxRays][31.968] Device 0 max allocable memory block size: 17592186044415MBytes
[LuxRays][31.968] Device 0 has out of core memory support: 0
[LuxRays][31.968] Device 1 name: NVIDIA GeForce RTX 2080 Ti
[LuxRays][31.968] Device 1 type: OPENCL_GPU
[LuxRays][31.968] Device 1 compute units: 68
[LuxRays][31.968] Device 1 preferred float vector width: 1
[LuxRays][31.968] Device 1 max allocable memory: 11264MBytes
[LuxRays][31.968] Device 1 max allocable memory block size: 2816MBytes
[LuxRays][31.968] Device 1 has out of core memory support: 0
[LuxRays][31.968] Device 2 name: NVIDIA GeForce GTX 970
[LuxRays][31.968] Device 2 type: OPENCL_GPU
[LuxRays][31.968] Device 2 compute units: 13
[LuxRays][31.968] Device 2 preferred float vector width: 1
[LuxRays][31.984] Device 2 max allocable memory: 4096MBytes
[LuxRays][31.984] Device 2 max allocable memory block size: 1024MBytes
[LuxRays][31.984] Device 2 has out of core memory support: 0
[LuxRays][31.984] Device 3 name: Intel(R) Core(TM) i7-5960X CPU @ 3.00GHz
[LuxRays][31.984] Device 3 type: OPENCL_CPU
[LuxRays][31.984] Device 3 compute units: 16
[LuxRays][31.984] Device 3 preferred float vector width: 8
[LuxRays][31.984] Device 3 max allocable memory: 65446MBytes
[LuxRays][31.984] Device 3 max allocable memory block size: 16361MBytes
[LuxRays][31.984] Device 3 has out of core memory support: 0
[LuxRays][31.984] Device 4 name: NVIDIA GeForce RTX 2080 Ti
[LuxRays][31.984] Device 4 type: CUDA_GPU
[LuxRays][31.984] Device 4 compute units: 64
[LuxRays][32.000] Device 4 preferred float vector width: 1
[LuxRays][32.000] Device 4 max allocable memory: 11264MBytes
[LuxRays][32.000] Device 4 max allocable memory block size: 17592186044415MBytes
[LuxRays][32.000] Device 4 has out of core memory support: 1
[LuxRays][32.000] Device 4 CUDA compute capability: 7.5
[LuxRays][32.000] Device 5 name: NVIDIA GeForce GTX 970
[LuxRays][32.000] Device 5 type: CUDA_GPU
[LuxRays][32.000] Device 5 compute units: 128
[LuxRays][32.000] Device 5 preferred float vector width: 1
[LuxRays][32.000] Device 5 max allocable memory: 4096MBytes
[LuxRays][32.000] Device 5 max allocable memory block size: 17592186044415MBytes
[LuxRays][32.000] Device 5 has out of core memory support: 1
[LuxRays][32.000] Device 5 CUDA compute capability: 5.2
[LuxRays][32.000] Creating 17 intersection device(s)
[LuxRays][32.015] Allocating intersection device 0: NVIDIA GeForce RTX 2080 Ti (Type = CUDA_GPU)
[LuxRays][32.171] [Optix][4][KNOBS] All knobs on default.

[LuxRays][32.281] [Optix][4][DISK CACHE] Opened database: "C:\Users\jgea\AppData\Local\NVIDIA\OptixCache\cache7.db"
[LuxRays][32.281] [Optix][4][DISK CACHE]     Cache data size: "23.6 KiB"
[LuxRays][32.281] Allocating intersection device 1: Native (Type = NATIVE_THREAD)
[LuxRays][32.281] Allocating intersection device 2: Native (Type = NATIVE_THREAD)
[LuxRays][32.281] Allocating intersection device 3: Native (Type = NATIVE_THREAD)
[LuxRays][32.281] Allocating intersection device 4: Native (Type = NATIVE_THREAD)
[LuxRays][32.296] Allocating intersection device 5: Native (Type = NATIVE_THREAD)
[LuxRays][32.296] Allocating intersection device 6: Native (Type = NATIVE_THREAD)
[LuxRays][32.296] Allocating intersection device 7: Native (Type = NATIVE_THREAD)
[LuxRays][32.296] Allocating intersection device 8: Native (Type = NATIVE_THREAD)
[LuxRays][32.296] Allocating intersection device 9: Native (Type = NATIVE_THREAD)
[LuxRays][32.296] Allocating intersection device 10: Native (Type = NATIVE_THREAD)
[LuxRays][32.296] Allocating intersection device 11: Native (Type = NATIVE_THREAD)
[LuxRays][32.312] Allocating intersection device 12: Native (Type = NATIVE_THREAD)
[LuxRays][32.312] Allocating intersection device 13: Native (Type = NATIVE_THREAD)
[LuxRays][32.312] Allocating intersection device 14: Native (Type = NATIVE_THREAD)
[LuxRays][32.312] Allocating intersection device 15: Native (Type = NATIVE_THREAD)
[LuxRays][32.312] Allocating intersection device 16: Native (Type = NATIVE_THREAD)
[LuxCore][32.312] CUDA devices used:
[LuxCore][32.312] [NVIDIA GeForce RTX 2080 Ti CUDAIntersect (Optix enabled: 1)]
[LuxCore][32.312] OpenCL devices used:
[LuxCore][32.312] Native devices used: 16
[LuxCore][32.312] Configuring 1 OpenCL render threads
[LuxCore][32.328] Configuring 16 native render threads
[LuxRays][32.328] Preprocessing DataSet
[LuxRays][32.328] Total vertex count: 0
[LuxRays][32.328] Total triangle count: 0
[LuxRays][32.328] Preprocessing DataSet done
[LuxRays][32.328] Adding DataSet accelerator: OPTIX
[LuxRays][32.328] Total vertex count: 0
[LuxRays][32.328] Total triangle count: 0
[LuxRays][32.328] Empty Optix accelerator
[LuxRays][32.328] Adding DataSet accelerator: EMBREE
[LuxRays][32.343] Total vertex count: 0
[LuxRays][32.343] Total triangle count: 0
[LuxRays][32.343] EmbreeAccel build time: 0ms
[LuxRays][32.343] Building Optix accelerator
[LuxRays][32.343] Optix accelerator is empty
[LuxRays][32.343] [OptixEmptyAccelKernel] Compiler options: -D LUXRAYS_OPENCL_KERNEL -D PARAM_RAY_EPSILON_MIN=1e-05f -D PARAM_RAY_EPSILON_MAX=0.1f -D LUXRAYS_CUDA_DEVICE -D LUXRAYS_OS_WINDOWS --use_fast_math
[LuxRays][32.343] [OptixEmptyAccelKernel] Compiling kernels
[LuxRays][32.359] [OptixEmptyAccelKernel] Program cached
[LuxCore][32.421] [PathOCLRenderEngine] OpenCL task count: 524288
[LuxCore][32.421] [PathOCLBaseRenderEngine] OpenCL max. page memory size: 18014398509481983Kbytes
[LuxCore][32.421] Compile Geometry
[LuxCore][32.421] Scene geometry compilation time: 0ms
[LuxCore][32.421] Compile 0 Textures
[LuxCore][32.421] Texture evaluation ops count: 0
[LuxCore][32.437] Texture evaluation max. stack size: 0
[LuxCore][32.437] Textures compilation time: 16ms
[LuxCore][32.437] Compile 0 Materials
[LuxCore][32.437] Material evaluation ops count: 0
[LuxCore][32.437] Material evaluation max. stack size: 0
[LuxCore][32.437] Material compilation time: 0ms
[LuxCore][32.437] Compile Lights
[LuxCore][32.437] Lights compilation time: 0ms
[LuxCore][32.437] Compile ImageMaps
[LuxCore][32.437] Image maps page(s) count: 1
[LuxCore][32.437]  RGB channel page 0 size: 3072Kbytes
[LuxCore][32.437] Image maps compilation time: 0ms
[LuxCore][32.437] Always enabled OpenCL code:
[LuxCore][32.437] Compile Geometry
[LuxCore][32.437] Scene geometry compilation time: 0ms
[LuxCore][32.437] Compile 0 Textures
[LuxCore][32.437] Texture evaluation ops count: 0
[LuxCore][32.437] Texture evaluation max. stack size: 0
[LuxCore][32.437] Textures compilation time: 0ms
[LuxCore][32.437] Compile 0 Materials
[LuxCore][32.453] Material evaluation ops count: 0
[LuxCore][32.453] Material evaluation max. stack size: 0
[LuxCore][32.453] Material compilation time: 0ms
[LuxCore][32.453] Compile Lights
[LuxCore][32.453] Lights compilation time: 0ms
[LuxCore][32.453] Compile ImageMaps
[LuxCore][32.453] Image maps page(s) count: 1
[LuxCore][32.453]  RGB channel page 0 size: 3072Kbytes
[LuxCore][32.453] Image maps compilation time: 0ms
[LuxCore][32.453] Starting 1 OpenCL render threads
[LuxRays][32.500] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] RADIANCE_PER_PIXEL_NORMALIZEDs[0] buffer size: 32400Kbytes
[LuxRays][32.500] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] NOISE buffer size: 8100Kbytes
[LuxRays][32.546] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] RADIANCE_PER_PIXEL_NORMALIZEDs[0] buffer size: 32400Kbytes
[LuxRays][32.546] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] Camera buffer size: 5492bytes
[LuxRays][32.546] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] ImageMap descriptions buffer size: 32bytes
[LuxRays][32.546] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] ImageMaps buffer size: 3072Kbytes
[LuxRays][32.546] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] Lights buffer size: 344bytes
[LuxRays][32.546] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] Env. light indices buffer size: 4bytes
[LuxRays][32.546] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] Env. light distributions buffer size: 1028Kbytes
[LuxRays][32.546] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] LightsDistribution buffer size: 16bytes
[LuxRays][32.546] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] InfiniteLightSourcesDistribution buffer size: 16bytes
[LuxRays][32.546] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] Ray buffer size: 24576Kbytes
[LuxRays][32.546] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] RayHit buffer size: 10240Kbytes
[LuxRays][32.546] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] GPUTaskConfiguration buffer size: 328bytes
[LuxRays][32.546] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] GPUTask buffer size: 339968Kbytes
[LuxRays][32.562] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] GPUTaskDirectLight buffer size: 30720Kbytes
[LuxRays][32.562] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] GPUTaskState buffer size: 200704Kbytes
[LuxRays][32.562] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] GPUTask Stats buffer size: 2048Kbytes
[LuxRays][32.562] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] SamplerSharedData buffer size: 8111Kbytes
[LuxCore][32.578] [PathOCLBaseRenderThread::0] Size of a Sample: 40bytes
[LuxRays][32.578] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] Sample buffer size: 20480Kbytes
[LuxCore][32.578] [PathOCLBaseRenderThread::0] Size of a SampleData: 8bytes
[LuxRays][32.578] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] SampleData buffer size: 4096Kbytes
[LuxCore][32.578] [PathOCLBaseRenderThread::0] Size of a SampleResult: 428bytes
[LuxRays][32.578] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] SampleResult buffer size: 219136Kbytes
[LuxRays][32.578] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] PathInfo buffer size: 55296Kbytes
[LuxRays][32.593] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] DirectLightVolumeInfo buffer size: 22528Kbytes
[LuxRays][32.593] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] Pixel Filter Distribution buffer size: 33Kbytes
[LuxCore][32.593] [PathOCLBaseRenderThread::0] Compiling kernels
[LuxRays][32.593] [PathOCL kernel] Compiler options: -D LUXRAYS_OPENCL_KERNEL -D SLG_OPENCL_KERNEL -D RENDER_ENGINE_PATHOCL -D PARAM_RAY_EPSILON_MIN=1e-05f -D PARAM_RAY_EPSILON_MAX=0.1f -D LUXRAYS_CUDA_DEVICE -D LUXRAYS_OS_WINDOWS --use_fast_math
[LuxRays][32.593] [PathOCL kernel] Compiling kernels
[LuxRays][32.812] [PathOCL kernel] Program cached
[LuxCore][32.812] [PathOCLBaseRenderThread::0] Compiling Film_Clear Kernel
[LuxCore][32.812] [PathOCLBaseRenderThread::0] Compiling InitSeed Kernel
[LuxCore][32.812] [PathOCLBaseRenderThread::0] Compiling Init Kernel
[LuxCore][32.812] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_RT_NEXT_VERTEX Kernel
[LuxCore][32.812] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_HIT_NOTHING Kernel
[LuxCore][32.812] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_HIT_OBJECT Kernel
[LuxCore][32.812] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_RT_DL Kernel
[LuxCore][32.812] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_DL_ILLUMINATE Kernel
[LuxCore][32.812] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_DL_SAMPLE_BSDF Kernel
[LuxCore][32.812] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_GENERATE_NEXT_VERTEX_RAY Kernel
[LuxCore][32.828] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_SPLAT_SAMPLE Kernel
[LuxCore][32.828] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_NEXT_SAMPLE Kernel
[LuxCore][32.828] [PathOCLBaseRenderThread::0] Compiling AdvancePaths_MK_GENERATE_CAMERA_RAY Kernel
[LuxCore][32.828] [PathOCLBaseRenderThread::0] AdvancePaths_MK_* workgroup size: 32
[LuxCore][32.828] [PathOCLBaseRenderThread::0] Kernels compilation time: 235ms
[LuxCore][32.828] Starting 16 native render threads
Session started in 0.6 s
[LuxCore][33.000] Film hardware image pipeline
[LuxCore][33.812] Film hardware device used: NVIDIA GeForce RTX 2080 Ti CUDAIntersect (Type: CUDA_GPU)
[LuxRays][33.812] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] IMAGEPIPELINE buffer size: 24300Kbytes
[LuxRays][33.984] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] Merge buffer size: 32400Kbytes
[LuxRays][33.984] [MergeSampleBuffersOCL] Compiler options: -D LUXRAYS_OPENCL_KERNEL -D SLG_OPENCL_KERNEL -D LUXRAYS_CUDA_DEVICE -D LUXRAYS_OS_WINDOWS --use_fast_math
[LuxRays][33.984] [MergeSampleBuffersOCL] Compiling kernels
[LuxRays][34.015] [MergeSampleBuffersOCL] Program cached
[LuxCore][34.015] [MergeSampleBuffersOCL] Compiling Film_MergeBufferInitialize Kernel
[LuxCore][34.015] [MergeSampleBuffersOCL] Compiling Film_MergeRADIANCE_PER_PIXEL_NORMALIZED Kernel
[LuxCore][34.031] [MergeSampleBuffersOCL] Compiling Film_MergeRADIANCE_PER_SCREEN_NORMALIZED Kernel
[LuxCore][34.031] [MergeSampleBuffersOCL] Compiling Film_MergeBufferFinalize Kernel
[LuxCore][34.031] [MergeSampleBuffersOCL] Kernels compilation time: 46ms
[LuxRays][34.531] [LinearToneMap] Compiler options: -D LUXRAYS_OPENCL_KERNEL -D SLG_OPENCL_KERNEL -D LUXRAYS_CUDA_DEVICE -D LUXRAYS_OS_WINDOWS --use_fast_math
[LuxRays][34.531] [LinearToneMap] Compiling kernels
[LuxRays][34.562] [LinearToneMap] Program cached
[LuxCore][34.562] [AutoLinearToneMap] Compiling LinearToneMap_Apply Kernel
[LuxCore][34.578] [LinearToneMap] Kernels compilation time: 47ms
[LuxRays][36.218] [LinearToneMap] Compiler options: -D LUXRAYS_OPENCL_KERNEL -D SLG_OPENCL_KERNEL -D LUXRAYS_CUDA_DEVICE -D LUXRAYS_OS_WINDOWS --use_fast_math
[LuxRays][36.218] [LinearToneMap] Compiling kernels
[LuxRays][36.234] [LinearToneMap] Program cached
[LuxCore][36.234] [AutoLinearToneMap] Compiling LinearToneMap_Apply Kernel
[LuxCore][36.234] [LinearToneMap] Kernels compilation time: 16ms
[LuxRays][36.234] [Device NVIDIA GeForce RTX 2080 Ti CUDAIntersect] Gamma table buffer size: 16Kbytes
[LuxRays][36.234] [GammaCorrectionPlugin] Compiler options: -D LUXRAYS_OPENCL_KERNEL -D SLG_OPENCL_KERNEL -D LUXRAYS_CUDA_DEVICE -D LUXRAYS_OS_WINDOWS --use_fast_math
[LuxRays][36.250] [GammaCorrectionPlugin] Compiling kernels
[LuxRays][36.265] [GammaCorrectionPlugin] Program cached
[LuxCore][36.265] [GammaCorrectionPlugin] Compiling GammaCorrectionPlugin_Apply Kernel
[LuxCore][36.281] [GammaCorrectionPlugin] Kernels compilation time: 46ms
[LuxCore][36.390] Noise estimation: first pass
Recommended clamp value: 58018.053166620666
[LuxCore][39.000] Noise estimation: Error mean = 0
[LuxCore][40.890] Noise estimation: Error mean = 0
[LuxRays][42.406] [Optix][4][DISK CACHE] Closed database: "C:\Users\jgea\AppData\Local\NVIDIA\OptixCache\cache7.db"
[LuxRays][42.406] [Optix][4][DISK CACHE]     Cache data size: "23.6 KiB"
[LuxCore][42.437] [NVIDIA GeForce RTX 2080 Ti CUDAIntersect] Memory used for hardware image pipeline: 56700Kbytes
User avatar
Dade
Developer
Developer
Posts: 5672
Joined: Mon Dec 04, 2017 8:36 pm
Location: Italy

Re: Fully empty scene takes 1.07Gb of VRam... broken?

Post by Dade »

This:

Code: Select all

[LuxCore][32.421] [PathOCLRenderEngine] OpenCL task count: 524288
While my RTX2070 with 8GB ram uses:

Code: Select all

[LuxCore][0.461] [PathOCLRenderEngine] OpenCL task count: 262144
LuxCore picks the number of GPUs thread based on the total amount of GPU ram: https://github.com/LuxCoreRender/LuxCor ... l.cpp#L223

- GPUs with more than 8GB => 524288 threads
- GPUs with more than 4GB => 262144 threads
- GPUs with more than 2GB => 131072 threads
- otherwise => 65536 threads

Fixed memory cost due to all threads is doubled in your case.

You can force the number of threads/tasks by using the "opencl.task.count" property. It is something we have already discusse in the past: more is faster but there is a diminishing return and, as you noticed, more is also more memory used.
Support LuxCoreRender project with salts and bounties
Post Reply