- I'm using Blender 2.82 and LuxCore 2.3
- I'm running Windows 10 on both computers
- Rendering inside Blender with LuxCore works fine
- Generating the BCF file appears to work fine
- Running both Console and Node on the same computer results in the Node randomly stopping work a few seconds in to receiving the files. It doesn't crash, as I can still interact with the window (eg scroll through the log). It just stops work, and the status reverts to "Waiting for new connection". The Console continues running, just waiting infinitely for the Node to have a film ready.
- For the complex scene, processing stops in the middle of receiving one of the mesh files.
- If I export just the basic startup scene (single cube), processing stops later (after receiving the mesh file, but before rendering starts). It's roughly the same amount of time that passes (5-10sec) in both scenarios.
- If I run Node on a networked computer only, I get the error "Permission denied" on the Node when trying to receive the BCF file (and the Console reports that the Node shut down the connection). It makes several attempts at this with the same errors before giving up.
- I've expressly allowed both inbound and outbound connections on the port.
- I've expressly allowed both public and private network communication for pyluxcoretool on both computers (maybe something else also needs permission?).
- For what it's worth, in all cases I was hosting the BCF and related files on my network file share (a 3rd computer running FreeNAS), but both Windows PCs running LuxCore have the file share mapped as a network drive (so shouldn't behave any differently than a local file), and both were connected to the drive at the time of testing. When rendering locally in Blender all files (textures, .blend, caches, etc) are located on this same network file share.
Network Render Crashes / Issues
Forum rules
Please upload a testscene that allows developers to reproduce the problem, and attach some images.
Please upload a testscene that allows developers to reproduce the problem, and attach some images.
Network Render Crashes / Issues
Just found myself with a scene complex enough that I wanted to bother with network rendering again, and I can't seem to get it to work. I think I'm encountering two separate issues, but not sure. Any ideas?
Re: Network Render Crashes / Issues
It is somewhat a known problem: https://github.com/LuxCoreRender/LuxCore/issues/100
I have never been able to replicate the problem on Linux but it happens for someone on Windows. Due to lack of interest in network rendering, none has fixed it and/or further developed the related Python code.
If you are using only 2 nodes it may be a lot simpler to just do network rendering "by hand":
1) export the scene in .bcf format (complete stand alone format, it includes everything required for the rendering, non need of shared file system, etc.);
2) start the rendering with stand alone LuxCore on each node using a different random number generator seed;
3) save the film on each node;
4) merge the saved films with pyluxcoretools and save the PNG/JPG/EXR/whatever.
I can further explain each single step if you are interested.
I have never been able to replicate the problem on Linux but it happens for someone on Windows. Due to lack of interest in network rendering, none has fixed it and/or further developed the related Python code.
If you are using only 2 nodes it may be a lot simpler to just do network rendering "by hand":
1) export the scene in .bcf format (complete stand alone format, it includes everything required for the rendering, non need of shared file system, etc.);
2) start the rendering with stand alone LuxCore on each node using a different random number generator seed;
3) save the film on each node;
4) merge the saved films with pyluxcoretools and save the PNG/JPG/EXR/whatever.
I can further explain each single step if you are interested.
Re: Network Render Crashes / Issues
I have tried it here, both with startup cube scene and something more complex, but it's working fine.
Maybe you could send the node and console log?
Maybe you could send the node and console log?
Re: Network Render Crashes / Issues
Well the plot thickens. I'd be fine to manually start the render on both PCs instead of using the network render capability, my only concern would be that if both PCs are on the same network anyway, aren't they going to detect eachothers' nodes and cause problems? But we can deal with that later. Right now I'm still having the same two (different) issues on each PC when using the external renderer.
On my slave PC, here are the logs when trying to start up the job locally (I moved the BCF and all supporting files for the basic cube scene to the local PC's desktop, and am running both Console and Node on this PC which I previously had configured as only a Node). These look identical to the logs I was getting when trying to use this PC as a Node for a network render. My main PC is having the other issue where the Node seems to time out on itself after a few seconds. At the moment I have a render running inside Blender though, and I don't want to screw it up running tests in the external renderer. Hopefully will be able to post some logs from that PC tomorrow though.
I'm slightly surprised that no one else is clamoring to get network rendering working, as I consider it one of the main reasons for using LuxCore (the other being caustics). Of course after years of using Lux I'm pretty comfortable with its material pipeline, but with the switch to node-based materials and the fact that I run most of my texturing through Substance these days... it's really distributed network rendering and caustics that keep me on Lux.
Console:
Node:
On my slave PC, here are the logs when trying to start up the job locally (I moved the BCF and all supporting files for the basic cube scene to the local PC's desktop, and am running both Console and Node on this PC which I previously had configured as only a Node). These look identical to the logs I was getting when trying to use this PC as a Node for a network render. My main PC is having the other issue where the Node seems to time out on itself after a few seconds. At the moment I have a render running inside Blender though, and I don't want to screw it up running tests in the external renderer. Hopefully will be able to post some logs from that PC tomorrow though.
I'm slightly surprised that no one else is clamoring to get network rendering working, as I consider it one of the main reasons for using LuxCore (the other being caustics). Of course after years of using Lux I'm pretty comfortable with its material pipeline, but with the switch to node-based materials and the fact that I run most of my texturing through Substance these days... it's really distributed network rendering and caustics that keep me on Lux.
Console:
Code: Select all
[MainThread][2020-04-13 20:40:50,342] LuxCore 2.3
[NetBeaconReceiverThread][2020-04-13 20:40:50,343] NetBeaconReceiver thread started.
[NetBeaconReceiverThread][2020-04-13 20:40:52,084] Discovered new node: 192.168.200.218:18018
[MainThread][2020-04-13 20:41:02,913] Creating single image render farm job: C:/Users/Andrew/Desktop/Untitled_LuxCore/00001.bcf
[MainThread][2020-04-13 20:41:02,913] New render farm job: C:/Users/Andrew/Desktop/Untitled_LuxCore/00001.bcf
[MainThread][2020-04-13 20:41:02,914] Job file md5: eb59ee38e35900900dd44b3d78d3da60
[MainThread][2020-04-13 20:41:02,915] -------------------------------------------------------
[MainThread][2020-04-13 20:41:02,915] Job started: C:/Users/Andrew/Desktop/Untitled_LuxCore/00001.bcf
[MainThread][2020-04-13 20:41:02,915] -------------------------------------------------------
[RenderFarmNodeThread-192.168.200.218:18018][2020-04-13 20:41:02,916] Node thread started
[FilmMergeThread][2020-04-13 20:41:02,917] Film merge thread started
[RenderFarmNodeThread-192.168.200.218:18018][2020-04-13 20:41:02,918] Remote node has the same pyluxcore verison
[RenderFarmNodeThread-192.168.200.218:18018][2020-04-13 20:41:02,918] Sending file: C:/Users/Andrew/Desktop/Untitled_LuxCore/00001.bcf
[RenderFarmNodeThread-192.168.200.218:18018][2020-04-13 20:41:02,920] [WinError 10054] An existing connection was forcibly closed by the remote host
Traceback (most recent call last):
File "C:\Users\Andrew\AppData\Local\Temp\_MEI134522\pyluxcoretools.zip\pyluxcoretools\renderfarm\renderfarmjobsingleimage.py", line 409, in NodeThread
socketutils.SendFile(nodeSocket, self.jobSingleImage.GetRenderConfigFileName())
File "C:\Users\Andrew\AppData\Local\Temp\_MEI134522\pyluxcoretools.zip\pyluxcoretools\utils\socket.py", line 90, in SendFile
RecvOk(soc)
File "C:\Users\Andrew\AppData\Local\Temp\_MEI134522\pyluxcoretools.zip\pyluxcoretools\utils\socket.py", line 62, in RecvOk
line = RecvLine(soc)
File "C:\Users\Andrew\AppData\Local\Temp\_MEI134522\pyluxcoretools.zip\pyluxcoretools\utils\socket.py", line 42, in RecvLine
data = soc.recv(BUFF_SIZE)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host
[RenderFarmNodeThread-192.168.200.218:18018][2020-04-13 20:41:02,921] Node thread done
[NetBeaconReceiverThread][2020-04-13 20:41:04,085] Retrying node: 192.168.200.218:18018
Code: Select all
[MainThread][2020-04-13 20:40:35,422] LuxCore 2.3
[NetBeaconSenderThread][2020-04-13 20:40:37,081] NetBeaconSender thread started.
[Thread-1][2020-04-13 20:40:37,081] Waiting for a new connection
[Thread-1][2020-04-13 20:41:02,917] Received connection from: ('192.168.200.218', 62539)
[Thread-1][2020-04-13 20:41:02,918] Remote pyluxcore version: 2.3
[Thread-1][2020-04-13 20:41:02,918] Local pyluxcore version: 2.3
[Thread-1][2020-04-13 20:41:02,918] Receiving RenderConfig serialized file: renderfarmnode-e8d7161a-4d16-4899-a579-3aac381f0846.bcf
[Thread-1][2020-04-13 20:41:02,918] Receiving file: renderfarmnode-e8d7161a-4d16-4899-a579-3aac381f0846.bcf
[Thread-1][2020-04-13 20:41:02,919] [Errno 13] Permission denied: 'renderfarmnode-e8d7161a-4d16-4899-a579-3aac381f0846.bcf'
Traceback (most recent call last):
File "C:\Users\Andrew\AppData\Local\Temp\_MEI139562\pyluxcoretools.zip\pyluxcoretools\renderfarm\renderfarmnode.py", line 152, in __HandleConnection
socketutils.RecvFile(clientSocket, renderConfigFile)
File "C:\Users\Andrew\AppData\Local\Temp\_MEI139562\pyluxcoretools.zip\pyluxcoretools\utils\socket.py", line 108, in RecvFile
with open(fileName, "wb") as f:
PermissionError: [Errno 13] Permission denied: 'renderfarmnode-e8d7161a-4d16-4899-a579-3aac381f0846.bcf'
[Thread-1][2020-04-13 20:41:02,920] Connection done: ('192.168.200.218', 62539)
[Thread-1][2020-04-13 20:41:02,920] Waiting for a new connection
Re: Network Render Crashes / Issues
You can use luxcoreui.exe or the "console" mode of pylucoretools, there is no need to use networkrendering script.gecko wrote: ↑Tue Apr 14, 2020 12:53 am Well the plot thickens. I'd be fine to manually start the render on both PCs instead of using the network render capability, my only concern would be that if both PCs are on the same network anyway, aren't they going to detect eachothers' nodes and cause problems?
I assume you have installed BlendLuxCore with Admin rights and/or on a directory where it lacks the filesystem write permission. It can not create the temporary file 'renderfarmnode-e8d7161a-4d16-4899-a579-3aac381f0846.bcf' so it throws an error. You should fix filesystem permissions.gecko wrote: ↑Tue Apr 14, 2020 12:53 amCode: Select all
PermissionError: [Errno 13] Permission denied: 'renderfarmnode-e8d7161a-4d16-4899-a579-3aac381f0846.bcf'
Re: Network Render Crashes / Issues
Ah, ok got it.
Ok, this seems really strange to me - LuxCore writes the temp BCF file to its own installation directory? That might explain the issue with Windows installations - if Lux is installed in Program Files (which would be the logical place to stick it), it won't get write access to that directory without explicitly setting it. Maybe also only an issue for standalone Lux (which is what I installed) - I'm pretty sure Blender sticks addons in the AppData folder, which should have write access enabled by default (I think).Dade wrote: ↑Tue Apr 14, 2020 7:41 am I assume you have installed BlendLuxCore with Admin rights and/or on a directory where it lacks the filesystem write permission. It can not create the temporary file 'renderfarmnode-e8d7161a-4d16-4899-a579-3aac381f0846.bcf' so it throws an error. You should fix filesystem permissions.
Either way, slave PC is running. Now i just need to figure out why my main PC won't render outside of Blender...
Re: Network Render Crashes / Issues
Ok, here are the logs from my main PC launching both the console and node on this computer from a BCF file on the local desktop. For what it's worth, I used the button inside Blender to launch LuxCore so I wouldn't need to go digging for wherever Blender decided to install it. I didn't notice this before, but the command line window is completely blank in this scenario, different from when I launch in standalone mode on my slave PC. So these logs are pulled from the interface window. Again, not sure if this matters.
Node logs. Note that this is for the startup cube scene. Loading a more complex scene, it stops at one of the "Loading serialized mesh" steps. Same amount of total runtime before the hang. I've waited over an hour for it to progress in both cases. It's not locked up, just stops progressing.
Console logs. The console will eventually announce that no film files were received from the node and continue waiting for it indefinitely.
Node logs. Note that this is for the startup cube scene. Loading a more complex scene, it stops at one of the "Loading serialized mesh" steps. Same amount of total runtime before the hang. I've waited over an hour for it to progress in both cases. It's not locked up, just stops progressing.
Code: Select all
LuxCore 2.3
Waiting for configuration...
Started
NetBeaconSender thread started.
Waiting for a new connection
Received connection from: ('192.168.200.46', 50223)
Remote pyluxcore version: 2.3
Local pyluxcore version: 2.3
Receiving RenderConfig serialized file: renderfarmnode-6a1a5c0f-a36b-4371-9fed-f24437117d8e.bcf
Receiving file: renderfarmnode-6a1a5c0f-a36b-4371-9fed-f24437117d8e.bcf
Transfered 2.06 Kbytes in 00:00:00 (2.02 Mbytes/sec)
Receiving RenderConfig serialized MD5: eb59ee38e35900900dd44b3d78d3da60
Received seed: 1
Reading RenderConfig serialized file: renderfarmnode-6a1a5c0f-a36b-4371-9fed-f24437117d8e.bcf
[SDL][55.437] Loading serialized mesh: Mesh_Cube2075980532072000
[SDL][55.437] Material definition: Material2075979363528
[SDL][55.437] Camera type: perspective
[SDL][55.437] Camera position: Point[7.35889, -6.92579, 4.95831]
[SDL][55.437] Camera target: Point[6.70733, -6.31162, 4.51304]
[SDL][55.437] Camera clipping plane disabled
[SDL][55.437] Scene objects count: 1
[SDL][55.437] Light definition: __WORLD_BACKGROUND_LIGHT__
[SDL][55.453] Light definition: 2075980533528
OpenCL render engines available
[LuxCore][55.453] Film resolution: 1920x1080
[SDL][55.453] Film output definition: RGB_IMAGEPIPELINE [image.png]
[SDL][55.453] Image pipeline: film.imagepipelines.0
[SDL][55.453] Image pipeline step 0: NOP
[SDL][55.453] Image pipeline step 1: TONEMAP_LINEAR
[SDL][55.453] Image pipeline step 2: GAMMA_CORRECTION
[SDL][55.453] Film output definition: RGB_IMAGEPIPELINE [RGB_IMAGEPIPELINE_0.png]
[LuxRays][55.484] OpenCL Platform 0: NVIDIA Corporation
[LuxRays][55.484] Device 0 name: NativeThread
[LuxRays][55.484] Device 0 type: NATIVE_THREAD
[LuxRays][55.484] Device 0 compute units: 1
[LuxRays][55.484] Device 0 preferred float vector width: 4
[LuxRays][55.484] Device 0 max allocable memory: 0MBytes
[LuxRays][55.484] Device 0 max allocable memory block size: 0MBytes
[LuxRays][55.484] Device 1 name: GeForce GTX 1060 6GB
[LuxRays][55.484] Device 1 type: OPENCL_GPU
[LuxRays][55.484] Device 1 compute units: 10
[LuxRays][55.484] Device 1 preferred float vector width: 1
[LuxRays][55.484] Device 1 max allocable memory: 6144MBytes
[LuxRays][55.484] Device 1 max allocable memory block size: 1536MBytes
[LuxRays][55.484] Creating 12 intersection device(s)
[LuxRays][55.484] Allocating intersection device 0: NativeThread (Type = NATIVE_THREAD)
[LuxRays][55.484] Allocating intersection device 1: NativeThread (Type = NATIVE_THREAD)
Code: Select all
LuxCore 2.3
NetBeaconReceiver thread started.
Discovered new node: 192.168.200.46:18018
Creating single image render farm job: C:/Users/sauer/Desktop/Untitled_LuxCore/00001.bcf
New render farm job: C:/Users/sauer/Desktop/Untitled_LuxCore/00001.bcf
Job file md5: eb59ee38e35900900dd44b3d78d3da60
-------------------------------------------------------
Job started: C:/Users/sauer/Desktop/Untitled_LuxCore/00001.bcf
-------------------------------------------------------
Node thread started
Film merge thread started
Remote node has the same pyluxcore verison
Sending file: C:/Users/sauer/Desktop/Untitled_LuxCore/00001.bcf
Transfered 2.06 Kbytes in 00:00:00 (0 bytes/sec)
Sending seed: 1
Waiting for node rendering start
Re: Network Render Crashes / Issues
I've confirmed that the issue with the network render node randomly stopping during load is specific to launching PyLuxCoreTool from inside Blender. I downloaded the standalone version of LuxCore (and set filesystem permissions to allow it to write to its install directory), exported my complex scene from BlendLuxCore, and opened the BCF in standalone LuxCore. This works both locally and across the network. Definitely annoying, but at least it works.