Observed XID 31 on host while running luxmark in Win2014x64 VM.

Use this forum for general user support and related questions.
Post Reply
dchandra
Posts: 2
Joined: Fri May 17, 2019 8:38 pm

Observed XID 31 on host while running luxmark in Win2014x64 VM.

Post by dchandra » Mon Jun 17, 2019 8:52 pm

Any idea about why we are observing XID 31 on host when running Luxmark on Win2014 x64 VM? Is there a fix for this already?

Details Below:

Configuration Details:
=================
System: HP iLO 4 ProLiant WS460c Gen9
Host: XS7.5
GPU: P6
vGPU: P6-1Q
VMs: Win2014x64
Driver: 390.94_392.05
XenDesktop: 7.15

Repro Steps:
===========
1. Install Xen Server 7.5 and install host driver 390.94
2. Install Win2014x64 VM.
3. Assign P6-1Q vGPU and install driver 392.05
4. Assign licence to VM and Install Xen Desktop 7.15.
6. Access session via Xen Desktop receiver (.ica file)
5. Copy luxmark application from \\netapp-pu02\wsapps02_apps\DCC\NEW APPS\LuxMark\3.1\luxmark-windows64-v3.1\LuxMark-v3.1
6. Run luxmark application and select scene "Neuman TLM" or "Hotel lobby".
7. Check host logs.

Observed Behavior :
=================
Oct 1 16:01:02 xenserver-yhhuqwrl vgpu-67[30662]: error: vmiop_log: XID 31 detected on physical_chid:0x18, guest_chid:0x10
Oct 1 16:01:02 xenserver-yhhuqwrl vgpu-67[30662]: error: vmiop_log: MMU Exception data for XID 31: addrLo 0x1257000, addHi 0x6, faultType 2 engineId 1
Oct 1 16:01:02 xenserver-yhhuqwrl kernel: [12327.053163] NVRM: Xid (PCI:0000:8a:00): 31, Ch 00000018, engmask 00000101, intr 10000000

Expected behavior:
================
Host should not give any kind of Xids or errors and Luxmark App should run smoothly on windows.

Isolation :
======== ======================
1. No repro with Win2014x64 + XenDesktop7.15 + P6_1Q profile + Sence : LuxBall HDR
2. No Repro with Win2014x64 + VNC + P6_1Q profile
3.No repro with Win2010RS3x64 + XenDesktop7.19 + P6_1Q profile
4. No repro with Win2014x64 + XenDesktop7.15 + P6_2Q profile

......
1. Repro with Win2014x64 + XenDesktop7.15 + P6 Pass-through + Restricted FB (1020mb)

User avatar
Dade
Developer
Developer
Posts: 2604
Joined: Mon Dec 04, 2017 8:36 pm

Re: Observed XID 31 on host while running luxmark in Win2014x64 VM.

Post by Dade » Mon Jun 17, 2019 10:29 pm

According Nvidia: https://docs.nvidia.com/deploy/xid-errors/index.html
What Is an Xid Message
The Xid message is an error report from the NVIDIA driver that is printed to the operating system's kernel log or event log. Xid messages indicate that a general GPU error occurred, most often due to the driver programming the GPU incorrectly or to corruption of the commands sent to the GPU. The messages can be indicative of a hardware problem, an NVIDIA software problem, or a user application problem.
So it looks like a driver bug/problem, are you running inside a VM ? Are you restricting the amount of available vGPU memory to 1GB ?

"LuxBall HDR" is a very simple scene and it usually works everywhere (including some very limited Intel GPU). Other scenes are more complex and require to compile more complex OpenCL kernels. This can often expose some driver bug (over the years we have found a lot of them both from AMD and NVIDIA).

P.S. LuxMark v3.1 is also extremely old (we are working on v4.0).
Support LuxCoreRender project with salts and bounties

dchandra
Posts: 2
Joined: Fri May 17, 2019 8:38 pm

Re: Observed XID 31 on host while running luxmark in Win2014x64 VM.

Post by dchandra » Tue Jun 25, 2019 5:51 pm

Yes the error was seen when running inside VM and the FB memory was reduced to 896 MB.

However, when FB = 1024 MB everything works. Should this be expected realizing that the scene may need a certain minimum amount of memory to run?

User avatar
Dade
Developer
Developer
Posts: 2604
Joined: Mon Dec 04, 2017 8:36 pm

Re: Observed XID 31 on host while running luxmark in Win2014x64 VM.

Post by Dade » Tue Jun 25, 2019 6:43 pm

dchandra wrote:
Tue Jun 25, 2019 5:51 pm
However, when FB = 1024 MB everything works. Should this be expected realizing that the scene may need a certain minimum amount of memory to run?
An internal driver error is not expected (i.e. that is an NVIDIA bug) however the scenes ramp up memory usage from lower (LuxBall scene) to the higher (Hotel Hall scene) so LuxMark is expected to not be able to work when only such small amount of memory is available.
Support LuxCoreRender project with salts and bounties

Post Reply