Try   HackMD

Proxmox 6.2 GPU Passthrough Tutorial

Hardware

MB: ASUS TUF GAMING B460-PLUS
CPU: Intel i7-10700
GPU: NVIDIA GTX-1060

BIOS Config

Your hardware needs to support IOMMU (I/O Memory Management Unit) interrupt remapping, this includes the CPU and the mainboard.

A. Enable VT-x

In the Asus UEFI BIOS, this feature is in "Advanced -> CPU configuration" and is named "Intel Virtualization Technology".

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

B. Enable VT-d

Then, if your motherboard supports it, you will find the "VT-d" option that matches IOMMU in "Advanced -> System Agent Configuration" or "Advanced -> North Bridge".

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

PS. SR-IOV option can also be found on this MB/BIOS, but will not be used in this tutorial.

Host Config

A. Load Required Modules and Block GPU Drivers

Load VFIO Modules

Add to /etc/modules

vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

Block GPU Drivers

Block the original GPU drivers for attaching vfio-pci driver to the devices

echo "blacklist radeon" >> /etc/modprobe.d/blacklist.conf 
echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf 
echo "blacklist nvidia" >> /etc/modprobe.d/blacklist.conf 

B. Enable the IOMMU for systemd-boot (Proxmox on UEFI)

Get VendorID and DeviceID of the GPU

lspci -nn | grep -i nvidia

Sample output

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] [10de:1c03] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation GP106 High Definition Audio Controller [10de:10f1] (rev a1)

=> For the first one: PCI ID: 01:00.0, VendorID: 10de, DeviceID: 1c03

Add kernel parameters

Add the following parameters into /etc/kernel/cmdline

intel_iommu=on vfio-pci.ids=<VendorID>:<DeviceID>,<VendorID>:<DeviceID> disable_vga=1

For example /etc/kernel/cmdline becomes

root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on vfio-pci.ids=10de:1c03,10de:10f1 disable_vga=1

Update and reboot

pve-efiboot-tool
reboot

Check if kernel parameters are loaded.

C. IOMMU Interrupt Remapping

It will not be possible to use PCI passthrough without interrupt remapping.

To identify if your system has support for interrupt remapping

dmesg | grep 'remapping'

Sample output

[    0.190148] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    0.191599] DMAR-IR: Enabled IRQ remapping in x2apic mode

If you see one of the following lines, then remapping is supported.

  • AMD-Vi: Interrupt remapping enabled
  • DMAR-IR: Enabled IRQ remapping in x2apic mode" ('x2apic' can be different on old CPUs, but should still work)

Allow Unsafe Interrupts (if interrupt remapping is not supported)

If your system doesn't support interrupt remapping, you can allow unsafe interrupts (not tested by me)

echo "options vfio_iommu_type1 allow_unsafe_interrupts=1" > /etc/modprobe.d/iommu_unsafe_interrupts.conf
update-initramfs -u
reboot

D. Verification

Kernel Parameters Loaded

cat /proc/cmdline

IOMMU Working

dmesg | grep -E "DMAR|IOMMU"

Sample output

[    0.009442] ACPI: DMAR 0x000000007936D000 0000A8 (v01 INTEL  EDK2     00000002      01000013)
[    0.110221] DMAR: IOMMU enabled
[    0.190123] DMAR: Host address width 39
[    0.190125] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[    0.190131] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap 1c0000c40660462 ecap 19e2ff0505e
[    0.190133] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[    0.190137] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 ecap f050da
[    0.190140] DMAR: RMRR base: 0x00000079945000 end: 0x00000079b8efff
[    0.190142] DMAR: RMRR base: 0x0000007b000000 end: 0x0000007f7fffff
[    0.190144] DMAR-IR: IOAPIC id 2 under DRHD base  0xfed91000 IOMMU 1
[    0.190146] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[    0.190148] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    0.191599] DMAR-IR: Enabled IRQ remapping in x2apic mode
[    0.942691] DMAR: No ATSR found
[    0.942753] DMAR: dmar0: Using Queued invalidation
[    0.942757] DMAR: dmar1: Using Queued invalidation
[    0.951433] DMAR: Intel(R) Virtualization Technology for Directed I/O

VFIO Working

dmesg | grep -i vfio

Should see messages from vfio_pci driver.

Sample output

[    0.000000] Command line: initrd=\EFI\proxmox\5.4.34-1-pve\initrd.img-5.4.34-1-pve root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on vfio-pci.ids=10de:1c03,10de:10f1 disable_vga=1
[    0.110162] Kernel command line: initrd=\EFI\proxmox\5.4.34-1-pve\initrd.img-5.4.34-1-pve root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on vfio-pci.ids=10de:1c03,10de:10f1 disable_vga=1
[    0.987220] VFIO - User Level meta-driver version: 0.3
[    0.987271] vfio-pci 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[    1.006157] vfio_pci: add [10de:1c03[ffffffff:ffffffff]] class 0x000000/00000000
[    1.026154] vfio_pci: add [10de:10f1[ffffffff:ffffffff]] class 0x000000/00000000
[    5.320737] vfio-pci 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[   36.517767] vfio-pci 0000:01:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[   36.520535] vfio-pci 0000:01:00.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0x0000
[   36.538147] vfio-pci 0000:01:00.1: enabling device (0000 -> 0002)
[   39.276682] vfio-pci 0000:01:00.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0x0000

VF-PCI Driver Loaded

lspci -nnk

Kernel driver in use should be: vfio-pci

Sample output

...
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] [10de:1c03] (rev a1)
	Subsystem: Micro-Star International Co., Ltd. [MSI] GP106 [GeForce GTX 1060 6GB] [1462:3283]
	Kernel driver in use: vfio-pci
	Kernel modules: nvidiafb, nouveau
01:00.1 Audio device [0403]: NVIDIA Corporation GP106 High Definition Audio Controller [10de:10f1] (rev a1)
	Subsystem: Micro-Star International Co., Ltd. [MSI] GP106 High Definition Audio Controller [1462:3283]
	Kernel driver in use: vfio-pci
	Kernel modules: snd_hda_intel
...

IOMMU Groups Isolation

For working PCI passthrough, you need a dedicated IOMMU group for all PCI devices you want to assign to a VM.

find /sys/kernel/iommu_groups/ -type l

Sample output

/sys/kernel/iommu_groups/7/devices/0000:00:1c.0
/sys/kernel/iommu_groups/7/devices/0000:04:00.3
/sys/kernel/iommu_groups/7/devices/0000:04:00.1
/sys/kernel/iommu_groups/7/devices/0000:04:00.2
/sys/kernel/iommu_groups/7/devices/0000:00:1c.4
/sys/kernel/iommu_groups/7/devices/0000:04:00.0
/sys/kernel/iommu_groups/5/devices/0000:00:17.0
/sys/kernel/iommu_groups/3/devices/0000:00:14.0
/sys/kernel/iommu_groups/1/devices/0000:00:01.0
/sys/kernel/iommu_groups/1/devices/0000:01:00.0
/sys/kernel/iommu_groups/1/devices/0000:01:00.1
/sys/kernel/iommu_groups/8/devices/0000:05:00.0
/sys/kernel/iommu_groups/8/devices/0000:00:1d.0
/sys/kernel/iommu_groups/6/devices/0000:00:1b.0
/sys/kernel/iommu_groups/4/devices/0000:00:16.0
/sys/kernel/iommu_groups/2/devices/0000:00:02.0
/sys/kernel/iommu_groups/0/devices/0000:00:00.0
/sys/kernel/iommu_groups/9/devices/0000:00:1f.2
/sys/kernel/iommu_groups/9/devices/0000:00:1f.0
/sys/kernel/iommu_groups/9/devices/0000:00:1f.3
/sys/kernel/iommu_groups/9/devices/0000:00:1f.6
/sys/kernel/iommu_groups/9/devices/0000:00:1f.4

Guest VM Config

Guest OS: Ubuntu 18.04.4 Server

A. Add PCI Device to VM

Only user root can add PCI device to guest VM

Hardware => Add => PCI Device

If the steps above are successfully done, should see the GPU here

Check all options

[v] All Functions
[v] Primary GPU
[v] ROM-Bar
[v] PCI-Express

B. Other VM Hardware Configs

Choose either option 1 or option 2 to configure.

Warning: lost of the web GUI terminal access

Once the Primary GPU (x-vga=1) option is set, the VNC consle on Web GUI will NOT able to connect to the VM, so make sure you are able to access the VM by SSH, and the network setting in the VM will not be changed after reboot.

Option 1. Edit the VM Config

Edit /etc/pve/qemu-server/<VMID>.conf

bios: seabios
machine: q35
hostpci0: 01:00,pcie=1,x-vga=1

Option 2. By the Web GUI

Set BIOS, Machine, PCI Device (hostpci0) by the Web GUI

Guest OS Config

A. Prerequisities

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install ubuntu-drivers-common

B. Install NVIDIA Driver

Check the latest available driver for the GPU

ubuntu-drivers devices

Sample Output

== /sys/devices/pci0000:00/0000:00:1c.0/0000:01:00.0 ==
modalias : pci:v000010DEd00001C03sv00001462sd00003283bc03sc00i00
vendor   : NVIDIA Corporation
model    : GP106 [GeForce GTX 1060 6GB]
driver   : nvidia-driver-415 - third-party free
driver   : nvidia-driver-435 - distro non-free
driver   : nvidia-driver-440 - distro non-free
driver   : nvidia-driver-440-server - distro non-free
driver   : nvidia-driver-390 - distro non-free
driver   : nvidia-driver-410 - third-party free
driver   : nvidia-driver-450 - third-party free recommended
driver   : nvidia-driver-418-server - distro non-free
driver   : xserver-xorg-video-nouveau - distro free builtin

=> The output shows that we can install nvidia-driver-440-server

sudo apt install nvidia-driver-440-server nvidia-utils-440-server
sudo reboot

Verification

lsmod | grep nvidia 

C. GRUB Options

Warning: lost of access to the VNC terminal

There will be no output to the VNC terminal after these options are set, make the ssh is ready.

Add the following options to GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub

video=vesafb:off,efifb:off

Run

sudo update-grub
sudo reboot

D. Verification

sudo nvidia-smi

Sample output

Sat Aug  1 19:45:48 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.95.01    Driver Version: 440.95.01    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 106...  Off  | 00000000:01:00.0 Off |                  N/A |
|  0%   45C    P5    14W / 140W |      0MiB /  6078MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

=> The GPU information is successfully retrieved without any error.

References

BIOS

PVE

Guest OS