# GPU PCI passthrough for instance ## Find GPU on gpu host ``` lspci -nn | grep -i nvidia 1b:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1) 1c:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1) 3d:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1) 3e:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1) b1:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1) b2:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1) db:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1) dc:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1) ^ ^ ^ | | | | | `--PRODUCT_ID `---PCI_SLOT `--VENDER_ID ``` ## Delete all vm on that host ``` openstack server delete $vm_name ``` ## Remove nvidia driver If installed by yum: ``` yum remove nvidia-driver ``` If installed by NVIDIA-xxx.run ``` NVIDIA-xxx.run --uninstall ``` Or force delete all nvidia named rpm ``` for i in $(rpm -qa | grep nvidia); rpm --nodeps -e $i;done ``` ## Add module vfio add modules in `/etc/modules-load.d/vfio.conf` ``` vfio vfio_iommu_type1 vfio_pci ``` ## module Blacklist Create `blacklist.conf` in `/etc/modprobe.d` ``` cat << EOF > /etc/modprobe.d/blacklist.conf blacklist snd_hda_intel blacklist amd76x_edac blacklist vga16fb blacklist rivafb blacklist nvidiafb blacklist rivatv blacklist nvidia blacklist nouveau options nouveau modeset=0 EOF ``` ## Add vfio options addd options with `VENDER_ID:DEVICE_ID` ``` cat << EOF > /etc/modprobe.d/vfio-options.conf options vfio-pci ids=10de:1db5 EOF ``` ## Update grub Add `intel_iommu=on` at the line`GRUB_CMDLINE_LINUX` end ``` vim /etc/default/grub GRUB_CMDLINE_LINUX="rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet intel_iommu=on" ``` ## Update grub ``` grub2-mkconfig -o /boot/grub2/grub.cfg grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg ``` ## update initramfs ``` dracut -f ``` ## Reboot If the libvirtd become zombie process need force power off node ``` reboot ``` ## Reboot and check iommu dmesg will contain iommu message ``` dmesg | grep -E "DMAR|IOMMU" ... [ 0.309935] DMAR: ATSR flags: 0x0 [ 0.309936] DMAR: RHSA base: 0x0000009d7fc000 proximity domain: 0x0 [ 0.309937] DMAR: RHSA base: 0x000000aaffc000 proximity domain: 0x0 [ 0.309937] DMAR: RHSA base: 0x000000b87fc000 proximity domain: 0x0 [ 0.309938] DMAR: RHSA base: 0x000000c5ffc000 proximity domain: 0x0 [ 0.309938] DMAR: RHSA base: 0x000000d37fc000 proximity domain: 0x1 [ 0.309939] DMAR: RHSA base: 0x000000e0ffc000 proximity domain: 0x1 [ 0.309940] DMAR: RHSA base: 0x000000ee7fc000 proximity domain: 0x1 [ 0.309940] DMAR: RHSA base: 0x000000fbffc000 proximity domain: 0x1 [ 0.309941] DMAR-IR: IOAPIC id 12 under DRHD base 0xc5ffc000 IOMMU 6 [ 0.309942] DMAR-IR: IOAPIC id 11 under DRHD base 0xb87fc000 IOMMU 5 [ 0.309943] DMAR-IR: IOAPIC id 10 under DRHD base 0xaaffc000 IOMMU 4 [ 0.309944] DMAR-IR: IOAPIC id 18 under DRHD base 0xfbffc000 IOMMU 3 [ 0.309945] DMAR-IR: IOAPIC id 17 under DRHD base 0xee7fc000 IOMMU 2 [ 0.309945] DMAR-IR: IOAPIC id 16 under DRHD base 0xe0ffc000 IOMMU 1 [ 0.309946] DMAR-IR: IOAPIC id 15 under DRHD base 0xd37fc000 IOMMU 0 [ 0.309947] DMAR-IR: IOAPIC id 8 under DRHD base 0x9d7fc000 IOMMU 7 [ 0.309947] DMAR-IR: IOAPIC id 9 under DRHD base 0x9d7fc000 IOMMU 7 [ 0.309948] DMAR-IR: HPET id 0 under DRHD base 0x9d7fc000 [ 0.309949] DMAR-IR: x2apic is disabled because BIOS sets x2apic opt out bit. [ 0.309950] DMAR-IR: Use 'intremap=no_x2apic_optout' to override the BIOS setting. [ 0.312475] DMAR-IR: Enabled IRQ remapping in xapic mode [ 2.680256] DMAR: dmar6: Using Queued invalidation [ 2.680266] DMAR: dmar5: Using Queued invalidation [ 2.680273] DMAR: dmar4: Using Queued invalidation [ 2.680278] DMAR: dmar3: Using Queued invalidation [ 2.680283] DMAR: dmar2: Using Queued invalidation [ 2.680289] DMAR: dmar1: Using Queued invalidation [ 2.680294] DMAR: dmar0: Using Queued invalidation [ 2.680300] DMAR: dmar7: Using Queued invalidation [ 2.680355] DMAR: Setting RMRR: [ 2.684286] DMAR: Setting identity map for device 0000:00:14.0 [0x6f01d000 - 0x6f02dfff] [ 2.684295] DMAR: Prepare 0-16MiB unity mapping for LPC [ 2.688269] DMAR: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff] [ 2.688285] DMAR: Intel(R) Virtualization Technology for Directed I/O ``` ## Check GPU driver kernel driver can not be used by nvidia ``` # lspci -nnk -d VENDER_ID:DEVICE_ID lspci -nnk -d 10de:1db5 1b:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1) Subsystem: NVIDIA Corporation Device [10de:1249] Kernel driver in use: vfio-pci Kernel modules: nouveau 1c:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1) Subsystem: NVIDIA Corporation Device [10de:1249] Kernel driver in use: vfio-pci Kernel modules: nouveau 3d:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1) Subsystem: NVIDIA Corporation Device [10de:1249] Kernel driver in use: vfio-pci Kernel modules: nouveau 3e:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1) Subsystem: NVIDIA Corporation Device [10de:1249] Kernel driver in use: vfio-pci Kernel modules: nouveau b1:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1) Subsystem: NVIDIA Corporation Device [10de:1249] Kernel driver in use: vfio-pci Kernel modules: nouveau b2:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1) Subsystem: NVIDIA Corporation Device [10de:1249] Kernel driver in use: vfio-pci Kernel modules: nouveau db:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1) Subsystem: NVIDIA Corporation Device [10de:1249] Kernel driver in use: vfio-pci Kernel modules: nouveau dc:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1) Subsystem: NVIDIA Corporation Device [10de:1249] Kernel driver in use: vfio-pci Kernel modules: nouveau ``` # check vfio ``` lsmod | grep vfio vfio_pci 41312 0 irqbypass 13503 2 kvm,vfio_pci vfio_iommu_type1 22300 0 vfio 32695 2 vfio_iommu_type1,vfio_pci ``` # check nova-compute log nova compute service will report pci devices to nova placement the log will contain `pci_stats=[PciDevicePool(xxxxxx), ...]` ``` tail -n 100 /var/log/container/nova/nova-compute.log 2019-06-26 12:05:17.786 8 INFO nova.compute.resource_tracker [req-748f84de-7bd8-422c-afee-926d8aef02a9 - - - - -] Final resource view: name=gn0215.twcc.ai phys_ram=785057MB used_ram=4096MB phys_disk=3078441GB used_disk=0GB total_vcpus=36 used_vcpus=0 pci_stats=[PciDevicePool(count=4,numa_node=0,product_id='1db5',tags={dev_type='type-PCI'},vendor_id='10de'), PciDevicePool(count=4,numa_node=1,product_id='1db5',tags={dev_type='type-PCI'},vendor_id='10de')] ``` # openstack config 1. Nova-api ``` [pci] alias = { "vendor_id":"10de", "product_id":"1eb8", "device_type":"type-PCI", "name":"gpu" } ``` 3. Nova-scheduler add `PciPassthroughFilter` 4. Nova-compute add whitelist ``` [pci] passthrough_whitelist = { "vendor_id": "10de", "product_id": "1eb8" } ```