# GPU PCI passthrough for instance
## Find GPU on gpu host
```
lspci -nn | grep -i nvidia
1b:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1)
1c:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1)
3d:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1)
3e:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1)
b1:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1)
b2:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1)
db:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1)
dc:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1)
^ ^ ^
| | |
| | `--PRODUCT_ID
`---PCI_SLOT `--VENDER_ID
```
## Delete all vm on that host
```
openstack server delete $vm_name
```
## Remove nvidia driver
If installed by yum:
```
yum remove nvidia-driver
```
If installed by NVIDIA-xxx.run
```
NVIDIA-xxx.run --uninstall
```
Or force delete all nvidia named rpm
```
for i in $(rpm -qa | grep nvidia); rpm --nodeps -e $i;done
```
## Add module vfio
add modules in `/etc/modules-load.d/vfio.conf`
```
vfio
vfio_iommu_type1
vfio_pci
```
## module Blacklist
Create `blacklist.conf` in `/etc/modprobe.d`
```
cat << EOF > /etc/modprobe.d/blacklist.conf
blacklist snd_hda_intel
blacklist amd76x_edac
blacklist vga16fb
blacklist rivafb
blacklist nvidiafb
blacklist rivatv
blacklist nvidia
blacklist nouveau
options nouveau modeset=0
EOF
```
## Add vfio options
addd options with `VENDER_ID:DEVICE_ID`
```
cat << EOF > /etc/modprobe.d/vfio-options.conf
options vfio-pci ids=10de:1db5
EOF
```
## Update grub
Add `intel_iommu=on` at the line`GRUB_CMDLINE_LINUX` end
```
vim /etc/default/grub
GRUB_CMDLINE_LINUX="rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet intel_iommu=on"
```
## Update grub
```
grub2-mkconfig -o /boot/grub2/grub.cfg
grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg
```
## update initramfs
```
dracut -f
```
## Reboot
If the libvirtd become zombie process need force power off node
```
reboot
```
## Reboot and check iommu
dmesg will contain iommu message
```
dmesg | grep -E "DMAR|IOMMU"
...
[ 0.309935] DMAR: ATSR flags: 0x0
[ 0.309936] DMAR: RHSA base: 0x0000009d7fc000 proximity domain: 0x0
[ 0.309937] DMAR: RHSA base: 0x000000aaffc000 proximity domain: 0x0
[ 0.309937] DMAR: RHSA base: 0x000000b87fc000 proximity domain: 0x0
[ 0.309938] DMAR: RHSA base: 0x000000c5ffc000 proximity domain: 0x0
[ 0.309938] DMAR: RHSA base: 0x000000d37fc000 proximity domain: 0x1
[ 0.309939] DMAR: RHSA base: 0x000000e0ffc000 proximity domain: 0x1
[ 0.309940] DMAR: RHSA base: 0x000000ee7fc000 proximity domain: 0x1
[ 0.309940] DMAR: RHSA base: 0x000000fbffc000 proximity domain: 0x1
[ 0.309941] DMAR-IR: IOAPIC id 12 under DRHD base 0xc5ffc000 IOMMU 6
[ 0.309942] DMAR-IR: IOAPIC id 11 under DRHD base 0xb87fc000 IOMMU 5
[ 0.309943] DMAR-IR: IOAPIC id 10 under DRHD base 0xaaffc000 IOMMU 4
[ 0.309944] DMAR-IR: IOAPIC id 18 under DRHD base 0xfbffc000 IOMMU 3
[ 0.309945] DMAR-IR: IOAPIC id 17 under DRHD base 0xee7fc000 IOMMU 2
[ 0.309945] DMAR-IR: IOAPIC id 16 under DRHD base 0xe0ffc000 IOMMU 1
[ 0.309946] DMAR-IR: IOAPIC id 15 under DRHD base 0xd37fc000 IOMMU 0
[ 0.309947] DMAR-IR: IOAPIC id 8 under DRHD base 0x9d7fc000 IOMMU 7
[ 0.309947] DMAR-IR: IOAPIC id 9 under DRHD base 0x9d7fc000 IOMMU 7
[ 0.309948] DMAR-IR: HPET id 0 under DRHD base 0x9d7fc000
[ 0.309949] DMAR-IR: x2apic is disabled because BIOS sets x2apic opt out bit.
[ 0.309950] DMAR-IR: Use 'intremap=no_x2apic_optout' to override the BIOS setting.
[ 0.312475] DMAR-IR: Enabled IRQ remapping in xapic mode
[ 2.680256] DMAR: dmar6: Using Queued invalidation
[ 2.680266] DMAR: dmar5: Using Queued invalidation
[ 2.680273] DMAR: dmar4: Using Queued invalidation
[ 2.680278] DMAR: dmar3: Using Queued invalidation
[ 2.680283] DMAR: dmar2: Using Queued invalidation
[ 2.680289] DMAR: dmar1: Using Queued invalidation
[ 2.680294] DMAR: dmar0: Using Queued invalidation
[ 2.680300] DMAR: dmar7: Using Queued invalidation
[ 2.680355] DMAR: Setting RMRR:
[ 2.684286] DMAR: Setting identity map for device 0000:00:14.0 [0x6f01d000 - 0x6f02dfff]
[ 2.684295] DMAR: Prepare 0-16MiB unity mapping for LPC
[ 2.688269] DMAR: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff]
[ 2.688285] DMAR: Intel(R) Virtualization Technology for Directed I/O
```
## Check GPU driver
kernel driver can not be used by nvidia
```
# lspci -nnk -d VENDER_ID:DEVICE_ID
lspci -nnk -d 10de:1db5
1b:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:1249]
Kernel driver in use: vfio-pci
Kernel modules: nouveau
1c:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:1249]
Kernel driver in use: vfio-pci
Kernel modules: nouveau
3d:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:1249]
Kernel driver in use: vfio-pci
Kernel modules: nouveau
3e:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:1249]
Kernel driver in use: vfio-pci
Kernel modules: nouveau
b1:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:1249]
Kernel driver in use: vfio-pci
Kernel modules: nouveau
b2:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:1249]
Kernel driver in use: vfio-pci
Kernel modules: nouveau
db:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:1249]
Kernel driver in use: vfio-pci
Kernel modules: nouveau
dc:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:1db5] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:1249]
Kernel driver in use: vfio-pci
Kernel modules: nouveau
```
# check vfio
```
lsmod | grep vfio
vfio_pci 41312 0
irqbypass 13503 2 kvm,vfio_pci
vfio_iommu_type1 22300 0
vfio 32695 2 vfio_iommu_type1,vfio_pci
```
# check nova-compute log
nova compute service will report pci devices to nova placement
the log will contain `pci_stats=[PciDevicePool(xxxxxx), ...]`
```
tail -n 100 /var/log/container/nova/nova-compute.log
2019-06-26 12:05:17.786 8 INFO nova.compute.resource_tracker [req-748f84de-7bd8-422c-afee-926d8aef02a9 - - - - -] Final resource view: name=gn0215.twcc.ai phys_ram=785057MB used_ram=4096MB phys_disk=3078441GB used_disk=0GB total_vcpus=36 used_vcpus=0 pci_stats=[PciDevicePool(count=4,numa_node=0,product_id='1db5',tags={dev_type='type-PCI'},vendor_id='10de'), PciDevicePool(count=4,numa_node=1,product_id='1db5',tags={dev_type='type-PCI'},vendor_id='10de')]
```
# openstack config
1. Nova-api
```
[pci]
alias = { "vendor_id":"10de", "product_id":"1eb8", "device_type":"type-PCI", "name":"gpu" }
```
3. Nova-scheduler
add `PciPassthroughFilter`
4. Nova-compute
add whitelist
```
[pci]
passthrough_whitelist = { "vendor_id": "10de", "product_id": "1eb8" }
```