Case Sharing - Multipath === ###### tags: `IBM` `Linux Tools` 客戶的問題 > 目前有一台Red Hat 6.10設備,OS為SAN Boot架構,使用內建 multipath 功能,實體光纖卡兩張,但連線到磁碟機設備時僅看到一張光纖卡有連線,請問要如何確認 multipath 功能是否正常? ### 請問要如何確認 multipath 功能是否正常? ```multipath -ll``` ![](https://i.imgur.com/ZBFqWje.png) path_status欄位,呈現 ready 才有成功。 客戶表示 > 沒有任何顯示。 Ref. [MULTIPATH 指令的輸出](https://access.redhat.com/documentation/zh-tw/red_hat_enterprise_linux/6/html/dm_multipath/mpio_output) ### SAN Boot 架構示意圖 ![](https://i.imgur.com/I7kzOsl.png) ### 釐清錯誤的硬體,找對的人進來處理 * FC Switch Setting - Not our support * HBA Firmware Setting - Not our support * What we can do is check if HBA driver is loaded and works? * If HBA driver does not work, client should ask HBA or backward vendor/administrator to collaborate. * If HBA driver does work, this is an OS upward problem. ### 檢查 HBA Driver ``` # check if the host has HBA card installed $ lspci | grep -i fibre 15:00.0 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI Express HBA (rev 02) 15:00.1 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI Express HBA (rev 02) $ lspci -v -s 15:00.0 15:00.0 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI Express HBA (rev 02) . . . Kernel driver in use: qla2xxx Kernel modules: qla2xxx $ lspci -v -s 15:00.1 15:00.1 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI Express HBA (rev 02) . . . Kernel driver in use: qla2xxx Kernel modules: qla2xxx # check if the driver/module loaded in kernel. $ lsmod | grep qla2xxx # Check if the running one same with kernel one $ modinfo -n qla2xxx /lib/modules/2.6.32-358.14.1.el6.x86_64/kernel/drivers/scsi/qla2xxx/qla2xxx.ko $ modinfo -k `uname -r` -n qla2xxx /lib/modules/2.6.32-358.14.1.el6.x86_64/kernel/drivers/scsi/qla2xxx/qla2xxx.ko # find the state of HBA ports (online/offline) $ more /sys/class/fc_host/host?/port_state ``` ### 檢查 multipath Log * multipathd process is alive? * LUN 的 wwid 是否進 blacklist? * 其他問題 ``` $ service multipathd status $ less /var/log/messages $ multipath -v4 ``` 並請客戶附上 /etc/multipath.conf /etc/multipath/* 目錄底下的所有檔案 ### 找出 SAN Boot LUN 的 wwid 1. ```cat /proc/scsi/scsi```看到系統所有掃到的LUNs 2. 假設是ATA LUN 做 SAN boot,記下那四個編碼 3. ```ls -ld /sys/block/sd*/device``` 對應那四個編碼,然後我就可以知道那塊LUN對應到的是sda 4. ```ls -la /dev/disk/by-id/``` 查出sda的wwid ![](https://i.imgur.com/lSUVf7M.png) 客戶回覆 > SAN的 Vendor為IBM > Host: scsi7 Channel: 00 Id: 03 Lun: 00 > Vendor: IBM Model: 2145 Rev: 0000 > Type: Direct-Access ANSI SCSI revision: 06 > wwid為 > lrwxrwxrwx. 1 root root 9 2021-08-08 23:32 scsi-3600507680c8081e68800000000000b0c -> ../../sdj 我從 multipath -v4 中確實看到 wwid 3600507680c8081e68800000000000b0c 被 blacklist 掉。 1. 先修正 ```/etc/multipath.conf``` ``` blacklist_exceptions { wwid "3600507680c8081e68800000000000b0c" } multipaths { multipath { uid 0 gid 0 wwid "3600507680c8081e68800000000000b0c" mode 0600 } } ``` 2. ```multipath -r``` 3. ```multipath -ll``` 觀察path有沒有出來 4. 如果沒有,再 ```multipath -v4``` ### multipath 的「其他問題」 ```multipath -v4``` 確定 wwid 為 3600507680c8081e68800000000000b0c 是進whitelist 的。 接著,wwid 3600507680c8081e68800000000000b0c 被自動設定為 mpathd 別名,我們不用再調整 multipath.conf。 ![](https://i.imgur.com/j1W4NF0.png) 那 mpathd create map 失敗的訊息,參考 https://access.redhat.com/solutions/641083 ![](https://i.imgur.com/dC921uB.png) 其中 > * Check whether any application is holding those scsi devices. -> 因為 root file system 就是在這塊,開機情況下這答案是肯定的。 > * If this is happening on boot devices, make sure that the initramfs contains the correct multipath configuration so that the maps are created at boot time, before anything else takes control over them. -> 所以我們要修改 initramfs image。 * 備份 initramfs image * 更新 initramfs image ```dracut --force --add multipath --include /etc/multipath /etc/multipath --include /etc/multipath.conf /etc/multipath.conf``` * 重新開機,確認 multipath path 有無顯示 ``` reboot multipath -ll multipath -v5 ```