# Windriver 22.06 a #### Environment OS: WindriverISO-USB (CentOS 6) CPU: Intel® Xeon® 6226R @2.9GHz x64 RAM: 192GB DISK: 500GB x3 **tips: 不要設定raid5** bootstrap需要兩顆獨立的硬碟運作 (OS要看到兩顆) ## Windriver 22.06 iso (pro) all in k200@60.250.198.199:~/windriver **Release Version** **wind-river-cloud-platform-host-installer-22.06-b14.iso** **wind-river-cloud-platform-host-installer-22.06-b14-pro.iso** 包含kernel-mft & mft tool by Windriver Jackey **wind-river-cloud-platform-host-installer-22.06-b14-pro-2.iso** 包含kernel-mft & mft tool & libatomic by Windriver Jackey ## On Windriver node ### Install mst tool before Bootstrap **tips:** 若bootstrap後安裝mst tool,重開機,安裝的rpm套件會復原 參照mft-tool.txt by Windriver Jackey ``` $ scl enable devtoolset-8 'bash' $ which gcc /opt/rh/devtoolset-8/root/usr/bin/gcc $ cd ~/mft-4.18.0-106-x86_64-rpm $~/mft-4.18.0-106-x86_64-rpm$ sudo ./install.sh Password: ``` ## Install Nvidia CX-6 Driver ## 設定non-passwd ``` sudo visudo sysadmin ALL=(ALL) NOPASSWD:ALL ``` ## 設定靜態ip ref. https://blog.gtwang.org/linux/centos-linux-static-network-configuration-tutorial/ ## Install mstflint cd workspace wget https://rpmfind.net/linux/centos/7.9.2009/os/x86_64/Packages/mstflint-4.13.3-2.el7.x86_64.rpm sudo rpm -ivh mstflint-4.13.3-2.el7.x86_64.rpm ## 安裝mft kernel module ``` localhost:~$ scl enable devtoolset-8 'bash' localhost:~$ which gcc /opt/rh/devtoolset-8/root/usr/bin/gcc localhost:~$ pwd /home/sysadmin localhost:~$ ls mft-4.18.0-106-x86_64-rpm mft-4.18.0-106-x86_64-rpm.tgz localhost:~$ cd mft-4.18.0-106-x86_64-rpm/ localhost:~/mft-4.18.0-106-x86_64-rpm$ ls install.sh LICENSE.txt old-mft-uninstall.sh RPMS SRPMS uninstall.sh localhost:~/mft-4.18.0-106-x86_64-rpm$ sudo ./install.sh Password: -I- Removing any old MFT file if exists... -I- Building the MFT kernel binary RPM... -I- Installing the MFT RPMs... Preparing... ################################# [100%] Updating / installing... 1:kernel-mft-4.18.0-5.10.112_200.47################################# [100%] Preparing... ################################# [100%] Updating / installing... 1:mft-4.18.0-106 ################################# [100%] -I- In order to start mst, please run "mst start". localhost:~/mft-4.18.0-106-x86_64-rpm$ mst start -E- You must be root to use mst tool localhost:~/mft-4.18.0-106-x86_64-rpm$ sudo mst start Starting MST (Mellanox Software Tools) driver set Loading MST PCI module - Success Loading MST PCI configuration module - Success Create devices -W- Missing "lsusb" command, skipping MTUSB devices detection Unloading MST PCI module (unused) - Success Unloading MST PCI configuration module (unused) - Success localhost:~/mft-4.18.0-106-x86_64-rpm$ rpm -qa |grep kernel kernel-5.10.112-200.47.tis.el7.x86_64 kernel-modules-extra-5.10.112-200.47.tis.el7.x86_64 kernel-mft-4.18.0-5.10.112_200.47.tis.el7.x86_64.x86_64 kernel-devel-5.10.112-200.47.tis.el7.x86_64 kernel-tools-libs-5.10.112-200.47.tis.el7.x86_64 erlang-kernel-18.3.4.4-2.el7.x86_64 mlnx-ofa_kernel-5.5-OFED.5.5.1.0.3.1.tis.34.x86_64 kernel-core-5.10.112-200.47.tis.el7.x86_64 kernel-headers-5.10.112-200.47.tis.el7.x86_64 kernel-tools-5.10.112-200.47.tis.el7.x86_64 mlnx-ofa_kernel-modules-5.5-OFED.5.5.1.0.3.1.tis.34.x86_64 kernel-modules-5.10.112-200.47.tis.el7.x86_64 ``` [mft-4.18.0-106-x86_64-rpm.tgz要放到60.250.198.199] mft-4.18.0-106-x86_64-rpm.tgz 載點 https://network.nvidia.com/products/adapter-software/firmware-tools/ ![](https://i.imgur.com/Wt32v2A.png) ## Install bootstrap by ansible tips: 開screen執行bootstap,避免ssh連線中斷導致命令停止 ``` screen 進入deattach: 輸入 Ctrl + a 再按一次d 跳出目前視窗: 輸入執行command後,輸入 Ctrl + c 回去screen: 輸入 screen -r 或 screen + id 另一個視窗 tail -f ansible.log ``` https://docs.starlingx.io/deploy_install_guides/r5_release/bare_metal/aio_simplex_install_kubernetes.html #### Run the Ansible bootstrap playbook ``` sudo ntpdate 0.pool.ntp.org 1.pool.ntp.org ansible-playbook /usr/share/ansible/stx-ansible/playbooks/bootstrap.yml ``` ## Once boot strap without any error ``` ; In controller-0 ; Upload license file(windriver.lic) into /home/sysadmin ; Run below command to update license, this is not necessary within 3 days after installation, since the system could run normally for 72 hours without license. source /etc/platform/openrc ; If you already upload the official license file into /home/sysadmin/, if not, can skip license-install step system license-install ./windriver.lic system license-install ./CompalElectronics.lic ; Run below command in openrc’s system shell OAM_IF=enp0s3 system host-if-modify controller-0 $OAM_IF -c platform system interface-network-assign controller-0 $OAM_IF oam ; Add ntp server system ntp-modify ntpservers=0.pool.ntp.org,1.pool.ntp.org ; Create ceph storage system storage-backend-add ceph --confirmed system host-disk-list controller-0 system host-disk-list controller-0 | awk '/\/dev\/sdb/{print $2}' | xargs -i system host-stor-add controller-0 osd {} system host-stor-list controller-0 ; Unlock controller-0 after the system host-stor-list could show the storage ready system host-unlock controller-0 ; Once controller-0 reboot successfully, login again, and run source /etc/platform/openrc ``` #### uname -a 查看kernel版本 ``` Linux controller-0 3.10.0-1127.el7.2.tis.x86_64 #1 SMP PREEMPT Sat Jun 27 18:46:23 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux ``` #### rpm -q centos-release 查看CentOS版本 ``` centos-release-7-6.1810.2.el7.centos.x86_64 ``` ### Install mlnx 5.4-1.0.3.0 ``` wget http://60.250.198.199/MLNX_OFED_LINUX-5.4-1.0.3.0-rhel7.6-x86_64.tgz tar axvf MLNX_OFED_LINUX-5.4-1.0.3.0-rhel7.6-x86_64.tgz ``` https://network.nvidia.com/products/ethernet-drivers/linux/mlnx_en/ ![](https://i.imgur.com/FSoU9S9.png) ### Install Mellanox OFED 5.5-1.0.3.2 https://network.nvidia.com/products/infiniband-drivers/linux/mlnx_ofed/ ![](https://i.imgur.com/qO7qIii.png) ### Install libatomic ``` wget https://rpmfind.net/linux/centos/7.9.2009/os/x86_64/Packages/libatomic-4.8.5-44.el7.x86_64.rpm # 安裝libatomic sudo rpm -ivh libatomic-4.8.5-44.el7.x86_64.rpm # 若重新開機會被移除,將檔案先複製出來到 /home/sysadmin/usrlib/ controller-0:~$ sudo rpm -ql libatomic /usr/lib64/libatomic.so.1 /usr/lib64/libatomic.so.1.0.0 ``` 3. 執行gtpd daemon ``` $ sudo /opt/mec/sbin/gtpd -a 37:00.0 -a 37:00.1 /opt/mec/sbin/gtpd: /lib64/libmlx5.so.1: version `MLX5_OFED' not found (required by /opt/mec/sbin/gtpd) MLNX_OFED_LINUX-5.4-1.0.3.0-rhel7.6-x86_64.tgz ``` ### 需要library 1. MLNX_OFED_LINUX-5.4-1.0.3.0-rhel7.6-x86_64.tgz 2. fast.dpdk.org/rel/dpdk-21.08.tar.xz 3. distfiles.dereferenced.org/pkgconf/pkgconf-1.8.0.tar.xz 4. libatomic-4.8.5-44.el7.x86_64.rpm ### build upf_gtpd_dpdk binary 前置作業: cd imec2.8_windriver_helm3/source_patch/tools/dpdk_sdk_centos/ upf_gtpd_dpdk裡面的dockerfile、mlnx_en.repo要改 mlnx-en-5.5-1.0.3.2-rhel7.6-x86_64.tgz ``` make build_binary_upf_gtpu_dpdk_centos ``` ## windriver 設定 ### 設定Sriov ``` WRCP Enable PCI-SRIOV Interfaces # 進入 WRCP 系統設定cli source /etc/platform/openrc # 調整resource 需要先 lock system export NODE=controller-0 system host-lock ${NODE} # 列出要啟用 SRIOV 的網卡名稱 ex. ens2f0 , 紀錄對應的UUID system host-if-list -a ${NODE} # 將對應的介面 N3 , N4/N9, N6 啟用 SRIOV # (網卡需在BIOS enable sriov, 有些預設不會啟動) system host-if-modify -m 1500 -N <vf-num> -c pci-sriov ${NODE} <sriov1-if-uuid> # 建立Data Network "名稱為固定" 會對應到 pcidp resource name DATANET1='mgt_dpdk' system datanetwork-add ${DATANET1} flat # 綁定 Data Networks to SRIOV Interfaces system interface-datanetwork-assign ${NODE} <sriov1-if-uuid> ${DATANET1} # 啟用K8s SRIOV device plugin system host-label-assign controller-0 sriovdp=enabled # 系統unlock , 等系統重開機 設定生效 system host-unlock ${NODE} ``` ### 設定hugepage 現在dpdk為native版本,直接執行此行cmd ``` source /etc/platform/openrc system host-lock controller-0 # 查看hugepage system host-memory-list controller-0 # 設定hugepage system host-memory-modify controller-0 0 -f application -2M 1 system host-unlock controller-0 ``` ## 安裝iMEC (helm3 by Compal) ### upf-mgt 須砍多餘的route rule,才可以ping到notc ``` kubectl exec -ti -n mec-system upf-mgt-XXXX bash route rm -net 0.0.0.0 gw 0.0.0.0 ``` ### notc步驟 http://10.194.87.21:17987/v2/docs#/ 1. **post** upf-mgt **subscribe** ``` { "subscriber": "MEC-UPF-PROD-TEST", "address": "upf-mgt", "port": 15001, "dst_offload_url": "api/destination_offload", "src_offload_url": "api/source_offload" } ``` 2. **post source offload** ``` { "node": "MEC-UPF-PROD-TEST", "address": "0.0.0.0" } ``` ### 確認DPDK正確執行,所有版本符合 Windriver上的dpdk需要特別build ``` cd imec2.8_windriver_helm3/source_patch/tools/dpdk_sdk_centos make build_centos ``` ``` cd imec2.8_windriver_helm3/source_patch/upf_gtpu_dpdk build_centos.sh ``` build_centos.sh 出來的 gtpd 丟去windriver /opt/mec/sbin ``` scp gtpd sysadmin@10.194.87.21:~/opt/mec/sbin ``` #### 執行build centOS gtpd binary in docker ``` docker stop dpdk_sdk_centos docker rm dpdk_sdk_centos docker run -ti --env PKG_CONFIG_PATH=/usr/local/lib64/pkgconfig --name dpdk_sdk_centos dpdk_sdk_centos bash ``` ### 開gtpd support gtp功能 ``` sudo mstconfig -d 37:00.0 query sudo mstconfig -d 37:00.0 set FLEX_PARSER_PROFILE_ENABLE=3 # 開啟功能後,重新開機 sudo reboot ``` ### 執行GTPD步驟 1. 查詢Mellanox interface,可看到ens2f1/ens2f0 ``` vim /var/log/dmesg // search mlx5, type /mlx5 [ 4.115405] mlx5_core 0000:37:00.1 ens2f1: renamed from eth1000 [ 4.120238] svcrdma: svcrdma is obsoleted, loading rpcrdma instead [ 4.121969] xprtrdma: xprtrdma is obsoleted, loading rpcrdma instead [ 4.125091] mlx5_core 0000:37:00.0 ens2f0: renamed from eth1006 ``` 2. 確認ifconfig有呈現ens2f1/ens2f0 ``` ens2f0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet6 fe80::e42:a1ff:fec4:79d8 prefixlen 64 scopeid 0x20<link> ether 0c:42:a1:c4:79:d8 txqueuelen 1000 (Ethernet) RX packets 2 bytes 120 (120.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 1499 bytes 412418 (402.7 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 ens2f1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet6 fe80::e42:a1ff:fec4:79d9 prefixlen 64 scopeid 0x20<link> ether 0c:42:a1:c4:79:d9 txqueuelen 1000 (Ethernet) RX packets 4 bytes 240 (240.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 1497 bytes 412238 (402.5 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 ``` 若ens2f0/ens2f1 not running ``` sudo ifconfig ens2f0 0.0.0.0 up ``` 3. 編寫 /etc/pcidp/config.json ``` { "resourceList": [ { "resourceName": "mgt_dpdk", "selectors": { "vendors": ["15b3"], "pfNames": ["ens2f0"] } } ] } ``` 4. 執行hugepage ``` echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages ``` 5. 執行openibd,查看device狀況 ``` sudo /etc/init.d/openibd status HCA driver loaded Configured Mellanox EN devices: enp55s1 enp55s1f1 enp55s1f2 enp55s1f3 enp55s1f4 enp55s1f5 enp55s1f6 enp55s1f7 enp55s2 enp55s2f1 ens2f0 ens2f1 ens2f2 ens2f4 ens2f5 ens2f6 ens2f7 Currently active Mellanox devices: ens2f0 ens2f1 The following OFED modules are loaded: rdma_ucm rdma_cm ib_ipoib mlx5_core mlx5_ib ib_uverbs ib_umad ib_cm ib_core mlxfw ``` #### 若HCA driver not found,stop再start即可正常 ``` sudo /etc/init.d/openibd stop sudo /etc/init.d/openibd start ``` 6. 查看bus number ``` lspci | grep Mell 12:00.0 Ethernet controller: Mellanox Technologies MT28841 12:00.1 Ethernet controller: Mellanox Technologies MT28841 12:00.2 Ethernet controller: Mellanox Technologies MT28850 12:00.3 Ethernet controller: Mellanox Technologies MT28850 12:00.4 Ethernet controller: Mellanox Technologies MT28850 12:00.5 Ethernet controller: Mellanox Technologies MT28850 12:00.6 Ethernet controller: Mellanox Technologies MT28850 12:00.7 Ethernet controller: Mellanox Technologies MT28850 12:01.0 Ethernet controller: Mellanox Technologies MT28850 12:01.1 Ethernet controller: Mellanox Technologies MT28850 ``` 7. gtpd設定 [gtpd設定port] ``` $ vim /lib/systemd/system/gtpd.service [Service] EnvironmentFile=-/etc/default/gtpd ExecStart=/opt/mec/sbin/gtpd -a 37:00.4 -a 37:01.4 ExecReload=/opt/mec/sbin/gtpd [Unit] Description=UPF GTPU daemon DefaultDependencies=no After=network.target mst.service ``` [gtpd設定IP plane] ``` $vim /etc/default/gtpd DAEMON=yes N3_ADDRESS=192.168.103.51 N6_ADDRESS=192.168.105.132 N9_ADDRESS=127.0.0.1 N6_GATEWAY=192.168.105.254 GTPU_MGR_PORT=15000 ``` 8. 執行gtpd ``` sudo /opt/mec/sbin/gtpd -a 12:00.4 -a 12:01.4 ``` 9. 要把k200加入到/etc/group 裡面的docker ``` vim /etc/docker ... docker:x:967:k200,ki83tty ... ``` 10. docker sock有時候會有permission denied ``` sudo chmod 666 /var/run/docker.sock ``` # 備註 ## CentOS如何重置root密碼 ref: https://autumncher.pixnet.net/blog/post/462809249-%E3%80%90centos7%E3%80%91%E5%A6%82%E4%BD%95%E9%87%8D%E7%BD%AEroot%E5%AF%86%E7%A2%BC-(how-to-recover-root-pas ## helm3 delete mec ``` vim deploy/roles/undeploy/k8s/tasks/main.yml - name: Uninstall the mec helm chart shell: "set -o pipefail && helm uninstall mec || true" ``` ## 設定raid5 bootstrap需要兩顆獨立的硬碟運作 (OS要看到兩顆) Launching ACU with HP Intelligent Provisioning (Gen8 or later) 1. Boot the server. 2. Press **F10** to launch HP **Intelligent Provisioning**. 3. At the main screen, select Perform Maintenance. 4. At the Maintenance screen, select Array Configuration Utility (ACU). The system launches the ACU GUI. ## Install Nvidia CX-6 Driver ``` wget http://60.250.198.199/MLNX_OFED_LINUX-5.5-1.0.3.2-rhel7.6-x86_64.tgz tar axvf MLNX_OFED_LINUX-5.5-1.0.3.2-rhel7.6-x86_64.tgz rpm --import http://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox ``` add /etc/yum.repos.d/mlnx_en.repo ``` [mlnx_en] name=MLNX_EN Repository baseurl=file:///tmp/MLNX_OFED_LINUX-5.5-1.0.3.2-rhel7.6-x86_64/RPMS enabled=1 gpgcheck=1 ``` sudo yum install mlnx-ofed-dpdk ## 安裝mft kernel module ``` sudo rpm -ivh mft-4.17.2-12.x86_64.rpm ``` ## 查詢GLICC版本 [root@58dbc2e0b94b /]# ls -la /lib64/libc.so.6 lrwxrwxrwx 1 root root 12 Jun 24 03:01 /lib64/libc.so.6 -> libc-2.17.so [root@58dbc2e0b94b /]# ls -al /lib64/libmlx5.so.1 lrwxrwxrwx 1 root root 20 Jun 24 03:00 /lib64/libmlx5.so.1 -> libmlx5.so.1.19.35.0 1. ``` $ strings /lib64/libc.so.6 |grep GLIBC_ GLIBC_2.2.5 GLIBC_2.2.6 GLIBC_2.3 GLIBC_2.3.2 GLIBC_2.3.3 GLIBC_2.3.4 GLIBC_2.4 GLIBC_2.5 GLIBC_2.6 GLIBC_2.7 GLIBC_2.8 GLIBC_2.9 GLIBC_2.10 GLIBC_2.11 GLIBC_2.12 GLIBC_2.13 GLIBC_2.14 GLIBC_2.15 GLIBC_2.16 GLIBC_2.17 ``` 2. /lib64/libmlx5.so.1 -> libmlx5.so.1.21.37.0 ``` $ strings /lib64/libmlx5.so.1 |grep libmlx5 libmlx5.so.1 libmlx5.so.1.21.37.0.debug ``` ## 安裝patch (Windriver 20.06 才需要) patch file in deploy node ~/patch/ ``` You install wind-river-cloud-platform-host-installer-20.06-38-PATCH_0009.iso first, then install patches WRCP_20.06_PATCH_0010 – 16(no 13) with below steps, then bootstrap. deploy node store patch file,send to windriver server command : sudo sw-patch upload *.patch sudo sw-patch apply --all sudo sw-patch install-local sudo reboot ```