[雲端] K8S / Operator / GPU

[雲端] K8S / Operator / GPU === ###### tags: `雲端 / K8s` ###### tags: `雲端`, `K8s`, `operator`, `GPU` ![](https://i.imgur.com/vsCp0RX.png) [TOC] ## [Clara with Kubernetes - GPU Multi Node](https://www.youtube.com/watch?v=lw0c1Ah-c-E) > 主題：GPU Operator > 內容： > - 如何透過 K8s ，進行多 GPU 智慧排程 > - 並結合 NVIDIA CLARA 運算平台，同時進行多專案 Deep Learning 模型訓練 ### ML 的痛點與需求 - AI醫療領域，如火如荼進行，需要花很多時間進行前處理，沒有時間可以思考： - 神經網路架構 - 層數 - 損失函數 - kernel size(捲積核)的大小 - 學習率 - 等等超參數變化 - 且希望把研究資源放在 - **最佳神經網路架構上，而非工程上的細節** - 希望有個工具，可以解決「**自動化模型訓練佈署**」，將工作效率化 ### 工程師的痛點＆需求： - **Q:** AI 工作流程需要 NVIDIA GPU 支援，有什麼工具可以協助整合 GPU & K8s 的整合? - **A:** GPU Operator ### K8s 與 GPU 整合演進： [![](https://i.imgur.com/ZsBz4vs.png)](https://i.imgur.com/ZsBz4vs.png) - ### ==版本1：傳統作法== - 在 Host OS 上安裝 NVIDIA Driver - ### ==版本2：[NVIDIA Device Plugin](https://github.com/NVIDIA/k8s-device-plugin#nvidia-device-plugin-for-kubernetes)== - 需求 - K8s 需要整合 NVIDIA GPU - 改善 - NVIDIA 推出 NVIDIA Device Plugin，讓 K8s 可以取得 GPU 功能與資訊 - 機制：屬於 Host-Level 管理機制 - 缺點 - NVIDIA Device Plugin 屬於 Host-Level 管理機制，IT 人員要維護 CPU & GPU 兩種不同的運算節點 - ### ==版本3：[NVIDA GPU Operator](https://github.com/NVIDIA/gpu-operator)== - 需求 - 減輕 IT 人員在 NVIDIA Device Plugin 的維護 - 改善 - NVIDIA 推出 NVIDA GPU Operator - 機制： - 包含底下四種容器服務，運行在 K8s 叢集之上 - NVIDIA Driver - NVIDIA Runtime - NVIDIA Device Pluging - NVIDIA Monitoring - 屬於 Cluster-Level 管理機制 ### NVIDA GPU Operator - ### Github https://github.com/NVIDIA/gpu-operator - ### 安裝： ```helm install NVIDIA/gpu-operator``` - ### GPU Operator 會自動執行三步驟： [![](https://i.imgur.com/MXJtkra.png)](https://i.imgur.com/MXJtkra.png) - ==步驟一==：偵測「節點」有無 NVIDIA GPU 硬體裝置 - 作法：透過 NFD (Node Feature Discovery) Daemonset 機制偵測 - ==步驟二==：安裝對應 GPU 的執行環境 - 作法： 1. 安裝 CUDA Runtime 2. 安裝NVIDIA Driver 3. 執行 cuda 範例程式，判斷有無安裝成功 - ==步驟三==：讓 K8s 叢集可以獲取 GPU 資源 - 作法：安裝 NVIDIA Device Plugin ### 實際操作範例 > https://youtu.be/lw0c1Ah-c-E?t=259 > ![](https://i.imgur.com/kk2Eh3J.png) - 操作環境準備 - Master Node - Computing Node x 2 - Clara AI Jobs - Prometheus (對運算節點進行監控) - Grafana (將節點上 GPU 資訊傳給 Grafana，在儀表版上呈現) - Master Node - 安裝 K8s 叢集 - 安裝 GPU Operator - Computing Node - 加入到 Master Node - GPU Operator 會自動裝好 - NVIDIA Driver - cuda Runtime, - NVIDIA Device Plugin - 後續用到的相關指令 ```bash= sudo kubeadm init --pod-network-cidr=192.16.0.0/16 mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config # 這邊使用 calico 建構 network kubectl app;y -f https://docs.projectcalico.org/v3.11/manifests/calico.yaml # 利用 Helm chart 安裝 GPU Operator helm install --version 1.1.7 --devel nvidia/gpu-operator --wait --generate-name # 安裝完後，可以看到 GPU-Operator 正在運行 kubectl get pods --all-namespaces # NAMESPACE NAME # default gpu-operaor-***-node-feature-discovery-master-*** # default gpu-operaor-***-node-feature-discovery-worker-*** # master node: 生成 token kubeadm token create --print-join-command # 將 kubeadm join ... 指令， # 複製到運算節點上執行，把該運算節點加入到 Master Node ``` ### 參考資料 - NGC: Clara 3D Spleen Segmentation https://ngc.nvidia.com/catalog/models/nvidia:med:clara_ct_seg_spleen_nmap - GPU-Operator https://github.com/NVIDIA/gpu-operator - NVIDIA Device Pluging https://github.com/NVIDIA/k8s-device-plugin - EGX-Platform https://github.com/NVIDIA/egx-platform - DeepOps https://github.com/NVIDIA/deepops - GPU-Monitoring-Tools https://github.com/NVIDIA/gpu-monitoring-tools - Grafana https://grafana.com/docs/grafana/latest/installation/debian/ - DCGM-Exporter https://hub.docker.com/r/nvidia/dcgm-exporter <hr> ## [[官方文件] NVIDIA GPU OPERATOR](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/getting-started.html) ### Getting Started 導讀 This document provides instructions, including pre-requisites for getting started with the NVIDIA GPU Operator. 本文檔提供了說明，包括開始使用 NVIDIA GPU Operator 的前提條件。 ### Prerequisites 先決條件 > **Before installing the GPU Operator, you should ensure that the Kubernetes cluster meets some prerequisites.** > 在安裝 GPU Operator 之前，您應確保 Kubernetes 叢集滿足一些先決條件。 1. **Nodes must not be pre-configured with NVIDIA components (driver, container runtime, device plugin).** 節點不可預先配置 NVIDIA 元件（驅動程式、容器執行環境、設備） 2. **Nodes must be configured with Docker CE/EE, ```cri-o```, or ```containerd```. For docker, follow the official install [instructions](https://docs.docker.com/engine/install/).** 節點必須使用 Docker CE/EE、```cri-o``` 或 ```containerd``` 來配置，請遵循官方安裝[說明](https://docs.docker.com/engine/install/)。 3. **If the HWE kernel (e.g. kernel 5.x) is used with Ubuntu 18.04 LTS, then the nouveau driver for NVIDIA GPUs must be blacklisted before starting the GPU Operator. Follow the steps in the CUDA installation [guide](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-nouveau-ubuntu) to disable the nouveau driver and update initramfs.** 如果 HWE 核心（例如：核心5.x）與 Ubuntu 18.04 LTS 一起使用，則在啟動 GPU Operator 之前，必須將 NVIDIA GPU 的 nouveau 驅動程式列入黑名單。請按照 CUDA 安裝[指南](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-nouveau-ubuntu)中的步驟，將 nouveau 驅動程式停用並更新 initramfs。 - 查看 kernel 版本 ```bash # uname: print system information $ uname -r 4.15.0-72-generic ``` - [如何在Ubuntu 18.04 LTS開啟HWE(Hardware Enablement Stack)](https://go-linux.blogspot.com/2019/02/ubuntu-1804-ltshwehardware-enablement.html) - HWE 主要用途：升級 kernel 用 - 如何檢查＆升級？ 1. 檢查系統是否有支援HWE: ```bash $ hwe-support-status --verbose You are not running a system with a Hardware Enablement Stack. Your system is supported until April 2023. ``` 2. 開啟 HWE ```bash sudo apt install --install-recommends linux-generic-hwe-18.04 ``` 4. **Node Feature Discovery (NFD) is required on each node. By default, NFD master and worker are automatically deployed. If NFD is already running in the cluster prior to the deployment of the operator, set the Helm chart variable ```nfd.enabled``` to false during the Helm install step.** 每個節點上都需要**節點功能發現（NFD）**。預設情況下，將自動部署 NFD 主服務器和輔助服務器。如果在部署 operator 之前，NFD 已經在叢集中執行，請在 Helm安裝步驟中，將 Helm chart 變數 ```nfd.enabled``` 設置為 false。 5. **For monitoring in Kubernetes 1.13 and 1.14, enable the kubelet KubeletPodResources feature gate. From Kubernetes 1.15 onwards, its enabled by default.** 為了在 Kubernetes 1.13 和 1.14 中進行監視，請啟用 kubelet KubeletPodResources 功能閘道。從 Kubernetes 1.15 開始，預設為啟用。 ### Install the GPU Operator ```bash $ helm repo add nvidia https://nvidia.github.io/gpu-operator && helm repo update "nvidia" has been added to your repositories Hang tight while we grab the latest from your chart repositories... ...Successfully got an update from the "nvidia" chart repository ...Successfully got an update from the "ingress-nginx" chart repository ...Unable to get an update from the "prometheus-community" chart repository (https://prometheus-community.github.io/helm-charts): open /home/ubuntu/.cache/helm/repository/prometheus-community-index.yaml: permission denied ...Successfully got an update from the "prometheus-msteams" chart repository ...Successfully got an update from the "elastic" chart repository ...Successfully got an update from the "stable" chart repository Update Complete. ⎈ Happy Helming!⎈ $ helm install --wait --generate-name nvidia/gpu-operator NAME: gpu-operator-1608259104 LAST DEPLOYED: Fri Dec 18 10:38:27 2020 NAMESPACE: default STATUS: deployed REVISION: 1 TEST SUITE: None $ helm ls NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION gpu-operator-1608259104 default 1 2020-12-18 10:38:27.205876076 +0800 CST deployed gpu-operator-1.4.0 1.4.0 $ ku get crd NAME CREATED AT ... clusterpolicies.nvidia.com 2020-12-18T02:38:24Z ... ``` ### 查看狀態 ```bash $ kubectl get ns $ kubectl get all -n gpu-operator-resources $ kubectl get all -n default ``` ![](https://i.imgur.com/l4KhSGh.png) ```bash $ kubectl get po -A | grep -i 'nvidia\|gpu\|operator' ``` [![](https://i.imgur.com/ZIavsxH.png)](https://i.imgur.com/ZIavsxH.png) ### Uninstalling GPU Operator ```bash $ helm delete $(helm list | grep gpu-operator | awk '{print $1}') $ kubectl get pods -n gpu-operator-resources ``` <hr> ## Esc4000 / Ubuntu 16.04 / GPU資訊 ```bash $ ll /dev/nvidia* crw-rw-rw- 1 root root 195, 0 Oct 17 09:53 /dev/nvidia0 crw-rw-rw- 1 root root 195, 1 Oct 17 09:53 /dev/nvidia1 crw-rw-rw- 1 root root 195, 2 Oct 17 09:53 /dev/nvidia2 crw-rw-rw- 1 root root 195, 3 Oct 17 09:53 /dev/nvidia3 crw-rw-rw- 1 root root 195, 255 Oct 17 09:53 /dev/nvidiactl crw-rw-rw- 1 root root 195, 254 Oct 21 11:32 /dev/nvidia-modeset crw-rw-rw- 1 root root 238, 0 Oct 17 09:53 /dev/nvidia-uvm crw-rw-rw- 1 root root 238, 1 Oct 21 11:32 /dev/nvidia-uvm-tools ``` ```bash $ sudo cat /etc/docker/daemon.json { "default-runtime": "nvidia", "runtimes": { "nvidia": { "path": "/usr/bin/nvidia-container-runtime", "runtimeArgs": [] } }, "insecure-registries": ["10.78.26.170:30352", "10.78.26.170:30350", "10.78.26.20:30350", "10.78.26.20:31352", "10.78.26.200:31350", "10.78.26.200:31352"] } ``` ```bash $ docker images REPOSITORY TAG IMAGE ID CREATED SIZE gcr.io/k8s-minikube/kicbase v0.0.15-snapshot4 06db6ca72446 2 weeks ago 941MB nvidia/cuda 10.0-base 0f12aac8787e 2 months ago 109MB nvidia/cuda 11.0-base 2ec708416bb8 4 months ago 122MB ``` ```bash $ docker run --rm -it nvidia/cuda:10.0-base bash root@550709fcfc8b:/# nvidia-smi Fri Dec 25 06:16:35 2020 ...(略)... root@550709fcfc8b:/# ll /dev/nvidia* crw-rw-rw- 1 root root 238, 0 Oct 17 01:53 /dev/nvidia-uvm crw-rw-rw- 1 root root 238, 1 Oct 21 03:32 /dev/nvidia-uvm-tools crw-rw-rw- 1 root root 195, 0 Oct 17 01:53 /dev/nvidia0 crw-rw-rw- 1 root root 195, 1 Oct 17 01:53 /dev/nvidia1 crw-rw-rw- 1 root root 195, 2 Oct 17 01:53 /dev/nvidia2 crw-rw-rw- 1 root root 195, 3 Oct 17 01:53 /dev/nvidia3 crw-rw-rw- 1 root root 195, 255 Oct 17 01:53 /dev/nvidiactl ``` ``` $ minikube ssh docker@minikube:~$ ls /dev/nvidia* /dev/nvidia-modeset /dev/nvidia-uvm /dev/nvidia-uvm-tools /dev/nvidia0 /dev/nvidia1 /dev/nvidia2 /dev/nvidia3 /dev/nvidiactl ```