# Podman Migrate # Podman Install ## Crun 更新 * [Crun Github 官方網站](https://github.com/containers/crun) ### Step1: 安裝相依套件: CRUN | Dependencies - Ubuntu ``` $ sudo apt-get install -y make git gcc build-essential pkgconf libtool \ libsystemd-dev libprotobuf-c-dev libcap-dev libseccomp-dev libyajl-dev \ libgcrypt20-dev go-md2man autoconf python3 automake ```` ### Step2: 下載 CRUN ``` $ git clone https://github.com/containers/crun.git $ cd crun ``` ### Step3: 編譯 CRUN| build ``` $ ./autogen.sh $ ./configure $ make ``` ``` $ sudo make install ``` ### Step4: 修正 Podman 的 config #### Step4.1: 取得 Crun 路徑 ``` $ which crun ``` #### Step4.2: 修改 containers.conf ``` $ sudo vim /usr/share/containers/containers.conf # crun = [.....] 註解處移除,並添加 $ which crun 的輸出路徑 ## 可運用 /<欲搜尋字串> + Enter 定位目的地 ## 修改 #crun = [ # >"/usr/local/bin/crun", #新增 which crun 的結果 # "/usr/bin/crun", # "/usr/sbin/crun", # "/usr/local/bin/crun", # "/usr/local/sbin/crun", # "/sbin/crun", # "/bin/crun", # "/run/current-system/sw/bin/crun", #] ``` ## criu ``` $ wget http://github.com/checkpoint-restore/criu/archive/v3.18/criu-3.18.tar.gz $ tar -xvf criu-3.18.tar.gz $ sudo apt install automake libtool autoconf curl make g++ unzip $ sudo apt-get update $ sudo apt-get install libnet-dev $ sudo apt-get install libnl-3-dev $ sudo apt-get install libcap-dev $ sudo apt-get install asciidoc $ sudo apt-get install protobuf-c-compiler $ sudo apt install protobuf-compiler $ sudo apt-get install libbsd-dev $ sudo apt-get install nftables $ sudo apt-get install libnftables-dev $ $ make $ sudo make install $ sudo criu check $ sudo criu -V ``` * 參考資料: [Ubuntu下安裝criu-3.15](https://blog.csdn.net/code_aJack/article/details/115316996) * 參考資料: [CRIU 官網: Installation](https://criu.org/Installation) * 參考資料: [Protocol Buffers - Google's data interchange format](https://github.com/protocolbuffers/protobuf/blob/main/src/README.md) ## Podman ### 安裝相依套件 * 參考資料 : [Podman Install | Building from scratch - Ubuntu ](https://podman.io/docs/installation#building-from-scratch) ``` sudo apt-get install \ btrfs-progs \ crun \ git \ golang-go \ go-md2man \ iptables \ libassuan-dev \ libbtrfs-dev \ libc6-dev \ libdevmapper-dev \ libglib2.0-dev \ libgpgme-dev \ libgpg-error-dev \ libprotobuf-dev \ libprotobuf-c-dev \ libseccomp-dev \ libselinux1-dev \ libsystemd-dev \ pkg-config \ uidmap ``` ### 下載Podman * 參考資料: [Podman Install | Get Source Code](https://podman.io/docs/installation#get-source-code) ``` git clone https://github.com/containers/podman/ cd podman ``` ### 安裝Podman ``` # 從 apt 先安裝 podman , 讓檔案位置自動配置 $ sudo apt-get update $ sudo apt-get install podman # 更新 Podman ## 若直接安裝,檔案位置會找不到、一些相依配置會尚未安裝 $ make BUILDTAGS="selinux seccomp" PREFIX=/usr $ sudo make install PREFIX=/usr ``` ## 查看各項版本 ``` renjie@worker2:~$ podman -v podman version 4.5.1 renjie@worker2:~$ criu -V Version: 3.18 renjie@worker2:~$ crun -v crun version 1.8.5.0.0.0.44-017b commit: 017bd29f5a99823c6c598fa875a41f316f16e58b rundir: /run/user/1000/crun spec: 1.0.0 +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL ``` # Podman Migrate Example ## Looper ### At host 1 ``` $ sudo podman run -d --name looper busybox /bin/sh -c \ 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done' $ sudo podman ps $ sudo podman logs <contianer ID> $ sudo podman container checkpoint <contianer ID> -e /tmp/chkpt_<contianer ID>.tar.gz $ sudo scp /tmp/chkpt_<containerID>.tar.gz renjie@10.52.52.86:/tmp ``` ### At host 2 ``` $ sudo podman container restore -i /tmp/chkpt_<containerID>.tar.gz ``` ### At host 1、2 ``` $ sudo podman logs -f <containerID> ``` ## Mnist Traning ### At master ``` # Pull tensorflow Image $ sudo podman image pull tensorflow/tensorflow:latest-py3 ? Please select an image: registry.fedoraproject.org/tensorflow/tensorflow:latest-py3 registry.access.redhat.com/tensorflow/tensorflow:latest-py3 ▸ docker.io/tensorflow/tensorflow:latest-py3 quay.io/tensorflow/tensorflow:latest-py3 $ sudo podman image ls # 下載 Mnist 專案 $ git clone https://gitlab.com/ncu111522119/migratemnisttraningcontainer.git $ cd migratemnisttraningcontainer/ ## 建立 Mnist Training Image $ sudo podman build . -t traning_mnist $ sudo podman image ls REPOSITORY TAG IMAGE ID CREATED SIZE localhost/traning_mnist latest a4f6f9f59f61 2 minutes ago 1.67 GB # 執行 Mnist Training Container $ sudo podman run -d --name mnist_container localhost/traning_mnist $ sudo podman logs -f mnist_container # 設定Checkpoint $ sudo podman container checkpoint mnist_container -e /tmp/chkpt_1f020fe25afc.tar.gz # 匯出 traning_mnist 的 Image $ sudo podman save -o traning_mnist_master.tar localhost/traning_mnist ## 將 checkpoint、Image 遷移至 worker $ sudo scp /tmp/chkpt_1f020fe25afc.tar.gz renjie@10.52.52.87:/tmp $ sudo scp ./traning_mnist_master.tar renjie@10.52.52.87:/home/renjie/lab ``` ### At worker ``` $ sudo docker image pull tensorflow/tensorflow:latest-py3 $ sudo podman load -i traning_mnist_master.tar $ sudo podman container restore -i /tmp/chkpt_97d24604241d.tar.gz ``` # Podman Error ## Error: Runtime Not Support 需要將CRUN 更新,並將更新後的位置,更新到 Podman 的 Config 檔案之中 ### Crun 更新 * [Crun Github 官方網站](https://github.com/containers/crun) #### 安裝相依套件: CRUN | Dependencies - Ubuntu ``` $ sudo apt-get install -y make git gcc build-essential pkgconf libtool \ libsystemd-dev libprotobuf-c-dev libcap-dev libseccomp-dev libyajl-dev \ libgcrypt20-dev go-md2man autoconf python3 automake ```` #### 下載 CRUN ``` $ git clone https://github.com/containers/crun.git $ cd crun ``` #### 編譯: CRUN| build ``` $ ./autogen.sh $ ./configure $ make ``` ``` $ sudo make install ``` ### 修正 Podman 的 config #### Step1: 取得 Crun 路徑 ``` $ which crun ``` #### Step2: 修改 containers.conf ``` $ sudo vi /usr/share/containers/containers.conf # crun = [.....] 註解處移除,並添加 $ which crun 的輸出路徑 ``` * * [在 Ubuntu 裡面啓用 cgroups v2](https://blog.davy.tw/posts/enable-cgroups-v2-in-ubuntu/) ## Error: could not find a working conmon binary ### 問題描述 Error: could not find a working conmon binary (configured options: [/usr/libexec/podman/conmon /usr/local/libexec/podman/conmon /usr/local/lib/podman/conmon /usr/bin/conmon /usr/sbin/conmon /usr/local/bin/conmon /usr/local/sbin/conmon /run/current-system/sw/bin/conmon]: invalid argument) ### 解決辦法 安裝 [conmon](https://github.com/containers/conmon/) #### 安裝相依套件 | Ubuntu ``` $ sudo apt-get install \ gcc \ git \ libc6-dev \ libglib2.0-dev \ libseccomp-dev \ pkg-config \ make \ runc ``` #### 安裝 ``` $ git clone https://github.com/containers/conmon.git $ make $ sudo make install ``` ### 參考資料 * [如何编译部署podman?](https://www.cmdschool.org/archives/16448) * [Error: could not get runtime: could not find a working conmon binary](https://github.com/containers/podman/issues/5076) * [conmon](https://github.com/containers/conmon/) ## Error: /etc/containers/registries.conf: No such file or directory ### 解決方法 從 apt 先安裝 podman ``` $ sudo apt-get install $ sudo apt-get podman ``` ## Error: registries are defined in "/etc/containers/registries.conf" ### 問題描述 Error: short-name "busybox" did not resolve to an alias and no unqualified-search registries are defined in "/etc/containers/registries.conf" ### 解決方法 ``` $ echo "unqualified-search-registries = [\"docker.io\"]" | sudo tee -a /etc/containers/registries.conf ``` ### 參考資料 * [Podman "Error: no registries found in registries.conf, a registry must be provided" while logging/pulling from docker.io](https://github.com/containers/podman/issues/16096) * [Testcontainers with Podman - Verify that Podman is installed correctly](https://github.com/evgeniy-khist/podman-testcontainers#verify-that-podman-is-installed-correctly) ## Error: invalid repository name ### 問題描述 ``` ## At worker $ sudo podman container restore -i /tmp/chkpt_97d24604241d.tar.gz Error: creating container storage: parsing named reference "a4f6f9f59f618396f4c49808098627dc810fbade1af32de34e50fd01235c6b22": invalid repository name (a4f6f9f59f618396f4c49808098627dc810fbade1af32de34e50fd01235c6b22), cannot specify 64-byte hexadecimal strings ## At master # localhost/traning_mnist image 與上述的 ID 相同 $ sudo podman image ls REPOSITORY TAG IMAGE ID CREATED SIZE localhost/traning_mnist latest a4f6f9f59f61 18 minutes ago 1.67 GB ``` ### 解決方法 ``` ## At master $ sudo podman save -o traning_mnist_master.tar localhost/traning_mnist $ sudo scp ./traning_mnist_master.tar renjie@10.52.52.87:/home/renjie/lab ## At worker $ sudo podman load -i traning_mnist_master.tar $ sudo podman image ls REPOSITORY TAG IMAGE ID CREATED SIZE localhost/traning_mnist latest a4f6f9f59f61 30 minutes ago 1.67 GB ``` * 參考資料: [podman案例:打包本地镜像,发送给别人或者上传镜像](https://blog.csdn.net/qq_45834685/article/details/121148116) ## Error : Start Container ,Not start , show **Exited (132)** ### 問題描述 #### Podman ``` $ sudo podman ps -a CONTAINER ID IMAGE STATUS NAMES 5f184f62ed7a localhost/traning_mnist:latest Exited (132) 16 seconds ago mnist_container ``` 顯示 Exited (132) #### 執行 Python ``` $ python3 train_mnist.py Illegal instruction (core dumped) ``` * 參考資料: [轻松解决Tensorflow报错illegal instruction (core dumped)](https://zhuanlan.zhihu.com/p/145301215) #### 解決辦法 啟用 avx/avx2 * 需先關閉 windows 的 Hyper-V ``` > Disable-WindowsOptionalFeature -online -FeatureName VirtualMachinePlatform ``` * 啟用 avx ![](https://hackmd.io/_uploads/SyupkH553.png) * 啟用 Nested VT-x/AMD-v 變成打勾 ``` > vboxmanage modifyvm [虚拟机名] --nested-hw-virt on ``` ### 解決方法 {%hackmd 9XNm_9uQRt6Yy8tgYArrXw#Docker-Exit-132 %} ## e: unable to locate package podman ### 解決方法: 更新Ubuntu * [升級 Ubuntu 從 20.04 至 22.04 (Jammy Jellyfish)](https://www.kwchang0831.dev/dev-env/ubuntu/upgrade-from-20.04-to-22.04) ## E: Sub-process /usr/bin/dpkg returned an error code (1) ### 解決方法 * [Solve E: Sub-process /usr/bin/dpkg returned an error code (1) Error under Ubuntu System](https://medium.com/@scofield44165/ubuntu%E4%B8%AD%E8%A7%A3%E6%B1%BAe-sub-process-usr-bin-dpkg-returned-an-error-code-1-%E5%A0%B1%E9%8C%AF-solve-e-sub-process-f64833f9105f) ### new yaru-theme-gnome-shell package pre-installation script subprocess returned error exit status 1 new yaru-theme-gnome-shell package pre-installation script subprocess returned error exit status 1 Errors were encountered while processing: /var/cache/apt/archives/yaru-theme-gnome-shell_22.04.5_all.deb E: Sub-process /usr/bin/dpkg returned an error code (1) #### 解決方法: 刪除後,重新安裝 ``` $ sudo apt remove yaru-theme-gnome-shell $ sudo apt install yaru-theme-gnome-shell ``` ## The following packages have unmet dependencies: ## note: module requires Go 1.19 note: module requires Go 1.19 make: *** [Makefile:342: bin/podman] Error 2 ### 解決方法 #### 確認版本 ``` $ go version go version go1.18.1 linux/amd64 ``` #### 升級 go ``` $ export GOPATH=~/go $ git clone https://go.googlesource.com/go $GOPATH $ cd $GOPATH $ cd src $ ./all.bash $ export PATH=$GOPATH/bin:$PATH $ sudo nano ~/.bash_profile export PATH=$GOPATH/bin:$PATH #貼上環境變數 $ source ~/.bash_profile #啟用環境變數 ``` ## CRIU: Error (criu/sockets.c:210): sockets: Diag module missing (-2) ``` $ sudo criu check Error (criu/sockets.c:210): sockets: Diag module missing (-2) Error (criu/sockets.c:210): sockets: Diag module missing (-2) Error (criu/sockets.c:210): sockets: Diag module missing (-2) Error (criu/util.c:643): exited, status=3 Error (criu/libnetlink.c:54): -95 reported by netlink: Operation not supported Error (criu/net.c:3791): net: Unable to create a veth pair: -95 Warn (criu/net.c:3817): net: NSID isn't reported for network links Error: Could not process rule: Operation not supported create table inet CRIU ^^^^^^^^^^^^^^^^^^^^^^^ Error (criu/kerndat.c:1508): Can't create nftables table Error (criu/util.c:1413): Can't wait or bad status: errno=0, status=256 Error (criu/kerndat.c:1632): kerndat_has_nftables_concat failed when initializing kerndat. Error (criu/crtools.c:263): Could not initialize kernel features detection. ``` ### 解決方法 * [Failing lxd migration - Error (sockets.c:129): Diag module missing (-2)](https://bugs.launchpad.net/ubuntu/+source/criu/+bug/1591729) * You need to install the linux-image-extra-`uname -r`. * [How can I resolve this problem : Unable to locate package linux-image-extra-4.15.0-29-generic](https://askubuntu.com/questions/1082472/how-can-i-resolve-this-problem-unable-to-locate-package-linux-image-extra-4-15) * 安裝 linux-image-extra-`uname -r` * [what are linux-image-extra and linux-image-generic? [duplicate]](https://askubuntu.com/questions/621234/what-are-linux-image-extra-and-linux-image-generic) ``` $ sudo apt update $ sudo apt upgrade $ sudo apt install --reinstall linux-image-generic ```