# ViTPose安裝流程(Linux)
ver.2024.02.20
前提: 官方ViTPose只能安裝在Linux系統上
- - - -
## 這段很重要, <font color='red'>版本問題就是mmcv最大的問題</font>
```
Nvidia Quadro RTX 6000
torch: '2.0.1+cu117' torchvision: '0.15.2+cu117' 貌似可以
NVIDIA GeForce RTX 2080 Ti
torch: '1.13.1+cu117' torchvision: '0.14.1+cu117' 貌似也行
```
小結論:或許是cudnn要用11.7(再試試)
- - - -
## 環境的建立
本操作說明中要建立ViTPose的環境有兩種選擇,一是在Docker的基礎下安裝,或是直接在anaconda的環境中安裝也可以。
如果選擇不要透過Docker建立,請到跳過`安裝Docker版本選擇`。並在Anaconda環境中建立一個新的環境專門給ViTPose使用。
### 安裝Docker版本選擇
在官方VitPose的安裝有推薦的幾個方式,目前測試兩個版本的Image,皆可以成功運行
- pytorch/pytorch <-- 什麼版本嗎??? [連結](<https://hub.docker.com/r/pytorch/pytorch>)
- NGC docker 21.06 <-- 官方建議之一 [連結](<https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch>)
Dockerhub的連結: Pytorch (用google搜不到, tensorflow團隊的陰謀)
成功建立container後,就可以進入Docker開始安裝ViTPose了
### 如何安裝VitPose
前半部分請依照官網安裝,註:`MMCV_WITH_OPS`會安裝很久請有耐心
```bash
git clone https://github.com/open-mmlab/mmcv.git
cd mmcv
git checkout v1.3.9
MMCV_WITH_OPS=1 pip install -e .
cd ..
git clone https://github.com/ViTAE-Transformer/ViTPose.git
cd ViTPose
pip install -v -e .
```
```bash
pip install timm==0.4.9 einops
```
官方安裝到上面為止,接著安裝YOLOv8
```bash
pip install ultralytics # 這個是YOLOv8
```
再來是重點步驟,<font color='red'>請卸載掉自動安裝的opencv</font>
```bash
pip uninstall opencv-python
```
因為呢使用root, 需要安裝cv2要安裝healess版本
```bash
pip install opencv-python-headless
```
### 你的OpenCV還是有問題嗎?
在官方推薦的`NGC docker 21.06`中如果`cv2`無法使用,可能出現下列報錯訊息
```bash
File "/opt/conda/lib/python3.8/site-packages/cv2/gapi/__init__.py", line 290, in <module>
cv.gapi.wip.GStreamerPipeline = cv.gapi_wip_gst_GStreamerPipeline
AttributeError: partially initialized module 'cv2' has no attribute 'gapi_wip_gst_GStreamerPipeline' (most likely due to a circular import)
```
那就解安裝先
```bash
pip uninstall opencv-python-headless
```
經過測試,此版本可以正常運行
```bash
pip install opencv-python-headless==4.5.5.64
```
[[參考解決方法]](<https://stackoverflow.com/questions/72706073/attributeerror-partially-initialized-module-cv2-has-no-attribute-gapi-wip-gs>)
### 其他問題簡記
- numpy版本太舊
```bash
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
```
<!--```import xtcocotools._mask as _mask
File "xtcocotools/_mask.pyx", line 1, in init xtcocotools._mask
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject```-->
> 更新一下就好`pip install --upgrade numpy`
<!--```
Found existing installation: numpy 1.20.3
Uninstalling numpy-1.20.3:
Successfully uninstalled numpy-1.20.3
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
mmpose 0.24.0 requires opencv-python, which is not installed.
scipy 1.6.3 requires numpy<1.23.0,>=1.16.5, but you have numpy 1.24.4 which is incompatible.
Successfully installed numpy-1.24.4
``` -->
- YOLOv8對torch1.9.0的奇怪衝突,
```bash
AttributeError: module 'torch' has no attribute 'is_inference_mode_enabled'
```
> 直接去YOLO源碼直接註解掉
**<font color='red'>ssh連線會有/etc/Profile問題,未紀錄</font>**
## 使用ViTPose說明
### 檔案架構參考

- `mmcv`安裝過程自動產生的資料夾
- `ViTPose`安裝過程自動產生的資料夾
- `weight_and_config`資料夾存放待會下載的權重及格式檔案
### 如何下載ViTPose權重
需要下載官網中`config`和`weight`兩個連結的成對檔案,習慣使用紅勾勾的為主

- [[B^*^ config連結]](<https://github.com/ViTAE-Transformer/ViTPose/blob/main/configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/ViTPose_base_coco_256x192.py>) [[B^*^ weight連結]](<https://onedrive.live.com/?authkey=%21AOUwHT3cnMm2qr4&id=E534267B85818129%21170&cid=E534267B85818129&parId=root&parQt=sharedby&o=OneUp>)
- L^*^、H^*^目前測試有點問題
[[怎麼在terminal使用wget下載OneDrive檔案]](<https://hackmd.io/@CM3ye3grQDWDO8SbuGjnEw/HyqHlVz3p>)
### 修改路徑讀取權重
以B^*^下載的`config`檔案`ViTPose_base_coco_256x192.py`為例,請修改檔案開頭的路徑
```
_base_ = [
'../ViTPose/configs/_base_/default_runtime.py',
'../ViTPose/configs/_base_/datasets/coco.py'
]
```
### 範例程式
ViTPose的inference.py內容
```python=
import cv2
import numpy as np
import time
import random
# VITPOSE
from mmcv import Config
# from mmengine.config import Config
from mmpose.models import build_posenet
from mmcv.runner import load_checkpoint
from mmcv.cnn import fuse_conv_bn
from mmpose.apis import inference_top_down_pose_model
# YOLOV8
from ultralytics import YOLO
class PersonDetector(object):
def __init__(self):
self.model = YOLO('./weight_and_config/yolov8n.pt') # load an official model
# self.model.to('cuda:1')
def detect(self, arr: np.ndarray):
# Predict with the model
# according to https://docs.ultralytics.com/modes/predict/#inference-sources
# input ndarray shape must to be [H, W, C] and dtype must be uint8
results = self.model(arr, classes=[0], verbose=False)
# View results
bboxs = None
confs = None
for r in results:
bboxs = r.boxes.xyxy
confs = r.boxes.conf
res = []
for i in range(bboxs.size(0)):
if confs[i] < 0.8:
continue
x0, y0, x1, y1 = int(bboxs[i][0]), int(bboxs[i][1]), int(bboxs[i][2]), int(bboxs[i][3])
res.append({
'bbox': [x0, y0, x1, y1, float(confs[i])]
})
return res
class PoseEstimator(object):
def __init__(self, cfg_path, ckpt_path):
cfg = Config.fromfile(cfg_path)
self.estimator = build_posenet(cfg.model)
load_checkpoint(self.estimator, ckpt_path, map_location='cpu')
self.estimator = fuse_conv_bn(self.estimator)
self.estimator.cfg = cfg
# self.estimator.to('cuda:1') # use GPU
self.estimator.eval()
def detect(self, img: np.ndarray, bboxs: list):
pose_results, returned_outputs = inference_top_down_pose_model(self.estimator,
img,
person_results=bboxs,
format='xyxy')
return pose_results
def get_connections():
return [(0, 1), (0, 2), (1, 3), (2, 4), (5, 6),
(5, 7), (7, 9), (6, 8), (8, 10), (17, 11), (17, 12),
(11, 13), (12, 14), (13, 15), (14, 16), (17, 0)]
def random_colors(n_persons: int) -> list:
colors = []
for i in range(n_persons):
colors.append((random.randint(80, 255), random.randint(80, 255), random.randint(80, 255)))
return colors
def plot_bboxs(img: np.ndarray, bboxs: list, colors: list, line_width: int):
n_persons = len(bboxs)
for i in range(n_persons):
x0, y0, x1, y1 = bboxs[i]['bbox'][:4]
cv2.rectangle(img, (x0, y0), (x1, y1), colors[i], line_width)
def plot_keypoints(img: np.ndarray, kps: list, threshold: float, colors: list, circle_size: int = 2):
n_persons = len(kps)
for i in range(n_persons):
n_kps = len(kps[i]['keypoints'])
for j in range(n_kps):
x, y, score = kps[i]['keypoints'][j]
if score < threshold:
continue
cv2.circle(img, (int(x), int(y)), circle_size, colors[i], -1)
# msg = "{:.2f}".format(score)
# cv2.putText(img, msg, (int(x), int(y)), 1, 1, colors[i], 1, 1)
def plot_skeletons(l_pair: list, img: np.ndarray, kps: list, threshold: float, colors: list, line_width: int = 2):
n_persons = len(kps)
for i in range(n_persons):
for pair in l_pair:
if pair[0] == 17:
start_pos = (kps[i]['keypoints'][5] + kps[i]['keypoints'][6]) / 2
else:
start_pos = kps[i]['keypoints'][pair[0]]
if pair[1] == 17:
end_pos = (kps[i]['keypoints'][5] + kps[i]['keypoints'][6]) / 2
else:
end_pos = kps[i]['keypoints'][pair[1]]
x0, y0, s0 = int(start_pos[0]), int(start_pos[1]), start_pos[2]
x1, y1, s1 = int(end_pos[0]), int(end_pos[1]), end_pos[2]
if s0 < threshold or s1 < threshold:
continue
cv2.line(img, (x0, y0), (x1, y1), colors[i], line_width)
if __name__ == "__main__":
person_model = PersonDetector()
pose_model = PoseEstimator(cfg_path="./weight_and_config/ViTPose_base_coco_256x192.py",
ckpt_path="./weight_and_config/vitpose-b-multi-coco.pth")
img = cv2.imread("./demo.jpg")
# img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
# ========== warm-up ==========
bboxs = person_model.detect(img)
kps = pose_model.detect(img, bboxs)
bboxs = person_model.detect(img)
kps = pose_model.detect(img, bboxs)
bboxs = person_model.detect(img)
kps = pose_model.detect(img, bboxs)
bboxs = person_model.detect(img)
kps = pose_model.detect(img, bboxs)
start_time = time.time()
bboxs = person_model.detect(img)
kps = pose_model.detect(img, bboxs)
cost = time.time() - start_time
print("Cost: {:.6f} secs/frame\nFPS: {:.6f} frame/sec".format(cost, 1 / cost))
# ===== visulize =====
kp_threshold = 0.7
l_pair = get_connections()
colors = random_colors(len(bboxs))
plot_bboxs(img, bboxs, colors, line_width=2)
plot_keypoints(img, kps, kp_threshold, colors, circle_size=2)
plot_skeletons(l_pair, img, kps, kp_threshold, colors, line_width=2)
cv2.imwrite('demo_result.jpg', img)
```
## DEMO Result


<!-- {width=10%} -->
<!-- {width=10%} -->
### 舊版流程
```
"""
# 1.1 Download mmcv Project
# git clone https://github.com/open-mmlab/mmcv.git
# 1.2 Installation
# version information (pytorch==1.9.0)
# git checkout v1.3.9
# 'MMCV_WITH_OPS=1 pip install -e .' in the mmcv repository
# 2.1 Download VitPose Project
# git clone from https://github.com/ViTAE-Transformer/ViTPose.git
# 2.2 Installation
# version information (20231012 test):
# please use 'pip install -v -e .' in the ViTPose repository
# 3. Download config file and pth file
# *config url:
# https://github.com/ViTAE-Transformer/ViTPose/blob/main/configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/ViTPose_base_coco_256x192.py
# *pth urls:
# https://onedrive.live.com/?authkey=%21AOUwHT3cnMm2qr4&id=E534267B85818129%21170&cid=E534267B85818129&parId=root&parQt=sharedby&o=OneUp
# 3.1 change the _base_ in the ViTPose_base_coco_256x192.py
# e.g. '../../../../_base_/default_runtime.py' --> './ViTPose/configs/_base_/default_runtime.py'
# 4. yolov8
# pip install ultralytics
"""
```