# 1. 資源
* RK3588 (6TOPS )
* rknn-toolkit2 (pythn)
||Model Sise|Type|Inference Time|
|---|---|---|---|
|YOLOv5|640x640|int8|34ms|
|YOLOv8|640x640|int8|32.8ms|
|YOLOv11|640x640|int8|40ms|
* rknpu2 (C++)
||Model Sise|Type|Inference Time|
|---|---|---|---|
|YOLOv5|640x640|int8|25ms|
</br>
---
# 2. [使用瑞芯微RK3588的NPU进行模型转换和推理](https://blog.csdn.net/old_power/article/details/145590080) - 建議使用
* (1) 下載代碼
$ git clone https://github.com/airockchip/rknn-toolkit2.git
$ git clone https://github.com/airockchip/rknn_model_zoo.git
$ git clone https://github.com/rockchip-linux/rknpu2.git
$ sudo cp rknpu2/runtime/RK3588/Linux/librknn_api/aarch64/librknnrt.so /usr/lib/librknnrt.so
* (2) 建立虛擬環境
$ sudo apt-get update
$ sudo apt install python3.8-venv
$ cd <rk_path>
$ python3 -m venv rk
$ source rk/bin/activate
$ pip install rknn/rknn-toolkit2/rknn-toolkit2/packages/arm64/rknn_toolkit2-2.3.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
$ python3 -c 'from rknn.api import RKNN; RKNN(); print("RKNN module loaded successfully!")'
* (3) 下載模型
$ cd rknn/rknn_model_zoo/examples/yolo11/model
$ bash download_model.sh
* (4) 轉換到 RKNN
$ cd rknn/rknn_model_zoo/examples/yolo11/python
$ python convert.py ../model/yolo11n.onnx rk3588
(PS : ONNX to rknn )
```
(rk) orangepi@orangepi5plus:~/Desktop/Rorkchip/rknn/rknn_model_zoo/examples/yolo11/python$ python convert.py ../model/yolo11n.onnx rk3588
I rknn-toolkit2 version: 2.3.0
--> Config model
done
--> Loading model
I Loading : 100%|██████████████████████████████████████████████| 174/174 [00:00<00:00, 36276.41it/s]
done
--> Building model
I OpFusing 0: 100%|██████████████████████████████████████████████| 100/100 [00:00<00:00, 451.94it/s]
I OpFusing 1 : 100%|█████████████████████████████████████████████| 100/100 [00:00<00:00, 186.89it/s]
I OpFusing 0 : 100%|██████████████████████████████████████████████| 100/100 [00:01<00:00, 62.39it/s]
I OpFusing 1 : 100%|██████████████████████████████████████████████| 100/100 [00:01<00:00, 60.04it/s]
I OpFusing 2 : 100%|██████████████████████████████████████████████| 100/100 [00:01<00:00, 58.17it/s]
I OpFusing 0 : 100%|██████████████████████████████████████████████| 100/100 [00:02<00:00, 46.59it/s]
I OpFusing 1 : 100%|██████████████████████████████████████████████| 100/100 [00:02<00:00, 45.27it/s]
I OpFusing 2 : 100%|██████████████████████████████████████████████| 100/100 [00:03<00:00, 30.04it/s]
W build: found outlier value, this may affect quantization accuracy
const name abs_mean abs_std outlier value
model.23.cv3.1.1.1.conv.weight 0.55 0.60 -12.173
model.23.cv3.0.0.0.conv.weight 0.25 0.35 -15.593
I GraphPreparing : 100%|████████████████████████████████████████| 223/223 [00:00<00:00, 3522.77it/s]
I Quantizating : 100%|████████████████████████████████████████████| 223/223 [00:14<00:00, 15.32it/s]
W build: The default input dtype of 'images' is changed from 'float32' to 'int8' in rknn model for performance!
Please take care of this change when deploy rknn model with Runtime API!
W build: The default output dtype of '462' is changed from 'float32' to 'int8' in rknn model for performance!
Please take care of this change when deploy rknn model with Runtime API!
W build: The default output dtype of 'onnx::ReduceSum_476' is changed from 'float32' to 'int8' in rknn model for performance!
Please take care of this change when deploy rknn model with Runtime API!
W build: The default output dtype of '480' is changed from 'float32' to 'int8' in rknn model for performance!
Please take care of this change when deploy rknn model with Runtime API!
W build: The default output dtype of '487' is changed from 'float32' to 'int8' in rknn model for performance!
Please take care of this change when deploy rknn model with Runtime API!
W build: The default output dtype of 'onnx::ReduceSum_501' is changed from 'float32' to 'int8' in rknn model for performance!
Please take care of this change when deploy rknn model with Runtime API!
W build: The default output dtype of '505' is changed from 'float32' to 'int8' in rknn model for performance!
Please take care of this change when deploy rknn model with Runtime API!
W build: The default output dtype of '512' is changed from 'float32' to 'int8' in rknn model for performance!
Please take care of this change when deploy rknn model with Runtime API!
W build: The default output dtype of 'onnx::ReduceSum_526' is changed from 'float32' to 'int8' in rknn model for performance!
Please take care of this change when deploy rknn model with Runtime API!
W build: The default output dtype of '530' is changed from 'float32' to 'int8' in rknn model for performance!
Please take care of this change when deploy rknn model with Runtime API!
I rknn building ...
I rknn building done.
done
--> Export rknn model
done
```
* (5) 運行 APP
$ cd rknn/rknn_model_zoo/examples/yolo11/python
$ python yolo11.py --model_path ../model/yolo11.rknn --img_show --target
'rk3588'

```
------------
WPI Test
Model Size (640, 640, 3)
Data Type uint8
RK3588 NPU Inference Time = 65.28830528259277 ms
------------
```
* How to inference
* Main
```
model, platform = setup_model(args)
outputs = model.run([input_data])
```
* RK3588 NPU Func
```
from py_utils.rknn_executor import RKNN_model_container
model = RKNN_model_container(args.model_path, args.target, args.device_id)
```
</br>
</br>
---
# 3. [rk3588对npu的再探索,yolov5使用rknn模型推理教程](https://www.coder4.com/archives/8229)
* **C/C++ 方式**
* 安裝必要套件
```
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install gcc cmake git build-essential
```
* 安裝 rknpu2
```
git clone https://github.com/rockchip-linux/rknpu2
cd rknpu2
```
* 編譯代碼
```
cd examples/rknn_yolov5_demo
./build-linux_RK3588.sh
```
* 運行代碼 (圖片) - run 16.665700 ms
```
cd install/rknn_yolov5_demo_Linux
./rknn_yolov5_demo ./model/RK3588/yolov5s-640-640.rknn ./model/bus.jpg
```
* 運行代碼 (鏡頭)
```
cd install/rknn_yolov5_demo_Linux
./rknn_yolov5_demo ./model/RK3588/yolov5s-640-640.rknn 0 640 480
```

* **Python 方式**
* 安裝必要套件
```
sudo apt-get install virtualenv git
sudo apt-get install python3 python3-dev python3-pip
sudo apt-get install libxslti-dev zlibig zlibig-dev libglib2.0-0 libsm6 gcc
```
* 虛擬環境
```
virtualenv -p /usr/bin/python3 venv
source venv/bin/activate
```
* 下載 RKNN toolkit 並安裝套件
```
git clone https://github.com/rockchip-linux/rknn-toolkit2.git
# INSTALL
cd rknn-toolkit2
pip3 install -r packages
/requirements_cp39-1.6.0.txt
pip3 install packages
/rknn_toolkit2-1.6.0+81f21f4d-cp39-cp39-linux_x86_64.whl
```
* 運行 DEMO
```
cd examples/tflite
/mobilenet_v1
python test.py
</br>
</br>
---
# 4. 自研代碼
* Benchmark.py
$ python benchmark.py \
--model rknn_model_zoo/examples/yolo11/model/yolov8.rknn \
--image rknn_model_zoo/examples/yolov5/model/bus.jp
```python
import time
import numpy as np
import cv2
import argparse
from rknn.api import RKNN
# 解析命令行參數
parser = argparse.ArgumentParser(description="Benchmark RKNN model on RK3588")
parser.add_argument("--model", type=str, required=True, help="Path to the RKNN model file")
parser.add_argument("--image", type=str, required=True, help="Path to the input image")
parser.add_argument("--target", type=str, default="rk3588", help="Target platform (default: rk3588)")
parser.add_argument("--iterations", type=int, default=100, help="Number of inference iterations (default: 100)")
args = parser.parse_args()
# 讀取模型
MODEL_PATH = args.model
IMAGE_PATH = args.image
TARGET_PLATFORM = args.target
N = args.iterations
# 初始化 RKNN
rknn = RKNN()
print(f"--> Loading RKNN model: {MODEL_PATH}")
rknn.load_rknn(MODEL_PATH)
# 初始化運行時
print(f"--> Initializing RKNN runtime on {TARGET_PLATFORM}")
rknn.init_runtime(target=TARGET_PLATFORM)
# 讀取輸入圖片
image = cv2.imread(IMAGE_PATH)
if image is None:
print(f"Error: Cannot load image {IMAGE_PATH}")
exit(1)
image = cv2.resize(image, (640, 640)) # 根據你的模型尺寸調整
image = image.astype(np.uint8)
# 預熱推理 10 次,確保測試更準確
print("--> Warming up the model")
for _ in range(10):
rknn.inference(inputs=[image])
# 正式 Benchmark
print(f"--> Running benchmark: {N} iterations")
start_time = time.time()
for _ in range(N):
outputs = rknn.inference(inputs=[image])
end_time = time.time()
# 計算 FPS
total_time = end_time - start_time
avg_time = total_time / N
fps = 1.0 / avg_time
print(f"Average Inference Time: {avg_time*1000:.6f} ms")
print(f"FPS: {fps:.2f}")
# 釋放 RKNN
rknn.release()
```
* YOLOv5s(視訊/Camera)
$ python yolov5_camera.py --model_path ../model/yolov5.rknn --img_show --
target 'rk3588'
```python
import os
import cv2
import sys
import argparse
import time
# Switch
SHOW_LOG=False
# Path
realpath = os.path.abspath(__file__)
_sep = os.path.sep
realpath = realpath.split(_sep)
sys.path.append(os.path.join(realpath[0]+_sep, *realpath[1:realpath.index('rknn_model_zoo')+1]))
# Define
from py_utils.coco_utils import COCO_test_helper
import numpy as np
OBJ_THRESH = 0.25
NMS_THRESH = 0.45
IMG_SIZE = (640, 640) # (width, height), such as (1280, 736)
CLASSES = ("person", "bicycle", "car","motorbike ","aeroplane ","bus ","train","truck ","boat","traffic light",
"fire hydrant","stop sign ","parking meter","bench","bird","cat","dog ","horse ","sheep","cow","elephant",
"bear","zebra ","giraffe","backpack","umbrella","handbag","tie","suitcase","frisbee","skis","snowboard","sports ball","kite",
"baseball bat","baseball glove","skateboard","surfboard","tennis racket","bottle","wine glass","cup","fork","knife ",
"spoon","bowl","banana","apple","sandwich","orange","broccoli","carrot","hot dog","pizza ","donut","cake","chair","sofa",
"pottedplant","bed","diningtable","toilet ","tvmonitor","laptop ","mouse ","remote ","keyboard ","cell phone","microwave ",
"oven ","toaster","sink","refrigerator ","book","clock","vase","scissors ","teddy bear ","hair drier", "toothbrush ")
coco_id_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 27, 28, 31, 32, 33, 34,
35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
64, 65, 67, 70, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 84, 85, 86, 87, 88, 89, 90]
def filter_boxes(boxes, box_confidences, box_class_probs):
"""Filter boxes with object threshold.
"""
box_confidences = box_confidences.reshape(-1)
class_max_score = np.max(box_class_probs, axis=-1)
classes = np.argmax(box_class_probs, axis=-1)
_class_pos = np.where(class_max_score* box_confidences >= OBJ_THRESH)
scores = (class_max_score* box_confidences)[_class_pos]
boxes = boxes[_class_pos]
classes = classes[_class_pos]
return boxes, classes, scores
def nms_boxes(boxes, scores):
"""Suppress non-maximal boxes.
# Returns
keep: ndarray, index of effective boxes.
"""
x = boxes[:, 0]
y = boxes[:, 1]
w = boxes[:, 2] - boxes[:, 0]
h = boxes[:, 3] - boxes[:, 1]
areas = w * h
order = scores.argsort()[::-1]
keep = []
while order.size > 0:
i = order[0]
keep.append(i)
xx1 = np.maximum(x[i], x[order[1:]])
yy1 = np.maximum(y[i], y[order[1:]])
xx2 = np.minimum(x[i] + w[i], x[order[1:]] + w[order[1:]])
yy2 = np.minimum(y[i] + h[i], y[order[1:]] + h[order[1:]])
w1 = np.maximum(0.0, xx2 - xx1 + 0.00001)
h1 = np.maximum(0.0, yy2 - yy1 + 0.00001)
inter = w1 * h1
ovr = inter / (areas[i] + areas[order[1:]] - inter)
inds = np.where(ovr <= NMS_THRESH)[0]
order = order[inds + 1]
keep = np.array(keep)
return keep
def box_process(position, anchors):
grid_h, grid_w = position.shape[2:4]
col, row = np.meshgrid(np.arange(0, grid_w), np.arange(0, grid_h))
col = col.reshape(1, 1, grid_h, grid_w)
row = row.reshape(1, 1, grid_h, grid_w)
grid = np.concatenate((col, row), axis=1)
stride = np.array([IMG_SIZE[1]//grid_h, IMG_SIZE[0]//grid_w]).reshape(1,2,1,1)
col = col.repeat(len(anchors), axis=0)
row = row.repeat(len(anchors), axis=0)
anchors = np.array(anchors)
anchors = anchors.reshape(*anchors.shape, 1, 1)
box_xy = position[:,:2,:,:]*2 - 0.5
box_wh = pow(position[:,2:4,:,:]*2, 2) * anchors
box_xy += grid
box_xy *= stride
box = np.concatenate((box_xy, box_wh), axis=1)
# Convert [c_x, c_y, w, h] to [x1, y1, x2, y2]
xyxy = np.copy(box)
xyxy[:, 0, :, :] = box[:, 0, :, :] - box[:, 2, :, :]/ 2 # top left x
xyxy[:, 1, :, :] = box[:, 1, :, :] - box[:, 3, :, :]/ 2 # top left y
xyxy[:, 2, :, :] = box[:, 0, :, :] + box[:, 2, :, :]/ 2 # bottom right x
xyxy[:, 3, :, :] = box[:, 1, :, :] + box[:, 3, :, :]/ 2 # bottom right y
return xyxy
def post_process(input_data, anchors):
boxes, scores, classes_conf = [], [], []
# 1*255*h*w -> 3*85*h*w
input_data = [_in.reshape([len(anchors[0]),-1]+list(_in.shape[-2:])) for _in in input_data]
for i in range(len(input_data)):
boxes.append(box_process(input_data[i][:,:4,:,:], anchors[i]))
scores.append(input_data[i][:,4:5,:,:])
classes_conf.append(input_data[i][:,5:,:,:])
def sp_flatten(_in):
ch = _in.shape[1]
_in = _in.transpose(0,2,3,1)
return _in.reshape(-1, ch)
boxes = [sp_flatten(_v) for _v in boxes]
classes_conf = [sp_flatten(_v) for _v in classes_conf]
scores = [sp_flatten(_v) for _v in scores]
boxes = np.concatenate(boxes)
classes_conf = np.concatenate(classes_conf)
scores = np.concatenate(scores)
# filter according to threshold
boxes, classes, scores = filter_boxes(boxes, scores, classes_conf)
# nms
nboxes, nclasses, nscores = [], [], []
for c in set(classes):
inds = np.where(classes == c)
b = boxes[inds]
c = classes[inds]
s = scores[inds]
keep = nms_boxes(b, s)
if len(keep) != 0:
nboxes.append(b[keep])
nclasses.append(c[keep])
nscores.append(s[keep])
if not nclasses and not nscores:
return None, None, None
boxes = np.concatenate(nboxes)
classes = np.concatenate(nclasses)
scores = np.concatenate(nscores)
return boxes, classes, scores
def draw(image, boxes, scores, classes):
for box, score, cl in zip(boxes, scores, classes):
top, left, right, bottom = [int(_b) for _b in box]
SHOW_LOG and print("%s @ (%d %d %d %d) %.3f" % (CLASSES[cl], top, left, right, bottom, score))
cv2.rectangle(image, (top, left), (right, bottom), (255, 0, 0), 2)
cv2.putText(image, '{0} {1:.2f}'.format(CLASSES[cl], score),
(top, left - 6), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 2)
def setup_model(args):
model_path = args.model_path
if model_path.endswith('.pt') or model_path.endswith('.torchscript'):
platform = 'pytorch'
from py_utils.pytorch_executor import Torch_model_container
model = Torch_model_container(args.model_path)
elif model_path.endswith('.rknn'):
platform = 'rknn'
from py_utils.rknn_executor import RKNN_model_container
model = RKNN_model_container(args.model_path, args.target, args.device_id)
elif model_path.endswith('onnx'):
platform = 'onnx'
from py_utils.onnx_executor import ONNX_model_container
model = ONNX_model_container(args.model_path)
else:
assert False, "{} is not rknn/pytorch/onnx model".format(model_path)
print('Model-{} is {} model, starting val'.format(model_path, platform))
return model, platform
def img_check(path):
img_type = ['.jpg', '.jpeg', '.png', '.bmp']
for _type in img_type:
if path.endswith(_type) or path.endswith(_type.upper()):
return True
return False
if __name__ == '__main__':
# Extra
parser = argparse.ArgumentParser(description='Process some integers.')
parser.add_argument('--model_path', type=str, required= True, help='model path, could be .pt or .rknn file')
parser.add_argument('--target', type=str, default='rk3566', help='target RKNPU platform')
parser.add_argument('--device_id', type=str, default=None, help='device id')
parser.add_argument('--img_show', action='store_true', default=False, help='draw the result and show')
parser.add_argument('--img_save', action='store_true', default=False, help='save the result')
parser.add_argument('--anno_json', type=str, default='../../../datasets/COCO/annotations/instances_val2017.json', help='coco annotation path')
parser.add_argument('--img_folder', type=str, default='../model', help='img folder path')
parser.add_argument('--coco_map_test', action='store_true', help='enable coco map test')
parser.add_argument('--anchors', type=str, default='../model/anchors_yolov5.txt', help='target to anchor file, only yolov5, yolov7 need this param')
args = parser.parse_args()
# load anchor
with open(args.anchors, 'r') as f:
values = [float(_v) for _v in f.readlines()]
anchors = np.array(values).reshape(3,-1,2).tolist()
SHOW_LOG and print("use anchors from '{}', which is {}".format(args.anchors, anchors))
# init model
model, platform = setup_model(args)
file_list = sorted(os.listdir(args.img_folder))
img_list = []
for path in file_list:
if img_check(path):
img_list.append(path)
co_helper = COCO_test_helper(enable_letter_box=True)
# Setting Save Image
img_name = 'output.jpg'
# Setting Windows
cv2.namedWindow("RK3588 - ai result", cv2.WND_PROP_FULLSCREEN)
cv2.setWindowProperty("RK3588 - ai result", cv2.WND_PROP_FULLSCREEN, cv2.WINDOW_FULLSCREEN)
# run
# gst-launch-1.0 v4l2src device=/dev/video0 ! image/jpeg,width=1280,height=720,framerate=30/1 ! jpegdec ! videoconvert ! autovideosink
#gst_pipeline = "v4l2src device=/dev/video0 ! image/jpeg,width=1280,height=720,framerate=30/1 ! jpegdec ! videoconvert ! appsink"
cap = cv2.VideoCapture(0)
while True:
# Read Image
ret, frame = cap.read()
img_src = frame
print(frame.shape)
# Due to rga init with (0,0,0), we using pad_color (0,0,0) instead of (114, 114, 114)
pad_color = (0,0,0)
img = co_helper.letter_box(im= img_src.copy(), new_shape=(IMG_SIZE[1], IMG_SIZE[0]), pad_color=(0,0,0))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# preprocee if not rknn model
if platform in ['pytorch', 'onnx']:
input_data = img.transpose((2,0,1))
input_data = input_data.reshape(1,*input_data.shape).astype(np.float32)
input_data = input_data/255.
else:
input_data = img
interpreter_time_start = time.time()
outputs = model.run([input_data])
interpreter_time_end = time.time()
boxes, classes, scores = post_process(outputs, anchors)
if args.img_show or args.img_save:
SHOW_LOG and print('\n\nIMG: {}'.format(img_name))
img_p = img_src.copy()
if boxes is not None:
draw(img_p, co_helper.get_real_box(boxes), scores, classes)
if args.img_save:
if not os.path.exists('./result'):
os.mkdir('./result')
result_path = os.path.join('./result', img_name)
cv2.imwrite(result_path, img_p)
SHOW_LOG and print('Detection result save to {}'.format(result_path))
if args.img_show:
cv2.imshow("RK3588 - ai result", img_p)
cv2.waitKeyEx(1)
# record maps
if args.coco_map_test is True:
if boxes is not None:
for i in range(boxes.shape[0]):
co_helper.add_single_record(image_id = int(img_name.split('.')[0]),
category_id = coco_id_list[int(classes[i])],
bbox = boxes[i],
score = round(scores[i], 5).item()
)
# calculate maps
if args.coco_map_test is True:
pred_json = args.model_path.split('.')[-2]+ '_{}'.format(platform) +'.json'
pred_json = pred_json.split('/')[-1]
pred_json = os.path.join('./', pred_json)
co_helper.export_to_json(pred_json)
from py_utils.coco_utils import coco_eval_with_json
coco_eval_with_json(args.anno_json, pred_json)
# release
model.release()
```
---
Q :RKNN init failed. error code: RKNN_ERR_FAIL
A : $ sudo cp rknpu2/runtime/RK3588/Linux/librknn_api/aarch64/librknnrt.so /usr/lib/librknnrt.so
```
I target set by user is: rk3588
E RKNN: [11:28:24.309] 6, 1
E RKNN: [11:28:24.309] Invalid RKNN model version 6
E RKNN: [11:28:24.309] rknn_init, load model failed!
E init_runtime: Traceback (most recent call last):
File "rknn/api/rknn_log.py", line 344, in rknn.api.rknn_log.error_catch_decorator.error_catch_wrapper
File "rknn/api/rknn_base.py", line 2483, in rknn.api.rknn_base.RKNNBase.init_runtime
File "rknn/api/rknn_runtime.py", line 427, in rknn.api.rknn_runtime.RKNNRuntime.build_graph
Exception: RKNN init failed. error code: RKNN_ERR_FAIL
————————————————
版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
原文链接:https://blog.csdn.net/m0_60657960/article/details/144234539
```
---
# 參考文獻
* [Higher latency on RK3588 then YOLOv6 and DAMO-YOLO](https://github.com/THU-MIG/yolov10/issues/115)
* [yolov8部署到RK3588_并运行示例12.4_笔记2](https://blog.csdn.net/m0_60657960/article/details/144234539)