【Doc】【Orange PI 5 Plus】 NPU Usage Method

# 1. 資源 * RK3588 (6TOPS ) * rknn-toolkit2 (pythn) ||Model Sise|Type|Inference Time| |---|---|---|---| |YOLOv5|640x640|int8|34ms| |YOLOv8|640x640|int8|32.8ms| |YOLOv11|640x640|int8|40ms| * rknpu2 (C++) ||Model Sise|Type|Inference Time| |---|---|---|---| |YOLOv5|640x640|int8|25ms| --- # 2. [使用瑞芯微RK3588的NPU进行模型转换和推理](https://blog.csdn.net/old_power/article/details/145590080) - 建議使用 * (1) 下載代碼 $ git clone https://github.com/airockchip/rknn-toolkit2.git $ git clone https://github.com/airockchip/rknn_model_zoo.git $ git clone https://github.com/rockchip-linux/rknpu2.git $ sudo cp rknpu2/runtime/RK3588/Linux/librknn_api/aarch64/librknnrt.so /usr/lib/librknnrt.so * (2) 建立虛擬環境 $ sudo apt-get update $ sudo apt install python3.8-venv $ cd <rk_path> $ python3 -m venv rk $ source rk/bin/activate $ pip install rknn/rknn-toolkit2/rknn-toolkit2/packages/arm64/rknn_toolkit2-2.3.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl $ python3 -c 'from rknn.api import RKNN; RKNN(); print("RKNN module loaded successfully!")' * (3) 下載模型 $ cd rknn/rknn_model_zoo/examples/yolo11/model $ bash download_model.sh * (4) 轉換到 RKNN $ cd rknn/rknn_model_zoo/examples/yolo11/python $ python convert.py ../model/yolo11n.onnx rk3588 (PS : ONNX to rknn ) ``` (rk) orangepi@orangepi5plus:~/Desktop/Rorkchip/rknn/rknn_model_zoo/examples/yolo11/python$ python convert.py ../model/yolo11n.onnx rk3588 I rknn-toolkit2 version: 2.3.0 --> Config model done --> Loading model I Loading : 100%|██████████████████████████████████████████████| 174/174 [00:00<00:00, 36276.41it/s] done --> Building model I OpFusing 0: 100%|██████████████████████████████████████████████| 100/100 [00:00<00:00, 451.94it/s] I OpFusing 1 : 100%|█████████████████████████████████████████████| 100/100 [00:00<00:00, 186.89it/s] I OpFusing 0 : 100%|██████████████████████████████████████████████| 100/100 [00:01<00:00, 62.39it/s] I OpFusing 1 : 100%|██████████████████████████████████████████████| 100/100 [00:01<00:00, 60.04it/s] I OpFusing 2 : 100%|██████████████████████████████████████████████| 100/100 [00:01<00:00, 58.17it/s] I OpFusing 0 : 100%|██████████████████████████████████████████████| 100/100 [00:02<00:00, 46.59it/s] I OpFusing 1 : 100%|██████████████████████████████████████████████| 100/100 [00:02<00:00, 45.27it/s] I OpFusing 2 : 100%|██████████████████████████████████████████████| 100/100 [00:03<00:00, 30.04it/s] W build: found outlier value, this may affect quantization accuracy const name abs_mean abs_std outlier value model.23.cv3.1.1.1.conv.weight 0.55 0.60 -12.173 model.23.cv3.0.0.0.conv.weight 0.25 0.35 -15.593 I GraphPreparing : 100%|████████████████████████████████████████| 223/223 [00:00<00:00, 3522.77it/s] I Quantizating : 100%|████████████████████████████████████████████| 223/223 [00:14<00:00, 15.32it/s] W build: The default input dtype of 'images' is changed from 'float32' to 'int8' in rknn model for performance! Please take care of this change when deploy rknn model with Runtime API! W build: The default output dtype of '462' is changed from 'float32' to 'int8' in rknn model for performance! Please take care of this change when deploy rknn model with Runtime API! W build: The default output dtype of 'onnx::ReduceSum_476' is changed from 'float32' to 'int8' in rknn model for performance! Please take care of this change when deploy rknn model with Runtime API! W build: The default output dtype of '480' is changed from 'float32' to 'int8' in rknn model for performance! Please take care of this change when deploy rknn model with Runtime API! W build: The default output dtype of '487' is changed from 'float32' to 'int8' in rknn model for performance! Please take care of this change when deploy rknn model with Runtime API! W build: The default output dtype of 'onnx::ReduceSum_501' is changed from 'float32' to 'int8' in rknn model for performance! Please take care of this change when deploy rknn model with Runtime API! W build: The default output dtype of '505' is changed from 'float32' to 'int8' in rknn model for performance! Please take care of this change when deploy rknn model with Runtime API! W build: The default output dtype of '512' is changed from 'float32' to 'int8' in rknn model for performance! Please take care of this change when deploy rknn model with Runtime API! W build: The default output dtype of 'onnx::ReduceSum_526' is changed from 'float32' to 'int8' in rknn model for performance! Please take care of this change when deploy rknn model with Runtime API! W build: The default output dtype of '530' is changed from 'float32' to 'int8' in rknn model for performance! Please take care of this change when deploy rknn model with Runtime API! I rknn building ... I rknn building done. done --> Export rknn model done ``` * (5) 運行 APP $ cd rknn/rknn_model_zoo/examples/yolo11/python $ python yolo11.py --model_path ../model/yolo11.rknn --img_show --target 'rk3588' ![image](https://hackmd.io/_uploads/rJ9prge31e.png) ``` ------------ WPI Test Model Size (640, 640, 3) Data Type uint8 RK3588 NPU Inference Time = 65.28830528259277 ms ------------ ``` * How to inference * Main ``` model, platform = setup_model(args) outputs = model.run([input_data]) ``` * RK3588 NPU Func ``` from py_utils.rknn_executor import RKNN_model_container model = RKNN_model_container(args.model_path, args.target, args.device_id) ``` --- # 3. [rk3588对npu的再探索，yolov5使用rknn模型推理教程](https://www.coder4.com/archives/8229) * **C/C++ 方式** * 安裝必要套件 ``` sudo apt-get update sudo apt-get upgrade sudo apt-get install gcc cmake git build-essential ``` * 安裝 rknpu2 ``` git clone https://github.com/rockchip-linux/rknpu2 cd rknpu2 ``` * 編譯代碼 ``` cd examples/rknn_yolov5_demo ./build-linux_RK3588.sh ``` * 運行代碼 (圖片) - run 16.665700 ms ``` cd install/rknn_yolov5_demo_Linux ./rknn_yolov5_demo ./model/RK3588/yolov5s-640-640.rknn ./model/bus.jpg ``` * 運行代碼 (鏡頭) ``` cd install/rknn_yolov5_demo_Linux ./rknn_yolov5_demo ./model/RK3588/yolov5s-640-640.rknn 0 640 480 ``` ![image](https://hackmd.io/_uploads/HJ9cRBcIke.png) * **Python 方式** * 安裝必要套件 ``` sudo apt-get install virtualenv git sudo apt-get install python3 python3-dev python3-pip sudo apt-get install libxslti-dev zlibig zlibig-dev libglib2.0-0 libsm6 gcc ``` * 虛擬環境 ``` virtualenv -p /usr/bin/python3 venv source venv/bin/activate ``` * 下載 RKNN toolkit 並安裝套件 ``` git clone https://github.com/rockchip-linux/rknn-toolkit2.git # INSTALL cd rknn-toolkit2 pip3 install -r packages /requirements_cp39-1.6.0.txt pip3 install packages /rknn_toolkit2-1.6.0+81f21f4d-cp39-cp39-linux_x86_64.whl ``` * 運行 DEMO ``` cd examples/tflite /mobilenet_v1 python test.py --- # 4. 自研代碼 * Benchmark.py $ python benchmark.py \ --model rknn_model_zoo/examples/yolo11/model/yolov8.rknn \ --image rknn_model_zoo/examples/yolov5/model/bus.jp ```python import time import numpy as np import cv2 import argparse from rknn.api import RKNN # 解析命令行參數 parser = argparse.ArgumentParser(description="Benchmark RKNN model on RK3588") parser.add_argument("--model", type=str, required=True, help="Path to the RKNN model file") parser.add_argument("--image", type=str, required=True, help="Path to the input image") parser.add_argument("--target", type=str, default="rk3588", help="Target platform (default: rk3588)") parser.add_argument("--iterations", type=int, default=100, help="Number of inference iterations (default: 100)") args = parser.parse_args() # 讀取模型 MODEL_PATH = args.model IMAGE_PATH = args.image TARGET_PLATFORM = args.target N = args.iterations # 初始化 RKNN rknn = RKNN() print(f"--> Loading RKNN model: {MODEL_PATH}") rknn.load_rknn(MODEL_PATH) # 初始化運行時 print(f"--> Initializing RKNN runtime on {TARGET_PLATFORM}") rknn.init_runtime(target=TARGET_PLATFORM) # 讀取輸入圖片 image = cv2.imread(IMAGE_PATH) if image is None: print(f"Error: Cannot load image {IMAGE_PATH}") exit(1) image = cv2.resize(image, (640, 640)) # 根據你的模型尺寸調整 image = image.astype(np.uint8) # 預熱推理 10 次，確保測試更準確 print("--> Warming up the model") for _ in range(10): rknn.inference(inputs=[image]) # 正式 Benchmark print(f"--> Running benchmark: {N} iterations") start_time = time.time() for _ in range(N): outputs = rknn.inference(inputs=[image]) end_time = time.time() # 計算 FPS total_time = end_time - start_time avg_time = total_time / N fps = 1.0 / avg_time print(f"Average Inference Time: {avg_time*1000:.6f} ms") print(f"FPS: {fps:.2f}") # 釋放 RKNN rknn.release() ``` * YOLOv5s(視訊/Camera) $ python yolov5_camera.py --model_path ../model/yolov5.rknn --img_show -- target 'rk3588' ```python import os import cv2 import sys import argparse import time # Switch SHOW_LOG=False # Path realpath = os.path.abspath(__file__) _sep = os.path.sep realpath = realpath.split(_sep) sys.path.append(os.path.join(realpath[0]+_sep, *realpath[1:realpath.index('rknn_model_zoo')+1])) # Define from py_utils.coco_utils import COCO_test_helper import numpy as np OBJ_THRESH = 0.25 NMS_THRESH = 0.45 IMG_SIZE = (640, 640) # (width, height), such as (1280, 736) CLASSES = ("person", "bicycle", "car","motorbike ","aeroplane ","bus ","train","truck ","boat","traffic light", "fire hydrant","stop sign ","parking meter","bench","bird","cat","dog ","horse ","sheep","cow","elephant", "bear","zebra ","giraffe","backpack","umbrella","handbag","tie","suitcase","frisbee","skis","snowboard","sports ball","kite", "baseball bat","baseball glove","skateboard","surfboard","tennis racket","bottle","wine glass","cup","fork","knife ", "spoon","bowl","banana","apple","sandwich","orange","broccoli","carrot","hot dog","pizza ","donut","cake","chair","sofa", "pottedplant","bed","diningtable","toilet ","tvmonitor","laptop ","mouse ","remote ","keyboard ","cell phone","microwave ", "oven ","toaster","sink","refrigerator ","book","clock","vase","scissors ","teddy bear ","hair drier", "toothbrush ") coco_id_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 27, 28, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 67, 70, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 84, 85, 86, 87, 88, 89, 90] def filter_boxes(boxes, box_confidences, box_class_probs): """Filter boxes with object threshold. """ box_confidences = box_confidences.reshape(-1) class_max_score = np.max(box_class_probs, axis=-1) classes = np.argmax(box_class_probs, axis=-1) _class_pos = np.where(class_max_score* box_confidences >= OBJ_THRESH) scores = (class_max_score* box_confidences)[_class_pos] boxes = boxes[_class_pos] classes = classes[_class_pos] return boxes, classes, scores def nms_boxes(boxes, scores): """Suppress non-maximal boxes. # Returns keep: ndarray, index of effective boxes. """ x = boxes[:, 0] y = boxes[:, 1] w = boxes[:, 2] - boxes[:, 0] h = boxes[:, 3] - boxes[:, 1] areas = w * h order = scores.argsort()[::-1] keep = [] while order.size > 0: i = order[0] keep.append(i) xx1 = np.maximum(x[i], x[order[1:]]) yy1 = np.maximum(y[i], y[order[1:]]) xx2 = np.minimum(x[i] + w[i], x[order[1:]] + w[order[1:]]) yy2 = np.minimum(y[i] + h[i], y[order[1:]] + h[order[1:]]) w1 = np.maximum(0.0, xx2 - xx1 + 0.00001) h1 = np.maximum(0.0, yy2 - yy1 + 0.00001) inter = w1 * h1 ovr = inter / (areas[i] + areas[order[1:]] - inter) inds = np.where(ovr <= NMS_THRESH)[0] order = order[inds + 1] keep = np.array(keep) return keep def box_process(position, anchors): grid_h, grid_w = position.shape[2:4] col, row = np.meshgrid(np.arange(0, grid_w), np.arange(0, grid_h)) col = col.reshape(1, 1, grid_h, grid_w) row = row.reshape(1, 1, grid_h, grid_w) grid = np.concatenate((col, row), axis=1) stride = np.array([IMG_SIZE[1]//grid_h, IMG_SIZE[0]//grid_w]).reshape(1,2,1,1) col = col.repeat(len(anchors), axis=0) row = row.repeat(len(anchors), axis=0) anchors = np.array(anchors) anchors = anchors.reshape(*anchors.shape, 1, 1) box_xy = position[:,:2,:,:]*2 - 0.5 box_wh = pow(position[:,2:4,:,:]*2, 2) * anchors box_xy += grid box_xy *= stride box = np.concatenate((box_xy, box_wh), axis=1) # Convert [c_x, c_y, w, h] to [x1, y1, x2, y2] xyxy = np.copy(box) xyxy[:, 0, :, :] = box[:, 0, :, :] - box[:, 2, :, :]/ 2 # top left x xyxy[:, 1, :, :] = box[:, 1, :, :] - box[:, 3, :, :]/ 2 # top left y xyxy[:, 2, :, :] = box[:, 0, :, :] + box[:, 2, :, :]/ 2 # bottom right x xyxy[:, 3, :, :] = box[:, 1, :, :] + box[:, 3, :, :]/ 2 # bottom right y return xyxy def post_process(input_data, anchors): boxes, scores, classes_conf = [], [], [] # 1*255*h*w -> 3*85*h*w input_data = [_in.reshape([len(anchors[0]),-1]+list(_in.shape[-2:])) for _in in input_data] for i in range(len(input_data)): boxes.append(box_process(input_data[i][:,:4,:,:], anchors[i])) scores.append(input_data[i][:,4:5,:,:]) classes_conf.append(input_data[i][:,5:,:,:]) def sp_flatten(_in): ch = _in.shape[1] _in = _in.transpose(0,2,3,1) return _in.reshape(-1, ch) boxes = [sp_flatten(_v) for _v in boxes] classes_conf = [sp_flatten(_v) for _v in classes_conf] scores = [sp_flatten(_v) for _v in scores] boxes = np.concatenate(boxes) classes_conf = np.concatenate(classes_conf) scores = np.concatenate(scores) # filter according to threshold boxes, classes, scores = filter_boxes(boxes, scores, classes_conf) # nms nboxes, nclasses, nscores = [], [], [] for c in set(classes): inds = np.where(classes == c) b = boxes[inds] c = classes[inds] s = scores[inds] keep = nms_boxes(b, s) if len(keep) != 0: nboxes.append(b[keep]) nclasses.append(c[keep]) nscores.append(s[keep]) if not nclasses and not nscores: return None, None, None boxes = np.concatenate(nboxes) classes = np.concatenate(nclasses) scores = np.concatenate(nscores) return boxes, classes, scores def draw(image, boxes, scores, classes): for box, score, cl in zip(boxes, scores, classes): top, left, right, bottom = [int(_b) for _b in box] SHOW_LOG and print("%s @ (%d %d %d %d) %.3f" % (CLASSES[cl], top, left, right, bottom, score)) cv2.rectangle(image, (top, left), (right, bottom), (255, 0, 0), 2) cv2.putText(image, '{0} {1:.2f}'.format(CLASSES[cl], score), (top, left - 6), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 2) def setup_model(args): model_path = args.model_path if model_path.endswith('.pt') or model_path.endswith('.torchscript'): platform = 'pytorch' from py_utils.pytorch_executor import Torch_model_container model = Torch_model_container(args.model_path) elif model_path.endswith('.rknn'): platform = 'rknn' from py_utils.rknn_executor import RKNN_model_container model = RKNN_model_container(args.model_path, args.target, args.device_id) elif model_path.endswith('onnx'): platform = 'onnx' from py_utils.onnx_executor import ONNX_model_container model = ONNX_model_container(args.model_path) else: assert False, "{} is not rknn/pytorch/onnx model".format(model_path) print('Model-{} is {} model, starting val'.format(model_path, platform)) return model, platform def img_check(path): img_type = ['.jpg', '.jpeg', '.png', '.bmp'] for _type in img_type: if path.endswith(_type) or path.endswith(_type.upper()): return True return False if __name__ == '__main__': # Extra parser = argparse.ArgumentParser(description='Process some integers.') parser.add_argument('--model_path', type=str, required= True, help='model path, could be .pt or .rknn file') parser.add_argument('--target', type=str, default='rk3566', help='target RKNPU platform') parser.add_argument('--device_id', type=str, default=None, help='device id') parser.add_argument('--img_show', action='store_true', default=False, help='draw the result and show') parser.add_argument('--img_save', action='store_true', default=False, help='save the result') parser.add_argument('--anno_json', type=str, default='../../../datasets/COCO/annotations/instances_val2017.json', help='coco annotation path') parser.add_argument('--img_folder', type=str, default='../model', help='img folder path') parser.add_argument('--coco_map_test', action='store_true', help='enable coco map test') parser.add_argument('--anchors', type=str, default='../model/anchors_yolov5.txt', help='target to anchor file, only yolov5, yolov7 need this param') args = parser.parse_args() # load anchor with open(args.anchors, 'r') as f: values = [float(_v) for _v in f.readlines()] anchors = np.array(values).reshape(3,-1,2).tolist() SHOW_LOG and print("use anchors from '{}', which is {}".format(args.anchors, anchors)) # init model model, platform = setup_model(args) file_list = sorted(os.listdir(args.img_folder)) img_list = [] for path in file_list: if img_check(path): img_list.append(path) co_helper = COCO_test_helper(enable_letter_box=True) # Setting Save Image img_name = 'output.jpg' # Setting Windows cv2.namedWindow("RK3588 - ai result", cv2.WND_PROP_FULLSCREEN) cv2.setWindowProperty("RK3588 - ai result", cv2.WND_PROP_FULLSCREEN, cv2.WINDOW_FULLSCREEN) # run # gst-launch-1.0 v4l2src device=/dev/video0 ! image/jpeg,width=1280,height=720,framerate=30/1 ! jpegdec ! videoconvert ! autovideosink #gst_pipeline = "v4l2src device=/dev/video0 ! image/jpeg,width=1280,height=720,framerate=30/1 ! jpegdec ! videoconvert ! appsink" cap = cv2.VideoCapture(0) while True: # Read Image ret, frame = cap.read() img_src = frame print(frame.shape) # Due to rga init with (0,0,0), we using pad_color (0,0,0) instead of (114, 114, 114) pad_color = (0,0,0) img = co_helper.letter_box(im= img_src.copy(), new_shape=(IMG_SIZE[1], IMG_SIZE[0]), pad_color=(0,0,0)) img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # preprocee if not rknn model if platform in ['pytorch', 'onnx']: input_data = img.transpose((2,0,1)) input_data = input_data.reshape(1,*input_data.shape).astype(np.float32) input_data = input_data/255. else: input_data = img interpreter_time_start = time.time() outputs = model.run([input_data]) interpreter_time_end = time.time() boxes, classes, scores = post_process(outputs, anchors) if args.img_show or args.img_save: SHOW_LOG and print('\n\nIMG: {}'.format(img_name)) img_p = img_src.copy() if boxes is not None: draw(img_p, co_helper.get_real_box(boxes), scores, classes) if args.img_save: if not os.path.exists('./result'): os.mkdir('./result') result_path = os.path.join('./result', img_name) cv2.imwrite(result_path, img_p) SHOW_LOG and print('Detection result save to {}'.format(result_path)) if args.img_show: cv2.imshow("RK3588 - ai result", img_p) cv2.waitKeyEx(1) # record maps if args.coco_map_test is True: if boxes is not None: for i in range(boxes.shape[0]): co_helper.add_single_record(image_id = int(img_name.split('.')[0]), category_id = coco_id_list[int(classes[i])], bbox = boxes[i], score = round(scores[i], 5).item() ) # calculate maps if args.coco_map_test is True: pred_json = args.model_path.split('.')[-2]+ '_{}'.format(platform) +'.json' pred_json = pred_json.split('/')[-1] pred_json = os.path.join('./', pred_json) co_helper.export_to_json(pred_json) from py_utils.coco_utils import coco_eval_with_json coco_eval_with_json(args.anno_json, pred_json) # release model.release() ``` --- Q :RKNN init failed. error code: RKNN_ERR_FAIL A : $ sudo cp rknpu2/runtime/RK3588/Linux/librknn_api/aarch64/librknnrt.so /usr/lib/librknnrt.so ``` I target set by user is: rk3588 E RKNN: [11:28:24.309] 6, 1 E RKNN: [11:28:24.309] Invalid RKNN model version 6 E RKNN: [11:28:24.309] rknn_init, load model failed! E init_runtime: Traceback (most recent call last): File "rknn/api/rknn_log.py", line 344, in rknn.api.rknn_log.error_catch_decorator.error_catch_wrapper File "rknn/api/rknn_base.py", line 2483, in rknn.api.rknn_base.RKNNBase.init_runtime File "rknn/api/rknn_runtime.py", line 427, in rknn.api.rknn_runtime.RKNNRuntime.build_graph Exception: RKNN init failed. error code: RKNN_ERR_FAIL ———————————————— 版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。原文链接：https://blog.csdn.net/m0_60657960/article/details/144234539 ``` --- # 參考文獻 * [Higher latency on RK3588 then YOLOv6 and DAMO-YOLO](https://github.com/THU-MIG/yolov10/issues/115) * [yolov8部署到RK3588_并运行示例12.4_笔记2](https://blog.csdn.net/m0_60657960/article/details/144234539)