Docker for DG2

--- title: 'Docker for DG2' disqus: hackmd --- Docker for DG2 === ## Table of Contents [TOC] ## BKC OV2022.2 Ubuntu 20.04 Kernel: 5.14.0-1047-oem libdrm2: 2.4.107 libva2: 2.7.0-2 Media driver for VAAPI: 20.1.1 Mesa: 21.2.6 OpenCL* runtime: 22.39.24347 compute-runtime: https://github.com/intel/compute-runtime/releases ## Docker installation and setup ``` sudo -E apt install docker.io #Docker proxy setting sudo mkdir /etc/systemd/system/docker.service.d sudo vim /etc/systemd/system/docker.service.d/http-proxy.conf #Add proxy config as below: [Service] Environment="HTTP_PROXY=http://proxy02.hd.intel.com:911" Environment="HTTPS_PROXY=http://proxy02.hd.intel.com:911" sudo systemctl daemon-reload sudo systemctl restart docker ``` >If you want to avoid typing sudo whenever you run the docker command, add your username to the docker group: ``` sudo usermod -aG docker ${USER} ``` > log out then re-login To check whether you can access and download images from Docker Hub, type: ``` docker run hello-world ``` Download OV docker from docker hub --- Ref: https://hub.docker.com/r/openvino/ubuntu20_dev ``` docker pull openvino/ubuntu20_dev ``` > If your host system is Ubuntu 20, follow the Configuration Guide for the Intel® Graphics Compute Runtime for OpenCL™ on Ubuntu* 20.04. > https://github.com/openvinotoolkit/docker_ci/blob/master/configure_gpu_ubuntu20.md Run docker with DG2 and download Intel compute-runtime --- Make sure DG2 can performance inference out of docker Run docker with DG2 Ref https://github.com/intel/compute-runtime/releases ``` docker run -it --user root --device /dev/dri --mount type=bind,source=/home/hc-adlp,destination=/home/hc-adlp --name ov2022.2 openvino/ubuntu20_dev ``` >inside docker OV2022.2 download Intel compute-runtime ``` export https_proxy="http://proxy02.hd.intel.com:911" export http_proxy="http://proxy02.hd.intel.com:911" apt update apt install wget dpkg clinfo vainfo mkdir neo cd neo wget https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.12149.1/intel-igc-core_1.0.12149.1_amd64.deb wget https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.12149.1/intel-igc-opencl_1.0.12149.1_amd64.deb wget https://github.com/intel/compute-runtime/releases/download/22.39.24347/intel-level-zero-gpu-dbgsym_1.3.24347_amd64.ddeb wget https://github.com/intel/compute-runtime/releases/download/22.39.24347/intel-level-zero-gpu_1.3.24347_amd64.deb wget https://github.com/intel/compute-runtime/releases/download/22.39.24347/intel-opencl-icd-dbgsym_22.39.24347_amd64.ddeb wget https://github.com/intel/compute-runtime/releases/download/22.39.24347/intel-opencl-icd_22.39.24347_amd64.deb wget https://github.com/intel/compute-runtime/releases/download/22.39.24347/libigdgmm12_22.2.0_amd64.deb dpkg -i *.deb #check DG2 can be detected cd /opt/intel/openvino_2022.2.0.7713 source setupvars.sh python3 samples/python/hello_query_device/hello_query_device.py ``` Run docker with DG2 and download Intel compute-runtime --- ``` #run benchmark_app root@f26ccd090972:/opt/intel/openvino# benchmark_app -m /home/hc-adlp/Desktop/openvino/yolo-v4-tf/FP16-INT8/yolo-v4-tf.xml -d GPU.1 -hint none [Step 1/11] Parsing and validating input arguments [ WARNING ] -nstreams default value is determined automatically for a device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README. [Step 2/11] Loading OpenVINO [ WARNING ] No device GPU.1 performance hint is set. [ INFO ] OpenVINO: API version............. 2022.2.0-7713-af16ea1d79a-releases/2022/2 [ INFO ] Device info GPU Intel GPU plugin........ version 2022.2 Build................... 2022.2.0-7713-af16ea1d79a-releases/2022/2 [Step 3/11] Setting device configuration [Step 4/11] Reading network files [ INFO ] Read model took 63.82 ms [Step 5/11] Resizing network to match image sizes and given batch [ INFO ] Network batch size: 1 [Step 6/11] Configuring input of the model [ INFO ] Model input 'image_input' precision u8, dimensions ([N,H,W,C]): 1 608 608 3 [ INFO ] Model output 'Func/StatefulPartitionedCall/output/_542:0' precision f32, dimensions ([...]): 1 38 38 255 [ INFO ] Model output 'Func/StatefulPartitionedCall/output/_543:0' precision f32, dimensions ([...]): 1 19 19 255 [ INFO ] Model output 'Func/StatefulPartitionedCall/output/_544:0' precision f32, dimensions ([...]): 1 76 76 255 [Step 7/11] Loading the model to the device [ INFO ] Compile model took 6280.09 ms [Step 8/11] Querying optimal runtime parameters [ INFO ] DEVICE: GPU.1 [ INFO ] AVAILABLE_DEVICES , ['0', '1'] [ INFO ] RANGE_FOR_ASYNC_INFER_REQUESTS , (1, 2, 1) [ INFO ] RANGE_FOR_STREAMS , (1, 2) [ INFO ] OPTIMAL_BATCH_SIZE , 1 [ INFO ] MAX_BATCH_SIZE , 1 [ INFO ] FULL_DEVICE_NAME , Intel(R) Graphics [0x56a0] (dGPU) [ INFO ] DEVICE_TYPE , Type.DISCRETE [ INFO ] OPTIMIZATION_CAPABILITIES , ['FP32', 'BIN', 'FP16', 'INT8', 'GPU_HW_MATMUL'] [ INFO ] GPU_UARCH_VERSION , 12.7.1 [ INFO ] GPU_EXECUTION_UNITS_COUNT , 512 [ INFO ] PERF_COUNT , False [ INFO ] MODEL_PRIORITY , Priority.MEDIUM [ INFO ] GPU_HOST_TASK_PRIORITY , Priority.MEDIUM [ INFO ] GPU_QUEUE_PRIORITY , Priority.MEDIUM [ INFO ] GPU_QUEUE_THROTTLE , Priority.MEDIUM [ INFO ] GPU_ENABLE_LOOP_UNROLLING , True [ INFO ] CACHE_DIR , [ INFO ] PERFORMANCE_HINT , PerformanceMode.UNDEFINED [ INFO ] COMPILATION_NUM_THREADS , 24 [ INFO ] NUM_STREAMS , 1 [ INFO ] PERFORMANCE_HINT_NUM_REQUESTS , 0 [ INFO ] DEVICE_ID , 1 [Step 9/11] Creating infer requests and preparing input data [ INFO ] Create 2 infer requests took 16.46 ms [ WARNING ] No input files were given for input 'image_input'!. This input will be filled with random values! [ INFO ] Fill input 'image_input' with random values [Step 10/11] Measuring performance (Start inference asynchronously, 2 inference requests, inference only: True, limits: 60000 ms duration) [ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop). [ INFO ] First inference took 39.40 ms [Step 11/11] Dumping statistics report Count: 5734 iterations Duration: 60029.51 ms Latency: Median: 20.80 ms AVG: 20.82 ms MIN: 20.40 ms MAX: 32.75 ms Throughput: 95.52 FPS ``` >-d GPU.1 -hint none Count: 5734 iterations Duration: 60029.51 ms Latency: Median: 20.80 ms AVG: 20.82 ms MIN: 20.40 ms MAX: 32.75 ms Throughput: 95.52 FPS >-d GPU.1 -b 8 -nstream 4 -hint none Count: 1320 iterations Duration: 60334.69 ms Latency: Median: 364.93 ms AVG: 364.83 ms MIN: 172.10 ms MAX: 389.45 ms Throughput: 175.02 FPS >-d GPU.1 -b 12 -nstream 4 -hint none Count: 848 iterations Duration: 60848.08 ms Latency: Median: 573.37 ms AVG: 572.47 ms MIN: 277.51 ms MAX: 596.19 ms Throughput: 167.24 FPS >-d GPU.1 -b 16 -nstream 4 -hint none Count: 32 iterations Duration: 84140.84 ms Latency: Median: 20823.73 ms AVG: 19729.30 ms MIN: 10623.74 ms MAX: 21496.96 ms Throughput: 6.09 FPS >-d GPU.1 -b 8 -nstream 8 -hint none Count: 1344 iterations Duration: 61195.22 ms Latency: Median: 727.54 ms AVG: 724.85 ms MIN: 164.20 ms MAX: 747.46 ms Throughput: 175.70 FPS