---
# System prepended metadata

title: DG2 performance on Windows
tags: [DG2]

---

---
title: 'DG2 performance on Windows'
disqus: hackmd
---

DG2 performance on Windows
===

## Table of Contents

[TOC]

## BKC

* ADL-S RVP
* FRD4 128EU B1 Arc A380
* Win10 10.0.19044.2006
* DG2
    * FRD1:
 FRD_DG2_512_C1_ES_136_IFWI_22WW38_02_GS1879_PC9771_1059_SN_V5_14GT_TRC_DS_C8.bin
    * FRD4: FRD4_DG2_128_B0_ES_276_IFWI_22WW39_04_GS1899_PC9775C_OP1059_SN_V5_15.5GT_C8_TR_DS.bin
* Driver: 31.0.101.3276

## DG2 AIC FRD1 512 EU
Default -d GPU.1
```
C:\Users\Win10>openvino\Scripts\activate
(openvino) C:\Users\Win10>benchmark_app -m C:\Users\Win10\Desktop\openvino\openvino_models\public\yolo-v4-tf\FP16-INT8\yolo-v4-tf.xml -d GPU.1
[Step 1/11] Parsing and validating input arguments
[ WARNING ]  -nstreams default value is determined automatically for a device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README.
[Step 2/11] Loading OpenVINO
[ WARNING ] PerformanceMode was not explicitly specified in command line. Device GPU.1 performance hint will be set to THROUGHPUT.
[ INFO ] OpenVINO:
         API version............. 2022.2.0-7690-940e927a22b-refs/pull/1296/head
[ INFO ] Device info
         GPU
         Intel GPU plugin........ version 2022.2
         Build................... 2022.2.0-7690-940e927a22b-refs/pull/1296/head

[Step 3/11] Setting device configuration
[ WARNING ] -nstreams default value is determined automatically for GPU.1 device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README.
[Step 4/11] Reading network files
[ INFO ] Read model took 127.79 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model input 'image_input' precision u8, dimensions ([N,H,W,C]): 1 608 608 3
[ INFO ] Model output 'Func/StatefulPartitionedCall/output/_542:0' precision f32, dimensions ([...]): 1 38 38 255
[ INFO ] Model output 'Func/StatefulPartitionedCall/output/_543:0' precision f32, dimensions ([...]): 1 19 19 255
[ INFO ] Model output 'Func/StatefulPartitionedCall/output/_544:0' precision f32, dimensions ([...]): 1 76 76 255
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 16760.36 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] DEVICE: GPU.1
[ INFO ]   AVAILABLE_DEVICES  , ['0', '1']
[ INFO ]   RANGE_FOR_ASYNC_INFER_REQUESTS  , (1, 2, 1)
[ INFO ]   RANGE_FOR_STREAMS  , (1, 2)
[ INFO ]   OPTIMAL_BATCH_SIZE  , 1
[ INFO ]   MAX_BATCH_SIZE  , 1
[ INFO ]   FULL_DEVICE_NAME  , Intel(R) Arc(TM) A770 Graphics (dGPU)
[ INFO ]   DEVICE_TYPE  , Type.DISCRETE
[ INFO ]   OPTIMIZATION_CAPABILITIES  , ['FP32', 'BIN', 'FP16', 'INT8', 'GPU_HW_MATMUL']
[ INFO ]   GPU_UARCH_VERSION  , 12.7.1
[ INFO ]   GPU_EXECUTION_UNITS_COUNT  , 512
[ INFO ]   PERF_COUNT  , False
[ INFO ]   MODEL_PRIORITY  , Priority.MEDIUM
[ INFO ]   GPU_HOST_TASK_PRIORITY  , Priority.MEDIUM
[ INFO ]   GPU_QUEUE_PRIORITY  , Priority.MEDIUM
[ INFO ]   GPU_QUEUE_THROTTLE  , Priority.MEDIUM
[ INFO ]   GPU_ENABLE_LOOP_UNROLLING  , True
[ INFO ]   CACHE_DIR  ,
[ INFO ]   PERFORMANCE_HINT  , PerformanceMode.THROUGHPUT
[ INFO ]   COMPILATION_NUM_THREADS  , 24
[ INFO ]   NUM_STREAMS  , 1
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS  , 0
[ INFO ]   DEVICE_ID  , 1
[Step 9/11] Creating infer requests and preparing input data
[ INFO ] Create 64 infer requests took 454.93 ms
[ WARNING ] No input files were given for input 'image_input'!. This input will be filled with random values!
[ INFO ] Fill input 'image_input' with random values
[Step 10/11] Measuring performance (Start inference asynchronously, 64 inference requests using 1 streams for GPU.1, inference only: True, limits: 60000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 652.58 ms
[Step 11/11] Dumping statistics report
Count:          16640 iterations
Duration:       60293.32 ms
Latency:
    Median:     231.60 ms
    AVG:        231.50 ms
    MIN:        119.99 ms
    MAX:        237.63 ms
Throughput: 275.98 FPS

(openvino) C:\Users\Win10>
```

>-d GPU.1 -hint none
Count:          5452 iterations
Duration:       60027.22 ms
Latency:
    Median:     21.93 ms
    AVG:        21.93 ms
    MIN:        11.59 ms
    MAX:        23.97 ms
Throughput: 90.83 FPS


>-d GPU.1 -b 8 -nstream 4 -hint none
Count:          1136 iterations
Duration:       60574.46 ms
Latency:
    Median:     421.04 ms
    AVG:        425.30 ms
    MIN:        122.51 ms
    MAX:        461.08 ms
Throughput: 150.03 FPS

>-d GPU.1 -b 12 -nstream 4 -hint none
Count:          736 iterations
Duration:       61253.30 ms
Latency:
    Median:     666.41 ms
    AVG:        662.91 ms
    MIN:        213.72 ms
    MAX:        697.80 ms
Throughput: 144.19 FPS

>-d GPU.1 -b 16 -nstream 4 -hint none
Count:          32 iterations
Duration:       94451.72 ms
Latency:
    Median:     23589.75 ms
    AVG:        21390.65 ms
    MIN:        11717.85 ms
    MAX:        23649.15 ms
Throughput: 5.42 FPS

>-d GPU.1 -b 8 -nstream 8 -hint none
Count:          1152 iterations
Duration:       61502.75 ms
Latency:
    Median:     855.28 ms
    AVG:        848.88 ms
    MIN:        231.36 ms
    MAX:        906.93 ms
Throughput: 149.85 FPS

>-d GPU.1 -hint throughput
Count:          16384 iterations
Duration:       60663.84 ms
Latency:
    Median:     472.02 ms
    AVG:        472.80 ms
    MIN:        246.64 ms
    MAX:        516.02 ms
Throughput: 270.08 FPS

DG2 AIC FRD4 128 EU
---
```
(openvino) C:\Users\Win10>benchmark_app -m C:\Users\Win10\Desktop\openvino\openvino_models\public\yolo-v4-tf\FP16-INT8\yolo-v4-tf.xml -d GPU.1
[Step 1/11] Parsing and validating input arguments
[ WARNING ]  -nstreams default value is determined automatically for a device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README.
[Step 2/11] Loading OpenVINO
[ WARNING ] PerformanceMode was not explicitly specified in command line. Device GPU.1 performance hint will be set to THROUGHPUT.
[ INFO ] OpenVINO:
         API version............. 2022.2.0-7690-940e927a22b-refs/pull/1296/head
[ INFO ] Device info
         GPU
         Intel GPU plugin........ version 2022.2
         Build................... 2022.2.0-7690-940e927a22b-refs/pull/1296/head

[Step 3/11] Setting device configuration
[ WARNING ] -nstreams default value is determined automatically for GPU.1 device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README.
[Step 4/11] Reading network files
[ INFO ] Read model took 161.76 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model input 'image_input' precision u8, dimensions ([N,H,W,C]): 1 608 608 3
[ INFO ] Model output 'Func/StatefulPartitionedCall/output/_542:0' precision f32, dimensions ([...]): 1 38 38 255
[ INFO ] Model output 'Func/StatefulPartitionedCall/output/_543:0' precision f32, dimensions ([...]): 1 19 19 255
[ INFO ] Model output 'Func/StatefulPartitionedCall/output/_544:0' precision f32, dimensions ([...]): 1 76 76 255
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 17912.19 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] DEVICE: GPU.1
[ INFO ]   AVAILABLE_DEVICES  , ['0', '1']
[ INFO ]   RANGE_FOR_ASYNC_INFER_REQUESTS  , (1, 2, 1)
[ INFO ]   RANGE_FOR_STREAMS  , (1, 2)
[ INFO ]   OPTIMAL_BATCH_SIZE  , 1
[ INFO ]   MAX_BATCH_SIZE  , 1
[ INFO ]   FULL_DEVICE_NAME  , Intel(R) Arc(TM) A380  Graphics (dGPU)
[ INFO ]   DEVICE_TYPE  , Type.DISCRETE
[ INFO ]   OPTIMIZATION_CAPABILITIES  , ['FP32', 'BIN', 'FP16', 'INT8', 'GPU_HW_MATMUL']
[ INFO ]   GPU_UARCH_VERSION  , 12.7.1
[ INFO ]   GPU_EXECUTION_UNITS_COUNT  , 128
[ INFO ]   PERF_COUNT  , False
[ INFO ]   MODEL_PRIORITY  , Priority.MEDIUM
[ INFO ]   GPU_HOST_TASK_PRIORITY  , Priority.MEDIUM
[ INFO ]   GPU_QUEUE_PRIORITY  , Priority.MEDIUM
[ INFO ]   GPU_QUEUE_THROTTLE  , Priority.MEDIUM
[ INFO ]   GPU_ENABLE_LOOP_UNROLLING  , True
[ INFO ]   CACHE_DIR  ,
[ INFO ]   PERFORMANCE_HINT  , PerformanceMode.THROUGHPUT
[ INFO ]   COMPILATION_NUM_THREADS  , 24
[ INFO ]   NUM_STREAMS  , 1
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS  , 0
[ INFO ]   DEVICE_ID  , 1
[Step 9/11] Creating infer requests and preparing input data
[ INFO ] Create 16 infer requests took 143.93 ms
[ WARNING ] No input files were given for input 'image_input'!. This input will be filled with random values!
[ INFO ] Fill input 'image_input' with random values
[Step 10/11] Measuring performance (Start inference asynchronously, 16 inference requests using 1 streams for GPU.1, inference only: True, limits: 60000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 912.59 ms
[Step 11/11] Dumping statistics report
Count:          4624 iterations
Duration:       60273.46 ms
Latency:
    Median:     207.98 ms
    AVG:        208.24 ms
    MIN:        103.81 ms
    MAX:        1146.41 ms
Throughput: 76.72 FPS
```

> -d GPU.1 -hint throughput
Count:          4704 iterations
Duration:       60499.28 ms
Latency:
    Median:     411.43 ms
    AVG:        410.37 ms
    MIN:        202.38 ms
    MAX:        412.55 ms
Throughput: 77.75 FPS

>-d GPU.1 -b 8 -nstream 8 -hint none
Count:          608 iterations
Duration:       62618.32 ms
Latency:
    Median:     1647.54 ms
    AVG:        1627.46 ms
    MIN:        381.35 ms
    MAX:        1651.07 ms
Throughput: 77.68 FPS
![](https://i.imgur.com/NvNzIWB.png)


###### tags: `DG2` `OPENVINO'
`