# Yolo GPU DeepStream realtime streaming detection
## Source:
In this tutorial, I use `rtsp` source. Specifically, I used a smartphone app `LarixBroadcaster` stream `rtmp` to our server: `172.18.240.131` port `1937` with topic `app/viewsmall`, i.e., `rtmp://172.18.240.131:1937/app/viewsmall`.
Then the `rtsp_simple` server has been deployed in 172.18.240.131 will re-broadcast the real-time streaming video as both rtsp, rtmp, and HLS.
We will use `rstp://172.18.240.131:8555/app/viewsmall` as the source for deepstreamer


## DeepStreamer Yolo:
* Use my built yolo deepstream (DS ver 6.0) image:
```
docker run --it --network=host --gpus 2 -w /opt/nvidia/deepstream/deepstream-6.0/sources/objectDetector_Yolo_Mao ngovanmao/nvidia-deepstream-6.0.1-yolo:v03
```
Inside the docker container, you can modify the config file. For example, here is the config file with source0 as RTSP and sink0 as RTSP. You can see some options are disable (i.e, `enable=0`), we only use `source0` and
`sink0`, no `display`. If you want to modify the pipeline, feel free to change.
```
[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
#gie-kitti-output-dir=streamscl
[tiled-display]
enable=0
rows=1
columns=1
width=4096
height=2160
gpu-id=0
#(0): nvbuf-mem-default - Default memory allocated, specific to particular platform
#(1): nvbuf-mem-cuda-pinned - Allocate Pinned/Host cuda memory, applicable for Tesla
#(2): nvbuf-mem-cuda-device - Allocate Device cuda memory, applicable for Tesla
#(3): nvbuf-mem-cuda-unified - Allocate Unified cuda memory, applicable for Tesla
#(4): nvbuf-mem-surface-array - Allocate Surface Array memory, applicable for Jetson
nvbuf-memory-type=0
[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=2
uri=rtsp://172.18.240.131:8555/app/viewsmall
latency=20
num-sources=1
gpu-id=0
# (0): memtype_device - Memory type Device
# (1): memtype_pinned - Memory type Host Pinned
# (2): memtype_unified - Memory type Unified
cudadec-memtype=0
[source1]
enable=0
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=1
intra-decode-enable=1
camera-width=4096
camera-height=2160
camera-fps-n=30
camera-fps-d=1
#device=/dev/video0
camera-v4l2-dev-node=0
nvbuf-memory-type=3
gpu-id=0
[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File 4=RTSPStreaming 5=Overlay
type=4
#1=h264 2=h265
codec=1
sync=0
source-id=0
gpu-id=0
nvbuf-memory-type=0
bitrate=4000000
# set below properties in case of RTSPStreaming
rtsp-port=8554
udp-port=5400
[sink1]
enable=0
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=0
source-id=0
gpu-id=0
nvbuf-memory-type=0
[osd]
enable=1
gpu-id=0
border-width=2
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0
[streammux]
gpu-id=0
##Boolean property to inform muxer that sources are live
live-source=1
batch-size=2
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=40000
## Set muxer output width and height
width=4096
height=2160
##Enable to maintain aspect ratio wrt source, and allow black borders, works
##along with width, height properties
enable-padding=0
nvbuf-memory-type=0
# config-file property is mandatory for any gie section.
# Other properties are optional and if set will override the properties set in
# the infer config file.
[primary-gie]
enable=1
gpu-id=0
#model-engine-file=model_b1_gpu0_int8.engine
labelfile-path=labels.txt
batch-size=1
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
interval=2
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary_yoloV3.txt
[tracker]
enable=1
# For NvDCF and DeepSORT tracker, tracker-width and tracker-height must be a multiple of 32, respectively
tracker-width=640
tracker-height=384
ll-lib-file=/opt/nvidia/deepstream/deepstream-6.0/lib/libnvds_nvmultiobjecttracker.so
# ll-config-file required to set different tracker types
# ll-config-file=../../samples/configs/deepstream-app/config_tracker_IOU.yml
ll-config-file=../../samples/configs/deepstream-app/config_tracker_NvDCF_perf.yml
# ll-config-file=../../samples/configs/deepstream-app/config_tracker_NvDCF_accuracy.yml
# ll-config-file=../../samples/configs/deepstream-app/config_tracker_DeepSORT.yml
gpu-id=0
enable-batch-process=1
enable-past-frame=1
display-tracking-id=1
[tests]
file-loop=0
```
* The start the Yolo service inference on edge server (e.g., Xinmatrix GPU edge server):
The model is quite big so loading quite slow, be patient. You can use Yolov3_tiny or other models download online as well.
```
root@xinmatrix:/opt/nvidia/deepstream/deepstream-6.0/sources/objectDetector_Yolo_Mao# deepstream-app -c deepstream_app_config_yoloV3_Mao.txt
....
(100) conv-bn-leaky 128 x 76 x 76 256 x 76 x 76 61277790
(101) conv-bn-leaky 256 x 76 x 76 128 x 76 x 76 61311070
(102) conv-bn-leaky 128 x 76 x 76 256 x 76 x 76 61607006
(103) conv-bn-leaky 256 x 76 x 76 128 x 76 x 76 61640286
(104) conv-bn-leaky 128 x 76 x 76 256 x 76 x 76 61936222
(105) conv-linear 256 x 76 x 76 255 x 76 x 76 62001757
(106) yolo 255 x 76 x 76 255 x 76 x 76 62001757
Output yolo blob names :
yolo_83
yolo_95
yolo_107
Total number of yolo layers: 257
Building yolo network complete!
Building the TensorRT Engine...
....
```
This may take 4 minutes for completing conversion model (i.e., do not know exactly why, I guess due to different platform issues, but just waiting is fine). After you see `**PERF: FPS 0 (Avg)`, it start inference online.
For Yolov3, it can run 24 FPS:
```
**PERF: 23.80 (23.55)
**PERF: 24.17 (23.75)
**PERF: 23.78 (23.76)
**PERF: 24.07 (23.86)
**PERF: 24.00 (23.89)
**PERF: 23.85 (23.87)
**PERF: 24.08 (23.89)
**PERF: 23.92 (23.90)
**PERF: 23.96 (23.91)
**PERF: 23.96 (23.92)
**PERF: 23.81 (23.91)
```
## Client side:
As configured in the `deepstream_app_config_yoloV3_Mao.txt`, the port is `8554`, default topic is `ds-test`. You can use VLC player to play the content remotely, or other mobile app supported RTSP, or event our built tool gstreamer.

You can change resolution, and boders sizes of the bounding box to increase visibility.
For improve latency, we can use our GStreamer script `gst-client.sh`:
```
#./gst_client.sh -s 10.1.7.24 -p 8554 -t ds-test
...
gst-launch-1.0 rtspsrc location=rtsp://10.1.7.24:8554/ds-test ! rtph264depay ! h264parse ! d3d11h264dec ! glimagesink sync=false
```

## Some extensions:
For getting details of bounding boxes coordinate, you may follow this guide:
https://forums.developer.nvidia.com/t/get-detected-bounding-box-infomations-from-deepstream-yolo-app/77327/4
My first guess is that you may change the file `/opt/nvidia/deepstream/deepstream-6.0/sources/objectDetector_Yolo_Mao/nvdsinfer_custom_impl_Yolo/nvdsparsebbox_Yolo.cpp`, class: `decodeYoloV3Tensor`.