---
tags: NVIDIA, NVIDIA GEFORCE RTX-3070, NVIDIA GEFORCE RTX-3080
---
# NVIDIA RTX-3070/3080 benchmark and burnIn environment using and building SOP
###### tags: NVIDIA GEFORCE RTX-3070, NVIDIA GEFORCE RTX-3080
## HW equipment
Mother Board: BIS-3101 with NVIDIA GEFORCE RTX-3070/3080 x 1
CPU: Intel® Core® i7 9700E CPU @ 2.60GHz x 1
RAM: 16GB SODIMM x 2
OS: Ubuntu 18.04 LTS Desktop, kernel 5.4.0 (UEFI)
Docker: 19.03
Cuda: 11.1
NVIDIA TensorRT docker image version: 20.11-py3
NVIDIA TensorFlow docker image version: 20.11-tf2-py3
## environment using SOP
### gpu_monitor
We use nvidia-smi tool to monitor temperature, power(watt), processes and frequency of gpu
I wrote a script to show those message and save messages to log in a loop
You can do test and monitor gpu value at the same time by switching tty in Linux or using terminal management tool like tmux
```javascript=
#!/bin/bash
echo " " > ./gpu_log.txt
echo "please insert interval (sec) : "
read interval
for((i=1;i>0;i++))
do
echo -e "\n=====i : ${i}=====\n" > ./gpu_log_tmp.txt
nvidia-smi >> ./gpu_log_tmp.txt
cat ./gpu_log_tmp.txt
sleep 2
nvidia-smi -q -d CLOCK | grep -v N/A | grep -v "Not Found" >> ./gpu_log_tmp.txt
cat ./gpu_log_tmp.txt
cat ./gpu_log_tmp.txt >> ./gpu_log.txt
sleep "${interval}"
done
```
### benchmark method
```javascript=
$ sudo su //to enter root privilege
$ cd /home/aewin
$ chmod 777 ./benchmark.sh
$ ./benchmark.sh
```
after that, you'll enter into tensorrt docker container
```javascript=
$ cd /workspace/tensorrt/data/resnet50
$ ./benchmark.sh //there're two mode: int8 and fp16, choose what you need to do
```
### burnIn method
```javascript=
$ sudo su //to enter root privilege
$ cd /home/aewin
$ chmod 777 ./burn.sh
$ ./burn.sh
```
after that, you'll enter into tensorflow docker container
```javascript=
$ cd /workspace/nvidia-examples/cnn
$ ./burn.sh
```
## environment building SOP
※ execute in root privilege
### set CLI interface as default interface
I installed Ubuntu 18.04 Desktop version because changing to CLI interface is easier than upgrading linux kernel
```javascript=
sudo systemctl set-default multi-user.target
```
or
```javascript=
sudo systemctl set-default runlevel3.target
```
### install NVIDIA drivers
Download the latest stable Driver from NVIDIA offical website. URL: https://www.nvidia.com/zh-tw/geforce/drivers/
After downloading the .run file
```javascript
# chmod 777 NVIDIA-Linux-x86_64-455.38.run //notice the filename when you click command, filename may be different because of its version or others
# apt install gcc make
# ./NVIDIA-Linux-x86_64-455.38.run //notice the filename when you click command
//When you execute the .run file, click "continue install" option in every error message
# reboot
```
after reboot, command
```javascript=
nvidia-smi
```
If you install the driver successfully, it shows message like picture below:

### install docker
```javascript=
$ sudo apt-get remove docker docker-engine docker.io containerd runc
$ sudo apt-get update
$ sudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
software-properties-common
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
$ sudo apt-key fingerprint 0EBFCD88
```
Make sure that the result may be:
9DC8 5822 9FC7 DD38 854A E2D8 8D81 803C 0EBF CD88
```javascript=11
$ sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
$ sudo apt-get update
$ sudo apt-get install docker-ce docker-ce-cli containerd.io
```
### install nvidia container toolkit
```javascript=
# distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
$ sudo apt-get update
$ sudo apt-get install -y nvidia-container-toolkit
$ sudo usermod -aG docker $USER
$ sudo systemctl daemon-reload
$ sudo systemctl restart docker
```
After rebooting, type and execute
```javascript=
# docker run --gpus all nvidia/cuda:11.1-base nvidia-smi
```
If it's OK, it may show message like picture below:

### download docker and build benchmark environment(tensorrt)
```javascript=
# docker pull nvcr.io/nvidia/tensorrt:21.09-py3
```
* We only have to pull docker image only one time
```javascript=2
$ docker run --gpus '"device=0"' -it --name trt_2011_root -w /workspace/tensorrt/data/resnet50/ nvcr.io/nvidia/tensorrt:21.09-py3
```
* BIS-3101 equip 1 NVIDIA RTX-3070/3080,--gpus '"device=0"' means that you give GPU No.0 to that container as its resource; --gpus all means that you give all GPU to that container as its resource. Other question, please google and key in keyword: docker run –gpu
* -it, i represents interactive, even we do not connect to that container,STDIN(terminal of UNIX) opens, too; -t presents tty,give that container a fake tty
* --rm, when we leave container, that container will be removed
* -v (volume), we use that to set a route for host and container to exchange files
* -w(workspace), the path after you enter into docker
And, we enter into container of tensorrt
```javascript=
$ /opt/tensorrt/python/python_setup.sh
$ cd /workspace/tensorrt/data/resnet50
$ vi benchmark.sh
```
in benchmark.sh
```javascript=
#!/bin/bash
echo -e "for int8 test, press 1; for fp16 test, press 2 : "
read testmode
if [ "${testmode}" -eq 1 ]; then
/workspace/tensorrt/bin/trtexec --batch=128 --iterations=400 --workspace=1024 --percentile=99 --deploy=ResNet50_N2.prototxt --model=ResNet50_fp32.caffemodel --output=prob --int8
elif [ "${testmode}" -eq 2 ]; then
/workspace/tensorrt/bin/trtexec --batch=128 --iterations=400 --workspace=1024 --percentile=99 --deploy=ResNet50_N2.prototxt --model=ResNet50_fp32.caffemodel --output=prob --fp16
else
echo -e "input wrong !!!"
fi
```
after that
```javascript=
$ chmod 777 ./benchmark.sh
$ exit
```
to back to host terminal
```javascript=
$ docker commit trt_2011_root
$ docker images //you'll see a docker image without name, it's what we commited, we need its image_ID
$ docker tag docker_image_ID trt:2011
$ vi ./benchmark.sh
```
in benchmark.sh
```javascript=
#!/bin/bash
docker run --gpus '"device=0"' -it --rm --name trt_2011 -w /workspace/tensorrt/data/resnet50/ trt:2011
```
### download docker and build burnIn environment(tensorflow)
```javascript=
# docker pull nvcr.io/nvidia/tensorflow:21.09-tf1-py3
```
* We only have to pull docker image only one time
```javascript
# docker run --gpus '"device=0"' -it --name tf_2011_root -w /workspace/nvidia-examples/cnn nvcr.io/nvidia/tensorflow:21.09-tf1-py3
```
And, we enter into container of tensorflow
```javascript=
$ cd /workspace/nvidia-examples/cnn
$ vi ./burn.sh
```
in burn.sh
```javascript=
#!/bin/bash
for((i=1;i>0;i++))
do
mpirun --allow-run-as-root -np 1 --mca btl ^openib python -u ./resnet.py --batch_size 128 --num_iter 28800 --precision fp16 --iter_unit batch
done
```
after that
```javascript=
$ chmod 777 ./burn.sh
$ exit
```
to back to host terminal
```javascript=
$ docker commit tf_2011_root
$ docker images //you'll see a docker image without name, it's what we commited, we need its image_ID
$ docker tag docker_image_ID tf:2011tf2
$ vi ./burn.sh
```
in burn.sh
```javascript=
#!/bin/bash
docker run --gpus '"device=0"' -it --rm --name tf_2011tf2 -w /workspace/nvidia-examples/cnn tf:2011tf2
```
## source:
1. https://github.com/YeeHong/CB-1921A_with_NVIDIA-T4_benchmark
1. Benchmark_SOP_T4 Benchmark Guide.docx
1. NGC-Ready-Validated-Server-Cookbook-ubuntu-18.04-v1.4.1-2020-02-03 v1.docx
1. Measuring_Training_and_Inferencing_Performance_on_NVIDIA_AI_Platforms-nv.pdf
1. https://ngc.nvidia.com/catalog/containers/nvidia:tensorflow
1. https://docs.docker.com/engine/install/ubuntu/
1. https://medium.com/@grady1006/ubuntu18-04%E5%AE%89%E8%A3%9Ddocker%E5%92%8Cnvidia-docker-%E4%BD%BF%E7%94%A8%E5%A4%96%E6%8E%A5%E9%A1%AF%E5%8D%A1-1e3c404c517d
1. https://ngc.nvidia.com/catalog/containers/nvidia:tensorrt
1. https://blog.wu-boy.com/2019/10/three-ways-to-setup-docker-user-and-group/
1. https://docs.docker.com/config/containers/resource_constraints/
1. https://blog.csdn.net/Flying_sfeng/article/details/103343813