---
title: Yolov5 training
tags: Object Detection, Yolov5
description: Yolov5 training process
---
# Object Detection
Object detection identify instances, and frame objects from images. Inference provides semantic understanding of images and videos for a variety of applications such as human behavior analysis, face recognition, and autonomous driving. [**Yolov5**](https://github.com/ultralytics/yolov5) which have high acuuracy and ability on object detection.
## Data Preparation
**Annotations Datasets**
Using a tool like [**Roboflow**](https://roboflow.com/), [**Labelme**](https://github.com/wkentaro/labelme), [**Darklabel**](https://github.com/darkpgmr/DarkLabel) tool to label your images, export your labels to YOLO format, with one *.txt file per image (if no objects in image, no *.txt file is required). The *.txt file specifications are:
**One row per object**
Each row of box coordinates are (label, center_x, center_y, label_w, label_h) and range is from 0 - 1. If your boxes are in pixels, get the center point, width, and height(x1, y1, w1, h1) of the label in image. Dividing x1, w1 by image width, and divide y1, h1 by image height to generate normalized coordinate (center_x, center_y, label_w, label_h).

The label file corresponding to the above image contains 2 persons (class 0) and a tie (class 27), class numbers are zero-indexed (start from 0):

Directory Organization
Txt files name should be the same as JPG files name. After complete dataset annotation, the files would be in the same folder. Next step is to separate Txt files and JPG files to differnt folder with following struture:
```
├── data
├── images
├── 001.jpg
├── 002.jpg
├── 003.jpg
├── 004.jpg
├── 005.jpg
├── 006.jpg
.
.
.
├── labels
├── 001.txt
├── 002.txt
├── 003.txt
├── 004.txt
├── 005.txt
├── 006.txt
.
.
.
```
## Augment datasets
Next step is to augment dataset, the following script rotate the image and change the brightness, sharpness of the images to augment dataset.
**augment_data.py**
```
import numpy as np
import os
import cv2
import argparse
import shutil
import logging
import random
import math
from tqdm import tqdm
from PIL import Image, ImageEnhance
augment_num = 10
def processdata(augment_num):
global num
shuffle_list = np.arange(len(label_list))
for _ in range(augment_num):
for i in range(3):
random.shuffle(shuffle_list)
total_l = len(shuffle_list)
total_l = math.ceil(total_l/10)
Train_list = shuffle_list[total_l:]
Val_list = shuffle_list[:total_l]
for train_index in Train_list:
img_list[train_index].save("%s/images/train/%s.png" %(root, num))
f0 = open("%s/labels/train/%s.txt" %(root, num), "w")
for label_index in label_list[train_index]:
for coor in label_index:
f0.write(coor)
f0.close()
num += 1
for val_index in Val_list:
img_list[val_index].save("%s/images/val/%s.png" %(root, num))
f0 = open("%s/labels/val/%s.txt" %(root, num), "w")
for label_index in label_list[val_index]:
for coor in label_index:
f0.write(coor)
f0.close()
num += 1
def LabelAppend(coor):
l = []
for i in range(len(coor)):
coor1 = coor[i].split("\n")[0].split(" ")
l.append([coor1[0], " ", coor1[1], " ", coor1[2], " ",
coor1[3], " ", coor1[4], "\n"])
label_list.append(l)
def LabelRotateAppend(coor):
l = []
for i in range(len(coor)):
coor1 = coor[i].split("\n")[0].split(" ")
l.append([coor1[0], " ", str(1-float(coor1[2])), " ",
coor1[1], " ", coor1[4], " ", coor1[3], "\n"])
label_list.append(l)
def ImgRotate(img):
img_list.append(img)
img = np.array(img)
img = cv2.rotate(img, cv2.ROTATE_90_CLOCKWISE)
img = Image.fromarray(img)
img_list.append(img)
def ImgFind(path):
if (path.lower().endswith(('.jpg'))):
txtpath = txt_orgpath + path.split(".jpg")[0] + ".txt"
elif (path.lower().endswith(('.png'))):
txtpath = txt_orgpath + path.split(".png")[0] + ".txt"
elif (path.lower().endswith(('.jpeg'))):
txtpath = txt_orgpath + path.split(".jpeg")[0] + ".txt"
return txtpath
def call_functions(img, coor):
LabelAppend(coor)
ImgRotate(img)
LabelRotateAppend(coor)
def BrightEnchance(Img, factor):
enh_bri = ImageEnhance.Brightness(Img)
image_brightened = enh_bri.enhance(factor)
return image_brightened
def ColorEnchance(Img, factor):
enh_col = ImageEnhance.Color(Img)
image_colored = enh_col.enhance(factor)
return image_colored
def ContrastEnchance(Img, factor):
enh_con = ImageEnhance.Contrast(Img)
image_contrasted = enh_con.enhance(factor)
return image_contrasted
def SharpEnchance(Img, factor):
enh_sha = ImageEnhance.Sharpness(Img)
image_sharped = enh_sha.enhance(factor)
return image_sharped
def arg():
parser = argparse.ArgumentParser()
parser.add_argument('--img', type=str, default="/home/data/images/", help='')
parser.add_argument('--txt', type=str, default="/home/data/labels/", help='')
opt = parser.parse_args()
return opt
if __name__ == "__main__":
opt = arg()
txt_orgpath = opt.txt
img_orgpath = opt.img
root = "/home/datasets/"
if os.path.isdir(root):
shutil.rmtree(root)
for task in ["images", "labels"]:
for dir_p in ["train", "val"]:
os.makedirs(os.path.join(root, task, dir_p))
LOGGER = logging.getLogger("")
logging.basicConfig(level='INFO')
LOGGER.info("start augmenting dataset")
num = 0
for path in tqdm(os.listdir(img_orgpath)):
img_list, label_list, coor = [], [], ''
txtpath = ImgFind(path)
Img = Image.open(os.path.join(img_orgpath, path))
if os.path.exists(txtpath):
f = open(txtpath, "r")
coor = f.readlines()
f.close()
for i in np.arange(0.4, 1.6, 0.3):
bright_img1 = BrightEnchance(Img, i)
call_functions(bright_img1, coor)
color_img1 = ColorEnchance(Img, i)
call_functions(color_img1, coor)
sharp_img1 = SharpEnchance(Img, i)
call_functions(sharp_img1, coor)
for i in np.arange(0.8, 1.3, 0.2):
contrast_img1 = ContrastEnchance(Img, i)
call_functions(contrast_img1, coor)
processdata(augment_num)
```
Directory Organization
After augmenting process is done. Dataset has benn splited into training data and validation data, 9/10 data points to the former and the remaining 1/10 to the latter.
The data directory organization would be as the following structure:
```
├── datasets
├── images
│ ├── train
│ │ ├── 001.jpg
│ │ ├── 002.jpg
│ ├── val
│ │ ├── 003.jpg
│ │ ├── 004.jpg
│ . .
│ . .
├── labels
│ ├── train
│ │ ├── 001.txt
│ │ ├── 002.txt
│ ├── val
│ │ ├── 003.txt
│ │ ├── 004.txt
│ . .
│ . .
└── data.yaml
```
data.yaml include the information of training path, validation path, classes names, number of classes
```
names: ["person", "car"] # classes names
nc: 2 # number of classes
train: /datasets/images/train/ # train datasets folder
val: /datasets/images/val/ # val datasets folder
```
## Start Training
Install yolov5 environment,yolov5 also provide an installation docker environment.
```
$ python3.8 train.py --data /home/qct/datasets1/data.yaml --batch-size 64
```
---
## 參考
* https://github.com/ultralytics/yolov5
## Thank you! :dash:
You can find me on
- GitHub: https://github.com/shaung08
- Email: a2369875@gmail.com