--- title: Yolov5 training tags: Object Detection, Yolov5 description: Yolov5 training process --- # Object Detection Object detection identify instances, and frame objects from images. Inference provides semantic understanding of images and videos for a variety of applications such as human behavior analysis, face recognition, and autonomous driving. [**Yolov5**](https://github.com/ultralytics/yolov5) which have high acuuracy and ability on object detection. ## Data Preparation **Annotations Datasets** Using a tool like [**Roboflow**](https://roboflow.com/), [**Labelme**](https://github.com/wkentaro/labelme), [**Darklabel**](https://github.com/darkpgmr/DarkLabel) tool to label your images, export your labels to YOLO format, with one *.txt file per image (if no objects in image, no *.txt file is required). The *.txt file specifications are: **One row per object** Each row of box coordinates are (label, center_x, center_y, label_w, label_h) and range is from 0 - 1. If your boxes are in pixels, get the center point, width, and height(x1, y1, w1, h1) of the label in image. Dividing x1, w1 by image width, and divide y1, h1 by image height to generate normalized coordinate (center_x, center_y, label_w, label_h). ![](https://i.imgur.com/Cv8689t.jpg) The label file corresponding to the above image contains 2 persons (class 0) and a tie (class 27), class numbers are zero-indexed (start from 0): ![](https://i.imgur.com/VvUT8jZ.png) Directory Organization Txt files name should be the same as JPG files name. After complete dataset annotation, the files would be in the same folder. Next step is to separate Txt files and JPG files to differnt folder with following struture: ``` ├── data ├── images ├── 001.jpg ├── 002.jpg ├── 003.jpg ├── 004.jpg ├── 005.jpg ├── 006.jpg . . . ├── labels ├── 001.txt ├── 002.txt ├── 003.txt ├── 004.txt ├── 005.txt ├── 006.txt . . . ``` ## Augment datasets Next step is to augment dataset, the following script rotate the image and change the brightness, sharpness of the images to augment dataset. **augment_data.py** ``` import numpy as np import os import cv2 import argparse import shutil import logging import random import math from tqdm import tqdm from PIL import Image, ImageEnhance augment_num = 10 def processdata(augment_num): global num shuffle_list = np.arange(len(label_list)) for _ in range(augment_num): for i in range(3): random.shuffle(shuffle_list) total_l = len(shuffle_list) total_l = math.ceil(total_l/10) Train_list = shuffle_list[total_l:] Val_list = shuffle_list[:total_l] for train_index in Train_list: img_list[train_index].save("%s/images/train/%s.png" %(root, num)) f0 = open("%s/labels/train/%s.txt" %(root, num), "w") for label_index in label_list[train_index]: for coor in label_index: f0.write(coor) f0.close() num += 1 for val_index in Val_list: img_list[val_index].save("%s/images/val/%s.png" %(root, num)) f0 = open("%s/labels/val/%s.txt" %(root, num), "w") for label_index in label_list[val_index]: for coor in label_index: f0.write(coor) f0.close() num += 1 def LabelAppend(coor): l = [] for i in range(len(coor)): coor1 = coor[i].split("\n")[0].split(" ") l.append([coor1[0], " ", coor1[1], " ", coor1[2], " ", coor1[3], " ", coor1[4], "\n"]) label_list.append(l) def LabelRotateAppend(coor): l = [] for i in range(len(coor)): coor1 = coor[i].split("\n")[0].split(" ") l.append([coor1[0], " ", str(1-float(coor1[2])), " ", coor1[1], " ", coor1[4], " ", coor1[3], "\n"]) label_list.append(l) def ImgRotate(img): img_list.append(img) img = np.array(img) img = cv2.rotate(img, cv2.ROTATE_90_CLOCKWISE) img = Image.fromarray(img) img_list.append(img) def ImgFind(path): if (path.lower().endswith(('.jpg'))): txtpath = txt_orgpath + path.split(".jpg")[0] + ".txt" elif (path.lower().endswith(('.png'))): txtpath = txt_orgpath + path.split(".png")[0] + ".txt" elif (path.lower().endswith(('.jpeg'))): txtpath = txt_orgpath + path.split(".jpeg")[0] + ".txt" return txtpath def call_functions(img, coor): LabelAppend(coor) ImgRotate(img) LabelRotateAppend(coor) def BrightEnchance(Img, factor): enh_bri = ImageEnhance.Brightness(Img) image_brightened = enh_bri.enhance(factor) return image_brightened def ColorEnchance(Img, factor): enh_col = ImageEnhance.Color(Img) image_colored = enh_col.enhance(factor) return image_colored def ContrastEnchance(Img, factor): enh_con = ImageEnhance.Contrast(Img) image_contrasted = enh_con.enhance(factor) return image_contrasted def SharpEnchance(Img, factor): enh_sha = ImageEnhance.Sharpness(Img) image_sharped = enh_sha.enhance(factor) return image_sharped def arg(): parser = argparse.ArgumentParser() parser.add_argument('--img', type=str, default="/home/data/images/", help='') parser.add_argument('--txt', type=str, default="/home/data/labels/", help='') opt = parser.parse_args() return opt if __name__ == "__main__": opt = arg() txt_orgpath = opt.txt img_orgpath = opt.img root = "/home/datasets/" if os.path.isdir(root): shutil.rmtree(root) for task in ["images", "labels"]: for dir_p in ["train", "val"]: os.makedirs(os.path.join(root, task, dir_p)) LOGGER = logging.getLogger("") logging.basicConfig(level='INFO') LOGGER.info("start augmenting dataset") num = 0 for path in tqdm(os.listdir(img_orgpath)): img_list, label_list, coor = [], [], '' txtpath = ImgFind(path) Img = Image.open(os.path.join(img_orgpath, path)) if os.path.exists(txtpath): f = open(txtpath, "r") coor = f.readlines() f.close() for i in np.arange(0.4, 1.6, 0.3): bright_img1 = BrightEnchance(Img, i) call_functions(bright_img1, coor) color_img1 = ColorEnchance(Img, i) call_functions(color_img1, coor) sharp_img1 = SharpEnchance(Img, i) call_functions(sharp_img1, coor) for i in np.arange(0.8, 1.3, 0.2): contrast_img1 = ContrastEnchance(Img, i) call_functions(contrast_img1, coor) processdata(augment_num) ``` Directory Organization After augmenting process is done. Dataset has benn splited into training data and validation data, 9/10 data points to the former and the remaining 1/10 to the latter. The data directory organization would be as the following structure: ``` ├── datasets ├── images │ ├── train │ │ ├── 001.jpg │ │ ├── 002.jpg │ ├── val │ │ ├── 003.jpg │ │ ├── 004.jpg │ . . │ . . ├── labels │ ├── train │ │ ├── 001.txt │ │ ├── 002.txt │ ├── val │ │ ├── 003.txt │ │ ├── 004.txt │ . . │ . . └── data.yaml ``` data.yaml include the information of training path, validation path, classes names, number of classes ``` names: ["person", "car"] # classes names nc: 2 # number of classes train: /datasets/images/train/ # train datasets folder val: /datasets/images/val/ # val datasets folder ``` ## Start Training Install yolov5 environment,yolov5 also provide an installation docker environment. ``` $ python3.8 train.py --data /home/qct/datasets1/data.yaml --batch-size 64 ``` --- ## 參考 * https://github.com/ultralytics/yolov5 ## Thank you! :dash: You can find me on - GitHub: https://github.com/shaung08 - Email: a2369875@gmail.com