Create object detection model for KTP using YOLOv8 and CVAT

# Create object detection model for KTP using YOLOv8 and CVAT ## What is YOLOv8 YOLOv8 is the version 8 of the YOLO model built by the Ultralytics. YOLO (You Only Look Once), a popular object detection and image segmentation model, was developed by Joseph Redmon and Ali Farhadi at the University of Washington. Launched in 2015, YOLO quickly gained popularity for its high speed and accuracy. See: https://docs.ultralytics.com/#yolo-a-brief-history ## What is CVAT CVAT is stand for Computer Vision Annotate Tools. CVAT is a free, online, interactive video and image annotation tool for computer vision. It is being developed and used by Intel to annotate millions of objects with different properties. Many UI and UX decisions are based on feedbacks from professional data annotation team. See: https://www.cvat.ai/ ## Requirement Before we begin, let me explain a brief story. This article is based on this medium article, which I read on textify mode (because blocked by a paywall). You can read it here: https://txtify.it/https://medium.com/dkatalis/training-a-custom-object-detector-in-half-a-day-with-yolov8-5e1475fe201e In a simple roadmap, what we are about to do is detecting a KTP in an image, an object detection type of model. 1. We annotate our image with CVAT 2. We export the annotation as YOLO format 3. We setup a training code at Google Colab 4. We test it What you should prepare: 1. CVAT.ai account 2. A bunch of KTP images 3. Google account ### Step #1 - Preparing the dataset My team had collected roughly 165 KTP images, it is enough to produce a decent quality of KTP detection model. Go to cvat.ai, register there, and create a task. You can name it whatever you want, don't forget to add **1 label**, in my case its **id-card**. Then upload your dataset images as zip (recommended). ![image](https://hackmd.io/_uploads/HJnKTt-ST.png) ### Step #2 - Start annotating Click on your job number, in my case its **Job #427485**. You will be directed into your main annotating tool. ![image](https://hackmd.io/_uploads/H1yK0YbS6.png) Start annotating by drawing a rectangle, choose the label and choose **By 2 Points** and **Shape**. Draw the KTP's box, give it a little padding then you're done. Go to the next frame/image by clicking on right arrow. You can speed up your process by using shortcut keyboard, **N** to draw new and **F** to go to the next. Once you done annotating all, click on **Save** button and go back to the previous page. ### Step #3 - Exporting datasets and annotations ![image](https://hackmd.io/_uploads/BJrpk9brp.png) On the three dots menu, choose **Export annotations**, make sure the **Export format** is **YOLO format** and click OK. Your annotation will be inside of `obj_train_data` folders that you just downloaded. ### Step #4 - Setup project Go to google drive and create your project name, inside it create a folder structure like this: ``` ktp_dataset/ ├── train │ ├── images │ ├── labels └── val ├── images ├── labels └── data.yaml ``` Upload your images to `images` folder and the annotation (.txt) to `labels` folder. If you don't have enough data, you can split your dataset into two, but the majority goes to `train` folder. In my case, I have 165 datasets and I uploaded 130 images to `train` folder (for training) and the rest into `val` folder (for validation). Both images and labels have the same name, only differ in the extension. Don't forget to create `data.yaml` file with the format like this (elkatepe is my project name in my gdrive): ```yaml train: /content/drive/MyDrive/elkatepe/ktp_dataset/train/images val: /content/drive/MyDrive/elkatepe/ktp_dataset/val/images nc: 1 names: - id_card ``` - `train` is the training folder images - `val` is the validation folder images - `nc` is the number of classes, which only one - `names` is the list of classes name which only one `id_card` ### Step #5 - Training time Goto google colab and create a new project. #### 5.1 Connecting to google drive ```py= from google.colab import drive drive.mount('/content/gdrive') ``` Authorize google colab to connect to your google drive. #### 5.2 Installing YOLOv8 It's so simple, create a text box and run ```bash !pip install ultralytics ``` #### 5.3 Load model and train the data ```py= from ultralytics import YOLO model = YOLO("yolov8n.pt") model.train(data="/content/drive/MyDrive/elkatepe/data.yaml", epochs=20) metrics = model.val() ``` We load the nano model of YOLOv8 for object detection and also the yaml file. It takes approximately 0.791 hours to complete 20 epochs and a total 6.2MB of model as a result. #### 5.4 Test the model Create a test folder inside your project folder and place a testing images ```py= import glob import cv2 from matplotlib import pyplot as plt from ultralytics import YOLO model = YOLO("runs/detect/train3/weights/best.pt") for file in glob.glob("/content/drive/MyDrive/elkatepe/test/**"): result = model(cv2.imread(file)) res_plotted = result[0].plot() fig, ax = plt.subplots(1, 1, figsize=(10, 10)) ax.imshow(res_plotted, cmap='gray') plt.title("Object Detection Result") plt.show() ``` The result: ![image](https://hackmd.io/_uploads/HkTB49ZS6.png) #### 5.5 Exporting the model ```py= model.export(format='tfjs') ``` I'm exporting it to tfjs model to be able to use it in tensorflow.js. You can export it into various type. See: https://docs.ultralytics.com/modes/export/#arguments #### 5.6 Download the model Now it's time to move your model from runtime google colab to your google drive and download it. ``` !pip install google-colab-shell ``` After done installing goolge-colab-shell run the following: ```py= from google_colab_shell import getshell getshell() ``` And then: ```bash cp -r runs /content/drive/MyDrive/elkatepe ``` And now you can access and download your model inside your project folder, in my case it's located at `/content/drive/MyDrive/elkatepe/runs/detect/train3/best_web_model` where `train3` is the best run in my training. Now you can use your model in different format, refer the docs.