Loïc Caille / Gabriel Bailly - Darknet

# Loïc Caille / Gabriel Bailly - Darknet ## Summary The goal of this module was to get a better understanding of what was necessary in order to **train a neural network using Darknet Yolo V3**. In doing so, several points seemed more important, either because of their difficulty or because of their time cost. This document aims to describe the project as it happens, all the while giving an interesting and useful insight into the topic and providing the reader with some of our **results**. To do so, the following plan will be followed: - *Opening battle* - **Darknet Yolo V3** - *Opening battle* - **Trying to install CUDA on WSL** - *A savior* - **Google colab** - *Experimenting* - **Using a webcam on colab** - *Experimenting* - **Webcam with a pre-trained model** - *Core of the project* - **Labeling images** - *Core of the project* - **Training a model** - *Reflecting upon our work* - **Limits of the model** - *Epilogue* - **Conclusion & Feedback** - *Epilogue* - **More...** ## Opening battle - Darknet Yolo V3 **Darknet Yolo** *(You Only Look Once)* is a system that allows us to do real-time object detection using neural networks. It provides us, users, with a set of C and python interfaces we can use to train and run neural networks. Installing and setting up Darknet Yolov3 isn't difficult in itself, but the real difficulties were interfacing this software with the OS and CUDA. ## Opening battle - Trying to install CUDA on WSL Trying to install **CUDA** on **WSL** was a real chore, and is definitely not worth describing here. After about 10 hours wasted on the topic, looking through the windows insiders program, installing multiple drivers, fumbling with WSL and VM host hardware sharing solutions the topic can be summed up with one simple statement: If you wish to use **CUDA**, either use **google colab** or setup a **linux Docker** / **dual boot**. ## A savior - Google colab Google colab allows us to use the cloud to work on our AI project. It provides us with a **CUDA** able environment and a python interface, as well as a virtual machine. Given **the tremendous time cost of setting up our own local environment**, there really is an interest in using such a platform. ## Experimenting - Using a webcam on colab To make use of **a webcam within Google colab** we needed to call a script in **javascript** (python cannot directly access the webcam since it can only be accessed by the web browser from Google colab). To do so, we used one of the numerous snippets offered by google colab. This code snipper allowed us to access our webcam directly from our web browser, meaning no further interfacing was required. ## Experimenting - Webcam with a pre-trained model Looking around in the world of open-source projects, We stumbled upon a [**vindruid**](https://github.com/vindruid) github user who put together a live image recognition AI using yolov3 sources and google collab's code snippets. ![](https://i.imgur.com/1sA7e5l.png) *nice* ## Core of the project - Labeling images In order to make our AI, an appropriate yolov3 darknet for ourselves we needed to train a model and for that we needed to create a **dataset**. A dataset is composed of labelled images that the AI can use to train and test itself. We decided to use 3 classes: - Bottle - Shoe - Cat We then proceeded to take pictures of these elements with various sizes, colors, focuses, orientations and backgrounds. In total, we took around **250** pictures with an average of 5 elements per image. Before labelling theses images they were all resized to a uniform and lower resolution of **1024x576**. Theses images were then all labelled using the **Yolo-Annotation-Tool-New** tool that works on python *(making it Windows compatible)*. ![](https://i.imgur.com/bSYieJT.jpg) To artificially increase the size of our dataset we also applied a gaussian noise on all our already labelled images. ![](https://i.imgur.com/OUf8gt2.jpg) We could also have applied other filters such as diming the luminosity or changing the contrast. ## Core of the project - Training a model The model was trained for about 3000 epochs, with a save every 250. Example of inconcluent result with only **500** iterations and the wrong labels: ![](https://i.imgur.com/cp5bwkQ.jpg) after about **2500** epochs, results started to be more logical. Example after **3000** epochs: ![](https://i.imgur.com/Ar43QdN.jpg) ## Reflecting upon our work - Limits of the model We can also see that depending on the number of iterations, more and more elements get detected. For example, with only 1000 to 2000 iterations our NN doesn't seem to detect anything in the following pictures, but after 3000 it finds a few elements! Something we probably could have done was increase the number of pictures we took, and use images featuring bottles closer to each other. Same goes for shoes. Going through this module really allowed us the further understand the importance of a dataset ; **we have to feed the NN images realted to what we want it to recognize.** ![](https://i.imgur.com/7S4Qwyy.jpg) ![](https://i.imgur.com/VtjzQ2g.jpg) ![](https://i.imgur.com/rrkx2vd.jpg) ![](https://i.imgur.com/owi0qNz.jpg) ### Epilogue - Conclusion & Feedback This class was an interesting first dive into the world of the Artificial Intelligence. To help us understand three major topics: - **How Neural Networks work in the real world** (the base of the AI, the importance of our dataset) - **How to use it in the real world** (first steps with Yolo Darknet, making a dataset, training an AI) - And the impressive list of traps one can fall into along the way. Improvements: - **More time**? Setting up the webcam with our own trained model could have been really satisfying, but because we didn't have a lot of assistance and lacked time we didn't manage to do it. - If possible, make it clear that it is expected to loose time setting up the environment: the main issue regarding time was the gigantic loss we suffered trying to install CUDA for 1 whole day. - **Tensorflow** / **pytorch** might be a better option than yolov3/v4. ### Epilogue - More... More... > To see the implementation and try it yourself: https://colab.research.google.com/drive/1PPv-I3CkMAgAkRhKU7jDqSXZhQieb25e#scrollTo=bv478fyt6Cfm > To access the google drive where you can find the custom_dataset (labellised images): https://drive.google.com/drive/folders/10xH5Ffg9HnHdsd66NT3DVUCLFW4FCA3S?usp=sharing > *Dataset can be found in "customdataset" directory*