Authors:
Wouter Looijenga & Michall Hu
In this blog post, we showcase our implementation of Football highlight detection using Yolov7. This project is part of the CS4245 Seminar Computer Vision by Deep Learning course for the academic year 2022/2023 at TU Delft. Our implementation is done in Python and we used a dataset from Kaggle.
Deep neural networks have made significant strides in the field of computer vision, particularly in tasks that involve object detection and segmentation where humans can easily label objects. However, not all industries have embraced the use of deep neural networks. Take, for instance, highlight detection in football matches - most highlights are still manually annotated by professionals. Ideally, this task should be automated. In our work, we trained an object classifier to recognize footballs and players to identify potential goals. Our implementation uses a custom-trained YOLOv7 object detection model to identify potential goals based on the positions of the goalkeeper and football within an image.
Highlight detection in football matches is mostly done manually by professionals. However, this approach can sometimes result in overlooked highlights. By automating highlight detection, we can ensure that no potential highlights are missed. An automated highlight detector has the potential to greatly improve the efficiency and accuracy of highlight detection in football matches while reducing the workload of professional football analysts. In our blog, we will focus on identifying potential goals based on the pixel positions of the goalkeeper and football using a custom-trained YOLOv7 model. Our implementation utilizes advanced computer vision techniques to accurately detect and track the movement of the ball and players on the field. This allows us to identify key moments in the game that may have otherwise been missed by manual analysis.
For our implementation, we used a Kaggle dataset of top-view football match videos. The data was annotated in a .csv format and was converted to the YOLOv7 annotation format. We trained two custom models - one for detecting football players per team, each goalkeeper, and the football, and another for detecting footballs. Figure 1 shows the YOLO architecture. Our models were trained using advanced machine learning techniques to accurately detect and track the movement of players and the ball on the field. This allows us to identify key moments in the game and potential highlights with high accuracy.
Figure 1: yoloV7 architecture
Before training our models, we performed several preprocessing steps. First, we used the OpenCV library in Python to convert the videos into frames. Next, we converted the .csv files into the YOLOv7 annotation format for custom object detection training. We randomly selected video frames for our training and validation sets. Our training set contains 400 images and our validation set contains 100 images. These sets were carefully curated to ensure a diverse range of scenarios and game situations to train our models on. This allows us to achieve high accuracy in detecting and tracking players and the ball on the field.
Our models were trained on an Nvidia RTX 3060 laptop GPU. We created a custom
custom_data.yaml file where we defined the number of classes and the location of the training and validation data. We also modified the yolov7.yaml file to include the number of classes to train.
We used the following training command for both the football player and football detection models:
Command: |
---|
python train.py โworkers 1 โdevice 0 โbatch-size 1 โepochs 100 โimg 1280 1280 โdata data/custom_data.yaml โhyp data/hyp.scratch.custom.yaml โcfg cfg/training/yolov7_custom.yaml โname yolov7-custom โweights yolov7.pt |
Each argument in the command represents a specific setting for training:
Our training process was carefully designed to ensure that our models achieve high accuracy in detecting and tracking players and footballs on the field.
We detected potential goals by saving the detected bounding box coordinates of the goalkeeper and football in a .txt file. The criteria for a potential goal differed for the two goalkeepers based on the location of their goalposts - either on the left or right. We determined the location of the goalposts by comparing the x-coordinate bounding values of the two goalkeepers. The goalkeeper with the lowest x-coordinate value had their goalpost on the left, while the one with the highest x-coordinate value had their goalpost on the right. A potential goal was detected if the football was detected behind the goalkeeper within a 100-pixel range per bounding box coordinate. Our algorithm was carefully designed to accurately detect potential goals and provide valuable insights into game situations.
To evaluate the performance of our two custom-trained YOLOv7 object detectors, we used several different methods. We compared the Box, Objectness, Classification, Precision, mAP@0.5, and mAP@0.5:0.95 graphs to assess the overall performance of our models. Additionally, we analyzed the confusion matrices to gain further insights into the accuracy of our models. Our evaluation process was thorough and rigorous, allowing us to accurately assess the performance of our models and make any necessary improvements to achieve high accuracy in detecting and tracking players and footballs on the field.
Figure 2 shows the training results for the Team1, Team2, Goalkeeper1, Goalkeeper2, and Ball classes. The average loss for bounding box regression decreases after each epoch, as does the loss for objectness and classification. The same pattern can be observed for val Box and val objectness losses. The precision and recall graphs increase over time, with some fluctuations where the values temporarily decrease before increasing again. The mAP@0.5 graph shows an overall increase in the accuracy of the model over time, as does the mAP@0.5:0.95 graph.
Figure 2: Metric graphs for the 5 classes
The confusion matrix in Figure 3 shows high accuracy results for the Team1, Team2, Goalkeeper1, and Goalkeeper2 classes. Our custom-trained model was able to accurately identify these objects with their correct class. However, the accuracy for the Ball class is zero, indicating that the classifier was unable to detect the ball. This resulted in a low score for the Ball class and a high number of false positives on the background FP.
Figure 3: Confusion matrix 5 classes
Overall, our custom-trained model performed well in detecting and identifying players and goalkeepers on the field. However, further improvements are needed to accurately detect and track the movement of the ball.
Figure 4 shows the training results for the Ball class. The object detection performance for the football shows overall improvement in the Box graph as the loss gradually decreases. However, the Objectness and val Objectness graphs show a steep downward curve. This may be due to the small size of the ball and the presence of multiple white lines that could be mistaken for a ball. The Classification graph and val Box show no result at all, indicating that our custom-trained model was unable to correctly detect the ball. The Precision and Recall graphs also show unusual behavior with various spikes, as do the mAP@0.5 and mAP@0.5:0.95 graphs. These unusual results may be due to the small size of the ball.
Figure 4: Metric graphs for the 1 class
The confusion matrix in Figure 5 shows similar behavior. Our custom-trained classifier was unable to correctly detect the ball, with a score of 1.0 for false positives on the background.
Figure 5: Confusion matrix 1 class
Overall, our custom-trained model performed well in detecting and identifying players and goalkeepers on the field. However, further improvements are needed to accurately detect and track the movement of the ball.
Our application consists of several steps. First, we run two custom object detection models - one to identify football players and the other to identify the ball. The bounding box coordinates are saved for further analysis. To address the low accuracy of ball detection, we implemented an additional step to check if the ball was behind the goalkeeper. We filtered out white spots that were not moving but remained within a specific location to remove penalty white dots.
After applying these filters, our application checks if a potential ball is behind the goalkeeper. If a potential goal is detected, the result is printed and the football is marked blue, as shown in Figure 6. Our application was carefully designed to accurately detect and track players and footballs on the field and provide valuable insights into game situations.
Figure 6: Highlight detection of possible goal in blue
We have successfully developed a highlight detection system for identifying potential goals using custom-trained YOLOv7 weights. The Kaggle dataset we used only contained the locations of players and the ball, so we wrote our own Python code to detect potential goal-scoring events. In future extensions, we plan to add additional Python code to detect other highlights such as attacking formations or fouls. Our system has the potential to greatly improve the efficiency and accuracy of highlight detection in football matches and provide valuable insights into game situations.
Our work has several limitations, primarily related to the detection of the football. Due to its small size, our code had to filter out objects to determine which one was the true football. Additionally, the input image resolution needed to be at least 1024 by 1024 pixels to accurately identify objects and avoid losing features in lower-quality images. Our object detection system may also incorrectly identify a potential goal if there is a white line and a goalkeeper in the frame due to the low object identification rate of the football. Finally, not all players were always detectable because they may walk on white lines, which reduced their detectability due to their small size. Despite these limitations, our system has the potential to greatly improve the efficiency and accuracy of highlight detection in football matches.
In this blog post, we presented our implementation of Football highlight detection using YOLOv7. Our goal was to automate highlight detection in football matches to improve efficiency and accuracy. We trained custom object detection models to identify players and the ball, and developed an algorithm to detect potential goals based on the positions of the goalkeeper and ball. Our models performed well in detecting players and goalkeepers but showed limitations in detecting the ball. Despite these limitations, our system has the potential to enhance highlight detection in football matches and provide valuable insights into game situations.
Wouter: Paper research and dataset search. Creating the dataset by using video frames. Writing on the blog and storyline.
Michall: Setting up training and testing environment and running all training and testing. Tuning parameters to achieve best performance.