Football Highlight Detection

June 14, 2023

Authors:
Wouter Looijenga & Michall Hu

In this blog post, we showcase our implementation of Football highlight detection using Yolov7. This project is part of the CS4245 Seminar Computer Vision by Deep Learning course for the academic year 2022/2023 at TU Delft. Our implementation is done in Python and we used a dataset from Kaggle.

1. Introduction

Deep neural networks have made significant strides in the field of computer vision, particularly in tasks that involve object detection and segmentation where humans can easily label objects. However, not all industries have embraced the use of deep neural networks. Take, for instance, highlight detection in football matches - most highlights are still manually annotated by professionals. Ideally, this task should be automated. In our work, we trained an object classifier to recognize footballs and players to identify potential goals. Our implementation uses a custom-trained YOLOv7 object detection model to identify potential goals based on the positions of the goalkeeper and football within an image.

2. Motivation

Highlight detection in football matches is mostly done manually by professionals. However, this approach can sometimes result in overlooked highlights. By automating highlight detection, we can ensure that no potential highlights are missed. An automated highlight detector has the potential to greatly improve the efficiency and accuracy of highlight detection in football matches while reducing the workload of professional football analysts. In our blog, we will focus on identifying potential goals based on the pixel positions of the goalkeeper and football using a custom-trained YOLOv7 model. Our implementation utilizes advanced computer vision techniques to accurately detect and track the movement of the ball and players on the field. This allows us to identify key moments in the game that may have otherwise been missed by manual analysis.

3. Implementation

For our implementation, we used a Kaggle dataset of top-view football match videos. The data was annotated in a .csv format and was converted to the YOLOv7 annotation format. We trained two custom models - one for detecting football players per team, each goalkeeper, and the football, and another for detecting footballs. Figure 1 shows the YOLO architecture. Our models were trained using advanced machine learning techniques to accurately detect and track the movement of players and the ball on the field. This allows us to identify key moments in the game and potential highlights with high accuracy.

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

Figure 1: yoloV7 architecture

3.1 Preprocessing

Before training our models, we performed several preprocessing steps. First, we used the OpenCV library in Python to convert the videos into frames. Next, we converted the .csv files into the YOLOv7 annotation format for custom object detection training. We randomly selected video frames for our training and validation sets. Our training set contains 400 images and our validation set contains 100 images. These sets were carefully curated to ensure a diverse range of scenarios and game situations to train our models on. This allows us to achieve high accuracy in detecting and tracking players and the ball on the field.

3.2 Training

Our models were trained on an Nvidia RTX 3060 laptop GPU. We created a custom
custom_data.yaml file where we defined the number of classes and the location of the training and validation data. We also modified the yolov7.yaml file to include the number of classes to train.

We used the following training command for both the football player and football detection models:

Command:
python train.py –workers 1 –device 0 –batch-size 1 –epochs 100 –img 1280 1280 –data data/custom_data.yaml –hyp data/hyp.scratch.custom.yaml –cfg cfg/training/yolov7_custom.yaml –name yolov7-custom –weights yolov7.pt

Each argument in the command represents a specific setting for training:

–workers 1: Specifies the number of subprocesses to use for data loading. Using multiple workers can help speed up data loading during training.
–device 0: Specifies the device to use for training. In this case, we are using the first CUDA device.
–batch-size 1: Specifies the number of samples per batch to load during training. A larger batch size can help speed up training but requires more memory.
–epochs 100: Specifies the number of epochs to train for. An epoch is one complete pass through the entire training dataset.
–img 1280 1280: Specifies the image size for training in pixels (width, height). Images in the dataset will be resized to this size before being fed into the model.
–data data/custom_data.yaml: Specifies the path to the dataset configuration file, which contains information about the location of the training and validation data.
–hyp data/hyp.scratch.custom.yaml: Specifies the path to the hyperparameters file, which contains values for hyperparameters such as learning rate and weight decay.
–cfg cfg/training/yolov7_custom.yaml: Specifies the path to the model configuration file, which contains information about the model architecture and other settings.
–name yolov7-custom: Specifies the name of the run, which is used to create a directory where logs and checkpoints will be saved.
–weights yolov7.pt: Specifies the path to a weights file. If provided, training will start from these weights instead of starting from scratch.

Our training process was carefully designed to ensure that our models achieve high accuracy in detecting and tracking players and footballs on the field.

3.3 Highlight Detection

We detected potential goals by saving the detected bounding box coordinates of the goalkeeper and football in a .txt file. The criteria for a potential goal differed for the two goalkeepers based on the location of their goalposts - either on the left or right. We determined the location of the goalposts by comparing the x-coordinate bounding values of the two goalkeepers. The goalkeeper with the lowest x-coordinate value had their goalpost on the left, while the one with the highest x-coordinate value had their goalpost on the right. A potential goal was detected if the football was detected behind the goalkeeper within a 100-pixel range per bounding box coordinate. Our algorithm was carefully designed to accurately detect potential goals and provide valuable insights into game situations.

4. Results

To evaluate the performance of our two custom-trained YOLOv7 object detectors, we used several different methods. We compared the Box, Objectness, Classification, Precision, mAP@0.5, and mAP@0.5:0.95 graphs to assess the overall performance of our models. Additionally, we analyzed the confusion matrices to gain further insights into the accuracy of our models. Our evaluation process was thorough and rigorous, allowing us to accurately assess the performance of our models and make any necessary improvements to achieve high accuracy in detecting and tracking players and footballs on the field.

4.1 Training Results

4.1.1 Custom Object Detection 5 Classes

Figure 2 shows the training results for the Team1, Team2, Goalkeeper1, Goalkeeper2, and Ball classes. The average loss for bounding box regression decreases after each epoch, as does the loss for objectness and classification. The same pattern can be observed for val Box and val objectness losses. The precision and recall graphs increase over time, with some fluctuations where the values temporarily decrease before increasing again. The mAP@0.5 graph shows an overall increase in the accuracy of the model over time, as does the mAP@0.5:0.95 graph.

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

Figure 2: Metric graphs for the 5 classes

The confusion matrix in Figure 3 shows high accuracy results for the Team1, Team2, Goalkeeper1, and Goalkeeper2 classes. Our custom-trained model was able to accurately identify these objects with their correct class. However, the accuracy for the Ball class is zero, indicating that the classifier was unable to detect the ball. This resulted in a low score for the Ball class and a high number of false positives on the background FP.

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

Figure 3: Confusion matrix 5 classes

Overall, our custom-trained model performed well in detecting and identifying players and goalkeepers on the field. However, further improvements are needed to accurately detect and track the movement of the ball.

4.1.2 Custom Object Detection 1 Class

Figure 4 shows the training results for the Ball class. The object detection performance for the football shows overall improvement in the Box graph as the loss gradually decreases. However, the Objectness and val Objectness graphs show a steep downward curve. This may be due to the small size of the ball and the presence of multiple white lines that could be mistaken for a ball. The Classification graph and val Box show no result at all, indicating that our custom-trained model was unable to correctly detect the ball. The Precision and Recall graphs also show unusual behavior with various spikes, as do the mAP@0.5 and mAP@0.5:0.95 graphs. These unusual results may be due to the small size of the ball.

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

Figure 4: Metric graphs for the 1 class

The confusion matrix in Figure 5 shows similar behavior. Our custom-trained classifier was unable to correctly detect the ball, with a score of 1.0 for false positives on the background.

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

Figure 5: Confusion matrix 1 class

4.2 Testing Results

Our application consists of several steps. First, we run two custom object detection models - one to identify football players and the other to identify the ball. The bounding box coordinates are saved for further analysis. To address the low accuracy of ball detection, we implemented an additional step to check if the ball was behind the goalkeeper. We filtered out white spots that were not moving but remained within a specific location to remove penalty white dots.

After applying these filters, our application checks if a potential ball is behind the goalkeeper. If a potential goal is detected, the result is printed and the football is marked blue, as shown in Figure 6. Our application was carefully designed to accurately detect and track players and footballs on the field and provide valuable insights into game situations.

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

Figure 6: Highlight detection of possible goal in blue

5. Discussion & Limitations

5.1 Discussion

We have successfully developed a highlight detection system for identifying potential goals using custom-trained YOLOv7 weights. The Kaggle dataset we used only contained the locations of players and the ball, so we wrote our own Python code to detect potential goal-scoring events. In future extensions, we plan to add additional Python code to detect other highlights such as attacking formations or fouls. Our system has the potential to greatly improve the efficiency and accuracy of highlight detection in football matches and provide valuable insights into game situations.

5.2 Limitations

Our work has several limitations, primarily related to the detection of the football. Due to its small size, our code had to filter out objects to determine which one was the true football. Additionally, the input image resolution needed to be at least 1024 by 1024 pixels to accurately identify objects and avoid losing features in lower-quality images. Our object detection system may also incorrectly identify a potential goal if there is a white line and a goalkeeper in the frame due to the low object identification rate of the football. Finally, not all players were always detectable because they may walk on white lines, which reduced their detectability due to their small size. Despite these limitations, our system has the potential to greatly improve the efficiency and accuracy of highlight detection in football matches.

6. Conclusion

In this blog post, we presented our implementation of Football highlight detection using YOLOv7. Our goal was to automate highlight detection in football matches to improve efficiency and accuracy. We trained custom object detection models to identify players and the ball, and developed an algorithm to detect potential goals based on the positions of the goalkeeper and ball. Our models performed well in detecting players and goalkeepers but showed limitations in detecting the ball. Despite these limitations, our system has the potential to enhance highlight detection in football matches and provide valuable insights into game situations.

Team Contributions

Wouter: Paper research and dataset search. Creating the dataset by using video frames. Writing on the blog and storyline.

Michall: Setting up training and testing environment and running all training and testing. Tuning parameters to achieve best performance.