112-2 ComputerVision Final

# 112-2 ComputerVision Final Report ## Title and Authors #### Driving Monitor System base on Yolov7 陳禹勳 312581014 廖永誠 312581029 ## 6 sections: ### Sec 1. Introduction: introduce the problem you want to solve; explain why it is important to solve it; and indicate the method you used to solve it. In today's society, road accidents occur frequently. In addition to driver fatigue, distraction, and discomfort, other factors such as drunk driving, failure to wear seat belts, phone usage, and speeding can also jeopardize driving safety. Therefore, DMS system (currently addressing fatigue, phone usage, and seat belt detection) will expand to detect these hazardous behaviors to comprehensively ensure driving safety. To address these issues, the European Union has established regulations related to DMS systems, and various countries have launched products that comply with these regulations. The following figure introduces the regulations: ![image](https://hackmd.io/_uploads/Hyi4M-QVR.png) The figure originates from the EURO NCAP (New Car Assessment Programme) regulations, specifically concerning Driver Monitoring Systems (DMS) in vehicles. The purpose of these regulations is to enhance road safety by mitigating the risks associated with driver distractions. The motivation behind these rules is to address the increasing number of accidents caused by distracted driving. By categorizing and detailing various types of distractions, the regulations aim to provide a framework for automotive manufacturers to develop and implement effective DMS technologies. These systems are designed to monitor and alert drivers to maintain their attention on the road, thus reducing the likelihood of accidents and improving overall traffic safety. The regulations emphasize the need for accurate detection of both non-driving and driving-related distractions, as well as phone usage, to ensure comprehensive coverage of potential distraction scenarios. Existing DMS systems mostly rely on rule-based methods or single sensors, but these approaches face challenges in terms of accuracy, robustness, and scalability. In contrast, deep learning-based methods offer higher accuracy, robustness, and the ability to learn more complex features and patterns from large datasets. YOLOv7, as an advanced object detection and classification model, boasts fast processing speeds and outstanding accuracy, making it an ideal choice for our solution. By combining YOLOv7 with appropriate data preprocessing techniques, we can achieve real-time and accurate driver monitoring, effectively reducing the risk of traffic accidents and ensuring road safety. In summary, our DMS system aims to leverage advanced deep learning techniques to address the current challenges in driving safety, providing drivers with a safer driving environment while reducing casualties and property losses caused by traffic accidents. ### Sec 2. Review of previous work; explain why your method is better than previous work; summarize the key main contributions of your work. To meet the requirements of DMS regulations, various aspects must be addressed, including driver gaze estimation, object detection, and so on. Currently, there are several solutions for driver gaze estimation, such as MCGaze and L2CS-Net. L2CS-Net (Look to Center of Symmetry Network) is a deep learning model designed for gaze estimation. It predicts where a person is looking by estimating the gaze direction from images of the eyes or face. The network leverages convolutional neural networks (CNNs) to extract features from eye images and applies regression techniques to estimate the gaze direction in 3D space. L2CS-Net is known for its accuracy and efficiency in real-time gaze tracking applications, making it useful in various fields like human-computer interaction, virtual reality, and driver monitoring systems. The detailed architecture of L2CS-Net is shown in the following figure: ![截圖 2024-05-28 下午3.26.34](https://hackmd.io/_uploads/SJYEvb74A.png) In terms of object detection, previous research primarily relied on rule-based methods or single sensor technologies, which had certain limitations. For example, rule-based methods might not adequately address variations in driving behavior, while single sensor technologies might not provide sufficient information to accurately assess the driver's state. In contrast, our approach utilizes YOLOv7, an advanced deep learning technique. YOLOv7 is an object detection algorithm and one of the latest versions in the YOLO (You Only Look Once) series. It was developed by Alexey Bochkovskiy and Chien-Yao Wang, among others. Compared to previous versions, YOLOv7 has improved performance and accuracy. This method can learn more complex features and patterns from large datasets, offering higher accuracy and robustness. Key technologies of YOLOv7 include: - **Backbone Network:** YOLOv7 uses a backbone network called E-ElAN as its feature extractor. This network structure is an improvement over Darknet, enhancing efficiency and performance by introducing Cross-Stage Partial Network. - **Model Optimization:** YOLOv7 optimizes the model structure through techniques such as model pruning, quantization, distillation, etc., improving detection performance and speed. - **Data Augmentation:** Various data augmentation techniques are employed, such as random scaling, random cropping, random distortion, etc., enhancing the model's robustness and generalization ability. - **Multi-Scale Training:** YOLOv7 utilizes a multi-scale training strategy to detect objects of different sizes, improving the model's adaptability. Compared to previous methods, our approach has several advantages: - **Enhanced Accuracy and Robustness:** YOLOv7, powered by deep learning, can more accurately detect the driver's state, including fatigue, distraction, and discomfort, and can adapt to various driving environments and behaviors. - **Real-time Monitoring:** Our method can monitor the driver's state in real-time and issue warnings immediately upon detecting hazardous behavior, thus preventing accidents promptly. - **Comprehensive Coverage:** Our method not only detects driver fatigue and distraction but also identifies other risky behaviors such as drunk driving and mobile phone use, providing a more comprehensive safety assurance for drivers. In summary, our method leverages advanced deep learning techniques to overcome the limitations of previous approaches, enhancing the accuracy, robustness, and real-time capabilities of driver monitoring systems, thereby providing a safer driving environment for motorists. ### Sec 3. Summary of the technical solution; details of the technical solution; you may want to decompose this section into several subsections; add figures to help your explanation. ***Technical Solution Summary:*** - Background: The post-processing algorithm is based on model-predicted object bounding boxes to reduce false alarms in the system. - Method: Utilizing a FIFO Buffer for post-processing, different buffer sizes are set for different hazardous behaviors to determine whether to trigger alerts. ***Details of the Technical Solution:*** 1. Buffer Configuration: - Seatbelt Detection: Utilize a longer buffer (20 seconds) and trigger alert after accumulating 10 seconds without seatbelt. - Fatigue Driving Detection: Employ a shorter buffer (4 seconds) and trigger alerts after accumulating 2 seconds. - Phone Usage Detection: Employ a medium-length buffer (8 seconds) and trigger alerts after accumulating 4 seconds of phone usage. 2. Post-processing Workflow: - Seatbelt Alert: Trigger alert when the accumulated time without seatbelt in the buffer exceeds a specific threshold, preventing false alarms when the driver's seat is vacant. - Phone Usage Alert: Trigger alert when the accumulated phone usage time in the buffer exceeds a specific threshold. - Other Scenarios: Set buffer to 1 under normal conditions. 3. Advantages Comparison: - Adjustment for Platform Frame Rate: Adjust buffer size based on the platform frame rate to maintain consistent alert judgment time. - Avoiding False Negatives: Consider all frames within the recent seconds, reducing the probability of false negatives. - Consistency in Alarm Duration: Ensure consistency in alarm duration across different platforms, enhancing system stability. ***Illustration of the Technical Solution:*** - Through this post-processing algorithm, we can further improve the accuracy and reliability of the system based on model-predicted results, better addressing various hazardous behaviors. Q-Box DMS Post-processing Diagram ![image_3](https://hackmd.io/_uploads/BkJEi4j4A.png) FIFO Buffer for DMS warning messages ![image](https://hackmd.io/_uploads/BJluk0vW0.png) ### Sec 4: Experiments: present here experimental results of the method you have implemented with plots, graphs, images and visualizations. DMS Model Training Results ![image](https://hackmd.io/_uploads/r1lSgRDZ0.png) DMS System On-road Test Function Validation Results ![image](https://hackmd.io/_uploads/H1wYkRwZ0.png) Inference Video ![image](https://hackmd.io/_uploads/rkUpFpO-R.png) ![image](https://hackmd.io/_uploads/BkJ9t6O-A.png) ### Sec 5: Conclusions In this project, we have successfully developed a driver monitoring system based on YOLOv7, utilizing advanced deep learning techniques to enhance driver safety. By continuously monitoring the driver's state in real-time, we are able to promptly detect fatigue, distraction, and other hazardous behaviors, issuing timely warnings to mitigate the risk of traffic accidents. Compared to traditional rule-based or single sensor methods, our approach offers higher accuracy, robustness, and real-time capabilities, providing comprehensive driver safety assurance. In the future, we will focus on further optimizing system performance, including improving accuracy, reducing computational costs, and expanding system functionalities to address a wider range of driving scenarios. Additionally, we will explore the application of this system in different types of vehicles and driving environments to further enhance traffic safety levels. ### Sec 6: References 1. [EURO NCAP](https://www.euroncap.com/en) 2. [L2CS-net Paper](https://arxiv.org/pdf/2203.03339) 3. [YOLOv7 Paper](https://arxiv.org/abs/2207.02696) 4. [YOLOv7 Source Code](https://github.com/WongKinYiu/yolov7) 5. [Driver Monitoring System](https://en.wikipedia.org/wiki/Driver_monitoring_system)