###### tags: `archive` `thesis` `draft` `jptw`
# (Archive) Chapter 1 Introduction
> Ref:
> [1] [A Human-in-the-Loop System for Sound Event Detection and Annotation](https://dl.acm.org/doi/10.1145/3214366)
## 1.1 Background and Motivation
:::info
Keywords | `data mining` `data annotation` `audio retrieval system`
:::
With the growing development of Artificial Intelligence (AI), Machine Learning (ML) or Deep Learning (DL) method is applied more than ever before. To develop ML or DL based models, a massive amount of labeled data is required. Hence, data annotation plays a role in the process. There are two ways of labeling data -- manual data labeling by a human, or automated data labeling powered by machine learning. Although the manual approach is tedious and time-consuming, its quality is better than the automatic labeler. Thus, an assistant labeling tool to accelerate the annotation process would be a good solution.
> Ref:
> [1] [為什麼在AI時代下,你需要更聰明的方法協助你做Data Labeling](https://medium.com/avalanche-computing/%E7%82%BA%E4%BB%80%E9%BA%BC%E5%9C%A8ai%E6%99%82%E4%BB%A3%E4%B8%8B-%E4%BD%A0%E9%9C%80%E8%A6%81%E6%9B%B4%E8%81%B0%E6%98%8E%E7%9A%84%E6%96%B9%E6%B3%95%E5%8D%94%E5%8A%A9%E4%BD%A0%E5%81%9Adata-labeling-51f0691f1c86)
> [2] [Why Data Annotation is Important for Machine Learning and AI?](https://medium.com/vsinghbisen/why-data-annotation-is-important-for-machine-learning-and-ai-5e647637c621)
> [3] [Audio Data Mining Using Multi-perceptron Artificial Neural Network](https://www.researchgate.net/publication/239929742_Audio_Data_Mining_Using_Multi-perceptron_Artificial_Neural_Network)
> [4] [Manual Data Labeling for Vision-Based Machine Learning and AI](https://acubed.airbus.com/blog/wayfinder/manual-data-labeling-for-vision-based-machine-learning-and-ai/)
> [5] [Automated Data Labeling With Machine Learning](https://azati.ai/automated-data-labeling-with-machine-learning/)
> [6] [Automatic vs. Manual Data Labeling](https://www.diva-portal.org/smash/get/diva2:1460858/FULLTEXT01.pdf)
## 1.2 Problem Statement
In the domain of audio analysis, sound event detection (SED), identifying the elements in an audio clip `[1]su2017weakly`, can be used for many tasks, even for automated audio annotation. To make an annotation system with SED still needs a large number of labeled data. It could be work if the sound classes are within the general sound event database. For specific target sound classes, by contrast, this is not feasible in the case where users do not have enough pre-labeled examples or provide enough labeled data manually in such a long-duration recording database.
We wish to develop a user-friendly sound event labeling tool to solve the problem in the middle of which there are too many audio clips to be manually tagged or too few training examples to train an accurate statistical model. Also, the user experience should be considered in the case. The main purpose is to speed up the process of hand labeling under the premise of non-sufficient training data provided. Hence, the annotation system is proposed and a pilot test gives advance signs on possibilities.
## 1.3 Thesis Overview
The rest of the thesis is organized as follows. Section 2 discusses related work of sound event detection and annotation. Section 3 describes the proposed system and algorithm in design. Section 4 conducts experimental analysis and Section 5 concludes and gives the direction of future research.
::: spoiler *rough draft (ch-tw)*
>隨著人工智能的不斷發展,機器學習或深度學習方法的應用比以往任何時候都多。 為了開發基於ML或DL的模型,需要大量的標記數據。 因此,數據註釋在該過程中發揮了作用。 標記數據有兩種方式-由人工手動標記數據,或由機器學習提供動力的自動數據標記。 儘管手動方法既繁瑣又費時,但其質量要優於自動貼標機。 因此,使用輔助標記工具來加速註釋過程將是一個很好的解決方案。
>在音頻分析域中,旨在識別音頻剪輯中的元素的聲音事件檢測,即使是用於自動音頻註釋,也可以用於許多任務。 使用SED進行註釋系統仍需要大量標記的數據。 如果聲音類在通用聲音事件數據庫中,它可能是有效的。 相比之下,對於特定的目標聲音類,在用戶沒有足夠的預先標記示例或在這種長時間記錄數據庫中手動提供足夠的標記數據的情況下,這是不可行的。
>我們希望開發一個用戶友好的聲音事件標籤工具來解決中間的問題,其中有太多的音頻剪輯要手動標記或太少的訓練示例來訓練準確的統計模型。 此外,在案例中應考慮用戶體驗。 主要目的是在提供的非充分培訓數據的前提下加速手標籤的過程。 因此,提出了註釋系統,並且試驗測試提供了預先識別可能性。
>本文的其餘部分組織如下。 第2節討論了聲音事件檢測和註釋的相關工作。 第3節描述了設計中所提出的系統和算法。 第4節進行實驗分析,第5節的結論,並給出了未來研究的方向。
:::