# [Time Series] Deep Learning with Time Series data ###### tags: `Literature Reading` `Time Series` `Self-supervised` `Unsupervised` `deep learning` `Signal Process` ### [AI / ML領域相關學習筆記入口頁面](https://hackmd.io/@YungHuiHsu/BySsb5dfp) --- 彙整2022年底深度學習方法在處理時間序列資料上的資源、趨勢與SOTA模型 - 詳見[ TS2Vec(Towards Universal Representation of Time Series) 論文筆記](https://hackmd.io/@YungHuiHsu/SJbzRpIYs) ## What is Time Series Data ### Property :::spoiler - 結構化資料 相對於語音、文字、圖像、影片等屬於非結構化資料 - 預測任務,例如股價、天氣、電力消耗 - 分類任務,例如 - 聲音(鳥音、樂器)辨識、 - 感測器資料,如穿戴裝置/手機/車用感測器的動作模式辨識 - 異常訊號偵測,如工廠噪音 - 每個時間點(timestamp)之間可能存在**時序自相關(Temporal autocorrelation)、與週期性** - 時間序列資料的特徵 :::spoiler ![Time and Frequency Domain Analysis](https://i.imgur.com/BAGec1N.png =300x) - [Time and Frequency Domain Analysis](https://www.zseries.in/mathematics%20lab) ![Time Domain Versus the Frequency Domain](https://i.imgur.com/0IkFLJ5.png =300x) - [Teach Tough Concepts: Frequency Domain in Measurements](https://knowledge.ni.com/KnowledgeArticleDetails?id=kA03q000000YGJ7CAO&l=en-US) ::: ::: ### Signal Process :::spoiler Garbage in, Garbage out, 了解傳統訊號處理、解讀、清理方式仍有必要 - 特徵提取方式及演變 - 統計(Statistical Domain) :::spoiler - Maximum, Minimum, Mean, Median, Skewness, Kurtosis, Histogram, Interquartile Range, Mean Absolute Deviation, Median Absolute Deviation, Root Mean Square, Standard Deviation, Variance, Empirical Distribution. (Absolute Deviation), Median Absolute Deviation, Root Mean Square, Standard Deviation, Variance, Empirical Distribution Function Percentile Count, Slope of Empirical Distribution Function (ECDF), etc. Slope), etc.. ::: - 時域(Time/Temporal Domain)特徵 :::spoiler - Autocorrelation, Centroid, Mean Differences, Mean Absolute Differences, Median Differences, Median Absolute Differences), Sum of Absolute Differences, Entropy, Peak to Peak Distance, Area Under the Curve, The Number of Maximum Peaks, The Number of Minimum Peaks, Zero Crossing Rate), etc. ::: - 頻域(Frequency/Spectral Domain)特徵 :::spoiler - Fourier Transform, FFT Mean Coefficient, Wavelet Transform, Wavelet Absolute Mean, Wavelet Standard Deviation, Wavelet Variance, Spectral Distance, Spectral Fundamental Frequency, Spectral Median Frequency, and Spectral Frequency. (Variance), Spectral Distance, Spectral Fundamental Frequency, Spectral Maximum Frequency, Spectral Median Frequency, Spectral Maximum Peaks, etc. ::: - 時域-頻域的聯合特徵 ![](https://i.imgur.com/sOh7zlB.png =500x) - [Analyze signals and images in the wavelet domain](https://www.mathworks.com/discovery/wavelet-transforms.html) ![](https://i.imgur.com/M7rhL7p.png =200x) - [Psychoacoustic Impacts Estimation in Manufacturing based on Accelerometer Measurement using Artificial Neural Networks]([i](https://www.researchgate.net/publication/308607454_Psychoacoustic_Impacts_Estimation_in_Manufacturing_based_on_Accelerometer_Measurement_using_Artificial_Neural_Networks)) - 深度(學習)特徵 - 使用深度學習模型抽取特徵 - 特徵工程方法演變 ![](https://i.imgur.com/ZVCgTPL.png =300x) - [Trends in audio signal feature extraction methods](https://www.sciencedirect.com/science/article/pii/S0003682X19308795) - 參考資料 - [Time Domain Analysis vs Frequency Domain Analysis: A Guide and Comparison](https://resources.pcb.cadence.com/blog/2020-time-domain-analysis-vs-frequency-domain-analysis-a-guide-and-comparison) - [乾貨 :時間序列特徵工程](https://www.gushiciku.cn/pl/ggzu/zh-tw) - [时间序列数据上可以抽取哪些频域特征](https://www.zhihu.com/question/24021704/answer/2245867156) - course - [Advanced Machine Learning and Signal Processing](https://www.coursera.org/learn/advanced-machine-learning-signal-processing#syllabus) ::: ## Resources ### Open Librarys #### tsai 目前支援最完整的時間序列深度學習模型資源、包含資料的前處理與視覺化 ##### 官方文件 [tsai](timeseriesai.github.io) :::spoiler - 官方文件 - Data preparation: - [Time Series data preparation](https://colab.research.google.com/github/timeseriesAI/tsai/blob/master/tutorial_nbs/00c_Time_Series_data_preparation.ipynb): this will show how you can do classify both univariate or multivariate time series. - [How to work with (very) large numpy arrays in tsai?](https://colab.research.google.com/github/timeseriesAI/tsai/blob/master/tutorial_nbs/00_How_to_efficiently_work_with_very_large_numpy_arrays.ipynb) - [How to use numpy arrays in tsai?](https://colab.research.google.com/github/timeseriesAI/tsai/blob/master/tutorial_nbs/00b_How_to_use_numpy_arrays_in_fastai.ipynb) - Visualization - [PredictionDynamics](https://github.com/timeseriesAI/tsai/blob/master/tutorial_nbs/09_PredictionDynamics.ipynb) - SOTA - However, the ones that have consistently deliver the best results in recent benchmark studies are Inceptiontime (Fawaz, 2019) and **ROCKET** (Dempster, 2019). Transformers, like **TST** (Zerveas, 2020), also show a lot of promise, but the application to time series data is so new that they have not been benchmarked against other architectures. - Format aligment in TSAI - array:(sample, n_var, t_step) - df format: - cols = [sample, feature, t_step] - `df2xy()` ::: #### tslearn 支援一些ml的方法,近年缺乏維護 ##### 官方文件 [tslearn](https://tslearn.readthedocs.io/en/stable/index.html) ### Open Datasets :::spoiler - [Benchmark time series data sets for PyTorch](https://philipdarke.com/torchtime/) - Univariate Time Series(UTS) - [128 UCR datasets](https://www.cs.ucr.edu/~eamonn/time_series_data_2018) - Multivariate time series (MTS) - [30 UEA datasets](http://www.timeseriesclassification.com) - ref - [ts2vec](https://github.com/yuezhihan/ts2vec) ::: ### Paper and Code survey #### 期刊與研討會等級 :::spoiler - [資源整理:跟上AI前沿知識](https://www.ycc.idv.tw/latest_ai_info.html) - arXiv - 通常在arXiv上幾乎可以搜到所有AI領域重要的論文,而且還可以拿到第一手的論文,但是arXiv並沒有嚴格的審核機制,所以在尚未經過其他研討會和期刊審核過之前務必要對內容執懷疑的態度。 - h-index - 代表所有發表論文中至少有h篇分別被引用了至少h次; - h-median - 代表被引用最多的h篇(由h-index決定)論文當中引用次數的中位數。舉例:一個研討會有五篇文章,其被引用次數如下:17, 9, 6, 3, 2,其h-index為3,所以其具影響力的h篇文章被引用數如下:17, 9, 6,因此中位數9就是h-median。 ::: #### [paperwithcodes](https://paperswithcode.com/) :::spoiler - 查找相關paper及github好用 - 有些paper在[arxiv](https://arxiv.org)未必會放上官方code ::: ## Deep learning with time series data ### Trends #### Trends in the kaggle contest - 近年(2021)時間序列資料競賽的趨勢觀察 - [Key takeaways from Kaggle’s most recent time series competition](https://towardsdatascience.com/key-takeaways-from-kaggles-most-recent-time-series-competition-ventilator-pressure-prediction-7a1d2e4e0131) - 留意資料處理與模型使用趨勢 - LSTM當道、Tranformer架構開始出現 - 透過手動提取特徵、縮短模型收斂時間(效能似乎沒有顯著影響) - [Time Series Forecast: A comprehensive Guide](https://www.kaggle.com/code/ankumagawa/time-series-forecast-a-comprehensive-guide****) #### Paradigm Shift - Supervised -> Self-Supervised 接續自監督方法在NLP與與CV領域的成功,2020年前後時間序列(TS)資料也開始採用自監督學習方法 #### Architecture - Sequence(RNN、LSTM)、CNN -> Transformer :::spoiler - Self-Attention 與Transformer架構首度提出,隨後在NLP領域大放異彩 - [2017。NeurIPS。Attention Is All You Need](https://arxiv.org/abs/1706.03762) - 純Transformer首度在CV領域取得突破 - [2021。ICLR。An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https://arxiv.org/abs/2010.11929) - 簡單樸素的生成式自監督訓練方法在CV領域取得成功 - 借鏡NLP領域 BERT遮蔽(masked)字詞生成的概念 - [2022。CVPR。Masked Autoencoders Are Scalable Vision Learners](https://arxiv.org/abs/2111.06377) - Transformerfor time series - [Are Transformers Effective for Time Series Forecasting?](https://arxiv.org/abs/2205.13504) - [Transformers in Time Series: A Survey](https://arxiv.org/abs/2202.07125) ::: ### Self-Supervised Learning #### SOTA(2022/2023) :::info - 分類任務Ti-MAE與TS2Vec表現接近 - Ti-MAE(生成式訓練)至20200年底尚未公開code,因此目前採用TS2Vec(對比式訓練)進行特徵抽取 ![](https://i.imgur.com/uIJKSBs.png =400x) ::: #### Contrast method :::spoiler ##### ==[2022。AAAI。TS2Vec: Towards Universal Representation of Time Series](https://arxiv.org/abs/2106.10466)== 詳見[TS2Vec論文筆記](https://hackmd.io/@YungHuiHsu/SJbzRpIYs) ![](https://i.imgur.com/DWk54ru.png =500x) - comment - 輕量、運算時間相當快、可輕易取得學習的特徵 - 支援多變量時間序列 - 同時透過不同層級多尺度的特徵比對提供不同層次的語意,提高特徵學習與通用化的能力 - 時序(temporal)、不同實例(instance) - 但缺乏頻率方面的特徵提取? - performance - ![](https://i.imgur.com/bpCcKYE.png =600x) - ![](https://i.imgur.com/0qIcqqK.png) - [github](https://github.com/yuezhihan/ts2vec) ##### [2022。ICLR。Unsupervised Time-Series Representation Learning with Iterative Bilinear Temporal-Spectral Fusion(BTSF)] ![](https://i.imgur.com/NkmzdhB.png =600x) - comment - 透過雙向融合,同時結合時間與頻率特徵(雙線性時空融合),更好同時的捕捉頻率與時間上的全域特徵 - 過去研究不足處 - 多僅捕捉時域或頻域、正負樣本多沿時間軸取樣、長期預測表現不佳 - perpormance - 想法不錯、但實際表現似乎沒跟近年SOTA拉開差距 - 沒有code可參考,也未跟21年的SOTA比較(來不及放上?) ![](https://i.imgur.com/uYXpfsZ.png =400x) ![](https://i.imgur.com/Mf7iKd8.png =400x) ##### [2021。ICLR。Unsupervised Representation Learning for Time Series with Temporal Neighborhood Coding(TNC)](https://arxiv.org/abs/2106.00750) ![](https://i.imgur.com/p8u9nma.png =500x) - comment - TNC 利用訊號生成過程局部平滑性來學習時間序列窗口的泛化。通過在表徵空間(latent vectors)訊號遠、近端的分布是可區分的 ##### [2019。ICLR。Unsupervised Scalable Representation Learning for Multivariate Time Series(SRL/T-Loss)](https://arxiv.org/abs/1901.10738) - 在部分任務上仍勝出2021年發表的模型 - triplet loss ![](https://i.imgur.com/a5jhLpQ.png =300x) - 編碼器架構:多層堆疊的擴張卷積層stacked dilated causal CNN ![](https://i.imgur.com/9hijQvi.png =500x) - comment - 透過比較相鄰(正樣本)與隨機(負樣本)的序列,學習到的表徵應該要與相鄰的樣本盡可能地接近、與隨機選取的樣本盡可能地不相似(或推遠距離) ::: #### Generative method :::spoiler ##### ==[2023。ICLR。Ti-MAE: Self-Supervised Masked Time Series Autoencoders](https://openreview.net/forum?id=9AuIMiZhkL2)== ![](https://i.imgur.com/V6W1H7G.png =400x) - comment - mask 採用時間同步方式 - performance - (ICLR 2023 v3 修改中 pdf (openreview.net)。code尚未公開但之後值得期待) ![](https://i.imgur.com/uIJKSBs.png=200x) - 在預測、分類表現表現看來是目前SOTA - 分類任務與與TS2VEC沒有顯著差距 ##### [2022。_。Time Series Generation with Masked Autoencoder](https://arxiv.org/abs/2201.07006) ![](https://i.imgur.com/5QNhl3Y.png) - comment - 應用在時間序列資料合成領域、沒有中間的抽象特徵(latent vectors) - [github](https://github.com/Dolores2333/ExtraMAE) ##### [2022。_。MTSMAE: Masked Autoencoders for Multivariate Time-Series Forecasting](https://arxiv.org/abs/2210.02199) ![](https://i.imgur.com/vFmrQGo.png =400x) - comment - 可處裡多變量資料 - 用在預測任務,但修改mask方式有機會適用在各種任務 - 沒有code ##### ==[2021。KDD。A Transformer-based Framework for Multivariate Time Series Representation Learning(MVP/TST/TSBERT)](https://arxiv.org/abs/2010.02803)== ![](https://i.imgur.com/ugQA550.png =400x) - Comment - 概念相當於時間序列版的BERT,透過生成式方法(遮蔽時間序列資料)進行訓練 - Paper Link & Resource - [the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Minin](https://dl.acm.org/doi/10.1145/3447548.3467401) - COLAB(TSAI) - `[08_Self_Supervised_TSBERT.ipynb](https://colab.research.google.com/github/timeseriesAI/tsai/blob/master/tutorial_nbs/08_Self_Supervised_TSBERT.ipynb#scrollTo=iZkX2--eMbZP)‵ - 可以彈性使用客製的架構 - 資料增強方式可以自己定義 ::: ### Supervised Learning(待補) #### Forecast task #### Regression task #### Classification task ### Unsupervised Learning :::spoiler 目前趨勢是朝向結合時間與頻率域的跨模態(Cross domain)、自監督、Transformer架構為主 [ML Together: Unsupervised time series clustering (part 1)](https://www.youtube.com/watch?v=N4dvAtV8V0M) ![](https://i.imgur.com/VYNTuIM.png =300x) ##### [2019。ICLR。SOM-VAE: Interpretable Discrete Representation Learning on Time Series]() ![](https://i.imgur.com/NVZUm1b.png =400x) - Comment - 結合自組織圖(自組織圖)與生成式(VAE)的方法 ##### [2018。ICLR。Deep Temporal Clustering : Fully Unsupervised Learning of Time-Domain Features。 ](https://arxiv.org/abs/1802.01059) ![](https://i.imgur.com/KrexlP3.png =400x) - Comment - 完全用時間領域的特徵 - 特徵抽取採用autoencoder概念 ::: ### EXAI for Time Series - [What went wrong and when? Instance-wise Feature Importance for Time-series Models](https://arxiv.org/abs/2003.02821) --- ## Deep Learning相關筆記 ### Self-supervised Learning - [[Self-supervised] Self-supervised Learning 與 Vision Transformer重點筆記與近期發展](https://hackmd.io/7t35ALztT56STzItxo3UiA) - [[Time Series] - TS2Vec(Towards Universal Representation of Time Series) 論文筆記](https://hackmd.io/OE9u1T9ETbSdiSzM1eMkqA) ### Object Detection - [[Object Detection_YOLO] YOLOv7 論文筆記](https://hackmd.io/xhLeIsoSToW0jL61QRWDcQ) ### ViT與Transformer相關 - [[Transformer_CV] Vision Transformer(ViT)重點筆記](https://hackmd.io/tMw0oZM6T860zHJ2jkmLAA) - [[Transformer] Self-Attention與Transformer](https://hackmd.io/fmJx3K4ySAO-zA0GEr0Clw) - [[Explainable AI] Transformer Interpretability Beyond Attention Visualization。Transformer可解釋性與視覺化](https://hackmd.io/SdKCrj2RTySHxLevJkIrZQ) - [[Transformer_CV] Masked Autoencoders(MAE)論文筆記](https://hackmd.io/lTqNcOmQQLiwzkAwVySh8Q) ### Autoencoder相關 - [[Autoencoder] Variational Sparse Coding (VSC)論文筆記](https://hackmd.io/MXxa8zesRhym4ahu7OJEfQ) - [[Transformer_CV] Masked Autoencoders(MAE)論文筆記](https://hackmd.io/lTqNcOmQQLiwzkAwVySh8Q)