part2 week1 - HackMD

# part2 week1 ## Sequences, Time Series and Prediction week1 ### Ｗhat exactly is a time series? > It's typically defined as an ordered sequence of values that are usually equally spaced over time. > 隨著時間有固定間隔、順序的序列值 ### Univariate v.s. Multivariate Time Series > 單變量可觀察趨勢、多變量則可能存在相關性、取得更多資訊 ![](https://i.imgur.com/ncXfByv.png) ### 時間序列的分析 * 預測未來值 * 插補過去缺失值(imputation) * 異常檢測(Anomaly detection) * 聲紋辨識(STT) ### common patterns in time series * 趨勢(trend) * 季節性(seasonality) ![](https://i.imgur.com/b7s1Vq7.png) * 白噪音(white noise) * 自相關(autocorrelation) ![](https://i.imgur.com/JvHQiLB.png) > Often a time series like this is described as having memory as steps are dependent on previous ones. > The spikes which are unpredictable are often called Innovations. ### stationary time series 機器學習是在發現行為模式(patterns)，但要是這些行為模式在未來同樣會出現才有用，這也反應抓取的資料區間不是越長越好，而是固定的行為模式在過去、未來存在關聯。 ![](https://i.imgur.com/B2fr7JV.jpg) > If this were stock, price then maybe it was a big financial crisis or a big scandal or perhaps a disruptive technological breakthrough causing a massive change. After that the time series started to trend downward without any clear seasonality. We'll typically call this a non-stationary time series. ### Train, validation and test sets * 需要確認Train,validation and test sets都有涵蓋相同的趨勢變化(trend, seasonality,...) * roll-forward partitioning vs. fixed partition fixed partition: 固定時間窗格(後續主要以此方法做介紹) ![](https://i.imgur.com/6ku4HgV.jpg) roll-forward partitioning: 動態調整時間窗格 ![](https://i.imgur.com/dFv7bUP.png) ### Metrics for evaluating performance * 衡量模型成效的各種指標(errors, mse, rmse, mae, mape) * MAPE: 平均絕對百分比誤差(P是百分比的意思) 註:資料集當中有0的資料會導致無法算出結果 ![](https://i.imgur.com/m19PD1d.png) ### Moving average and differencing * 比較Naive, MA, differencing各方法的成效(mae) naive:5.9 ![](https://i.imgur.com/ld50DbF.jpg) MA(30): 7.14 ![](https://i.imgur.com/3oEojFV.jpg) differencing: 5.8 ![](https://i.imgur.com/6tlDbeQ.jpg) MA(30)+differencing: 4.5 ![](https://i.imgur.com/PMb1iN7.jpg) * 進入deep learning之前記得先試試這些統計方法 ### Trailing versus centered window * centered window: 過去+未來預測現在(此方法無法預測未來的值) * Trailing window: 純粹用過去資料來預測現在