時間序列 - HackMD

# 時間序列 :::success [回筆記目錄](https://hackmd.io/68GUUX4MQXGXroA5TlsZ-Q) 編輯:2023/04/19 第ㄧ章施工中 [github](https://github.com/Iofting1023/2024-Spring--Introduction-to-stochastic-calculus) ::: [TOC] # 1.Characteristics of Time Series :::info 時間序列分析處理Time(serial) correlations data. ::: - Time Domain: lagged relationships. - Frequency Domain: Investigation of cycles. - Import Data in textbook from python. ```python=+ import astsadata data = astsadata.jj #後面換成資料及名稱． ``` ## 1.1 The Nature of Time Series Data - 有趨勢的時間序列:溫度 ![image](https://hackmd.io/_uploads/ryD1PtDWC.png) - 有週期性的時間序列:音頻資料 ![image](https://hackmd.io/_uploads/Bkc4FFwWR.png) - 波動度變動資料: 道瓊報酬 ![image](https://hackmd.io/_uploads/r1OJqtPWR.png) ## 1.2 Time Series Statistical Models - Time series can be defined as a collection of random variables indexed according to the order they are obtained in time. - For discrete, we defined by$$\{x_t\}\quad t=0,\pm1,\pm2,\ldots$$ called realization of the stochastic proces, $t$ as integer. :::info 時間的標記原本是連續的，為了方便分析做離散化． ::: ### White noise - 不相關的隨機變數,可以給或不給分配假設 $$w_t \sim wn(0,\sigma_w^2)$$ $$w_t \sim Ｎ(0,\sigma_w^2)$$ ### Moving Averages - 使用相鄰樣本對平滑序列 $$v_t=(\sum_{t= -h}^h a_tw_t)$$ - 這樣做平滑不會改變平均 ### Autoregressions(自迴歸) - 現在的點與過去的點相關 $$x_t = x_{t-1}-0.9x_{t-2}+w_t$$ - lag幾項就需要給幾個初始值，才能建構數列． ### Random Walk with Drift $$x_t = \delta +x_{t-1}+w_t=\delta t +\sum_{j=1}^t w_t$$ - $\delta$: drift term. - $E(x_t)=\delta t$ ### Signal in Noise $$x_t = 2\cos(2\pi\frac{t+15}{50})+w_t$$ - 週期可以表示成$A\cos(2\pi\omega t+\phi)$ - $A$: the amplitude - $\omega$: the frequency of oscillation - $\phi$: is a phase shift - $\frac{A}{\sigma_w}$: signal-to-noise ratio - $E(x_t)=2 \cos(2\pi \frac{t+15}{50})$ 後續章節會學習到detecting regular or periodic signals $$x_t =s_t +v_t$$ - $s_t$: some unknown signal - $v_t$: time series that may be white or correlated over time ## 1.3 Measures of Dependence - 可以把$\{X_t\}$看成一個聯合機率分佈． :::info Although the joint distribution function describes the data completely, it is an unwieldy tool for displaying and analyzing time series data. ::: - most useful descriptive measures are those expressed in terms of covariance and correlation functions. ### Define autocovariance function $$\gamma_x(s,t)=cov(x_s,x_t)=E[(x_s-\mu_s)(x_t-\mu_t)]$$ - measures the linear dependence between two points on the same series observed at different times - For simply denoted $\gamma_x(s,t)=\gamma(x,t)$ - $\gamma(t,t)=E[(x_t-\mu_y)]=var(x_t)$ #### Example1.17 Autocovariance of a Moving Average (p 17) \begin{align} \gamma_V(s,t)&=cov(v_s,v_t)=cov\{\frac{1}{3}(w_{s-1}+w_s+w_{s+1}),\frac{1}{3}(w_{t-1}+w_t+w_{t+1})\}\\ &=\begin{cases}\frac{3}{9}\sigma_w^2,\quad &s=t,\\ \frac{2}{9}\sigma_w^2,\quad &|s-t|=1,\\ \frac{1}{9}\sigma_w^2,\quad &|s-t|=2,\\ 0,\quad &|s-t|>2 . \end{cases} \end{align} - It only depends on the time separation or lag and not on the absolute location of the points along the series - 看時間間隔有沒有重疊到． #### Example1.18 AutocovarianceofaRandomWalk $$\gamma_x(s,t)=cov\big(\sum_{j=1}^s w_j,\sum_{k=1}^t w_k\big)=\min\{s,t\}\sigma_w^2$$ - autocovariance function of a random walk depends on the particular time values $s$ and $t$, and not on the time separation or lag. ### Define autocorrelation function (ACF) $$\rho(s,t)=\frac{\gamma(s,t)}{\sqrt{\gamma(s,s)\gamma(t,t)}}$$ - The ACF measures the linear predictability of the series at time $t$, say $x_t$, using only the value $x_s$. - $-1\leq\rho(s,t)\leq1$ ### Define cross-covariance function and **CCF** 考慮兩條時間數列$\{x_t\},\{y_t\}$. - cross-covariance function $$\gamma_{xy}(s,t)=cov(x_s,y_t)=E[(x_s-\mu_xs)(y_t-\mu_yt)]$$ - cross-correlation function (CCF) $$\rho(s,t)=\frac{\gamma_{x,y}(s,t)}{\sqrt{\gamma_x(s,s)\gamma_y(t,t)}}$$ ## 1.4 Stationary Time Series ### Define strictly stationary Consider one time series $\{x_t\}$,we have - collection of values: $$\{x_{t1},x_{t2},\ldots x_{tk}\}$$ - time shifted set $$\{x_{t1＋h},x_{t2+h},\ldots x_{tk+h}\}$$ we have \begin{align} &\mathcal{P}\{x_{t1}\leq c_1,\ldots x_{tk}\leq c_k\}=\mathcal{P}\{x_{t1+h}\leq c_1,\ldots x_{tk+h}\leq c_k\},\\ &\forall \ k=1,2\ldots ,\ t=1,2\ldots , c_1,c_2,\ldots c_k,\ \text{all time shifts}: h=0,\pm1,\pm2,\ldots. \end{align} - Then all of the multivariate distribution functions for subsets of variables must agree with their counterparts in the shifted set for all values of the shift parameter $h$. - 對相同時間間隔的機率分佈會相同，只取決與間隔不關絕對時間點． - Imply $\gamma(s,t)=\gamma(s+h,t+h)$ - Autocovariance function of the process depends only on the time difference between s and t, and not on the actual times. - 這個條件本身太嚴格，也很難去檢驗所以另外定義． ### Define weak stationary Consider one time series $\{x_t\}$, satisfied 1 and 2. 1. the mean value function, $μ_t$ is constant and does not depend on time $t$. 2. the autocovariance function, $\gamma(s, t)$ depends on $s$ and $t$ only through their difference $|s − t|$. - 通常說 ***stationary***指的就是***weak stationary***. - strictly stationary 可以推到 weak stationary, 反過來不一定成立，（ex. 常態可以反推．） - By $\mu_t$ as constant , under stationary denoted it as $\mu$. - By $\gamma(t+h,t)=cov(x_{t+h},x_t)=cov(x_h,x_0)=\gamma(h,0)$, Denoted it as $\gamma(h)$. - Define $\rho(h)=\frac{\gamma(h)}{\gamma(0)}.$ #### Example1.19 Stationarity of White Noise The mean and autocovariance functions of the white noise series are $\mu_{wt}=0$ and $$ \gamma_w(h)= cov(w_{t+h}, w_t)=\left\{ \begin{aligned} \sigma_w^2 \ ; & \ h=0.\\ 0 \ ; & \ h \neq 0. \\ \end{aligned} \right. $$ - White noise is weakly stationary or stationary - If the white noise variates are also normally distributed, the series is also strictly stationary. #### Example 1.20 Stationarity of a Moving Average - The three-point moving average process is stationary (independent of time t) - $\mu_{vt} = 0$, and $$ \gamma_v(h)=\left\{ \begin{aligned} \frac{3}{9}\sigma_w^2 \ ; & \ h=0.\\ \frac{2}{9}\sigma_w^2 \ ; & \ h=\pm1.\\ \frac{1}{9}\sigma_w^2 \ ; & \ h=\pm2.\\ 0 \ ; & \ |h| > 2. \\ \end{aligned} \right. $$ - autocorrelation function $$ \rho_v(h)=\left\{ \begin{aligned} 1 \ ; &\ h=0 \\ \frac{2}{3} \ ; &\ h=\pm1\\ \frac{1}{3} \ ; &\ h=\pm2\\ 0 \ ; &\ |h|>2\\ \end{aligned} \right. $$ ![image](https://hackmd.io/_uploads/r12RxSi-0.png) #### Example 1.21 A RandomWalk is Not Stationary - not stationary because - autocovariance function $\gamma(s,t)=min\{s,t\}\sigma_w^2$ - the mean of random walk with drift $\mu_{xt}=\delta t$ - both are a function of time $t$ #### Example 1.22 Trend Stationarity model: $x_t=\alpha+\beta t+y_t$ where $y_t$ is white noise (stationary) - Mean function: $\mu_{x, t}=E(x_t)=\alpha +\beta_t+\mu_y$ (不獨立於時間 $t$，所以 not stationary) - autocovariance functio: $\gamma_x(h)=cov(x_{x+h}, x_t)=E[(x_{t+h}-\mu_{x, t+h})(x_{t}-\mu_{x, t})]=E[(y_{t+h}-\mu_{y, t+h})(y_{t}-\mu_{y, t})]=\gamma_y(h)$ - the model may be considered as having stationary behavior around a linear trend (called **trend stationarity**) ![image](https://hackmd.io/_uploads/r1CYcHkM0.png) ### Special properties (ACF of a stationary process) - $\gamma(h)$ is non-negative definite ( Problem 1.25 [(a)](https://stats.stackexchange.com/questions/431429/show-that-the-autocovariance-function-of-stationary-process-x-t-is-positiv)) - 確保 $x_t$ 的變異的線性組合不會是負的，即: $$ 0 \leq var(a_1x_1+\dots+a_nx_n)=\sum_{j=1}^{n}\sum_{k=1}^n a_ja_{k}\gamma(j-k) $$ - Cauchy-Schwarz inequality: $|\gamma(h)| \leq \gamma(0)$ - $\gamma(h)=\gamma(-h)$ proof: $\gamma((t+h)-t)=cov(x_{t+h}, x_t)=cov(x_t, x_{t+h})=\gamma(t-(t+h))$ ### Definition 1.10: cross-covariance function Two time series $x_t$ and $y_t$ $$ \gamma_{xy}(h)=cov(x_{t+h}, y_t)=E[(x_{t+h}-\mu_x)(y_{t+h}-\mu_y)] $$ ### Definition 1.11: cross-correlation function Two time series $x_t$ and $y_t$ $$ \rho_{xy}(h)=\frac{\gamma_{xy}(h)}{\sqrt{\gamma_x(0)\gamma_y(0)}} $$ - property: - $\gamma_{xy}((t+h)-t)=cov(x_{t+h}, y_t)=cov(y_{t}, x_{t+h})=\gamma_{xy}(-h)$ - $\gamma_{xy}(h)=\gamma_{xy}(-h)$ - $\rho_{xy}(h)=\rho_{xy}(-h)$ #### Example 1.24 Prediction Using Cross-Correlation model: $y_t=Ax_{t-l}+w_t$ property: the series $x_t$ is said to lead $y_t$ for $l>0$ and is said to lag $y_t$ for $l<0$ - $\gamma_{xy}(h)=cov(y_{t+h}, x_t)=cov(Ax_{t+h-l}+w_{t+h}, x_t)=cov(Ax_{t+h-l}. x_t)=A\gamma_x(h-l)$ - 根據 Cauchy–Schwarz Inequality，$|\gamma(h-l)| \leq \gamma(0)$，並且在 $h=l$ 時會有最大的 $\gamma_{xy}(h)$ - 下圖是模擬 $l=5$ 的 CCovF ，可看到在 $lag=5$ 時有最大的值 ![image](https://hackmd.io/_uploads/H1zNsmmM0.png) ### Definition 1.12 linear process: $x_t$, is defined to be a linear combination of white noise variates $w_t$ , and is given by $$ x_t=\mu+\sum_{j=-\infty}^{\infty}\psi_jw_{t-j}, \ \sum_{j=-\infty}^{\infty}|\psi_j|<\infty $$ - Problem 1.11 $\gamma_x(h)=\sigma_w^2\sum_{j=-\infty}^{\infty}\psi_{j+h}\psi_j$ - 讓該 linear process 有有限變異的條件是 $\sum_{j=-\infty}^{\infty}\psi_j^2<\infty$ ### Definition 1.13 Gaussian process $\{x_t\}:$ Gaussian process, if the n-dimentional vector $x=(x_{t_1}, ..., x_{t_n})'$, for every collection of distinct time points $t_1, t_2, ..., t_n$, have a multivariate normal distribution - mean vector: $E(x)=\mu=(\mu_{t_1}, \mu_{t_2}, ..., \mu_{t_n})'$ - covariance matrix: $var(x)=\Gamma=\{\gamma(t_i, t_j); i, j=1, ..., n\}$, which is assumed to be positive definite - density function: $f(x)=\frac{1}{(2\pi)^{n/2}}|\Gamma|^{-1/2}exp\{-\frac{1}{2}(x-\mu)'\Gamma^{-1}(x-\mu)\}$ important porperty: - $\{x_t\}$: weakly stationary, $\mu_t$: constant, $\gamma(t_i, t_j)=\gamma(|t_i-t_j|)$ ## 1.5 Estimation of Correlation 若是時間序列是 stationary 則: - 平均是常數 $\mu_t=\mu$，可以用樣本平均估計$\bar{x}$ $$ \bar{x}=\frac{1}{n}\sum_{i=1}^{n} x_i $$ - $$ var(\bar{x})=var(\frac{1}{n}\sum_{t=1}^n x_t)=\frac{1}{n^2}cov(\sum_{t=1}^{n} x_t, \sum_{s=1}^{n} x_s)=\frac{1}{n^2}(n\gamma_x(0)+(n-1)\gamma_x(1)+\dots+) $$ ### Definition 1.14 The sample autocovariance function $$ \hat{\gamma}(h)=\frac{1}{n}\sum_{t=1}^{n-h}(x_{t+h}-\bar{x})(x_t-\bar{x}) $$ $with \hat{\gamma}(-h)=\hat{\gamma}(h) for h=0, 1, ..., n-1$ [Problem 1.25 (b)](https://www.stat.berkeley.edu/~bartlett/courses/153-fall2010/lectures/4.pdf) ### Definition 1.15 The sample autocorrelation function $$ \hat{\rho}(h)=\frac{\hat{\gamma}(h)}{\hat\gamma(0)} $$ ## 1.6 Vector-Valued and Multidimensional Series - Consider $x_t =(x_{t1},\ldots,x_{tp})^\top$ a vector time series,which contains as its components $p$ univariate timeseries. **For the stationary case,** - $p\times1$ mean vector: $$\mu = E(x_t)=(\mu_{t1},\ldots,\mu_{tp})$$ - $p\times p$ autocovariance matrix: $$$$ # 2.Time Series Regression and Exploratory Data Analysis - Multiple linear regression in a time series context - Model selection - Exploratory data analysis - Preprocessing nonstationary time series ## 2.1 Classical Regression in the Time Series Context 使用線性迴歸對時間序列進行建模． $$x_t =\beta_0+\beta_1z_{t1}+\ldots +\beta_qz_{tq}+w_t$$ - $\{w_t\}$ is a random error or follow $N(0,\sigma_w^2)$. - For time series regression, it is rarely the case that the noise is white, and we will need to eventually relax that assumption. - $\beta = (\beta_0,\beta_1,\ldots,\beta_q)'$ - 可以透過OLS的方式最小化$\text{SSE}=\sum_{t=1}^n(x_t-\hat{\beta}'z_t)^2$得到$\beta$的估計量$\hat{\beta}$． ### 模型選擇 - Select the best subset of independent variables. 考慮一個變數子集$z_{t,1:r}=\{z_{t1},\ldots ,z_{tr}\},\ r<q$, reduce model為: $$x_t =\beta_0+\beta_1z_{t1}+\ldots +\beta_qz_{tr}+w_t$$ 透過$H_0:\beta_{r+1}=\ldots= \beta{q}=0$, 比較reduce model和full model. - 檢定統計量: $$F=\frac{(\text{SSE}_r-\text{SSE})/(q-r)}{\text{SSE}/(n-q-1)}$$ 如果$H_0$ 成立,$\text{SSE}_r\approx \text{SSE}$,因為 $\beta_{r+1}=\ldots= \beta{q}$ 會接近0. - 可以透過逐步方入變數的選擇模型，稱為 *stepwise multiple regression*. 有時候不會使用逐步方式而是想直接比較多個特定的模型，則會使用以下幾個指標． - 考慮一個包含k個解釋變數的模型，以及vaeiance 的MLE: $$\sigma_k^2 = \frac{\text{SSE}(k)}{n}$$ ### AIC $$\text{AIC}=\log\hat{\sigma}^2_k+\frac{n+2k}{n}$$ - 使AIC最小的k會被認為是最好的模型． ### AICc $$\text{AICc}=\log\hat{\sigma_k^2}+\frac{n+k}{n-k-2}$$ - 小樣本中相比於AIC使用這個較好． ### BIC $$\text{BIC}=\log\hat{\sigma}_k^2+\frac{k\log n }{n}$$ - BIC 對變數的懲罰比AIC大許多，因此會傾向選擇較小的模型． ## 2.2 Exploratory Data Analysis 對 trend stationary model 最簡單的建模如下： \begin{align} x_t=\mu_t+y_t \end{align} 其中 $x_t$：observations, $\mu_t$：trend 和 $y_t$： stationary process. 強烈的 trend 經常會蓋住 stationary process ($y_t$)，所以會在 expploratory analysis 的第一步會移除趨勢，而會對 trend component 做合理的估計得 $\hat \mu_t$ $$ \hat y_t=x_t-\hat \mu_t $$ #### Exaple 2.4: Detrending Chicken Prices 該範例是對資料假設模型如 $x_t=\mu_t+y_t$ ****OLS**** 可以利用 Example 2.1 對資料 chicken 配的回歸模型去除 trend，trend 模型如下： $$ \mu_t = \beta_0+\beta_1t+e_t $$ 用 OLS 得到 $\hat \mu_t=-7131.02+3.59t$，則可以得到 detrended series $$ \hat y_t=x_t+7131.02-3.59t $$ **differencing：** 由前面 Chapter1. 可以得出 $x_t-x_{t-1}$ 是 stationary，所以透過差分亦可讓資料成定態。 \begin{aligned} x_t-x_{t-1}&=(\mu_t+y_t)-(\mu_{t-1}+y_{t-1})\\ &=\delta+w_t+y_t-y_{t-1} \end{aligned} 下圖分別是利用上面兩個方法做 detrending 得到的時間序列： ![image](https://hackmd.io/_uploads/rykPpWOBC.png) ### difference 1. 差分的優點是不需要估計參數，但是亦是缺點就沒有對 $y_t$ 有估計，所以如果只是要讓時序變定態，差分是合適的。 2. first difference: $\bigtriangledown x_t=x_t-x_{t-1}$ 3. first difference 去除 linear trend，而 second difference 去除 quadratic trend. ### Definition: backshift operator * backshift operator: $Bx_t=x_{t-1}$ (extend: $B^kx_t=x_{t-k}$) * forward-shift operator: $x_t=B^{-1}Bx_t=B^{-1}x_{t-1}.$ * $\bigtriangledown x_t=(1-B)x_t$ * $\bigtriangledown^d=(1-B)^d$ #### Example 2.5: Differencing Chicken Prices 在上圖可以看到一街差分後就看不到像 detrend series 的趨勢了，而從下面的 ACF 可看到在一階差分之後可以看到有每年的循環。 ![image](https://hackmd.io/_uploads/HkaheQdBC.png) #### Example 2.6: Differencing Global Temperature 從下圖的最上面可看到這個資料較像 random walk 而不是 trend stationary series，因此用差分比較合適。 ![image](https://hackmd.io/_uploads/SymmL7uHA.png) ### fractional differencing 用來處理上面 difference order 在 $-0.5<d<0.5$ 的情況，通常在長期的時序資料（例如水文學的資料）會有 $0<d<0.5$ 的情況。 * 對數轉換： $y_t=log\ x_t$ * Box-Cox family: \begin{aligned} y_t= (x_t^\lambda-1)/\lambda &; \lambda \neq 0 \\ logx_t &; \lambda=0 \end{aligned} ### Scatterplot Matrics ACF 讓我們了解到 $x_t$ 和 $x_{t-h}$ 的線性關係，但可能忽略非線性關係，因此需要利用 scatterplot。 #### Example 2.8: Scatterplot Matrices, SOI and Recruitment 紅線是 locally weighted scatter plot smoothing (lowess) lines 可以用來當著發現非線性關係。 - $S_t$ 和 $S_{t-1}, S_{t-2}, S_{t-11}, S_{t-12}$ 有強正線性關係 ![image](https://hackmd.io/_uploads/Bkjp5NdrC.png) - 可看到 $R_t$ 和 $S_{t-5}, S_{t-6},S_{t-7},S_{t-8}$ 有強非線性關係 ![image](https://hackmd.io/_uploads/BJ-k3VdHC.png) #### Example 2.9: Regression with Lagged Variable #### Example 2.10: 考慮加入 dummy 的情況： $$ R_t=\beta_0+\beta_2S_{t-6}+\beta_2D_{t-6}+\beta_3D_{t-6}S_{t-6}+w_t $$ 其中 $D_t=0$ 若 $S_t<0$ 且 o.w. 若 $S_t \geq 0$ 由下面的圖可以看到上面回歸模型和 lowess 配的模型差不多，但是從下向兩張圖可看到殘差仍不是白噪音。 ![image](https://hackmd.io/_uploads/HkBsBduHC.png) ![image](https://hackmd.io/_uploads/HJ_Crd_r0.png) ### periodic behavior #### Example 2.10: Using Regression to Discover a Signal in Noise 考慮以下模型產生 500 個觀測值： $$ x_t=Acos(2\pi \omega t+\phi)+w_t $$ 其中 $\omega=1/50, A=2, \phi=0.6 \pi, \sigma_w=5$ 假設 $\omega=1/50$ 是已知，其中 $A$ 和 $\phi$ 未知，利用三角函數的性質可以將右式轉換得到: $$ x_t=\beta_1cos(2\pi t/50)+\beta_2sin(2\pi t/50)+w_t $$ 可以利用回歸估計 $\hat \beta_1$ 和 $\hat \beta_2$ 來偵測是否有 cyclic or periodic signals. 根據上面的回歸得 $\hat \beta_1=-0.74, \hat \beta_2=-1.99$，而真實的值是 $\beta_1=-0.62, \beta_2=-1.9$，其他相關的討論在第四章。 ![image](https://hackmd.io/_uploads/rkWevXKBR.png) ## 2.3 Smoothing in the Time Series Context 利用 moving average 去 smooth 白噪音，可以發現時間序列中的 long-term trend 和 seasonal component。 $$ m_t=\sum_{j=-k}^k a_jx_{t-j} $$ 其中 $a_j=a_{-j}$ 且 $\sum_{j=-k}^k a_j=1$ (symmetric moving average) #### Example 2.11 Moving Average Smoother 可以看出用這個方式曲線看起來還是比較不平滑的。 ![image](https://hackmd.io/_uploads/HJ78dNtSC.png) #### Example 2.12 Kernel Smoothing 也是 moving average smoother 只是權重是用 kernel 去加權平均觀測值。 $$ m_t=\sum_{i=1}^n w_i(t)x_i $$ 其中 $w_i(t)=K(\frac{t-i}{b})/\sum_{j=1}^n K(\frac{t-j}{b})$，而 $K(.)$ 是 kernel function，通常用常態分配，而 $b$ 若是越大則曲線會越平滑。 ![image](https://hackmd.io/_uploads/r1orwvKSC.png) #### Example 2.13 Lowess - 另一種方法是 nearest neighbor regression，這個方法是依據 k-nearest neightbors regression，基本概念是只用 $\{x_{t-k/2}, \dots, x_t, \dots, x_{t+k/2}\}$ 用回歸去預測 $x_t$，然後設定 $m_t=\hat x_t$，實際作法 Lowess 更複雜。 - 因此要選擇要用多少附近數據去預測 $m_t$，下圖聖嬰現象的循環是用 5%（藍線），而整體趨勢是預設值 2/3（紅線）。 ![image](https://hackmd.io/_uploads/rkXdxuKB0.png) #### Example 2.14 Smoothing Splines - 直覺的方式是配一個時間 t 的 polynomial regressioin ，例如一個 cubic polynomial: $$ x_t=\beta_0+\beta_1t+\beta_2t^2+\beta_3t^3+e_t=m_t+e_t $$ - 另一種進階的方法是將時間 $t=1, \dots, k$ 分成 $k$ 區間，$[t_0=1, t_1], [t_1+1, t_2], \dots, [t_{k-1}+1, t_k=n]$，而這些區間 $t_0, t_1, \dots, t_k$ 稱為 knots，然後在每個區間配一個時間 t 的 polynomial regressioin，若是配 cubic polynomial ，則稱為 cubic splines， - 該方式稱為 **smoothing splines**，希望讓該式越小越好， $\lambda$ 的選擇會是呈現出回歸 (completely smooth) 和資料本身 (no smoothness) 的權衡。$\lambda$ 越大則資料越平滑。 $$ \sum_{t=1}^n [x_t-m_t]^2+\lambda\int(m_t'')^2dt $$ ![image](https://hackmd.io/_uploads/HkNk__FBA.png) #### Example 2.15 Smoothing One Series as a Function of Another 由 lowes line 可以大概看出當溫度在華氏 83 度時有最低的死亡率。 ![image](https://hackmd.io/_uploads/HJFzLYFHC.png)