# LSTM實現時間序列預測的模型 ## 1.讀出數據 從文件內讀出數據,得到的序列為raw_n ``` var s = "Python syntax highlighting"; from numpy import array from keras.models import Sequential from keras.layers import LSTM from keras.layers import Dense from keras import layers import numpy as np import keras f = open('seq_to_train.txt') num_file = f.read() data = num_file.split('\n') f.close() print('data len:',len(data)) raw_n = list(map(float,data)) ``` ## 2.原始數據圖像化 ``` var s = "Python syntax highlighting"; import matplotlib.pyplot as plt plt.plot(raw_n) plt.show() ``` ## 3.定義數據的處理方法 數字向量由原始數據切片得到,方法內的n_steps為預測關聯到的數字向量長度。由觀察可知這裡的關聯長度為100比較合適 ``` var s = "Python syntax highlighting"; def split_sequence(sequence, n_steps): # 數據的處理方法 X = [] y = [] for i in range(len(sequence)): end_ix = i + n_steps if end_ix > len(sequence)-1: break seq_x, seq_y = sequence[i:end_ix], sequence[end_ix] # seq_x 為數字向量 seq_y為標籤 x_train.append(seq_x) y_train.append(seq_x) return array(x_train), array(y_train) raw_seq = raw_n n_steps = 100 x_train,y_train = split_sequence(raw_seq, n_steps) n_features = 1 x_train = x_train.reshape((x_train.shape[0], x_train.shape[1], n_features)) ``` ## 4.設置模型 這裡的模型用到三層LSTM,可以較好的記錄下數據特征。輸出層為一個Dense層。 ``` var s = "Python syntax highlighting"; model = Sequential() model.add(LSTM(50, activation='relu',input_shape=(n_steps, n_features),return_sequences=True)) model.add(LSTM(50, activation='relu',return_sequences=True)) model.add(LSTM(50)) model.add(Dense(1)) model.compile(optimizer='adam', loss='mse') model.summary() ``` 模型的信息 ![](https://i.imgur.com/Wj8QrzA.jpg) ## 5.訓練模型 將訓練數據和標籤傳入模型訓練,訓練20輪。 ``` var s = "Python syntax highlighting"; model.fit(x_train, y_train, epochs=20, batch_size=50 , verbose=1,validation_split = 0.1) ``` ## 6.使用模型 模型訓練好之後用模型預測後面的500個值。用x_result記錄500個數值。 ``` var s = "Python syntax highlighting"; x_input = array(raw_seq[-n_steps:]) x_input = x_input.reshape((1, n_steps, n_features)) x_input_r = raw_seq[-(n_steps-1):] x_result = [ ] for i in range(500): yhat = model.predict(x_input, verbose=0) x_result.append(yhat[0][0]) x_input_r.append(yhat[0][0]) x_input = array(x_input_r[-100:]) x_input = x_input.reshape((1, n_steps, n_features)) print(x_result) ``` ## 7.結果圖像化 ``` var s = "Python syntax highlighting"; import matplotlib.pyplot as plt x_index = list(range(1000,1500)) plt.plot(raw_n) plt.plot(x_index,x_result) plt.show() ``` ![](https://i.imgur.com/tJoAObo.jpg) ## 8.總結 RNN可以處理這種以序列方式輸入的數據,增加訓練輪數和增加訓練數據量可以讓得出的結果更準確,數據量比較小的時候需要很多的訓練輪數,模型設計也要考慮到較小的數據量。