Stock prediction with Multiple Stock data and CNN-Xception

--- title: 'Stock prediction with Multiple Stock data and CNN-Xception' disqus: hackmd --- Stock prediction with Multiple Stock data and CNN-Xception === ## Table of Contents [TOC] ## Resource-Reference **RL Overview in Trading (2018)** https://www.youtube.com/watch?v=c0gpgCyjTM8&t=1147s https://github.com/rodler/quantcon2018/blob/master/Quantcon-RL-Example.ipynb **Infrastructure of ML in Multivariate Analysis (2017)** https://medium.com/@alexrachnog/neural-networks-for-algorithmic-trading-2-1-multivariate-time-series-ab016ce70f57 **Reccomended CNN Model** https://colab.research.google.com/drive/1Lu-oPmcM_3nB9PfEmD9puClyAdKaz-_1?usp=sharing https://towardsdatascience.com/illustrated-10-cnn-architectures-95d78ace614d https://towardsdatascience.com/understand-the-architecture-of-cnn-90a25e244c7 **Introduction to XceptionCNN Implementation (2020)** https://towardsdatascience.com/xception-from-scratch-using-tensorflow-even-better-than-inception-940fb231ced9 ## Introduction CNNs tend to outperform other time-series prediction NNs (Udactiy, specifically, taking orders of datablocks each iteration). I think this due to spatial intrepertations made only by CNNs, and that one can transform samples and make meaningful representations rather than relying on step-by-step representations. A 2018 quantcon presented that SVMs are efficient in the trading space, and claims that technical analysis aren't good predictors/features because of it's lagged characteristics. However, transforming raw data into higher-ordered representations can make it easier and efficient for a machine learning model to predict. My hypothesis is that supervised learning models require a more broad representation of data in order to create better predictions of them. However, I have not investigated the space of unsupervised learning models, thus, feature engineering may be useful compared to the TA cohorts. **I'm also trying to figure out whether CNNs are better than SVMs**. For now, I would like to use a universe of stocks as features for trading. These have more data, and is readily available. The issues posing now is how to make a reward/PnL function to determine model change. And then, how to determine long/short strategies based on the model, and then the risk management system (maybe after knowing the stochasic process of PnL). The goal as of now **is to make a prediction model to predict stock data**. The next steps is to see how well the model performs, and **how to trade** based **on model output**. Information Gathered so far: * **CNNs are better performers** of time-series prediction (Empirical evidence from Udacity vs. RNNs and LSTMs). * **WaveNets**, a dilated convolution, empiraclly **outperforms 1D CovNets**. * **Xception** is a form of CNN, it reduces computer resources. Salient information is extracted efficiently. Avoids overfitting. Better than **Inception**. * **TA are lagged features**, and shouldn't be used. * It's a good idea to **use multivariable data** (more than one stock) to make predictions. * **Higher order representations** of data (raw open/close data to smoothed 5-day average data) helps **ML learn better**. * **ARIMA models outperform vanilla GRU models.** * (ARIMA models are poor, as well as RNNs. It's time to move on to CNNs and SVMs) * Improved performance comes from **adding predictive factors** and **removing noise (unexplained returns)**. * **Reward/PnL functions** to train models are **difficult**. Data Preparation --- Better performance comes from being able to distinguish trends more easily. We need more data and feature sets that correlate well with each other (thus, a feature selector can be useful). We can just use a smoothed version of price series to rid of noise. According to Dr. Tom Starke, we can look use a 5-day SMA transformation of our price series data. But for machine learning input, the data must be transformed. Refer to this document to prepare the data. ## [CNNs/Xception: Keras set-up w/ Multivariables](https://towardsdatascience.com/xception-from-scratch-using-tensorflow-even-better-than-inception-940fb231ced9) ***CNN Summary*** Block 1: feature extractor, repeated several times on inputs and output feature map vector. Block 2: end of all neural network for classification, returns probablility of class (possible markovian process) via logistic function (binary) or softmax function (multi-class). In terms of stocks, softmax works well. ***CNN Layers*** Convolutional Layer: the first layer, detect set of features (feature = filter) (*in terms of trading, filter windows should represent a size equal to the length of holding stock [e.g. holding stock 1 min needs window size of 1 min or above]*). Filters determined by weighting function, similarities, and backpropagation. Pooling Layer: reduces size of image while preserving characteristics, avoids overfitting. Pattern identification stage. ReLU correction Layer: non-linear function, an activation function Fully-connected Layer: last layer, produces output vector by applying linear combination and activation function of input. Classifies image, returns vector size N (# of classes). Each element = probability for input image to belong to a class (opportunity to search for markovian process). ***Xception Summary*** Learn richer representations with fewer parameters, and faster computing w/ less resources. Separable convolution better than normal convolution because the computation costs is lower, and doesn't have to transform multiple times. Validation accuracy is typically higher. ***Xception Layers*** Entry Flow: Two blocks of convolutional layers followed with a ReLU actionation block + separable convolutional layers (w/ ReLu + maxpooling). ![](https://i.imgur.com/ASo02mU.png) Middle Flow: *read more* ![](https://i.imgur.com/XTzeIKL.png) Exit Flow: *read more* ![](https://i.imgur.com/aiIeMLH.png) **Importing Libraries** ```python= # libraries required import tensorflow as tf from tensorflow.keras.layers import Input,Dense,Conv2D,Add from tensorflow.keras.layers import SeparableConv2D,ReLU from tensorflow.keras.layers import BatchNormalization,MaxPool2D from tensorflow.keras.layers import GlobalAvgPool2D from tensorflow.keras import Model ``` **Creating Conv-BatchNorm Block** ```python= # creating the Conv-Batch Norm block def conv_bn(x, filters, kernel_size, strides=1): x = Conv2D(filters=filters, kernel_size = kernel_size, strides=strides, padding = 'same', use_bias = False)(x) x = BatchNormalization()(x) return x ``` :::info The Conv-Batch Norm block takes as inputs: 1) tensor - x 2) \# of filters - filters 3) kernel size of convolayer - kernel_size 4) stridesOfLayer - strides Apply a convolution layer to x, and then apply **Batch Normalization** . Batch normalization standardize inputs. ::: **SeparableConv - BatchNorm Block** ```python= # create sepConv-BatchNorm def sep_bn(x, filters, kernel_size, strides=1): x = SeparableConv2D(filters=filters, kernel_size=kernel_size, strides=strides, padding='same', use_bias=False) (x) x = BatchNormalization() (x) ``` :::info SeparableConv2D used instead of a regular Conv2D ::: **Functions for Entry, Middle, and Exit Flow** ```python= # entry flow def entry_flow(x): x = conv_bn(x, filters = 32, kernel_size=3, strides=2) x = ReLU() (x) x = conv_bn(x, filters=64, kernel_size=3, strides=1) tensor = ReLU() (x) x = sep_bn(tensor, filters=128, kernel_size=3) x = ReLU() (x) x = sep_bn(x, filters = 128, kernel_size=3) x = MaxPool2D(pool_size=3, strides=2, padding='same') (x) tensor = conv_bn(tensor, filters=128, kernel_size=1, strides=2) x = Add() ([tensor, x]) x = ReLU()(x) x = sep_bn(x, filters =256, kernel_size=3) x = ReLU()(x) x = sep_bn(x, filters =256, kernel_size=3) x = MaxPool2D(pool_size=3, strides=2, padding = 'same')(x) tensor = conv_bn(tensor, filters=256, kernel_size = 1,strides=2) x = Add()([tensor,x]) x = ReLU()(x) x = sep_bn(x, filters =728, kernel_size=3) x = ReLU()(x) x = sep_bn(x, filters =728, kernel_size=3) x = MaxPool2D(pool_size=3, strides=2, padding = 'same')(x) tensor = conv_bn(tensor, filters=728, kernel_size = 1,strides=2) x = Add()([tensor,x]) return x # middle flow def middle_flow(tensor): for _ in range(8): x = ReLU() (tensor) x = sep_bn(x, filters = 728, kernel_size=3) x = ReLU()(x) x = sep_bn(x, filters = 728, kernel_size = 3) x = ReLU()(x) x = sep_bn(x, filters = 728, kernel_size = 3) x = ReLU()(x) tensor = Add()([tensor,x]) return tensor # exit flow def exit_flow(tensor): x = ReLU()(tensor) x = sep_bn(x, filters = 728, kernel_size=3) x = ReLU()(x) x = sep_bn(x, filters = 1024, kernel_size=3) x = MaxPool2D(pool_size = 3, strides = 2, padding ='same')(x) tensor = conv_bn(tensor, filters =1024, kernel_size=1, strides =2) x = Add()([tensor,x]) x = sep_bn(x, filters = 1536, kernel_size=3) x = ReLU()(x) x = sep_bn(x, filters = 2048, kernel_size=3) x = GlobalAvgPool2D()(x) x = Dense (units = 1000, activation = 'softmax')(x) return x ``` **Xception Model Creation** ```python= # model code input = Input(shape = (299,299,3)) x = entry_flow(input) x = middle_flow(x) output = exit_flow(x) model = Model(inputs=input, outputs=output) model.summary() ``` **Model Inspection** ###### tags: `CNN` `Documentation`