---
title: 'Stock prediction with Multiple Stock data and CNN-Xception'
disqus: hackmd
---
Stock prediction with Multiple Stock data and CNN-Xception
===
## Table of Contents
[TOC]
## Resource-Reference
**RL Overview in Trading (2018)**
https://www.youtube.com/watch?v=c0gpgCyjTM8&t=1147s
https://github.com/rodler/quantcon2018/blob/master/Quantcon-RL-Example.ipynb
**Infrastructure of ML in Multivariate Analysis (2017)**
https://medium.com/@alexrachnog/neural-networks-for-algorithmic-trading-2-1-multivariate-time-series-ab016ce70f57
**Reccomended CNN Model**
https://colab.research.google.com/drive/1Lu-oPmcM_3nB9PfEmD9puClyAdKaz-_1?usp=sharing
https://towardsdatascience.com/illustrated-10-cnn-architectures-95d78ace614d
https://towardsdatascience.com/understand-the-architecture-of-cnn-90a25e244c7
**Introduction to XceptionCNN Implementation (2020)**
https://towardsdatascience.com/xception-from-scratch-using-tensorflow-even-better-than-inception-940fb231ced9
## Introduction
CNNs tend to outperform other time-series prediction NNs (Udactiy, specifically, taking orders of datablocks each iteration). I think this due to spatial intrepertations made only by CNNs, and that one can transform samples and make meaningful representations rather than relying on step-by-step representations.
A 2018 quantcon presented that SVMs are efficient in the trading space, and claims that technical analysis aren't good predictors/features because of it's lagged characteristics. However, transforming raw data into higher-ordered representations can make it easier and efficient for a machine learning model to predict.
My hypothesis is that supervised learning models require a more broad representation of data in order to create better predictions of them. However, I have not investigated the space of unsupervised learning models, thus, feature engineering may be useful compared to the TA cohorts. **I'm also trying to figure out whether CNNs are better than SVMs**.
For now, I would like to use a universe of stocks as features for trading. These have more data, and is readily available.
The issues posing now is how to make a reward/PnL function to determine model change. And then, how to determine long/short strategies based on the model, and then the risk management system (maybe after knowing the stochasic process of PnL).
The goal as of now **is to make a prediction model to predict stock data**. The next steps is to see how well the model performs, and **how to trade** based **on model output**.
Information Gathered so far:
* **CNNs are better performers** of time-series prediction (Empirical evidence from Udacity vs. RNNs and LSTMs).
* **WaveNets**, a dilated convolution, empiraclly **outperforms 1D CovNets**.
* **Xception** is a form of CNN, it reduces computer resources. Salient information is extracted efficiently. Avoids overfitting. Better than **Inception**.
* **TA are lagged features**, and shouldn't be used.
* It's a good idea to **use multivariable data** (more than one stock) to make predictions.
* **Higher order representations** of data (raw open/close data to smoothed 5-day average data) helps **ML learn better**.
* **ARIMA models outperform vanilla GRU models.**
* (ARIMA models are poor, as well as RNNs. It's time to move on to CNNs and SVMs)
* Improved performance comes from **adding predictive factors** and **removing noise (unexplained returns)**.
* **Reward/PnL functions** to train models are **difficult**.
Data Preparation
---
Better performance comes from being able to distinguish trends more easily. We need more data and feature sets that correlate well with each other (thus, a feature selector can be useful). We can just use a smoothed version of price series to rid of noise. According to Dr. Tom Starke, we can look use a 5-day SMA transformation of our price series data.
But for machine learning input, the data must be transformed.
Refer to this document to prepare the data.
## [CNNs/Xception: Keras set-up w/ Multivariables](https://towardsdatascience.com/xception-from-scratch-using-tensorflow-even-better-than-inception-940fb231ced9)
***CNN Summary***
Block 1: feature extractor, repeated several times on inputs and output feature map vector.
Block 2: end of all neural network for classification, returns probablility of class (possible markovian process) via logistic function (binary) or softmax function (multi-class). In terms of stocks, softmax works well.
***CNN Layers***
Convolutional Layer: the first layer, detect set of features (feature = filter) (*in terms of trading, filter windows should represent a size equal to the length of holding stock [e.g. holding stock 1 min needs window size of 1 min or above]*). Filters determined by weighting function, similarities, and backpropagation.
Pooling Layer: reduces size of image while preserving characteristics, avoids overfitting. Pattern identification stage.
ReLU correction Layer: non-linear function, an activation function
Fully-connected Layer: last layer, produces output vector by applying linear combination and activation function of input. Classifies image, returns vector size N (# of classes). Each element = probability for input image to belong to a class (opportunity to search for markovian process).
***Xception Summary***
Learn richer representations with fewer parameters, and faster computing w/ less resources.
Separable convolution better than normal convolution because the computation costs is lower, and doesn't have to transform multiple times.
Validation accuracy is typically higher.
***Xception Layers***
Entry Flow: Two blocks of convolutional layers followed with a ReLU actionation block + separable convolutional layers (w/ ReLu + maxpooling).

Middle Flow: *read more*

Exit Flow: *read more*

**Importing Libraries**
```python=
# libraries required
import tensorflow as tf
from tensorflow.keras.layers import Input,Dense,Conv2D,Add
from tensorflow.keras.layers import SeparableConv2D,ReLU
from tensorflow.keras.layers import BatchNormalization,MaxPool2D
from tensorflow.keras.layers import GlobalAvgPool2D
from tensorflow.keras import Model
```
**Creating Conv-BatchNorm Block**
```python=
# creating the Conv-Batch Norm block
def conv_bn(x, filters, kernel_size, strides=1):
x = Conv2D(filters=filters,
kernel_size = kernel_size,
strides=strides,
padding = 'same',
use_bias = False)(x)
x = BatchNormalization()(x)
return x
```
:::info
The Conv-Batch Norm block takes as inputs:
1) tensor - x
2) \# of filters - filters
3) kernel size of convolayer - kernel_size
4) stridesOfLayer - strides
Apply a convolution layer to x, and then apply **Batch Normalization** . Batch normalization standardize inputs.
:::
**SeparableConv - BatchNorm Block**
```python=
# create sepConv-BatchNorm
def sep_bn(x, filters, kernel_size, strides=1):
x = SeparableConv2D(filters=filters,
kernel_size=kernel_size,
strides=strides,
padding='same',
use_bias=False) (x)
x = BatchNormalization() (x)
```
:::info
SeparableConv2D used instead of a regular Conv2D
:::
**Functions for Entry, Middle, and Exit Flow**
```python=
# entry flow
def entry_flow(x):
x = conv_bn(x, filters = 32, kernel_size=3, strides=2)
x = ReLU() (x)
x = conv_bn(x, filters=64, kernel_size=3, strides=1)
tensor = ReLU() (x)
x = sep_bn(tensor, filters=128, kernel_size=3)
x = ReLU() (x)
x = sep_bn(x, filters = 128, kernel_size=3)
x = MaxPool2D(pool_size=3, strides=2, padding='same') (x)
tensor = conv_bn(tensor, filters=128, kernel_size=1, strides=2)
x = Add() ([tensor, x])
x = ReLU()(x)
x = sep_bn(x, filters =256, kernel_size=3)
x = ReLU()(x)
x = sep_bn(x, filters =256, kernel_size=3)
x = MaxPool2D(pool_size=3, strides=2, padding = 'same')(x)
tensor = conv_bn(tensor, filters=256, kernel_size = 1,strides=2)
x = Add()([tensor,x])
x = ReLU()(x)
x = sep_bn(x, filters =728, kernel_size=3)
x = ReLU()(x)
x = sep_bn(x, filters =728, kernel_size=3)
x = MaxPool2D(pool_size=3, strides=2, padding = 'same')(x)
tensor = conv_bn(tensor, filters=728, kernel_size = 1,strides=2)
x = Add()([tensor,x])
return x
# middle flow
def middle_flow(tensor):
for _ in range(8):
x = ReLU() (tensor)
x = sep_bn(x, filters = 728, kernel_size=3)
x = ReLU()(x)
x = sep_bn(x, filters = 728, kernel_size = 3)
x = ReLU()(x)
x = sep_bn(x, filters = 728, kernel_size = 3)
x = ReLU()(x)
tensor = Add()([tensor,x])
return tensor
# exit flow
def exit_flow(tensor):
x = ReLU()(tensor)
x = sep_bn(x, filters = 728, kernel_size=3)
x = ReLU()(x)
x = sep_bn(x, filters = 1024, kernel_size=3)
x = MaxPool2D(pool_size = 3, strides = 2, padding ='same')(x)
tensor = conv_bn(tensor, filters =1024, kernel_size=1, strides =2)
x = Add()([tensor,x])
x = sep_bn(x, filters = 1536, kernel_size=3)
x = ReLU()(x)
x = sep_bn(x, filters = 2048, kernel_size=3)
x = GlobalAvgPool2D()(x)
x = Dense (units = 1000, activation = 'softmax')(x)
return x
```
**Xception Model Creation**
```python=
# model code
input = Input(shape = (299,299,3))
x = entry_flow(input)
x = middle_flow(x)
output = exit_flow(x)
model = Model(inputs=input, outputs=output)
model.summary()
```
**Model Inspection**
###### tags: `CNN` `Documentation`