SPO+ Framework on Cryptocurrency Trading

# SPO+ Framework on Cryptocurrency Trading LIN Ruihan 20547872 IEDA4500 Final project ## 1. Introduction In many financial applications, high frequency trading is considered as one of post booming and popular field recently. The fundamental of high frequency is exactly the mixture of technology and finance. Within this field, signal processing and trading philosophy are two significant topics. In my project, I will first illustrate the signal processing, machine learning method, and finally a special optimization problem that resolve the concern that, most of the time, we care about decision making error instead of accuracy. ### Preliminaries Data before resample 1. time: This is the market execution time (UTC) 2. price: Executed price 3. side: True represents sell&bid, False represents buy&ask. ![](https://i.imgur.com/5AIABOZ.png =400x400) ### Resample and Add Feature During resampling, it is needed to consider sum_of_size and sum_of_vol (sum of side×size) as additional features. ![](https://i.imgur.com/JA9HQEs.png) add extra features, where add_sarmacd_feature is an pre-defined function that can add sarmacd related features to the dataframe. ``` df = add_sarmacd_feature(df, param) df.loc[df.open > df.close, 'inc'] = 1 df.loc[df.open <= df.close, 'inc'] = 0 df['return_c'] = df.close.diff()/df.close.shift() df['return_o'] = df.open.diff()/df.open.shift() df['return_h'] = df.high.diff()/df.high.shift() df['return_l'] = df.low.diff()/df.low.shift() ``` There would be in total 24 features till this step. ![](https://i.imgur.com/YJM2wDF.png =800x50) Next, we need to add the information of previous timebar to the current one, e.g., the open price one minutes ago. **Only consider listed features to add previous information, specified by "columns".** ``` def add_column(df_processed): columns = ['open', 'high', 'low', 'close', 'volume', 'vwap', 'sar', 'macd', 'macdsignal', 'sum_of_size', 'sum_of_vol','return','inc'] for i in range(1,hp_param['num_prev_timebar']+1): for col in columns: df_processed[col+'_pre'+str(i)] = df_processed[col].shift(periods=i) add_column(df) df.dropna(inplace=True) ``` Final look of dataframe (total 105 columns/features), **due to the rolling based trading, tags/output variable should be defined during rolling.** ![](https://i.imgur.com/xy29VRq.png) ## Trading Algorithm with Rolling Based ### Rolling **Step 1: Load initial data.** ![](https://i.imgur.com/oOymP5e.png =400x250) **Step 2: Create tags. (Output variables)** 1. max_up: the maximum return in the coming 0.5 hour. 2. max_down: the maximum drop in the coming 0.5 hour. ![](https://i.imgur.com/iYTg6IQ.png) **Step 3: Train the XGboost model based on the training data.** model for predicting max_up and max_down ``` clf_up = xgb.XGBClassifier(objective='reg:squarederror', seed=1, max_depth=10, n_estimators=30) clf_down = xgb.XGBClassifier(objective='reg:squarederror', seed=1, max_depth=10, n_estimators=30) ``` **Step 4: Use the model for prediction in the coming 1 hour.** ![](https://i.imgur.com/CnjZhun.png) **Step 5: After 1 hour, update the model and do the above again.** ![](https://i.imgur.com/K4BAHth.png) ### Generate Signal #### Choosing appropriate portfolio Consider our portfolio pool containing N different cryptocurrencies, we can construct a weight vector $\ w \in\mathbb R^N$. We denote predicted maximum growth, drop as $\ r_{up}, r_{down}$ respectively. We need to solve this optimization problem: \begin{equation*} \begin{aligned} & \underset{w\in\mathbb R^N}{\text{minimize}} & & \mathrm{ -\lambda_1 w^Tr_{up}+\lambda_2w^Tr_{down}+w^T\Sigma w} \ \ \ \ (1)\\ & \text{subject to} & & w\geq 0, \\ &&& w^Te = 1. \end{aligned} \end{equation*} Here, $\lambda_1 \ and\ \lambda_2$ are user-defined coefficients, $e$ is the unit vector in $\mathbb R^N$, $\Sigma$ is the semi-covariance matrix (we obtain $\Sigma$ through historical data and consider it to be a constant matrix, *details can be seen in the cod*e). After choosing appropriate $w$, we start to judge if such portfolio satisfies our trading conditions. #### Buy Signal Buy signal will be sent if the following conditions are met: 1. $w^Tr_{up}$ > upper threshold 2. $w^Tr_{down}$ > lower threshold The graph below shows a buy signal at t=0 ![](https://i.imgur.com/bk2RFDc.png) After buy signal is generated, we will hold this position until sell condition are met, and during the holding period, we will **never** send additional buy signal. #### Sell Signal During the 1 hour holding period, once the price reaches the stopprofit line or stoploss line as shown below, sell signal will then be generated. ![](https://i.imgur.com/DH3yky7.png) Note: The upper and lower threshold, the starting point of stopprofit and stoploss line are **determined by the stopprofit and stoploss level**. More specifically, they are acquired by mutiplying some constants to the stopprofit and stoploss level. (all the hyperparameters should be included in the yaml file) ### Inspiration Inspired by SPO+ framework, which provides a quick and convenient way of minimization of **decision error** instead of **model error**, it becomes possible to apply such technic to quantitative trading by specifying (1) underlying optimization problem with decision variable as buy decision. (2) appropriate predictor that fits SPO+ framework. (3) regular predictor to quantify our decision. For simplicity, we will not consider the portfolio trading formulated in $(1)$, instead, we will focused on a single coin. ## 2. Review on SPO+ Framework Under SPO+ framework, a nominal optimization problem is considered, and in the form \begin{equation*} \begin{aligned} & \underset{w}{\text{minimize}} & & \mathrm{ c^Tw} \ \ \ \ (2)\\ & \text{subject to} & & w\in\mathbb S, \\ \end{aligned} \end{equation*} Here, $w$ is the decision variable, $c$ is the cost vector. Also, we denote the optimal $w$ given $c$ as $w^*(c)$, and $c^Tw^*(c)$ as $z^*(c)$. Notice that under SPO+ framework, $c$ is not deterministic, and some specific predictor $P$ should be used to predict such $c$. **(Notice that $P$ should be invariant of scaler multiple $\alpha$, i.e., $P(x) \in \mathbb H$ implies $\alpha P(x) \in \mathbb H$ for any $\alpha$, where $\mathbb H$ is the hypothesis class)** Under SPO+ framework, the loss function of $P$ is chosen to be $$L_{SPO} := \underset{w\in\mathbb W}{max}\ \{ \hat{c}^Tw\} - z^*(c), \mathbb W\ is\ the\ solution\ set \ \ \ \ (3)$$ With a convex approximation $$L_{SPO}^+ := \underset{w\in\mathbb S}{max}\ \{ c^Tw - 2\hat{c}^Tw\} +2\hat{c}^Tw^*(c)-z^*(c)\ \ \ \ (4)$$ and subgradient $$G = 2(w^*(c)-w^*(2\hat{c}-c))\ \ \ \ (5)$$ More details can be refered to *"Smart 'Predict, then Optimize'"* by Adam. ## 3. Define Underlying Optimization Problem In our trading algorithm, we define $w\in \{1,0\}$, **with $1$ represents buy, $0$ represents do nothing.** It is worth mentioning the general principle of choosing $c$ , where we requires the following: $$ \begin{cases} c\geq0& \text{buy condition not satisfied}\\ c<0& \text{buy condition satisfied} \end{cases}$$ Consider one possible choice, which will be used later, we define $$c := 5\times maxdrop - maxup\ \ \ \ (6)$$ **Which means only when the predicted maximum up is more than 5 times of the predicted maximum drop, can we send buy signal.** But one may concern the quantity of such definition, for example, $(maxup, maxdrop) = (0.001,0.00019)$ are considered as the same as $(maxup, maxdrop) = (0.1,0.019)$. To solve this issue, due to the poor prediction ability of SPO+ model, we may consider a general predictor (e.g. XGBoost) to quantify our decision, **which means only when $w=1$ and the XGBoost predicted maximum reutrn is higher then desired level, 0.01 e.g, will we make the buy decision.** Also, this defition of $w$ and $c$ ensures the uniqueness of the solution $w^*(c)$, we can drop the $max$ statement without concern. ## 4. Choose Appropriate predictor After defining the underlying optimization problem, it is necessary to choose suitable predictor. As mentioned above, chosen predictors should satifty multiplier invariant property. In this part, several predictors will be included. For better illustrate, an explicit example on ETH trading will be demonstrate in part 5. ### Linear Regression Method It can be proved that all linear hypothesis class (predictors) satisfying multiplier invariant property, and we consider the easiest linear regression model for our trading algorithm. As mentioned in the paper, even for such simple model can have very low decision error regardless of the linearity assumption of $P$ is incorrect. Now, we define $$\hat{c}_i :=P(x_i)= \beta^T x_i,\ \ \ \ (7)\\ x_i,\beta\in\mathbb R^n$$ One can easily verify the $$L_{SPO}^+ :=\underset{w\in\ \{0,1\}}{max}\ \{ (c_i-2\hat{c_i})w \} +2\hat{c_i}w^*(c_i)-z^*(c_i)\ \ \ \ (8)$$ and the subgradient $$G(\hat{c}) = 2(w^*(c)-w^*(2\hat{c}-c))\ \ \ \ (9)$$ then by the convexity we can further show that the subgradient of $\beta$ is $$G(\beta) = 2(w^*(c)-w^*(2\beta x_i-c))x_i^T\ \ \ \ (10)$$ After all these settings, we can apply stochastic gradient descent to get the optimal $\beta$. ## 5. Implementation on ETH Market ## Settings Trading principles: 1. Buy signal is only generated when SPO+ framework gives $w^*(\hat{c}) = 1$ and the maximum growth predicted by XGBoost is greater than 0.0125. 2. Stoploss and stopprofit follows the part "Generate signal - sell signal", where it specifies a triangular holding region. ### Result analysis ![](https://i.imgur.com/jSmrE3r.png) ![](https://i.imgur.com/y1nl2ch.png) ![](https://i.imgur.com/XPh7tjj.png) In the backtest result, we noticed that such strategy is applicable in real market, however, we found a significant withdraw. We believe that it can be avoided by rescrict the trading frequency to avoid consecutive losses. ## Reference Adam N. Elmachtoub, Paul Grigas (2017), Smart "Predict, then Optimize" https://arxiv.org/abs/1710.08005