# Identifying arbitrage signals Before discussing the aim of this work, some relations and termini need to be presented: ### Price alignment, Arbitrage and Impermanent Loss During the price alignment of different markets, Impermanent Loss can occur: the value of the markets pool is reduced due to arbitrage trades that re-align the prices. This can be understood in the following way: Assume the markets pool contains assets $A_1$, $A_2$, .., $A_n$ with corresponding internal prices $p_i\propto\frac{1}{A_i}$, that is, the less supply of an asset is given wihtin the markets pool, the higher the price will be. This needs not to be the case in general, but allows to understand the origin of *Impermanent Loss*. If, during a *swap*, the market takes in an amount $\delta A_1$ of $A_1$ and gives back an amount $\delta A_2$ of $A_2$, the internal price of $A_2$ will increase whilst the internal price of $A_1$ will decrease. If outside the considered market it is possible to attain different prices $p^\text{external}_1 > p_1$ and $p^\text{external}_2 < p_2$, one can make profit from a swap within the market from $A_2$ to $A_1$ followed by a swap outside the market (that is, at a different market) back from $A_1$ to $A_2$. Such *arbitrage trade* is profitable until price alignment, that is, $p^\text{external}_1 = p_1$ and $p^\text{external}_2 = p_2$. During this price alignment the market loses an amount of those asset that has risen in price and obtains those asset that has fallen in price. As of that the overall value of the markets pool will decrease during price alignment. This loss, called *Impermanent Loss*, is a burden to the liqudity providers. ### The aim of this work By identifying and respecting aforementioned arbitrage trades, the price alignment may be speed up either by implementing trading-bots, a dynamic fee or modification of the invariant. That is, the price alignment of the AMM may be achieved with less trade volume. This reduction in trade volume directly reduces the Impermanent Loss because the change in the market internal supplies required to achieve price alignment is reduced and the market is not forced to give away that much of the asset that has risen in price. The aim of this work is not to identify an arbitrage-opportunity with profitable trade direction, that is, price-gaps and the direction of profitable trades. It aims only at identifying potential arbitrage signals but not information about profitable trade directions: we call any hint at ongoing price movements or present price differences an *Arbitrage signal*, that is, a hint at the possible presence of an arbitrage opportunity. This information alone may already be enough to allow faster price-alignment. One can suspect that the identification of such arbitrage signals is faster and/or easier than the identification of profitable trades. This would enable AMMs to perform a kind of arbitrage-frontrunning: dynamically readjust AMM-intern parameters so that the price-alignment is done with less trading volume. The work is based on the analysis of prices and volumina of different trading pairs at different markets. Due to the sparse and noisy data some tricks have to be brought to use and the results should be interpreted with care. The central question to be answered is: **Can an AMM acquire more information about the external market so that a faster price-alignment becomes possible?** Here, we consider "faster" as synonym for a reduced requirement of trading volume until price alignment is achieved. First, the data used for the analysis is presented. This will be followed by a short disussion of the statistical methods brought to use and leads to the numerical results. In the conclusio possible usage scenarios will be discussed. ## Formalism and Method We collect data on the prices and volumes of different trading pairs with fixed left side for many different market places. That is, we look at average price and volume of trades $A_0\leftrightarrow A_i$ with $i\neq 0$. To compare different right sides of the trade, we refere to US-Dollar as numeraire, that is, we look at $A_0 \leftrightarrow (A_i \text{ in } \text{USD})$. For different fixed left sides $i \in\{\text{ETH},\text{BTC},\text{DOT},\text{KSM},\text{AAVE},\text{ADA}\}$ a time-series of data is collected. With this time series it is looked for correlations between price, price changes, volume and volume changes, as well as measures for the corresponding spreads: we compare the standard deviation to an error estimate based on [Jackknife](https://en.wikipedia.org/wiki/Jackknife_resampling)). ### Data acquisition The data was acquired by parsing the market section of [coinranking.com](https://coinranking.com/de/markets?search=ETH) for different assets in intervalls of 10 minutes over 3 days. An overview is given in the following. ![](https://i.imgur.com/8X1XZ4L.png) Due to broken internet connection and the need to revise the script the data contains holes and is far from dense. The prices are calculated based on volume weighted averages (shown in blue with error bars given by the standard deviation) and with equal weights (shown in orange with error bars calculated via [Jackknife](https://en.wikipedia.org/wiki/Jackknife_resampling)). For each point in time we store the data as ```Mathematica { { DateObject[{2021, 10, 4, 11, 15, 51.696385`9.466035025075556}, "Instant", "Gregorian", 2.], { {2.18, 3.2976*^8, "ADA Binance/ USDT"}, {2.18, 1.2504*^8, "ADA Coinbase Pro/USD"}, ... }, }, { DateObject[{2021, 10, 4, 11, 27, 24.7374115`9.145929177194422}, "Instant", "Gregorian", 2.], { {2.18, 3.2626*^8, "ADA Binance/ USDT"}, {2.19, 1.2577*^8, "ADA Coinbase Pro/USD"}, ... }, ... } ``` Each "DateObject" is of form ``` DateObject[{Year, Month, Day, Hour, Minute, Seconds}, "Instant", "Gregorian", 2.] ``` with the date- and time information corresponding to the moment at which our source of data has been accessed. The single data element is comprised of ``` {Price, 24-Volume, TradingPairString} ``` with the "TradinPairString" collecting the symbols of the two sides of the trading pair and the market to which the data refers to. Collecting the data was done using a *Wolfram Language Script* that parsed the html-response of an http-request on https://coinranking.com/de/markets?search=ETH with "ETH" runing over the different fixed left sides. ``` this = "C:\\...\\tecOMNIPool"; assets = {"ETH","DOT","BTC","KSM","AAVE","ADA"}; (********LOAD PRICE DATA FROM COINRANKING********) prices[token_,maxpages_]:=Module[{dat,prices,pages,page,dt,t=token,mpages=maxpages}, dt = Now; dat=Import["https://coinranking.com/de/markets?search="<>ToString[token],"Data"]; prices= {#[[2]],#[[3]],If[Length[#[[1]]]>0,StringReplace[ToString@Select[#[[1]],\[Not]NumericQ[#]&],{"-\n"->"/","\n"->"","-"->""," "->"","/ "->"/","/ "->"/"}],"UNKNOWN"]}&/@(ToExpression[StringReplace[StringDrop[#,2],{"."->"",","->".","Millionen"->"*10^6","Milliarden"->"*10^9"}]]&/@(Select[dat[[1,2,2,2,;;,{1,2,3}]],StringContainsQ[#[[1]]," "<>ToString[token]<>"/"~~__]&][[;;,{1,2,3}]])); pages = Min[Round[ToExpression[StringReplace[ToString[dat[[1,2,3,1,2]]],{"."->""}]]/Length@dat[[1,2,2,2]],1],mpages]; If[pages>1,prices=Flatten[Append[Table[ {#[[2]],#[[3]],If[Length[#[[1]]]>0,StringReplace[ToString@Select[#[[1]],\[Not]NumericQ[#]&],{"-\n"->"/","\n"->"","-"->""," "->"","/ "->"/","/ "->"/"}],"UNKNOWN"]}&/@(ToExpression[StringReplace[StringDrop[#,2],{"."->"",","->".","Millionen"->"*10^6","Milliarden"->"*10^9"}]]&/@(Select[dat[[1,2,2,2,;;,{1,2,3}]],StringContainsQ[#[[1]]," "<>ToString[token]<>"/"~~__]&][[;;,{1,2,3}]])) ,{page,2,pages}],prices],1],Nothing]; {dt,Select[prices,ToString[#]!= "String[]"&]} ] (********ACCUMULATE THE DATA********) Table[ dat={}; (*LOAD EXISTING*) If[FileExistsQ[FileNameJoin[{this,assets[[i]]<>".json"}]],Get[FileNameJoin[{this,assets[[i]]<>".json"}]],Nothing]; AppendTo[dat,prices[ assets[[i]],40]]; (*DELETE BACKUP*) DeleteFile[FileNameJoin[{this,assets[[i]]<>"_OLD.json"}]]; (*CREATE NEW BACKUP*) RenameFile[FileNameJoin[{this,assets[[i]]<>".json"}],FileNameJoin[{this,assets[[i]]<>"_OLD.json"}]]; (*SAVE DATA*) Save[FileNameJoin[{this,assets[[i]]<>".json"}],dat]; ,{i,1,Length[assets]}]; ``` The script accumulates the data in corresponding *.json files, though the file ending is misleading. This script was run via a bash-script that in turn was executed as *scheduled task* on Windows every 10 minutes. ### Relevant observables The following observables are of relevance: | Symbol | Calculation | | -------- | -------- | | $p_\text{equal}$ | Equally weighted average of all prices of different trading-pairs with fixed left side denoted in $ for given moment in time. | | $p_\text{volume}$ | Volume weighted average of all prices of different trading-pairs with fixed left side denoted in $ for given moment in time. | | $<\text{vol}>$ | Average over the volume of all trade pairs with fixed left side nominated in $ for given moment in time. | | $\%\Delta_\text{std}p_\text{equal}$ | Percentual standard deviation of the price $p_\text{equal}$ of different trading pairs with fixed left side for given moment in time | | $\%\Delta_\text{jack}p_\text{equal}$ | Percentual Jackknife deviation of the price $p_\text{equal}$ of different trading pairs with fixed left side for given moment in time | | $\%\Delta_\text{std}p_\text{volume}$ | Percentual standard deviation of the price $p_\text{volume}$ of different trading pairs with fixed left side for given moment in time | | $\%\Delta_\text{jack}p_\text{volume}$ | Percentual Jackknife deviation of the price $p_\text{volume}$ of different trading pairs with fixed left side for given moment in time | | $\Delta_\text{std}\text{vol}$ | standard deviation of the volume of different trading pairs with fixed left side for given moment in time | | $\Delta_\text{jack}\text{vol}$ | Jackknife deviation of the volume of different trading pairs with fixed left side for given moment in time | | $\mid d_t p_\text{equal} \mid$ | Measure of the change of $p_\text{equal}$ during one minute by comparison of one moment in time to the next moment in time. | | $\mid d_t p_\text{volume} \mid$| Measure of the change of $p_\text{volume}$ during one minute by comparison of one moment in time to the next moment in time. | In addition to these, also the changes of all variables that relate to a single moment in time have been taken into account. To compactify the notation, consider the following scheme: * $p_i$ denotes a average price. With $i=\text{vol}$ it is volume weighted, with $i=\text{equal}$ it is equally weighted. * $\Delta_i$ denotes a Measure of spread of whatever comes after the $\Delta_i$. With $i=\text{std}$ it is the standard deviation, with $i=\text{jack}$ it is based on the Jackknife method. * A % denotes that the observable followed after is denoted in percent of a respective base observable. This base observable should be clear from the context. * $d_t$ denotes the change of the observable followed after the $d_t$. It is calculated by a difference quotient and normalized to a duration of one minute. One can assume that a change in prices as well as the spread of the actual prices are related to arbitrage opportunities and the work is based on the following assumptions: * The spread of the actual prices relates to different prices for same asset, which can allow to make profit by buying it at a low price at one market and selling it at a higher price at another market. * A change in prices causes a momentaneous dis-alignment of different prices at different markets. The prices are usually re-aligned nearly instantaneous by arbitrage trading. We will look for correlations to those observables and consider any non-vanishing correlations a *arbitrage signal*, that is, a signal for the presence of an arbitrage opportunity or the ongoing of arbitrage-trades. ### Correlation analysis Using the [*CorrelationTest*](https://reference.wolfram.com/language/ref/CorrelationTest.html) of the [*Wolfram Mathematica*](https://www.wolfram.com/mathematica/) language it was looked for correlations by testing different $H_0$-hypothesis. The tests were based on an [$\chi^2$-test](https://en.wikipedia.org/wiki/Chi-squared_test) with respect to the [Pearson correlation coefficient](https://en.wikipedia.org/wiki/Pearson_correlation_coefficient). Due to the strongly fluctuating data some tricks were brought to use: We checked different $H_0$-hypothesi to estimate the probability density for the correlation coefficient, that is, we determined the p-value for correlations of value -0.95, -0.75, -0.5, -0.25, 0, 0.25, 0.5, 0.75 and 0.95. The resulting p-values of these tests can be taken as a weight for the respective value of correlation and allow to estimate an average value and a standard deviation for the correlation coefficient: The higher the p-value, the higher is the probability that the $H_0$-hypothesis is compatible to the data. It was looked for non-vanishing correlations with respect to aforementioned average value and standard deviation, that is, we looked for average correlations that are at least one standard deviation away from zero. The respective tests were repeated for different fixed left side of the trading pairs so that the resulting error bar allows to check the stability for the estimate of the probability density for different values of the correlation. Due to strong noise in the data the values to be presented in the following should not be over-interpreted. The only stable information may be given by the sign of the non-vanishing correlations. ## Results The following graphics show the average value for the correlations as green line with the standard deviation as opaque rectangle. The different p-values of the corresponding correlation tests are shown by blue points that are joined by a blue line. They can be considered as an estimate for a probability density. The vertical axes is logarithmic and the ticks correspond to the p-value of the test. A asymmetric structure hints at non-vanishing correlations. One finds a weak signal for a correlation between the absolute of a price change and the change of the standard deviations of the different volumina of corresponding trading pairs: ![](https://i.imgur.com/WZi08iL.png) This correlation can be understood intuitively in the following way: Assume that a price change between asset A and B occurs, but the prices between A to all other assets are unchanged. It causes increased trading between the assets A and B, but no increase in other trading pairs. In the ensemble of volumina from which the standard deviation of volumina is calculated, the respective trading pair will aquire more volume and become an outlier. This has stronger influence on the standard deviation than on the Jackknife deviation because Jackknife is less sensitive to outliers. From this we can extract our first arbitrage-signal: **If the volume of a trading pair is higher than those of other trading pairs with identical left side, it may hint at ongoing price changes.** This signal is quite weak, but strengthened by the following two arguments: **Argument 1:** When the change of two observables is correlated one can expect that also the correlation between the observables itself is non-vanishing. Since we looked for the absolute of the change in price, there is no information about the direction in which the price changes and the following result should be taken with care: The standard deviation of the volumina can correlate to the price ![](https://i.imgur.com/R0y0wNp.png) and the volume itself can correlate to the price. ![](https://i.imgur.com/TgGjhEA.png) This should be taken with care as it is prone to over-interpretation and misleading conclusios: One can expect that the sign of the correlation may change for different market phases and that the correlator vanishes if a longer time period is taken into account. **Argument 2:** Arbitrage-trading is a process that counteracts price deviations. From that we can expect negative correlations between any arbitrage-signal and the change of the same signal. This is strongly related to a *Mean Reversal Tendency*. We find such negative correlation for the Jackknife deviations of the volumina: ![](https://i.imgur.com/PDMULgL.png) and also the following negative correlation is found: ![](https://i.imgur.com/7BE0uaD.png) One can suspect that a similar signal for negative correlation between standard deviation of volumina to the change of the standard deviation of volumnia is lost due to noise. Besides ongoing price changes also momentaneous price differences can give rise to arbitrage-opportunities. One can estimate such price differences via standard deviation or jackknife deviation of the prices. Also here a relation to volumina can be found: ![](https://i.imgur.com/ebKYsDa.png) This correlation can be in fact considered a causal effect of the correlation between price changes and standard deviation of volumina: A price change may induce an increased momentaneous standard- or jackknife deviation of prices due to the seized arbitrage-opportunities. Such opportunities induce arbitarge trades between the respective trading pairs which in turn causes an increased volume of the trading-pairs. This in turn causes increased standard- or jackknife deviations of the volumina. Hence one can suspect: **A change in the standard or jackknife deviation of volumina can be related to momentaneous price differences** One finds the expected negative correlation of the respective deviations to their changes: ![](https://i.imgur.com/sw7tIgj.png) ![](https://i.imgur.com/UuMZs9u.png) ## Conclusio By looking at the volumina of different trading pairs an AMM could gain additional information about market-extern prices that may allow for a faster price alignment if allowed to impact the market-intern price definition: **we can answer the research question with a "yes".** There are different possibilities of doing so: Trading-bots, dynamic fees or a more complex invariant, to list some. ### Possible usage scenarios A very rough approach towards different usages of arbitrage signals shall be presented in the following. #### Trading Bots Bots could be funded by the AMM to buy both sides of a trading pair for which arbitrage is signalled. The amount that the bot temporarily holds would reduce the market intern liquidity of this trading pair and cause an increased slippage. This may reslt in faster price alignment: less trading volume causes a larger price movement than without the bots activity.Trading bots may be the easiest way of speeding up price alignment and can be added to existing AMMs: The price movement within the AMM can be related to slippage. The less supplies the AMM has of a specific token, the more the price of this token is sensitive to volume. For assumed proportionality $p_i\propto \frac{1}{A_i}$, the sensitivity can be estimated to $d p_i \propto -\frac{1}{A_i^2} dA_i$ with $dA_i$ corresponding to the trade size. The factor $\frac{1}{A_i^2}$ governs the sensitivity of the price with respect to trade sizes, that is, volume. By temporarily removing liquidity from a specific trading-pair it may be possible to make the price more sensitive to volume: larger price movement can be caused by less volume. Once, the prices are aligned, the liquidity would be added again with respect to the new prices. When re-adding the liquidity, the bot will give all of one asset, but only a part of the other asset due to the change in price. That is, the bot keeps a part of an asset. This part would be lost to the arbitrageurs without the bot, but is preserved with the bot. It could be put into a sidepool that acts as an IL insurance for the liquidity providers and allow for an self-funding of the bot. #### Dynamic Fees Once, an arbitrage signal is detected for a trading pair, the fee for the respective swaps between the two assets could be increased. The idea aims at targeting the arbitrageurs with increased fees while keeping a low fee for non-arbitrage trades. The arbitrage signals are not specific to single trades, but only specific to trading-pairs. The clearer the signal becomes, the more specific could the arbitrageurs be targeted with the fee increase. This would not speed up price alignment, but might counterweight the Impermanent Loss caused during alignment. Dynamic fees allow for a direct compensation of Impermanent Loss without artificially increasing slippage. But they do require do be placed within the AMMs dynamic and can not be added to an existing AMM. #### Invariant The invariant is fundamental for the market intern price definition and governs the behaviour of the market place: the bonding curve can be derived from the invariant. The bonding curve directly tells how much of one asset is attained by inserting another asset into the market. It defines not only slippage, but can also model fees. A modified invariant can achieve the same effect as trading-bots and a dynamic fee. It may allow do so in a far more efficient way, even combining the advantages of bots and dynamic fees, although this method can not be added to an exiting AMM. It may be the most promising usage of arbitrage signals, but is also the most complex and complicated one to implement.