Christoph Schlegel: Cross-Chain Arbitrage

[**Christoph Schlegel**: Cross-Chain Arbitrage](https://www.youtube.com/live/fDAvmAc0OMM?si=d-j9N-4D0yHe5ZdA) ![Google Chrome_2025-07-03 _ 15-24@2x](https://hackmd.io/_uploads/r1WoEZESxe.png) --- **Summary:** This talk explores cross-chain arbitrage in long-tail assets traded across nine EVM chains. It examines how price discovery for very long-tail assets is happening on DEXes, using a dataset of over 240,000 trades. Most arbitrage was done by holding inventory across chains, bridging was less common due to latency and risk. A theoretical model compares the costs of bridging vs. inventory, factoring in slippage, price drift, and opportunity costs. The talk raises questions about whether this fragmented structure is a transitional phase or a lasting feature of multichain markets. --- **Transcript:** I will give a presentation on cross-chain arbitrage. This is a joint project involving several individuals from Munich, Flashbots, and Lisbon University. And I'll link this a bit to questions of Ethereum interoperability. So how are prices determined in the current Ethereum design, and what is the current market structure? The general idea that people have is that we have DEXes, but price discovery is not really happening on DEXes. What is the reason? Well, there is not as much liquidity as on CEXes. And there is a delay, you do not always have the price determined at each point in time. So, CEXes are faster and have deeper liquidity, which are of course related to each other. So we expect that new information is more often priced in a CEX rather than in a DEX. And the DEXes then sync to the CEXes, and we have a price update to the external world off-chain. We can imagine a different world where there is price discovery between DEXes. So we have different DEXes on different chains, or different DEXes on the same chain, which might also have different liquidity and different ways of making a market. But somehow the collective of those DEXes might eventually incorporate information and price hopefully. That would be another way of doing it. And that's actually not a hypothetical, but in some sense that is the reality in the very niche of the Ethereum ecosystem. There are some situations where we have price discovery on decentralized exchanges. And that's mostly for very long-tail assets. So what we do in this project is we want to look at this long-tail of assets, see how people arbitrage between different DEXes on different domains, on different blockchains. And maybe that gives us an insight into how a fragmented, multichain world would look in the future. Or maybe it's just an interesting artifact. Maybe we solve interoperability, maybe there will be one chain that will rule them all, and we don't have the fragmentation and the problems of thinking between different domains. Then this would just be a beautiful exercise to study a bunch of meme coins. So we, and by we I mean mostly Burak Öz, who is the leading driver on the dataset of this project. The dataset studies different trends across nine EVM chains over one year. With this data we attempt to identify situations that can be classified as cross-chain arbitrage. So one leg of the trade was completed on one chain, and the other leg on a different chain. Maybe it was bridging in between, or maybe people held inventory on both chains, but we tried to match trades to the same entity on different chains. That gives us a universe of considerable size. We had a total of 242,535 trades, with a medium-sized volume of 868 million. Let us look at some graphs. So we studied eight different chains, from September 2023 to September 2024. One nice thing is that in between, there was Dencun which had a nice effect of decreasing fees on Ethereum L1, which made things arbitrageable that were too costly before that. This led to a certain spike in the activity on Ethereum, leading cross-domain arbitrage. Okay, so that's the dataset, volume of around $3 million. It might sound small in the grand scheme of things, but I think it's still a considerable-sized data set. So first of all, we try to correlate it with volume. And I think that is true for any kind of MEV extraction, that it's more profitable on average if there is higher volatility. There is a correlation between arbitrage volume and volatility. It's dampened a bit because these things are really, really long-tail. What do I mean by long-tail? Why is price discovery happening onchain rather than off-chain? Either these tokens are not even listed on CEXes or don't have good liquidity or execution on CEXes, and instead people trade them on DEXes primarily So there are two classes basically, sort of meme-coins-ish or very long-tail experimental projects with low market cap. And then low market cap synthetic assets, like synthetic versions of the dollar, synthetic versions of Bitcoin, and so on. So, how should we think about arbitrage extraction? How is arbitrage extraction different if you have long-tail assets? So first of all, there's a liquidity effect. I'm not sure if you've heard the discussion around LVR, which is somewhat prominent among professors but perhaps less so in the community. So people have studied the effect of slippage on arbitrage. So you can adjust the standard theory of this effect of arbitrage, the slippage effect. The effect that there is limited liquidity on those DEXes. And so you basically get a smaller size arbitrage where there are these trading factors. So the relative fraction of the liquidity on the two venues matters for the amount of arbitrage. First of all, there's less of it, and it depends on the imbalance in liquidity. So that's how we think about it, that's the liquidity cost. The cost is that there's limited liquidity, which makes the arbitrage more costly. There are other costs. And primarily in our data sets, the cost is from bridging or maintaining inventory. If you have to maintain inventory on different domains for very long-tail risky assets, that's very costly to you. So the one thing that we did on the theory side is we tried to have a very stylized model of this to try to have some kind of comparison between when it pays off to bridge, and when it pays off to have inventory. Inventory trading is just maintaining some token on a DEX on, for example Arbitrum, and maintaining the same token on a DEX on Ethereum. You sell on Arbitrum and you buy it on Ethereum. That is almost instant settlement; there is just a difference in how fast the chains settle, but you have to maintain inventory. So it tends to be more costly for that reason. If you do bridging, then basically you have the token and trade it on Arbitrum, and then you have to bridge it back to Ethereum, and then you trade it back. And during that time passes, it's a very long time to settlement. And you have price uncertainty, so it might be risky to do this kind of things. So that's the basic thing you have to think about. We have done a stylized model where you basically assume that sometimes, just out of a sudden, arbitrage opportunities arise, sort of through noise trading or for whatever reason, and they don’t compensate for the risk in the sense that the price tends to drift negatively. So that is our model assumption. So it applies more to these long-tail assets than it does to synthetic assets. If you have that, then you can assume that some arrival rate of those arbitrage opportunities. And then you can perform some math based on that. So you can think about if those arbitrage opportunities arrive more frequently, maybe it makes sense to maintain an inventory, but if these opportunities don't arrive that frequently you might think about bridging it because the price doesn't move too much, and there are fewer opportunities. The benchmark looks something like this. If you have a frictionless ideal world that I talked about before, also abstracting away from the liquidity effect, you have some profit function. That is this nicely convex-shaped profit of arbitrage. But now we have a cost that comes in two ways. Let me first of all talk about the bridging costs. The bridging cost is first of all the fee for using that bridge, but you also have the price uncertainty. The price might move against you when you are bridging quite naturally. It might possibly move in your favor, but it might move against you. And that introduces an implicit cost. The cost comes from the fact that once you have completed the bridge, the arbitrage opportunity might have disappeared. That is a risk that you have to factor in. So you basically get a difference of two quadratics you get a cost that is increasing marginally in the time that you need to bridge the asset. That's a plausible model of bridging. So let's compare that to inventory. So what is the cost of inventory? It is basically the risk cost and the opportunity cost of holding the token. So you have an opportunity cost, and you have a risk of holding this asset as you might not be able to sell it, or it might depreciate in value and so on. So basically, you have a constant flow cost of holding this asset. So you keep it in inventory, and the flow cost is always in each period costly to hold the token in inventory. And then you have a nice trade-off, you have this formula where you basically compare the expectation of this price depreciation of the token versus a frequency of arrivals of this arbitrage opportunity. And depending on this, you get sort of this μ divided by λ formula. So the price drift relative to the frequency of opportunity matters for the value of the Q. And then you have an inequality you compare. So in principle, we have a theory of inventory which tells you: the longer it takes to bridge, the more costly it is to bridge, the more you go for inventory. If inventory is very costly to hold because those things are volatile, depreciating in expectation, or they have very low liquidity so you might not get rid of them, then you refrain from holding inventory. Okay, that's the theory. How does it relate to the empirics? And we've also looked into other datasets. So first of all there is a substantial amount of arbitrage where people use bridges and arbitrage between these very long-tail tokens. So they really take this very long time to complete or finish the arbitrage, 242s median. And inventory arbitrage is by definition is almost immediate, with a nine-second settlement median. The majority is inventory, approximately two-thirds in our dataset is inventory-based arbitrage, and the rest have a bridging transaction in between. There is heterogeneity in bridging times and in bridging costs, and that also gives heterogeneity in what people do. There's a difference between native bridges in bridging time and these other bridging solutions, like Wormhole, etc, where we usually have shorter bridging times. The majority is non-native bridging solution. But some were done via their native bridge. And that also reflects on the way people arbitrage these things. Okay. How does a market look like? It's generally true that searching, or any form of arbitrage, have this Pareto distribution. So few people make all of the gains, sort of a winner takes all game. Because of the scale factor involved. We also have this kind of stylized fact of few searchers extracting most of these arbitrage opportunities. So there are these top addresses that account for almost 50%, and then during Dencun, almost 75% of this total arbitrage volume. Okay. The final thing I wanted to do is link this effects and theoretical findings to with the broader picture. So this is a niche, as I said. It's sort of a beautiful niche of beautiful, weird long-tail tokens with funny names. And for finance proportions, it’s not a large market. But there could be a potential world where we have fragmented liquidity, we have multiple chains, and cross-domain arbitrage becomes increasingly more important. So there could be such world. On the other hand, the tech tries to solve some of the frictions that I talked about. We should have maybe shorter bridges. We should have other interoperability solutions. We should have clever ways to route liquidity across different venues on different chains. So maybe we can reduce the friction and the cost in these cross-chain markets. So those are the two forces. And I think one takeaway off of the paper is if we are in this hypothetical world where we a lot of growth and, and, a lot of fragmentation of liquidity and the technology doesn't follow that, then you would have scale economies and searching. Because for the fact that there is a premium of having capital, so you can maintain inventory. There's a premium on being clever and managing risk and all of these factors should be a factor of centralization. But you can also take it as a documentation of a weird, beautiful world that might disappear in the future. Okay, some quick advertisements I promised to do. There’s jobs at Flasbots jobs, there's a preprint, and there's another preprint for the theory folks in the audience that care about arbitrage calculations with Brownian motion.