Open Bundle Flow

# Open Bundle Flow For the purposes of research, and gauging the ever-looming curse of Moloch in builder centralization, having access to data around orderflow is incredibly important. Broadly there are 3 types of orderflow from the perspective of a block builder: *public*, *private*, and *exclusive*. *Public Orderflow* are transactions which are publicly broadcasted, (almost always?) usually through the "mempool". *Private* are those transactions which are sent privately to `n` builder, but not publicly broadcasted. Exclusive orderflow is a subset of private orderflow where it is only sent to `n = 1` builders. Public orderflow is the easiest to study, as anyone with the resources to run a couple ethereum nodes can have fairly high visibility. [1](https://arxiv.org/pdf/2208.02858.pdf) [more links l8r] And we also have good reason to believe that most searchers are willing to send their orderflow to `n = 5` builders on average. [victoria's findings] And as an intuition, if the probability of a builder leaking a bundle is low, there is only upside in sending orderflow to more builders. ![](https://i.imgur.com/oMmBIdH.jpg) But one area which is hard for us to analyze is exclusive orderflow, which is inconveniently the worst type of orderflow. In order to get more visibility into this type of order flow this document proposes a data sharing standard for builders. The goals of this document are: (1) identify how this API ***should not*** be designed - that is identify what patterns around searcher behavior we do not want to make publicly available (2) give an initial high level API design ## How to Not Design This API (This is all conjecture as Im not an active searcher) Winning bids by searchers, expressed as bundles, land on-chain. Therefore the internal operations of their extraction methods are availabile if you can reverse EVM bytecode. But what is not on-chain are their bidding strategies to win the block inclusion auction. There are three peices of data that could be exposed related to this: 1) `Searcher signature` - unique searcher and payload identifier 2) `Bid timing` - timing of bundle submission for each slot 3) `Bundle information` - transaction details ex: `to`, `from`, `data` The worst combination of all of these three allows someone to understand which eth bot addresses are linked together and how they update their bids for specific state access throughout the slot. Therefore ideally the information we reveal would not include any of this. Currently my best idea is for builders to publish a list of all bundle hashes they received during a slot. Researchers could then compare bundle hash lists across builders to determine intersection and non-intersection. ### Purposeful Obfuscation and Incentives Searchers can easily change their bundle hash in various ways (diff gas price), does this defeat the purpose of this API? Or can we assume that the benefit a searcher gets from obfuscating their flow outweighs the cost? Similarly, what is the incentive for the builder here? Maybe there doesnt need to be one other than it is a social norm. But what is the tangible value add to the ecosystem for enforcing a norm that imposes a cost on builders. Right now its something like "it helps with research and gauging whether or not we recreated citadel". But this type of data also tends to incite mob like anger. ## API design The main concern here is additional constraints on builders. As seen with the Data API in relays, most choose to run a separate instance to serve relay data. Historical data grows, so do maintenance costs.