tap3 - HackMD

## who's token price is it anyways? ### an implementation plan for the price quoter project TAP-3 > A fast, trustless price oracle library for all tokens. The most important insight is that we can use tycho's live blockchain data to build a graph of all liquidity pools and then pathfind through it to find the best price for any token. this is cool because you get the actual executable price. We calculate prices by simulating swaps through multiple paths, accounting for gas costs, and taking the midpoint between buy and sell prices. We use binary search for depth at various defined levels and we cache everything aggressively and only recalculate when pools actually change. On every block we save the price history for any token the user specifies. Thats basically it. The quoter uses rust because its fast and has good libraries for graph algorithms. we chose pathfinding specifically because finding the best route through liquidity is a graph problem. ``` Tycho Indexer │ ▼ State Ingestor │ ▼ ┌──────────────┐ ┌───────────────────┐ │ Path Finder │ ───▶ │ Simulation Engine │ │ & Router │ └───────────────────┘ └──────────────┘ │ │ ▼ └──▶ Cache Layer (path + token) │ ▼ API Layer (gRPC / REST / Python) │ ▼ Clients (Rust, Python, JS, Solvers) ``` Proposed modules: ``` src/ ├─ graph.rs // build & update token graph ├─ path_finder.rs // hop-limited path enumeration ├─ simulator.rs // swap simulation + gas integration ├─ cache.rs // per-path & per-token caches + invalidation ├─ history.rs // file-based price history logger ├─ lib.rs // Quoter struct + public API └─ python_bindings/ // pyo3 glue ``` ## More notes on pools If there's a new pool we haven't seen yet, we add it automatically and tokens using it (if not a new token pool) will have their cache invalidated. This way we don't have to keep track of new pools, the Tycho indexer does that for us automatically. ### the importance of depth Depth is critical, mid-price means very little without the effective spread and depth of the liquidity pool. Users define an arbitrary depth at which to probe the orderbook to define the mid-price. Denominated in the numeraire. We can binsearch to get this. ## the basic algorithm *Graph construction & updates* We keep an in-memory graph where tokens are nodes and pools are edges (with reserves, fees, and swap math). Every block, Tycho gives us a diff of updated pools, so we patch only those edges—no full rebuild. *Path enumeration* ~~We turn each edge’s tiny‑trade mid-price into a weight w = -ln(price). Running a bidirectional Dijkstra (or A* with a mid-price heuristic) gives us the best single path for that metric, but we need more candidates. We could then wrap it in Yen’s k-shortest-paths (k=3–5) to grab the top handful of low‑cost routes.~~ After feedback note: you can't naively route with dijkstra (compares too many different pools). >How Dijkstra actually works: > - It maintains a priority queue of vertices with their current best distance from the source > - It compares partial path costs at every step when deciding which vertex to explore next > - When it extracts vertex u from the queue, it relaxes edges to neighbors, comparing d[u] + w(u,v) vs d[v] > The problem is: > In token routing, when Dijkstra compares partial paths: > - d[ETH→USDC] = -ln(0.0005) (cost to reach USDC) > - d[ETH→WBTC] = -ln(20) (cost to reach WBTC) > These ARE being compared in the priority queue to decide which vertex to explore next, but they represent costs in different token units! **Quoter Overview** **simple 3-hop quoter** 1. **Pre-compute all 3-hop paths** * On startup (and whenever our TVL filter changes), enumerate every simple path of length ≤ 3 from our numeraire to each token. * Build a reverse index: `pool → [paths…]`. 2. **Block-driven updates** * On each new block, pull the set of pools that had any state change. * Look up all paths touching those pools via our reverse index. * **Re-quote** only those paths (simulate swapping our fixed M tokens down each path, subtract gas). 3. **Maintain best-price cache** * For each token, keep its current “best path → net amount.” * When we re-quote a path, compare it against our cached winner and replace if better. 4. **Spot-price answers** * When asked “price of X in numeraire,” we simply look up our cached best-price for X. Behind the scenes we actually run a two‑layer caching system: Path‑level cache caches individual simulation results keyed by (path_id, depth, gas_price_bucket). Each entry also records the last block numbers when each pool in the path was updated. Token‑level cache aggregates the best quote per (token, numeraire, depth) from those path results and records the maximum of its paths’ last-updated blocks. *Invalidation & refresh process:* On each new block, we receive a list of pool IDs that changed. We scan the path cache index (a reverse‐map from pool to cached paths) to find all path entries containing those dirty pools and evict them. For each evicted path, we also mark its owning token entry as stale. We then re-run the simulation only for paths whose cache was evicted, recompute each affected token’s best quote, and overwrite just those token‐level entries. All other token prices remain untouched. This targeted eviction means that if 5 pools out of 100 k update in a block, we only recalc ~O(#paths referencing those 5) instead of everything—keeping end‑to‑end update time sub‑500 ms. We estimate gas per swap (UniV2 vs V3 vs Curve), multiply by the current gas price (using the route's own price), convert it into numeraire, and subtract from the gross rate to get net_price. We also compute effective spread in basis points. *Two-layer caching* Path cache holds raw simulation results keyed by (path, depth, gas_bucket) and invalidates only when a pool in that path changes. Token cache picks the best quote from those path results and invalidates only when any of its underlying paths change. *Serve* Once cached, get_price(token, numeraire, opts) is a fast lookup. **Appendix** 2. **A★-Driven Pathfinding** OPTIONAL (if wanted/needed) * Build a token–pool graph (tokens as nodes, pools as edges). * Run A★ up to N hops, using a 1-hop spot-price heuristic (we made this in the previous step) to rank frontier states by estimated numeraire value. * if no heuristic path we can use common bridge 2 hop (from WETH, USDC, DAI) with extra penalty for gas * if absolutely nothing then use 0 as heuristic value * Auto-bridge rare tokens through top hubs (WETH, USDC, DAI) and, if no route is found, locally bump N without affecting other searches. (weird rare tokens)