# Data Services Hiring ## Introduction At Osmosis, we frequently need to operate the nodes for extracting data. This data can be used for various purposes such as analytics, monitoring, or serving off-chain web data services to clients. In this task, we request implementing a data pipeline that retrieves pool data from a real Osmosis full-node and creates a web server that allows retrieving ready-for-display pools in the frontend app. We will provide the necessary infrastructure to support your development. ## Problem Description Osmosis DEX has liquidity pools that users trade against. The higher the liquidity in each pool is, the smaller the price impact. As a result, users always pay attention to this metric. In fact, in our application, we have [a pools page](https://app.osmosis.zone/pools) where we display the list of pools as well as their USD-denominated liquidity. Currently, this data is provided by external services. However, we want to have a more reliable and cost-effective solution by serving the data from our infrastructure. We want to create a data pipeline that retrieves the pool data from the Osmosis full-node, transforms it into a ready-to-use format, and serves it through a web server. For scope reasons, we only focus on liquidity while ignoring volume, fees and APR from [the linked page](https://app.osmosis.zone/pools). We would like to keep the external services as a fallback solution. However, this fallback must be used only when on-chain data is not available. ## Requirements 1. The data pipeline should retrieve the pool data from the mainnet Osmosis full-node. * We will give you access to a running full node. 2. The data pipeline should transform the data into a ready-to-use format. * List of [{ pool_id: uint64, liquidity: string, type: uint64 }] 3. The data should be persisted into a storage of your choice. 4. The web server should serve the data from the off-chain storage. * The web server should serve the data in the following format`: * `GET /pool-liquidity` * Response: [{ pool_id: uint64, liquidity: string, type: uint64 }] ## Hints Note, that you have the freedom in selecting the exact direction of building out your data-pipeline. You are free to choose any tool that you think will accomplish the task in the most optimal way given the constraints. Below, we only give suggestions or ideas that you are free to discard in favor of your own vision. ### Data Ingestion There are several ways to ingest data from the node. We leave hints for two options and a freedom for you to select the preferred approach. #### Push-Model There is an [ingest](https://github.com/osmosis-labs/osmosis/tree/main/ingest) package where you can find existing logic for piping data from the Osmosis full-node at the end of the block into various data sinks. We suggest implementing your own [Ingester](https://github.com/osmosis-labs/osmosis/blob/06bb6b40cfd698caaabbb2872f9259d1ef00420e/ingest/ingest_manager.go#L19-L27) and wiring it up in the app similar to [this](https://github.com/osmosis-labs/osmosis/blob/06bb6b40cfd698caaabbb2872f9259d1ef00420e/app/app.go#L285). ##### Extracting Pools You can use [this method](https://github.com/osmosis-labs/osmosis/blob/06bb6b40cfd698caaabbb2872f9259d1ef00420e/x/poolmanager/types/pool.go#L17) of the x/poolmanager keeper for retrieving the pool data. Note that each pool has [an address](https://github.com/osmosis-labs/osmosis/blob/06bb6b40cfd698caaabbb2872f9259d1ef00420e/x/poolmanager/types/pool.go#L17) and a [type](https://github.com/osmosis-labs/osmosis/blob/06bb6b40cfd698caaabbb2872f9259d1ef00420e/x/poolmanager/types/pool.go#L37). ##### Extracting Balances By using a pool address, we can retrieve its balance denominated in the tokens that a pool consists of. This can be achieved by the [GetAllBalances](https://github.com/osmosis-labs/osmosis/blob/06bb6b40cfd698caaabbb2872f9259d1ef00420e/ingest/sqs/pools/ingester/pool_ingester.go#L332) method defined on the x/bank keeper. With these helpers, we should be able to extract all of the core pieces of data needed in the pipeline: - Pool ID - Pool type - Pool balance in each token #### Pull Model Notice that each Cosmos SDK module has proto-definitions of requests, responses and query servers. The REST port for interacting with the server is `1317`. Below, we leave the links to query APIs that you might find useful in this assignment. ##### Extracting Pools - [Reference](https://github.com/osmosis-labs/osmosis/blob/c73bcf8518c11b4ef15395561c3004a1b7d32889/proto/osmosis/poolmanager/v1beta1/query.proto#L92) ##### Extracting Balances - [Reference](https://github.com/osmosis-labs/cosmos-sdk/blob/190963ec9c5fa2ca566e1f72d59c00b3f55e55e3/proto/cosmos/bank/v1beta1/query.proto#L29) ### Data Transformation #### Filtering Pools For simplicity, we only focus on the following pool IDs: - 1263 (OSMO/USDC) - 1400 (OSMO/ATOM) - 1265 (OSMO/ATOM) - 1281 (ETH/OSMO) - 1335 (milkTIA/TIA) Assume that these are the only pools that exist in the network. Explicitly filter out the pools that are not in the list to enable this assumption #### Computing USD-denominated balances Note, that our end goal is to display the liquidity in USD. However, we have liquidity data denominated in each of the pool's tokens. To compute USD-denominated liquidity, we must determine pricing of each token in terms of USDC To start, note that we have the price of each asset with OSMO as base. Then, we can use pool 1263 to determine the USD value of 1 OSMO. Finally, we can calculate the USD value of each pool. As an example, assume pool 1281 has the spot price of 0.0003333 ETH for OSMO and pool 1263 has spot price of 2 USDC for 1 OSMO. In pool 1281, we have 9000 OSMO and 3 ETH. To determine its liquidity in USD terms, we perform the following calculation: - 3 * 1 / 0.0003333 + 9000 = 18000 OSMO - 18000 OSMO * 2 = $36000 USDC. The only exception is the pool 1335 (milkTIA/TIA). We don't have the prices of these assets in USDC or OSMO based on the filtered on-chain pools. As a result, we must rely on an external service to get the USD value of this pool. Assume that [there is an endpoint](http://sqs.stage.osmosis.zone/tokens/usd-price-test?denoms=factory/osmo1f5vfcph2dvfeqcqkhetwv75fda69z7e5c2dldm3kvgj23crkv6wqcn47a0/umilkTIA,ibc/D79E7D83AB399BFFF93433E54FAA480C191248FC556924A2A8351AE2638B3877) that returns the price of an asset in USD. Use this endpoint to get the USD value of the milkTIA and TIA assets. Note that on-chain representation of these tokens is the following: - `milkTia` - `factory/osmo1f5vfcph2dvfeqcqkhetwv75fda69z7e5c2dldm3kvgj23crkv6wqcn47a0/umilkTIA` - `TIA` - `ibc/D79E7D83AB399BFFF93433E54FAA480C191248FC556924A2A8351AE2638B3877` #### Token Precision Note that on-chain liquidity values constitute scaled amounts according to the [token precision](https://ethereum.stackexchange.com/questions/19673/decimals-on-erc20-tokens). To compute realistic and correct values, you will have to descale them according to their exponent. We have each token's metadata tracked via [this list](https://github.com/osmosis-labs/assetlists/blob/main/osmosis-1/osmosis-1.assetlist.json). During transform step, make sure to query and parse and, potentially, persist this list in order to retrieve the correct exponents for balance descaling. As an example, consider "uosmo" coin. It has an exponent of 6 If the x/bank `GetAllBalances` returned `5_000_000_000`. Then, the true OSMO balance is `5_000_000_000 / 10**6 = 5_000`. ## Data Load As a final pipeline step, make sure to load your data in the following format to the storage of your choice: ``` type PoolData struct { // The ID of the pool ID uint64 // The USD-denominated liquidity of the pool. USDLiquidity string // The type of the pool Type uint64 } ``` ## Web Server Now, please expose a simple web server endpoint. * `GET /pool-liquidity` * Response: `[{ pool_id: uint64, liquidity: string, type: uint64 }]` This endpoint is to serve the end result of the ETL pipeline developed in the previous steps. ## Evaluaion Criteria You will be evaluated on the following areas: - Requirement gathering - we understand that there might be a lot of new context so questions are encouraged. You're free to use us as the resource to achieve the end goal. - Completion - does the system work end-to-end? - Architercture - can you justify the choice of one or the other tool in your pipeline? - Simplicity and Readability - is your code easy to follow? - Quality - does your pipeline consistently work and is tested in the most critical parts? ## Deliverables - GitHub Repository with the crux of the work * Make sure @niccoloraspa and @p0mvn have access - A web service deployment with `GET /pool-liquidity` API exposed that we can query ### Bonus Points - Architecture diagram and/or documentation to help us understand your decision-making process ## Running a Node You will be given access to an Osmosis node to get data from. It is running in a tmux session called `main`. The binary can be restarted from `./build/osmosisd start`. ## Helpful Resources - [Node Installation Guide](https://polkachu.com/installation/osmosis) - [Osmosis Docs](https://docs.osmosis.zone/) - [Osmosis Repository](https://github.com/osmosis-labs/osmosis)