# Introducing Que Pasa, a contract indexer for Tezos by [Rick Klomp](https://github.com/RKlompUU) and [John Newby](https://github.com/johnsnewby) ###### tags: que-pasa, indexer, tezos ## Introduction [Que Pasa](https://github.com/tzConnectBerlin/que-pasa) is a new indexer for Tezos which aims to fulfil the needs of dApp developers who want a quick, small, fast indexer for their contracts' storages. It is written in Rust and currently uses PostgreSQL as its storage layer. As an example, Que Pasa translates the marketplace contract in: ``` rklomp@Monke que-pasa % cat HEN.yaml contracts: - name: "hdao" address: "KT1QxLqukyfohPV5kPkw97Rs6cw1DDDvYgbB" - name: "marketplace" address: "KT1HbQepzV1nVGg8QVznG7z4RcHseD5kwqBn" ``` into a database schema with following tables: ``` tezos=# \dt "marketplace".* List of relations Schema | Name | Type | Owner -------------+------------------+-------+--------- marketplace | bigmap_clears | table | quepasa marketplace | storage | table | quepasa marketplace | storage.metadata | table | quepasa marketplace | storage.swaps | table | quepasa (4 rows) ``` And, for example, the table "storage" here has the following columns: ``` tezos=# \d "marketplace"."storage" Table "marketplace.storage" Column | Type | Collation | Nullable | Default ---------------+------------------------+-----------+----------+------------------------------------------------- tx_context_id | bigint | | not null | id | bigint | | not null | nextval('marketplace.storage_id_seq'::regclass) counter | numeric | | | fee | numeric | | | manager | character varying(127) | | | objkt | character varying(127) | | | ``` As is standard for a Tezos tool, we have chosen a name related to Mexican food for our indexer. Que Pasa is the worst Mexican restaurant chain in Berlin, hence possibly in the world. It also translates to "what's happening," which is quite fitting. ## Motivation and history All dApps of any complexity require a storage and caching layer in order to give a reasonable user experience. Of course this has to be easy to setup or else it becomes a source of centralization, which is meant to be what the blockchain frees us from. This is the dirty secret of blockchain development, and Que Pasa attempts to mitigate this. There are already excellent indexers for Tezos which index the whole chain. And these are great for building explorer websites (see e.g. tzkt.io, tzstats.com, [better-call.dev](https://better-call.dev)). Although they have open sourced their backends, setting these up and maintaining them requires significant effort and are therefore, in our opinion, not best-suited to serve as backends of dApps. For most dApps, all that's required is the indexing of its own (set of) contract(s). Que Pasa has been built to address specifically indexing for dApp backends. With it we are aiming for simplicity in configuration and maintainence of dApp backends. The project's genesis came when one of us (Rick) did a coding exercise as part of his interview at TZ Connect, and wrote a simple parser in Haskell for the contract storage as it's expressed in Michelson in the contract definition. John continued this work in Rust to make a relational representation of the contract storage and an output stage into PostgreSQL which conforms to the ActiveRecord standard for ORM sytems. The idea was to be output layer agnostic to enable as many systems which automatically configure themselves to use the generated schema. We have been using Postgraphile up to now, which makes a GraphQL interface to the contract, with essentially zero configuration. The indexer has been in use for the last several months to support the user interface for the [Prediction Market](https://github.com/tzConnectBerlin/prediction-market-contracts) developed by TZ Connect, but it was awaiting Rick joining us to attain the full flowering of its beauty. ## Philosophy We wanted to make an indexer which was as small as possible and required nothing apart from information which is available on-chain. It should be possible to index a contract and be querying its contents within seconds for a contract that is in development mode and has few calls. Currently this is true for most contracts, and we are able to index the heavily used [HEN](https://www.hicetnunc.xyz/) set of contracts (the HEN dApp encompasses 6 contracts) in about half a day. We model each contracts' storage as a table called `storage` which lives in a schema named after the contract's identifier. Where we have annotations on variables, this is the name of the corresponding column. Otherwise we generate one based on the variable's type. Maps, big maps, lists and sets become new tables which reference the structure from which they're declared. Sadly, the keys of (big) maps never have annotations, which leads to suboptimal naming. But such is life. An advantage of using a SQL database and a standard naming scheme in the schema is that up to this point we've been able to avoid the sordid business of actually answering queries. In fact it occurs to us that calling Que Pasa an indexer is to some extent fraudulent--that glory belongs to Postgres. Rather it's a translator, from Michelson storage representations to a generic relational one, and then trivially to a PostgreSQL schema and inserts. ## In use The first time that Que Pasa encounters a contract it must generate a SQL definition corresponding to the storage. It queries the code for the storage, converts it to SQL and emits this into the database in a schema with the same name as the contract's identifier. The first invocation which inserts data from the contract's storage into the database is a special case. It would be possible, although boring, to simply read the blockchain from the most recent block to the one in which the contract is originated, but this would take too much time for contracts that are mostly inactive. So Que Pasa offers a fast sync option wherein it asks Better Call Dev for all of the blocks which have transactions for the contract, fills these in, and marks the rest done. After this point Que Pasa simply polls the chain every second, reading in blocks, dealing with any chain reorganisations and inserting data when needed. With the exception of big maps, the storage is saved in the database in its entirety on each update. This means there can be multiple storages even within the same operation group. By joining on the `tx_context` table you can determine the state of the (non-big map) storage at each level. Big map changes are only delivered as updates and deletions. For this reason the big map tables have a deleted column. It's up to the the querier to assemble the collection of values at the level needed. Per table, there is a `_live` view which gives the most recent set that the indexer has been made aware of, and an `_ordered` view that gives the updates and deletions in chronological order of execution. ## Example run & querying the generated data We'll run the same configuration as shown in the intro (indexing HEN's hDAO contract, and HEN's marketplace contract). For sake of simplicity we'll only go into the resulting marketplace schema. $NODE_URL and $DATABASE_URL are set as follows: ``` rklomp@Monke que-pasa % cat .env export NODE_URL=https://mainnet-tezos.giganode.io export DATABASE_URL="host=localhost dbname=tezos user=quepasa password=... port=5432" rklomp@Monke que-pasa % source .env ``` We're ready now to index our HEN contracts of interest (for simplicity, indexing only 1 level): ``` rklomp@Monke que-pasa % que-pasa --contract-settings HEN.yaml -l 1690865 [2021-09-29T14:50:45Z INFO que_pasa] Common tables set up in db [2021-09-29T14:50:46Z INFO que_pasa::sql::db] creating schema for contract hdao [2021-09-29T14:50:46Z INFO que_pasa::sql::db] creating schema for contract marketplace [2021-09-29T14:50:47Z INFO que_pasa::highlevel] processing level 1690865: (baked at 2021-09-13 20:59:32 UTC) [2021-09-29T14:50:47Z INFO que_pasa::highlevel] level 1690865: 1 contract call for hdao, 21 contract calls for marketplace rklomp@Monke que-pasa % ``` Lets look at the resulting populated database. This is the first time we start Que Pasa on this database, therefore Que Pasa starts with setting up global tables that are relevant to all contracts: ``` tezos=# \d List of relations Schema | Name | Type | Owner --------+-----------------+-------+--------- public | bigmap_keys | table | quepasa public | contract_deps | table | quepasa public | contract_levels | table | quepasa public | contracts | table | quepasa public | levels | table | quepasa public | max_id | table | quepasa public | tx_contexts | table | quepasa (7 rows) ``` Because this is the first level we're indexing for these HEN contracts, Que Pasa additionally derives on startup the table schemas from the contract storage types and adds them to the database. Then the level we asked in `-l` argument is processed. Apparently there were 21 contract calls to the marketplace and 1 contract call to the hDAO in this level. Let's have a look at the data that has been populated into marketplace's tables. As a reminder, these are the tables in `marketplace` schema: ``` tezos=# \dt "marketplace".* List of relations Schema | Name | Type | Owner -------------+------------------+-------+--------- marketplace | bigmap_clears | table | quepasa marketplace | storage | table | quepasa marketplace | storage.metadata | table | quepasa marketplace | storage.swaps | table | quepasa (4 rows) ``` Table `storage` contains, per contract call, a row with all the top-level datafields' values as they are after the contract call. In our example, there were 21 contract calls in the processed level, thus there are now 21 rows in table `storage`. For example, showing 3 of those at random: ``` tezos=# select * from "marketplace"."storage" limit 3; tx_context_id | id | counter | fee | manager | objkt ---------------+----+---------+-----+--------------------------------------+-------------------------------------- 28 | 97 | 985215 | 25 | tz1UBZUkXpKGhYsP5KtzDNqLLchwF4uHrGjw | KT1RJ6PbjHpwc3M5rw5s2Nbmefwbuwbdxton 27 | 93 | 985215 | 25 | tz1UBZUkXpKGhYsP5KtzDNqLLchwF4uHrGjw | KT1RJ6PbjHpwc3M5rw5s2Nbmefwbuwbdxton 26 | 89 | 985215 | 25 | tz1UBZUkXpKGhYsP5KtzDNqLLchwF4uHrGjw | KT1RJ6PbjHpwc3M5rw5s2Nbmefwbuwbdxton (3 rows) ``` The `tx_context_id` tells us exactly the location where the contract call was performed in the level. It's an identifier linking to a row in the global table `tx_contexts`. By joining the storage table with this table, it's possible to figure out for example what the latest state is of `storage` table. However, it's much easier to use the view `storage_live` that Que Pasa has already automatically generated for us: ``` tezos=# select * from "marketplace"."storage_live"; level | level_timestamp | id | counter | fee | manager | objkt ---------+------------------------+-----+---------+-----+--------------------------------------+-------------------------------------- 1690865 | 2021-09-13 20:59:32+00 | 113 | 985215 | 25 | tz1UBZUkXpKGhYsP5KtzDNqLLchwF4uHrGjw | KT1RJ6PbjHpwc3M5rw5s2Nbmefwbuwbdxton (1 row) ``` This `_live` view is generated for every table (with exception to descendants of bigmap tables). Besides this view, there's also an `_ordered` view generated. This gives us all the storage values ordered by order of execution on the Tezos chain: ``` tezos=# select * from "marketplace"."storage_ordered" limit 3; ordering | level | level_timestamp | id | counter | fee | manager | objkt ----------+---------+------------------------+----+---------+-----+--------------------------------------+-------------------------------------- 1 | 1690865 | 2021-09-13 20:59:32+00 | 33 | 985208 | 25 | tz1UBZUkXpKGhYsP5KtzDNqLLchwF4uHrGjw | KT1RJ6PbjHpwc3M5rw5s2Nbmefwbuwbdxton 2 | 1690865 | 2021-09-13 20:59:32+00 | 37 | 985209 | 25 | tz1UBZUkXpKGhYsP5KtzDNqLLchwF4uHrGjw | KT1RJ6PbjHpwc3M5rw5s2Nbmefwbuwbdxton 3 | 1690865 | 2021-09-13 20:59:32+00 | 41 | 985210 | 25 | tz1UBZUkXpKGhYsP5KtzDNqLLchwF4uHrGjw | KT1RJ6PbjHpwc3M5rw5s2Nbmefwbuwbdxton (3 rows) ``` # Further work We would like our tool to be useful to as many people as possible. Partly this means we should be able to deliver the data to frontends without requiring additional configuration setup. To an extent, with Postgraphile, we've already reached this. However, there is more to it if we want to provide a complete solution. For example, as it stands creating reactive frontends using websockets is not really possible yet. On the other hand, we realize that we need to better understand what the community finds lacking about the current version. We could really benefit from feedback from teams that are building on Tezos. ## Further information The most recent version of the indexer will always be in the repository at https://github.com/tzConnectBerlin/que-pasa