Permanode data layout discussion

# Permanode data layout discussion ## Needed API endpoints #### GET `messages/${messageId}` Should return data from the `messages` table for a given message ID. ```sql= select message from messages where message_id = ${messageId} ``` #### GET `messages/${messageId}/metadata` Should return only the metadata from the `messages` table for a given message ID. ```sql= select metadata from messages where message_id = ${messageId} ``` #### GET `messages/${messageId}/children` Should return all children's IDs from the `messages` table for a given parent message ID. ```sql= select m.* from edges e left join messages m on e.children = m.id where e.parent = ${messageId} and e.partition_id = ? ``` #### GET `messages?index=${indexationKey}` Should return paged message IDs from the `messages` table for a given hashed index. ```sql= select m.* from index_lookup i left join messages m on i.message_id = m.id where i.hashed_index = ${indexationKey} and i.partition_id = ? ``` #### GET `addresses/${bech32Address}/outputs` Should return paged output IDs from the `outputs` table for a given address. ```sql= select * from address_lookup a left join outputs o on a.output_id = o.id where a.address = ${bech32Address} and i.partition_id = ? ``` #### GET `outputs/${outputId}` Should return an output structure from the `outputs` table for a given output ID. ```sql= select * from outputs where id = ${outputId} ``` #### GET `milestones/${index}` Should return paged message IDs from the `messages` table for a given milestone index. ```sql= select ms.* from milestone_lookup mi left join messages ms on mi.message_id = ms.id where mi.index = ${index} and mi.partition_id = ? ``` ## Partitioning Lookup tables must be partitioned effectively to remove possibility of overflowing in the case where lots of records exist for a particular key. We discussed partitioning them by defining a `max_partitions` constant and uniformly distributing the insertions; however, this presents an issue with retrieving and displaying the records for a user: They will not be in any particular order and partition IDs mean very little. We decided it would be better to insert records in the earliest possible partition until it is full and define a `partition_size` constant instead. We should be able to make use of the ScyllaDB `system.large_partitions` table to manage this, as well as atomic data in the application. ## Tables * `messages` * `addresses` * `indexes` * `milestones` * `parents` * `transactions` * `sync` ## Schema Cql statements * Create `mainnet` Keyspace ```sql= CREATE KEYSPACE IF NOT EXISTS permanode WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': 1} AND durable_writes = true; ``` * Create `messages` Table ```sql= CREATE TABLE IF NOT EXISTS permanode.messages ( message_id blob PRIMARY KEY, message blob, metadata blob, proof blob, ); ``` * Create `addresses` table ```sql= CREATE TABLE IF NOT EXISTS permanode.addresses ( address text, partition_id smallint, milestone_index int, output_type tinyint, transaction_id text, idx smallint, amount bigint, address_type tinyint, inclusion_state blob, updated_at timestamp, PRIMARY KEY ((address, partition_id), milestone_index, output_type, transaction_id, idx) ) WITH CLUSTERING ORDER BY (milestone_index DESC, output_type DESC, transaction_id DESC, idx DESC); ``` * Create `indexes` table ```sql= CREATE TABLE IF NOT EXISTS permanode.indexes ( indexation text, partition_id smallint, milestone_index int, message_id text, inclusion_state blob, updated_at timestamp, PRIMARY KEY ((indexation, partition_id), milestone_index, message_id) ) WITH CLUSTERING ORDER BY (milestone_index DESC); ``` * Create `parents` table ```sql= CREATE TABLE IF NOT EXISTS permanode.parents ( parent_id text, partition_id smallint, milestone_index int, message_id text, inclusion_state blob, updated_at timestamp, PRIMARY KEY ((parent_id, partition_id), milestone_index, message_id) ) WITH CLUSTERING ORDER BY (milestone_index DESC); ``` * Create `transactions` table ```sql= CREATE TABLE IF NOT EXISTS permanode.transactions ( transaction_id text, idx smallint, variant text, message_id text, data blob, inclusion_state blob, milestone_index int, PRIMARY KEY (transaction_id, idx, variant, message_id, data) ); ``` * Create `hints` table ```sql= CREATE TABLE IF NOT EXISTS permanode.hints ( hint text, variant text, partition_id smallint, milestone_index int, PRIMARY KEY (hint, variant, partition_id) ) WITH CLUSTERING ORDER BY (variant DESC, partition_id DESC); ``` * Create `milestones` table ```sql= CREATE TABLE IF NOT EXISTS permanode.milestones ( milestone_index int, message_id text, timestamp bigint, payload blob, PRIMARY KEY (milestone_index, message_id) ); ``` * Create `sync` table ```sql= CREATE TABLE IF NOT EXISTS permanode.sync ( key text, milestone_index int, synced_by tinyint, logged_by tinyint, PRIMARY KEY (key, milestone_index) ) WITH CLUSTERING ORDER BY (milestone_index DESC); ``` # Access cql statements ## Select queries - [x] Select `message` from `messages` table ```sql= SELECT message from mainnet.messages WHERE messsage_id = ?; ``` - [x] Select `metadata` from `messages` table ```sql= SELECT metadata from mainnet.messages WHERE messsage_id = ?; ``` - [x] Select `message` and `metadata` from `messages` table ```sql= SELECT message, metadata from mainnet.messages WHERE messsage_id = ?; ``` - [x] Select `output` from `transactions` table ```sql= SELECT data from mainnet.transactions WHERE transaction_id = ? AND index = ? and variant = 'output'; ``` - [x] Select `milestone` from `milestones` table ```sql= SELECT milestone from mainnet.milestones WHERE milestone_index = ?; ``` - [x] Select `children` from `parents` table ```sql= SELECT message_id from mainnet.parents WHERE parent_id = ? AND partition_id = ?; ``` - [x] Select `outputs_ids` from `addresses` table ```sql= SELECT address, address_type, transaction_id, index from mainnet.addresses WHERE address = ? AND partition_id = ?; ``` - [x] Select `messages` from `indexes` table ```sql= SELECT hashed_index, message_id from mainnet.indexes WHERE hashed_index = ? AND partition_id = ?; ``` ## Insert queries - [x] Insert `(message, metadata)` into `messages` table ```sql= INSERT INTO mainnet.messages (message_id, message, metadata) VALUES (?, ?, ?); ``` - [x] Insert `address` into `addresses` table ```sql= INSERT INTO mainnet.addresses (address, partition_id, transaction_id, index, amount, address_type) VALUES (?, ?, ?, ?, ?, ?); ``` - [x] Insert `hashed_index` into `indexes` table ```sql= INSERT INTO mainnet.indexes (hashed_index, partition_id, message_id) VALUES (?, ?, ?); ``` - [x] Insert `parent` into `parents` table ```sql= INSERT INTO mainnet.parents (parent_id, partition_id, parent_index, message_id) VALUES (?, ?, ?, ?); ``` - [x] Insert `(transactionId, Batch<InputsOutputs> )` into `transactions` table * `OUTPUT`,`INPUT` and `UTXOINPUT` variants ```sql= INSERT INTO mainnet.transactions (transaction_id, index, variant, tx_id, idx, data) VALUES (?, ?, ?, ?, ?); ``` 1. Query for UTXOINPUT records 2. Batch for INPUT and OUTPUT records - [x] Insert `milestone` into `milestones` table ```sql= INSERT INTO mainnet.milestones (milestone_index, message_id, timestamp) VALUES (?, ?, ?); ``` # Open Questions * What is the best method to partition the data? * Predefined partition max number; * Dynamic partition size; * Partitioning is not required;