Torii Specification

# Torii Specification **Torii** is a modular, extensible, event-driven indexer for Starknet, capable of indexing: - Contracts emitting **Torii Beacon** events (e.g., `ModelRegistered`, `StoreSetRecord`) - ERC721 / ERC20 / ERC1155 token standards - Offchain signed messages submitted via gRPC Torii is designed for **high concurrency**, **pluggable backends**, and **read/write separation**. It supports multiple runtime interfaces including **gRPC**, **REST**, **GraphQL** and more in the future. --- ## 📦 Architecture Overview ``` [ Starknet Node / Stream ] ↓ [ Fetcher ] ↓ [ Decoder ] ↓ [ Event Dispatcher ] ↓ [ Task Queues (by ordering key) ] ↓ [ CompositeSink ] ┌──────────────┬───────────────┬──────────────────┐ ↓ ↓ ↓ ↓ [ SqliteSink ] [ GrpcSink ] [ MqttSink ] [ CustomSink... ] ↓ [ Torii Store Engine ] ←──────→ [ Torii Query Engine ] ↓ [ gRPC / Rest / GraphQL ] ``` --- ## ⚙️ Core Components | Module | Responsibility | |--------------------|--------------------------------------------------| | **Fetcher** | Fetch raw events from Starknet (RPC or gRPC). | | **Decoder** | Decode raw events into structured domain events, this may include additional fetch from the chain. | | **Dispatcher** | Route events into task queues by `ordering_key`. | | **Sink** | Handle decoded events and optionally write them. | | **Store Engine** | Shared write abstraction for consistent storage. | | **Query Engine** | Read-only abstraction to query indexed data. | | **gRPC/REST/GraphQL** | Client interfaces for data access & messaging. | --- ## 🔄 Event Processing Flow ``` [ Starknet Block ] ↓ [ Fetcher ] ↓ [ Decoder ] ↓ [ Dispatcher (by ordering_key) ] ↓ [ TaskQueue ] ↓ [ CompositeSink ] ↓ [ StoreEngine (e.g. SQLite) ] ``` --- ## ✍️ Write Path: Store Engine The **Store Engine** is a unified abstraction that defines how to persist structured data like: - Registered models - Store records - Token metadata - Event logs Sinks may optionally implement the Store Engine to delegate data storage consistently. But not mandatory. As an example, a `gRPC sink` may only broadcast to subscribed clients. --- ## 🔎 Read Path: Query Engine The **Query Engine** is used by all read interfaces (gRPC, REST, CLI, etc.) to retrieve indexed data in a backend-agnostic way. Example of queries that the Query Engine would support: - List all models - Fetch data by model and key - Retrieve latest token metadata - Retrieve user balance for a token --- ## 🌐 Interfaces Torii exposes multiple interfaces to users and clients: | Interface | Role | |------------|------------------------------------------| | **gRPC** | Subscriptions, queries, signed messages | | **REST** | Lightweight HTTP access, custom SQL etc... | | **GraphQL**| Type-safe expressive querying | Each interface uses the **Query Engine** under the hood to remain consistent. --- ## 🪝 Sinks Sinks are event consumers. They may: - Persist data to DBs (`SqliteSink`, `PsqlSink`, etc...) - Broadcast events to clients (`GrpcSink`) - Trigger side effects (`MetadataSink`) - Be chained together using a `CompositeSink` Sinks are **independent** and **must not rely on each other**. --- ## 🧠 Ordering Key Logic To maintain data consistency where required: - Each decoded event is assigned an **`ordering_key`** - Events with the same key are **processed in order** - Events with different keys are **processed in parallel** This has been implemented into Torii and increase the capability of Torii to process more events. --- ## 🧩 Extensibility | Extension Point | Description | |------------------|---------------------------------------------| | **Fetchers** | Swap between RPC, gRPC, or other sources | | **Decoders** | Add support for new contract types | | **Sinks** | Add custom sinks for storage or streaming | | **Query Engine**| Swap SQLite for Postgres or others | | **Interfaces** | Add CLI, Web UI, etc., using the QueryEngine| --- ## 💡 Component Responsibilities Summary | Component | Direction | Role | |------------------|-----------|-------------------------------------------| | **Fetcher** | Input | Pulls events from chain or stream | | **Decoder** | Transform | Decodes into structured Torii events | | **Dispatcher** | Control | Routes by ordering key to queues | | **Sink** | Output | Consumes events, may persist or publish | | **Store Engine** | Write | Persists decoded data in structured form | | **Query Engine** | Read | Exposes typed queries over stored data | | **gRPC/REST** | Interface | Client-facing protocols using QueryEngine | --- ## ✅ Design Goals - 🔁 **Event-driven, parallel architecture** - 🧩 **Pluggable sinks, fetchers, and storage** - 🔐 **Separation of concerns between write and read** - 🌍 **Multiple simultaneous contract indexing** - 🔍 **Queryable by clients over gRPC, REST, or GraphQL** - 🪶 **Lightweight default (SQLite) with room to grow (Postgres, NoSQL)** --- ## 📚 Example Runtime Configuration (`torii.toml`) ```toml [contracts] beacon = ["0xabc...", "0xdef..."] erc721 = ["0x123...", "0x456..."] [fetcher] type = "jsonrpc" url = "https://api.cartridge.gg/x/starknet/mainnet" # configuration for the event fetching. [sinks.sqlite] path = "torii.db" # ... ``` --- This architecture ensures **performance**, **scalability**, and **developer ergonomics** while staying open to evolving protocol and application needs. ## Torii-Torii syncing ### ⚠️ Core Challenges in Syncing Torii from Torii | Challenge | Description | |----------------------------|-----------------------------------------------------------------------------| | ⏱ Missing Block Boundaries | Most stored records (e.g., model data, token metadata) lack block info | | 🔄 Overwritten State | Store and token data are updated in-place, not versioned | | 📤 Metadata is Expensive | IPFS/HTTP metadata must be re-fetched and is slow or unreliable | | 🧩 Schema Mutability | Models evolve: fields can be added, and types widened | | 🔀 Race Conditions | New messages (on/offchain) can arrive during sync | | 🧠 No Chain Guarantees | Offchain signed messages aren't replayable via chain, require coordination | Currently, the simplest path is to rely on the database snapshot. If the `SqliteSink` is used, relying on the database engine export system if the most reliable way to ensure data integrity. But this doesn't solve the offchain messages issues, even if the Torii syncing from a snapshot is connected to the peer-to-peer network from the beginning. ## 🧭 Alternative Design: Unified MQTT Sink An alternative approach considered is to **replace all individual sinks** with a single, centralized **`MqttSink`**. This design would have Torii **broadcast all decoded events via MQTT**, and downstream systems (e.g., database writers, GraphQL APIs, metadata processors) would subscribe as independent MQTT clients. ### 🔁 MQTT-Centric Architecture ``` [ Starknet Node / Stream ] ↓ [ Fetcher ] ↓ [ Decoder ] ↓ [ Event Dispatcher ] ↓ [ Task Queues (by ordering key) ] ↓ [ MQTT Sink ] ┌──────────────┬───────────────┬──────────────────┐ ↓ ↓ ↓ ↓ [ Sqlite Client] [ Grpc Client ] [ Metadata ] [ CustomSink... ] ``` ### ✅ Pros - **Unified interface**: All consumers subscribe via MQTT, reducing coupling between core logic and extensions. - **Loose coupling**: Each consumer is completely decoupled from Torii internals and can scale independently. - **Network ready**: MQTT works well across distributed systems and can integrate with external components easily. ### ❌ Cons - **Latency overhead**: Events must be serialized/deserialized even for local consumption (e.g., DB writing), introducing unnecessary latency and CPU cost. - **Performance bottlenecks**: MQTT guarantees like *exactly-once delivery* are not trivial and often require stateful tracking or QoS level 2, which can degrade throughput significantly. QoS 2 would be required to avoid processing messages twice or add the overhead of filtering incoming events. ![image](https://hackmd.io/_uploads/ByoooVONxe.png) ---