HackMD - Collaborative Markdown Knowledge Base

## Executive Summary Eliza Cloud is a unified platform that provides developers with a single API key to access all the services needed to build and deploy AI agents. The platform combines managed inference, storage, database, and container hosting into one cohesive system, while providing the infrastructure to run ElizaOS agents at scale with proper multi-tenancy, billing, and observability. ## Product Vision and Rationale The current landscape of AI development requires developers to manage credentials and integrations across multiple providers: OpenAI for language models, AWS S3 for storage, Postgres for persistence, various hosting providers for compute, and custom message infrastructure for real-time communication. This fragmentation creates significant operational overhead and increases the barrier to entry for teams wanting to deploy production AI systems. Eliza Cloud consolidates these disparate services behind a single API key and unified billing model. When a developer obtains an Eliza API key, they immediately gain access to inference across all major model providers, object storage, database persistence, and container hosting. More importantly, they can deploy ElizaOS agents—complete with the full agent runtime, memory systems, and plugin architecture—without managing infrastructure. The platform serves two primary functions. First, it provides a comprehensive API service where a single key unlocks storage, inference, database access, and other core capabilities that agents require. Second, it offers managed hosting for ElizaOS agents, allowing developers to deploy agents from templates or custom configurations through either the web interface or CLI, with the platform handling container orchestration, health monitoring, and multi-tenant isolation. ## System Architecture The platform consists of several interconnected services that present a unified interface to developers while maintaining clear separation of concerns internally. ### Authentication and Tenancy Model Every user belongs to exactly one organization, which serves as the primary tenancy boundary. Organizations own agents, API keys, and resources. We've chosen not to implement a "project" abstraction—agents themselves serve as the atomic unit of organization. This simplification reduces cognitive overhead while still providing the grouping and isolation features teams need. Authentication flows through WorkOS for SSO support, with Google and GitHub as the initial providers. The system uses JWT tokens for session management, with API keys serving as the primary authentication mechanism for programmatic access. These API keys work identically for both human developers and automated agents, providing a unified access model across all platform services. ### Agent and Container Management Agents are first-class entities in the system, containing character configuration, runtime settings, and plugin specifications. When deployed, an agent runs inside an isolated container—Docker for local development, Cloudflare Containers for production. The platform provides prebuilt container images with the ElizaOS runtime preconfigured, though the CLI will support custom container deployment for advanced use cases. Container sizing follows a simple small/medium/large model that maps to Cloudflare's container presets, abstracting away the complexity of resource allocation while providing predictable pricing. Containers include health checking, graceful shutdown, and automatic restart capabilities. Logs are retained for 24 hours by default, with paid retention available for longer periods. ### Message Server Integration The platform embeds the ElizaOS GUI and integrates with a message server that facilitates communication between users and agents. This follows the existing ElizaOS room-based architecture but adds multi-tenant isolation. Critically, agents cannot create or join arbitrary rooms—they can only participate in rooms to which they've been explicitly invited. This design choice ensures clear security boundaries and prevents agents from accidentally crossing organizational boundaries. When a user initiates a conversation with an agent, the platform provisions a room on the message server, provides the agent with connection credentials, and ensures both the user and agent join the same room. This happens transparently whether using the embedded GUI or connecting programmatically through the API. ### Storage and Persistence Storage operates through R2 with an S3-compatible API, providing familiar interfaces for file operations. Each organization receives isolated storage with configurable quotas. The platform automatically handles namespacing, access control, and usage tracking. For structured data persistence, we provide a managed database interface. This isn't intended to replace dedicated analytical databases but rather to provide a convenient, authenticated storage layer for agent state, conversation history, and application data. The same API key that authenticates inference requests also authorizes database operations, with Row Level Security ensuring complete tenant isolation. ### Billing and Credit System The platform operates on a prepaid credit model, similar to OpenAI or Anthropic. Organizations purchase credits in USD, which are then consumed as they use platform services. All services—inference, storage, compute, database—deduct from the same credit pool, providing a unified spending model. The pricing strategy is straightforward: we apply a 20% markup to all underlying costs. This markup covers platform overhead, development costs, and provides margin while remaining transparent and predictable for customers. When an organization's balance approaches zero, the platform implements a graduated response: inference services are throttled first, then after a grace period, containers are stopped. This prevents accidental overspend while giving users time to add credits. New accounts receive $5 in promotional credits to enable immediate experimentation. Auto-pay functionality allows organizations to maintain their credit balance automatically, with configurable thresholds and amounts. ### Services Plugin and Provider Abstraction A key architectural decision is the Services Plugin, which ships as a default-enabled plugin for all new agents. This plugin wraps all model provider integrations, storage operations, and platform services behind a unified interface. Rather than configuring individual provider credentials, developers select from cost profiles (Standard or Professional) that automatically route requests to appropriate models based on cost and capability requirements. This abstraction layer serves multiple purposes. It simplifies the developer experience by eliminating provider-specific configuration. It enables transparent failover when providers experience issues. It allows the platform to optimize costs by routing requests to the most economical provider that meets requirements. And it provides a consistent interface that remains stable even as we add or remove underlying providers. ## Implementation Status The shared type system is complete, providing comprehensive type definitions for all entities, API contracts, and service interfaces. Database schemas exist for organizations, users, API keys, usage records, credit transactions, containers, plugins, and audit logs. The authentication types, analytics interfaces, and provider abstractions are fully specified. The API service endpoints for inference, embeddings, image generation, and video generation are operational. Rate limiting, circuit breakers, and health checking infrastructure are in place. The plugin registry with semantic search is functional. Key components requiring implementation include the agent entity system and associated CRUD operations, WorkOS authentication flows, Stripe payment integration, the unified analytics dashboard, the container orchestration layer with Cloudflare Containers, IP allowlisting for API keys, and the Services Plugin itself. The message server integration needs to be extended to support multi-tenant room provisioning and access control. ## Technical Decisions and Tradeoffs Several architectural decisions warrant explanation for their impact on the system design. The decision to avoid a "project" abstraction layer simplifies the mental model significantly. Rather than organizations containing projects containing agents, we have organizations directly containing agents. Agents can be cloned or duplicated to create variants, providing the organizational capabilities of projects without the additional complexity. Using a single API key system for both human and agent access simplifies authentication and authorization logic throughout the system. The tradeoff is that we must carefully manage permissions and rate limits to prevent abuse, hence the addition of optional IP allowlisting and configurable rate limits per key. The choice of Cloudflare Containers over Kubernetes or ECS reduces operational complexity while providing sufficient isolation and scaling capabilities for our use case. The abstraction layer allows us to switch providers if needed, but Cloudflare's global presence and integrated CDN provide advantages for our distributed user base. Implementing credits in USD only for the initial version simplifies accounting and reduces currency conversion complexity. Stripe handles tax calculation and payment processing, allowing us to focus on usage tracking and cost attribution rather than payment infrastructure. The 24-hour log retention default balances cost with debugging needs. Most debugging happens immediately after issues occur, and paid retention is available for production workloads requiring longer audit trails. ## Security and Compliance Considerations The platform implements Row Level Security at the database level, ensuring complete tenant isolation even in the event of application-layer bugs. All API keys are hashed with bcrypt and only shown once at creation. The system maintains comprehensive audit logs for all sensitive operations. User deletion follows a tombstone pattern, preserving referential integrity while removing personally identifiable information. This approach satisfies GDPR requirements while maintaining system consistency. The optional IP allowlisting for API keys provides additional security for production deployments, while the organization-level allowed IPs setting provides a broader security boundary for all resources. ## Operational Considerations The platform is designed for horizontal scaling, with stateless API servers and distributed job processing. Health checks at multiple levels—service, provider, and container—ensure rapid detection and response to issues. Circuit breakers prevent cascade failures when downstream services experience problems. Observability is built in from the start, with structured logging, distributed tracing, and comprehensive metrics. Sentry captures exceptions with full context, while PostHog tracks product usage patterns. The analytics dashboard provides both operational and business metrics, enabling data-driven decisions about platform evolution. (Alternative Version) # Product Overview This platform has a singular intent: make it trivial for a developer to ship, run, and observe Eliza‑based agents without assembling a patchwork of vendors and credentials. The product presents as one surface with one API key. That key unlocks inference across multiple model providers, object storage, container hosting for agents, and a managed database surface—all behind the same permissions, billing, and analytics. The same key is usable by humans, services, and agents, so every call and cost ties back to a single identity. # Problem and rationale Today, getting an agent into production means collecting cloud accounts, provisioning storage, picking a host, wiring a message layer, and tracking costs across disjoint dashboards. It is error‑prone and difficult to reason about. The platform consolidates the essentials—models, storage, hosting, database, and messaging—so teams can move from a working agent to a usable product without building infrastructure. The goal is not to hide ElizaOS; it is to provide a production‑ready substrate that matches how ElizaOS already works while adding multi‑tenancy, billing, and observability. # What the product is A developer signs in, receives an Eliza API key, and can immediately do useful work: run model inference through a unified endpoint, store and fetch files over an S3‑compatible interface backed by Cloudflare R2, and persist structured state in a shared Postgres database. When ready to host an agent, the developer deploys it from a template in the web console or directly from the CLI. In development, agents run locally in Docker; in production, agents run in Cloudflare Containers. On startup, each agent connects to a managed message server, joins a room scoped to its tenant, and is available through the embedded ElizaOS GUI or any client that speaks the same room protocol. From the user’s point of view, it is the familiar Eliza agent experience with first‑class multi‑tenant isolation and lifecycle controls. # Scope and boundaries for v1 Every user belongs to exactly one organization. There is no organization management UI in v1; the account and its developer API key are implicitly tied to that organization. Keys do not expire by default and can be revoked at any time. Optional IP allow‑listing is available to make keys safe for limited front‑end use. All features behind the key are full‑scope in v1; fine‑grained scopes can come later. Agents are the primary unit of deployment. There is no separate “project” abstraction; an organization has a list of agents. Agents can be duplicated or cloned. Each agent is hosted in its own container. For production, the product targets Cloudflare Containers and maintains an internal abstraction so alternative providers can be supported later without changing user workflows. Agents support an on/off lifecycle rather than autoscaling. Each agent runs as a single instance. Persistent volumes are available as an option; some agents will not need them, others—such as coding or tooling‑heavy agents—will. The platform embeds the ElizaOS GUI and operates a shared message server. Agents and people meet in rooms. Agents do not create arbitrary rooms; the client model remains authoritative and invites agents into rooms as needed. Multi‑tenancy is enforced end‑to‑end. The database is a shared Postgres cluster with strict row‑level security on every table. All rows carry a tenant identifier; tenants only see their own data. A dedicated Postgres per tier is a possible future direction, but not part of v1. # Billing model and pricing posture Billing is prepaid credit, denominated in USD and processed by Stripe. Users top up a balance, optionally enable auto‑pay with a threshold and refill amount, and then consume services against that balance. The platform applies a fixed 20% markup over underlying provider costs for everything it resells: model usage, storage, containers, logging, and persistent volumes. At signup, new accounts receive a small promotional credit to explore the system. When a balance reaches zero, the platform begins graceful degradation. High‑burn services such as inference stop first. Low‑burn services such as running containers may continue briefly under a small negative threshold to avoid abrupt outages, after which they are suspended. The exact thresholds and grace periods are configurable but kept simple in v1 to reduce operational complexity. Invoices and receipts are available in the console. Disputes and refunds are handled manually via support. # Provider strategy and models The platform exposes a unified “super‑provider” interface for inference. Users select models through our service rather than bringing their own provider API keys in v1. The platform maintains current model catalogs and pricing, routes requests to the appropriate upstream, and records usage, latency, and cost per call. This approach enables consistent analytics, predictable billing, and a single integration surface for the ElizaOS ecosystem. # Storage and database File storage is backed by Cloudflare R2, presented through an S3‑compatible API and surfaced in the UI and CLI. The database surface is a shared Postgres instance that agents and clients can access through authenticated endpoints; it exists to hold durable application state that does not belong in blob storage. All access is tenant‑scoped. The embedded SQL viewer is available in the GUI but filtered by row‑level security to ensure users only see their own data. For v1, a curated set of ElizaOS plugins is whitelisted; their schemas are pre‑migrated or auto‑migrate safely into the shared database. # Agent hosting and lifecycle Production agents run as one container per agent with a clear lifecycle: create, start, stop, restart, view health, and inspect logs. Container sizes map to Cloudflare’s standard profiles, and the platform presents them as simple size choices. Health checks and reasonable timeouts are enforced. Logs are retained briefly by default for troubleshooting; extended retention is available as a billed feature. Persistent disk is optional and billed proportionally. For v1, image provenance is controlled through first‑party agent images in the GUI; the CLI will add support for deploying custom images later with appropriate safety controls. # Identity, access, and audit Authentication is provided through WorkOS, with email‑based flows for development and SSO providers such as Google and GitHub for production use. API keys are the primary credential for programmatic access and are valid for both human use and agent use. Keys display a stable prefix and “last used” metadata for operational visibility. User deletion follows a tombstone pattern; records are marked for deletion while preserving referential integrity. Comprehensive audit logging is enabled across key actions and is designed to support SOC 2 controls as the product matures. # Analytics and operational visibility The console functions as a single dashboard for cost and behavior. Users can see requests, token usage, latency, and cost by provider and model, with trend views over time. Inputs and responses can be surfaced for debugging when explicitly enabled, otherwise only metadata is retained. For agents, the console shows recent logs, container health, and basic resource metrics. The intent is to make the running cost and behavior of agents legible enough that users can make informed choices—switch models, change sizes, or disable features—without leaving the platform. # User experience and interfaces The web console and the CLI are first‑class and equivalent in capability. Anything done in the console—creating agents, deploying containers, managing storage, running test inferences, rotating keys, downloading receipts—can be performed from the CLI. The console embeds the ElizaOS GUI so a user can join the same room as a running agent immediately after deployment, using the platform’s message server without additional configuration. # ElizaOS Services plugin To reduce setup friction, a default Services plugin ships with agents. It wraps the platform’s unified inference, storage, hosting, and database endpoints, exposes the current model catalog, and provides sensible defaults, including optional cost profiles that favor either standard pricing or higher‑end quality. The plugin is configurable but present by default so that an agent deployed on the platform is operational without additional keys or wiring. # Out of scope for v1 Static site hosting is not included. Autoscaling agents beyond a single instance is not included. Dedicated per‑tenant Postgres is not included. Complex egress policies and strict container sandboxing are deferred while we prioritize isolation through tenancy boundaries and container separation. Users cannot supply their own third‑party model provider keys in v1; that option can be reconsidered after the one‑key experience is established.