# Asimov Humanoid Robot Platform - Deep Dive Tutorial Created on with Claude Code on 10-Sep-2025 This tutorial will teach you the Asimov codebase from the ground up. It's a sophisticated robotics platform with real-time WebRTC streaming, gRPC communication, and multi-component architecture. ## Table of Contents 1. [System Architecture Overview](#system-architecture-overview) 2. [Robot Components Deep Dive](#robot-components-deep-dive) 3. [Communication Patterns](#communication-patterns) 4. [WebRTC Media Streaming](#webrtc-media-streaming) 5. [Frontend Web Interface](#frontend-web-interface) 6. [Build Systems and Development](#build-systems-and-development) 7. [Data Flow Examples](#data-flow-examples) 8. [Getting Started Walkthrough](#getting-started-walkthrough) ## System Architecture Overview The Asimov platform consists of four main components that work together to create a complete robotics system: ```mermaid graph TB subgraph "Web Layer" FE[Frontend<br/>Next.js React App<br/>WebRTC Video Player] BE[Backend<br/>Mediasoup Server<br/>WebSocket Signaling] end subgraph "Robot Hardware Layer" subgraph "Robot Processes" N[Nucleus<br/>Media Processing<br/>WebRTC Producer] L[Link<br/>Hardware Interface<br/>Motor Control] C[Cortex<br/>High-Level AI<br/>Decision Making] end subgraph "Hardware" CAM[Cameras<br/>RealSense/USB] MIC[Microphones] ROBOT[Unitree G1<br/>Humanoid Robot] end end subgraph "Development Tools" DOCS[Documentation<br/>Astro Starlight] SIM[Isaac Sim<br/>Robot Simulation] end %% Data Flow FE <--> BE BE <--> N N <--> L L <--> ROBOT N --> CAM N --> MIC C <--> N C <--> L %% Styling classDef robotProcess fill:#ff6b6b,color:#fff classDef webLayer fill:#4ecdc4,color:#fff classDef hardware fill:#45b7d1,color:#fff classDef tools fill:#96ceb4,color:#fff class N,L,C robotProcess class FE,BE webLayer class CAM,MIC,ROBOT hardware class DOCS,SIM tools ``` ### Key Design Principles 1. **Separation of Concerns**: Each component has a specific responsibility 2. **Real-time Communication**: WebRTC for low-latency video, gRPC for commands 3. **Modular Architecture**: Components can be developed and tested independently 4. **Safety First**: Multiple communication layers with fallbacks ## Robot Components Deep Dive ### Nucleus - Media Processing Core **Location**: `robot/nucleus/` **Language**: C with GStreamer **Purpose**: Handles video/audio capture and WebRTC streaming ```mermaid graph LR subgraph "Nucleus Process (main.c)" INIT[Initialize Modules] LOOP[Processing Loop<br/>100Hz] subgraph "Core Modules" MS[Mediasoup<br/>WebRTC Transport] TEL[Telemetry<br/>Metrics Export] TO[Teleop<br/>gRPC Client] IPC[IPC<br/>Shared Memory] end subgraph "Media Pipeline" CAP[Video Capture<br/>GStreamer] ENC[Encoding<br/>VP8/Opus] SEND[Frame Sender] end end INIT --> LOOP LOOP --> MS LOOP --> TEL LOOP --> TO LOOP --> IPC MS --> CAP CAP --> ENC ENC --> SEND classDef process fill:#ff6b6b,color:#fff classDef module fill:#ffa726,color:#fff classDef media fill:#66bb6a,color:#fff class INIT,LOOP process class MS,TEL,TO,IPC module class CAP,ENC,SEND media ``` **Key Files**: - `main.c`: Entry point with initialization and main loop - `mediasoup/`: WebRTC transport using Mediasoup library - `teleop/`: gRPC client for receiving commands - `config.yaml`: Runtime configuration **Configuration Example**: ```yaml # WebRTC streaming server_url: "ws://localhost:8080" room_name: "test-room" # Camera settings width: 848 height: 480 fps: 30 device_path: "/dev/video4" # Teleoperation teleop_enabled: true teleop_server: "localhost:50051" robot_id: "nucleus-robot" ``` ### Link - Hardware Interface **Location**: `robot/link/` **Language**: C **Purpose**: Direct communication with robot hardware (Unitree G1) ```mermaid graph TD subgraph "Link Process (main.c)" LINIT[Initialize IPC + Robot Interface] LLOOP[Processing Loop<br/>100Hz] subgraph "Communication" RECV[Receive Commands<br/>from Nucleus] EXEC[Execute on Robot<br/>Joint Control] STATE[Read Robot State<br/>Sensors/Position] SEND[Send State<br/>to Nucleus] end subgraph "Robot Interface" G1[Unitree G1 SDK<br/>Motor Commands] SAFETY[Safety Checks<br/>Limits/Emergency] end end LINIT --> LLOOP LLOOP --> RECV RECV --> EXEC EXEC --> G1 G1 --> SAFETY LLOOP --> STATE STATE --> SEND classDef process fill:#ff6b6b,color:#fff classDef comm fill:#4ecdc4,color:#fff classDef robot fill:#45b7d1,color:#fff class LINIT,LLOOP process class RECV,EXEC,STATE,SEND comm class G1,SAFETY robot ``` **Key Files**: - `main.c`: Main processing loop - `robot_interface/unitree_g1/`: Hardware-specific drivers - Inter-process communication with Nucleus via shared memory ### Cortex - AI Decision Making **Location**: `robot/cortex/` **Status**: Planned/Future component **Purpose**: High-level AI, path planning, behavior control ## Communication Patterns The system uses multiple communication protocols optimized for different use cases: ```mermaid sequenceDiagram participant F as Frontend<br/>(Browser) participant B as Backend<br/>(Mediasoup) participant N as Nucleus<br/>(Robot) participant L as Link<br/>(Hardware) participant R as Robot<br/>(G1) Note over F,R: WebRTC Video Streaming Setup F->>B: WebSocket Connect B->>N: WebSocket Connect F->>B: Join Room Request B->>F: Available Producers F->>B: Create WebRTC Transport B->>F: Transport Parameters F->>B: Start Consuming Video N->>B: Produce Video Stream B->>F: Video Frames (WebRTC) Note over N,R: Robot Command Flow N->>L: Command via Shared Memory IPC L->>R: Motor Commands (Unitree SDK) R->>L: State/Sensor Data L->>N: Robot State via IPC N->>B: Telemetry Data (optional) Note over F,R: User Control Example F->>B: User Input (WebSocket) B->>N: gRPC Command N->>L: Motion Command (IPC) L->>R: Joint Movements ``` ### Protocol Breakdown | Protocol | Use Case | Latency | Reliability | |----------|----------|---------|-------------| | **WebRTC** | Video streaming | ~50-100ms | Best-effort | | **WebSocket** | Signaling, UI commands | ~10-50ms | Reliable | | **gRPC** | Robot commands | ~1-10ms | Reliable | | **Shared Memory** | Inter-process (Nucleus↔Link) | ~0.1ms | Reliable | | **Unitree SDK** | Hardware control | ~0.1ms | Critical | ## WebRTC Media Streaming The video streaming system uses a sophisticated WebRTC implementation: ```mermaid graph TB subgraph "Robot Side (Nucleus)" CAM[Camera Capture<br/>RealSense/USB] GST[GStreamer Pipeline<br/>Capture → Encode] MS_PROD[Mediasoup Producer<br/>WebRTC Sender] end subgraph "Network Transport" INET[Internet/Network<br/>UDP/RTP packets] end subgraph "Server Side (Backend)" MS_SFU[Mediasoup SFU<br/>Selective Forwarding] WS_SIG[WebSocket Signaling<br/>SDP/ICE Exchange] end subgraph "Browser Side (Frontend)" MS_CONS[Mediasoup Consumer<br/>WebRTC Receiver] VIDEO[HTML Video Element<br/>Display] STATS[Statistics<br/>Bitrate/Latency] end CAM --> GST GST --> MS_PROD MS_PROD --> INET INET --> MS_SFU MS_SFU --> INET INET --> MS_CONS MS_CONS --> VIDEO MS_CONS --> STATS WS_SIG <--> MS_SFU WS_SIG <--> MS_CONS classDef robot fill:#ff6b6b,color:#fff classDef server fill:#4ecdc4,color:#fff classDef browser fill:#45b7d1,color:#fff classDef network fill:#96ceb4,color:#fff class CAM,GST,MS_PROD robot class MS_SFU,WS_SIG server class MS_CONS,VIDEO,STATS browser class INET network ``` ### Video Pipeline Details 1. **Capture**: Camera frames captured via GStreamer 2. **Encoding**: VP8 video codec for WebRTC compatibility 3. **Transport**: RTP packets over UDP with SRTP encryption 4. **SFU Routing**: Mediasoup server routes streams to multiple viewers 5. **Display**: Browser receives and displays video with minimal latency ## Frontend Web Interface The React/Next.js frontend provides a modern web interface: ```mermaid graph TD subgraph "Next.js Application (frontend/)" subgraph "Pages" HOME[page.tsx<br/>Main Video Player] end subgraph "Components" VP[VideoPlayer<br/>WebRTC Display] STATS[StatsDisplay<br/>Metrics/Latency] UI[UI Components<br/>Radix + Tailwind] end subgraph "Services" WS_CLIENT[WebSocketClient<br/>Mediasoup Integration] MS_CLIENT[Mediasoup Client<br/>WebRTC Consumer] end subgraph "State Management" HOOKS[React Hooks<br/>useState/useEffect] STREAM_STATE[Stream State<br/>Producers/Consumers] end end HOME --> VP HOME --> STATS VP --> WS_CLIENT WS_CLIENT --> MS_CLIENT MS_CLIENT --> STREAM_STATE STREAM_STATE --> HOOKS HOOKS --> HOME classDef page fill:#4ecdc4,color:#fff classDef component fill:#45b7d1,color:#fff classDef service fill:#96ceb4,color:#fff classDef state fill:#ffa726,color:#fff class HOME page class VP,STATS,UI component class WS_CLIENT,MS_CLIENT service class HOOKS,STREAM_STATE state ``` ### Key Frontend Features - **Real-time Video**: WebRTC video consumption with MediaStream API - **Statistics**: Live bitrate, latency, and connection metrics - **Responsive Design**: Works on desktop and mobile devices - **Room Management**: Multi-user rooms with participant tracking ## Build Systems and Development The project uses multiple build systems for different components: ```mermaid graph LR subgraph "Robot Code (C/C++)" BAZEL[Bazel Build<br/>BUILD.bazel files] CC[GCC Compiler<br/>System Libraries] DEPS[Dependencies<br/>GStreamer, gRPC] end subgraph "Backend (TypeScript)" BUN_BE[Bun Runtime<br/>Fast TypeScript] NPM_BE[npm packages<br/>mediasoup, ws] end subgraph "Frontend (React)" NEXT[Next.js Build<br/>Turbo Mode] NPM_FE[npm packages<br/>React, Tailwind] end subgraph "Documentation" ASTRO[Astro Build<br/>Static Site] STAR[Starlight Theme<br/>Documentation] end BAZEL --> CC CC --> DEPS BUN_BE --> NPM_BE NEXT --> NPM_FE ASTRO --> STAR classDef robot fill:#ff6b6b,color:#fff classDef web fill:#4ecdc4,color:#fff classDef docs fill:#96ceb4,color:#fff class BAZEL,CC,DEPS robot class BUN_BE,NPM_BE,NEXT,NPM_FE web class ASTRO,STAR docs ``` ### Development Workflow ```bash # Robot code development bazel build //robot/nucleus:nucleus_gst bazel test //robot/nucleus/test:all # Backend development cd backend bun install bun start # Frontend development cd frontend npm run dev # Documentation cd docs bun run dev ``` ## Data Flow Examples ### Example 1: Video Streaming Session ```mermaid sequenceDiagram participant User as User<br/>(Browser) participant FE as Frontend participant BE as Backend<br/>(Mediasoup) participant N as Nucleus<br/>(Robot) participant Cam as Camera User->>FE: Opens robot page FE->>BE: WebSocket connect N->>BE: Register as producer N->>Cam: Start video capture Cam->>N: Video frames N->>BE: Produce video stream User->>FE: Click "Watch Stream" FE->>BE: Request to consume video BE->>FE: WebRTC transport setup FE->>BE: Connect transport BE->>FE: Start consuming N->>FE: Video frames (via BE) FE->>User: Display video loop Every 2 seconds FE->>FE: Collect WebRTC stats FE->>User: Update bitrate/latency display end ``` ### Example 2: Robot Command Execution ```mermaid sequenceDiagram participant UI as Web UI participant gRPC as gRPC Server<br/>(External) participant N as Nucleus participant L as Link participant Robot as Unitree G1 UI->>gRPC: Send robot command gRPC->>N: gRPC NucleusControl.SendCommand Note over N: Command validation & processing N->>L: Forward command (IPC) Note over L: Safety checks & conversion L->>Robot: Unitree SDK commands Robot->>L: Joint feedback L->>N: Robot state update (IPC) Note over N: State processing N->>gRPC: Command response gRPC->>UI: Success/failure loop Every 10ms (100Hz) L->>Robot: Control loop Robot->>L: Sensor data end ``` ## Getting Started Walkthrough ### 1. Environment Setup First, understand the development environment: ```bash # Check repository structure ls -la # You'll see: robot/ backend/ frontend/ docs/ # Install system dependencies for robot code sudo apt-get install gstreamer1.0-tools libgstreamer1.0-dev # Install Bazel for robot builds # (Follow Bazel installation guide) ``` ### 2. Backend Setup (Mediasoup Server) ```bash cd backend bun install # Install dependencies bun start # Start signaling server on port 9095 ``` The backend handles: - WebSocket signaling for WebRTC - Mediasoup SFU (Selective Forwarding Unit) - Room management for multiple viewers ### 3. Frontend Setup (React App) ```bash cd frontend npm install npm run dev # Starts on http://localhost:3000 ``` The frontend provides: - Video player interface - WebRTC consumer implementation - Real-time statistics display ### 4. Robot Code (Nucleus) ```bash # Build the main nucleus binary bazel build //robot/nucleus:nucleus # Configure camera and streaming settings nano robot/nucleus/config.yaml # Run nucleus (requires camera hardware) ./bazel-bin/robot/nucleus/nucleus ``` ### 5. Testing the Full System 1. **Start Backend**: `cd backend && bun start` 2. **Start Frontend**: `cd frontend && npm run dev` 3. **Start Robot**: `./bazel-bin/robot/nucleus/nucleus` 4. **Open Browser**: Navigate to `http://localhost:3000` 5. **Join Room**: Enter username and room name 6. **Watch Stream**: Click "Watch Stream" to see robot camera ### 6. Development Tips - **Robot Logs**: Check `journalctl -f | grep NUCLEUS` for robot logs - **WebRTC Debug**: Enable browser DevTools Network tab for WebRTC stats - **Hot Reload**: Frontend has hot reload, backend restarts automatically - **Configuration**: Modify `config.yaml` files for different settings ## Understanding the Codebase Structure ``` asimov/ ├── robot/ # C/C++ robot code (Bazel) │ ├── nucleus/ # Media processing & WebRTC │ ├── link/ # Hardware interface │ ├── cortex/ # AI/high-level control │ ├── common/ # Shared libraries │ │ ├── protos/ # gRPC protocol definitions │ │ ├── ipc/ # Inter-process communication │ │ └── utils/ # Common utilities │ └── simulation/ # Isaac Sim integration ├── backend/ # TypeScript WebRTC server (Bun) │ ├── index.ts # Main server entry point │ └── package.json # Dependencies ├── frontend/ # Next.js React app (npm) │ ├── app/ # Next.js App Router │ ├── components/ # React components │ └── lib/ # Utility functions └── docs/ # Astro documentation site ``` This architecture enables: - **Modular Development**: Each component can be developed independently - **Language Optimization**: C for performance-critical robot code, TypeScript for web services - **Scalable Streaming**: Multiple viewers can watch robot feeds simultaneously - **Real-time Control**: Low-latency command execution and feedback The system is designed for robotics research, telepresence applications, and remote robot operation with emphasis on safety, performance, and user experience. --- *This tutorial covers the essential concepts needed to understand and contribute to the Asimov robotics platform. Each component builds upon the others to create a complete system for humanoid robot control and monitoring.*