Sequential Flow Agent Architecture

--- title: Sequential Flow Agent Architecture --- # Sequential Flow Agent Architecture ## 1\. Overview This document outlines the architecture for a new **Sequential Flow Agent**, a specialized conversational agent designed to follow structured, stateful conversation paths. The system is designed as a state machine, driven by a backend **Flow Engine**. The conversation flows are designed visually in a user interface and are published to the backend as a single **Flow Configuration** JSON object. This modular approach allows for complex conversational logic (loops, branches, validation) to be built and maintained without writing new code for each agent. ## 2\. Core Architectural Design The system is built upon 4 primary components: 1. **New Agent view (UI):** Current settings page, including TTS, STT, Response to Silence and LLM Settings. Communication Instructions and Glossary from welcome section will also be included. This should be in the settings tab. We also need the Workflow (in where the visual flow builder will be implemented) and a tools tab (for tool definition will not be supported in v1). 3. **The Visual Flow Builder (UI):** This is the new user-facing interface for designing and managing sequential flows. When a user creates a new agent, they will choose between the "Existing" type or the new "Sequential" type. Selecting "Sequential" will launch this builder, which consists of: - **Node Toolbox:** A sidebar containing all available node types (Initialize, Say, Ask & Extract, etc.) that users can drag onto the canvas. - **Canvas:** The main workspace where users place nodes and visually connect them to map out the conversation flow. - **Properties Panel:** A context-sensitive panel that displays the configuration options for the currently selected node. This is where users will write messages, define variables to extract, and set up validation rules. ***TBD*** - **Publish & Validate:** A mechanism to check the flow for errors (e.g., disconnected nodes) before compiling the visual layout into the Flow Configuration JSON and sending it to the backend. 4. **Flow Configuration:** A JSON object that acts as the blueprint for a conversation. It defines every node (step), its specific configuration, and the connections between them. This is generated by the UI. 5. **Flow Engine:** The backend service that interprets the Flow Configuration. It manages the conversation's state, executes the logic for the current node, processes user input, and transitions to the next node. - **Conversation State:** A runtime JSON object that tracks the live status of a single conversation. It holds the `currentNodeId`, all collected `variables`, and a `history` of the path taken. ## 3\. Node Types Nodes are the fundamental building blocks of a conversation flow. Each node has a specific `type`, a `config` object for its settings, and `connections` that define the possible paths leading out of it. ### 3.1. Initialize Conversation Node - We are going to use the Welcome configuration * **Purpose:** The mandatory starting point of every flow. * **Type:** `INITIALIZE` * **Configuration (`config`):** ```json { "message": "Welcome! How can I help you today?", "uninterruptible": false, "provideDateTimeToLLM": true "systemPromptXml":"<Key_Context_Points>Make sure to ask for approval before making an appointment</Key_Context_Points><General_Instructions>Adapt approach based on customer experience level (Beginner/Experienced/Some Experience) Maintain professional yet urgent tone throughout Use decision tree logic for conversation navigation Be persistent but respectful. </General_Instructions><Communication_Instructions>Be concise and natural. Do not use symbols or lists</Communication_Instructions><Glossary>No</Glossary>" , "userPromptXml": "<Conversation_Stages>Greeting\nTell availability\nGather information\nConfirm appointment\nBook appointment</Conversation_Stages><Role_and_Responsibility>You're an AI agent that helps in the health industry</Role_and_Responsibility>", "topP": 1, "maxTokens": 2000, "temperature": 0.3, "llmServiceToUse": "AzureOpenAI", "presencePenalty": 0, "frequencyPenalty": 0, "llmServiceConfig": { "apiKey": "C48WrGz5gvxyUFZodWuCD1NjUTOR6uMRsupNUcMpao3iLuk08aUxJQQJ99BGACPV0roXJ3w3AAABACOGnoI2", "deploymentName": "gpt-4o", "endpoint": "https://kalimeradevazureopenai.openai.azure.com/" } }, ``` * **Outputs:** `start` ### 3.2. Dialog Conversation Node - We are going to use the Dialog configuration * **Purpose:** The mandatory starting point of every flow. * **Type:** `DIALOG` * **Configuration (`config`):** ```json { "message": "Welcome! How can I help you today?", "uninterruptible": false, "topP": 1, "maxTokens": 2000, "temperature": 0.3, "llmServiceToUse": "AzureOpenAI", "presencePenalty": 0, "frequencyPenalty": 0, "llmServiceConfig": { "apiKey": "C48WrGz5gvxyUFZodWuCD1NjUTOR6uMRsupNUcMpao3iLuk08aUxJQQJ99BGACPV0roXJ3w3AAABACOGnoI2", "deploymentName": "gpt-4o", "endpoint": "https://kalimeradevazureopenai.openai.azure.com/" }, "provideDateTimeToLLM": true, "userPromptXml": "<Identity_Verification_Stage>Stage 1</Identity_Verification_Stage><Identification_URl>https://dev-kalimera.kalimera.ai/</Identification_URl>", "systemPromptXml":"<Introduction_Stage>test</Introduction_Stage><stages><_1>one</_1><_2>two</_2></stages>" } ``` * **Outputs:** `start` ### 3.3. Presentation Node (`Say`) * **Purpose:** To deliver a message to the user. Can be configured to be interruptible or not. * **Type:** `PRESENTATION` * **Configuration (`config`):** ```json { "message": "We offer services in appointments, billing, and visiting hours.", "uninterruptible": false } ``` * **Outputs:** `next` ### 3.4. Information Extraction Node (`Ask & Extract`) * **Purpose:** To ask a question and use an LLM to extract structured data from the user's response. * **Type:** `EXTRACTION` * **Configuration (`config`):** ```json { "variablesToExtract": [ { "name": "intent", "type": "enum", "options": ["appointment", "billing", "other"], "description": "The caller's primary goal." } ] } ``` ### 3.5 Set Variable Node **Purpose:** To directly create or modify variables in the **Conversation State** without user interaction. It can assign static values, copy other variables, or evaluate simple expressions to generate dynamic values. **Type:** `SET_VARIABLE` **Configuration (`config`):** The configuration treats the `value` field as an expression to be evaluated by the **Flow Engine**. ```json { "assignments": [ { "variable": "retry_counter", "value": "{{retry_counter}} + 1" }, { "variable": "welcome_message", "value": "RANDOM_CHOICE(['Hello!', 'Hi there!', 'Greetings!'])" }, { "variable": "user_id", "value": "RANDOM_INT(1000, 9999)" } ] } ``` **Outputs:** `next` #### Expression Engine Syntax The `value` field will be processed according to these rules: 1. **Literals:** A value without any special syntax is treated as a literal string or number (e.g., `"pending"` or `42`). 2. **Variable Access:** Variables from the conversation state are accessed using the `{{variableName}}` syntax. 3. **Arithmetic Operators:** Simple operators like `+` and `-` are supported for numeric operations. 4. **Functions:** The engine will support a library of built-in functions called with `FUNCTION_NAME(...)`. #### Example Expressions: | Use Case | Example Expression | Result | | :--- | :--- | :--- | | **Increment a counter** | `{{retry_counter}} + 1` | If `retry_counter` was 2, the new value is 3. | | **Set a static string** | `"in_progress"` | The value is the string "in\_progress". | | **Pick a random integer** | `RANDOM_INT(1, 100)` | A random whole number between 1 and 100. | | **Select a random greeting** | `RANDOM_CHOICE(['Hello', 'Hi'])` | Either the string "Hello" or "Hi". | | **Concatenate variables** | `{{firstName}} + ' ' + {{lastName}}` | If `firstName` is "Jane" and `lastName` is "Doe", the value is "Jane Doe". | ### 3.6. Validation Node (`Validate & Confirm`) * **Purpose:** A powerful two-step node that first performs programmatic validation on data and then optionally asks the user for confirmation. * **Type:** `VALIDATION` * **Configuration (`config`):** ```json { "validations": [ { "variable": "phoneNumber", "rejectionPrompt": "That doesn't sound like a valid 10-digit phone number. Please try again.", "rules": [ { "function": "isNumeric" }, { "function": "hasLength", "params": { "exact": 10 } } ] } ], "confirmation": { "enabled": true, "prompt": "I have your number as {{phoneNumber}}. Is that correct?", "maxAttempts": 2 } } ``` * **Outputs:** `validation_failed`, `success`, `denied`, `max_attempts_reached` ### 3.7. Decision Making Node (`Switch`) * **Purpose:** To route the conversation down different paths based on a variable's value, enabling `if/then/else` or `switch` logic. * **Type:** `DECISION` * **Configuration (`config`):** ```json { "variableToCheck": "intent", "conditions": [ { "operator": "equals", "value": "appointment", "targetNodeId": "ask_for_name_node" }, { "operator": "equals", "value": "billing", "targetNodeId": "billing_info_node" } ], "defaultTargetNodeId": "unsupported_intent_node" } ``` * **Outputs:** The outputs are defined dynamically by the `targetNodeId` properties within the configuration. ### 3.8. Finish Conversation Node - We are going to use the Farewell configuration * **Purpose:** The mandatory end point of a flow that delivers a final message and terminates the call. * **Type:** `FINISH` * **Configuration (`config`):** ```json { "message": "Thank you for calling. Goodbye!", "topP": 1, "maxTokens": 2000, "temperature": 0.3, "llmServiceToUse": "AzureOpenAI", "presencePenalty": 0, "frequencyPenalty": 0, "llmServiceConfig": { "apiKey": "C48WrGz5gvxyUFZodWuCD1NjUTOR6uMRsupNUcMpao3iLuk08aUxJQQJ99BGACPV0roXJ3w3AAABACOGnoI2", "deploymentName": "gpt-4o", "endpoint": "https://kalimeradevazureopenai.openai.azure.com/" }, "provideDateTimeToLLM": true, "userPromptXml": "", "systemPromptXml": "" } ``` * **Outputs:** None. ## 4\. Control Flow Patterns (`goto` and `while`) Complex logic like unconditional jumps (`goto`) and loops (`while`) are supported through connection patterns rather than dedicated nodes. This provides greater flexibility and keeps the visual design intuitive. **4.1 Supporting "goto" through Connections:** The "goto" functionality is an inherent feature of the connection system. Connecting any node's output directly to any other non-sequential node's input is a "goto". For example, after a `Presentation Node` informs the user that their intent is unsupported, its next output can be connected directly back to the initial `Information Extraction Node`. This creates a clean, unconditional jump to re-ask the user's intent without restarting the entire call. **4.2 Supporting "while" through Looping Patterns :** A `while` loop is not a single node but a design pattern you create by combining a condition-checking node (`Validation` or `Decision`) with a backward "goto" connection. This pattern has two parts: 1. **The Action:** A node that performs a task (e.g., `Ask & Extract` a phone number). 2. **The Condition:** A `Validation` node that checks if the data is valid. - **Exit Condition:** If the validation passes (`success`), the flow proceeds forward. - **Loop Condition:** If the validation fails (`validation_failed`), its output is connected back to the Action node, restarting the loop. This visually creates a `do-while` loop: do ask for the information while the validation fails. ## **5. Dynamic Jumps (Global Intent Router)** To handle out-of-sequence user requests (e.g., correcting information from a previous step), a **Global Intent Router** is used. 1. **Process:** Before the current node processes the user's input, the input is first sent to the router (an LLM call). 2. **Intent Detection:** The router determines if the user is answering the last question (`CONTINUE_FLOW`) or has a different intent (`CORRECT_INFORMATION`). 3. **Dynamic Jump:** If a different intent is detected, the **Flow Engine** pauses the current path, stores its location in a `returnContext` within the state, and jumps to the relevant node to handle the new intent. After handling it, the engine can use the `returnContext` to jump back and resume the original flow. ----- ## 6\. Example: Hospital Welcome Flow Configuration This JSON represents a complete flow for a hospital, demonstrating how the nodes connect to create a full conversation. ```json { "flowId": "hospital-welcome-flow-v2", "startNodeId": "init_1", "nodes": { "init_1": { "type": "INITIALIZE", "config": { "message": "Welcome to St. Gemini Hospital. How can I help you today?" }, "connections": { "start": "presentation_1" } }, "presentation_1": { "type": "PRESENTATION", "config": { "message": "You can say things like 'I want to book an appointment' or 'I have a question about billing'." }, "connections": { "start": "extract_intent_2" } }, "extract_intent_2": { "type": "EXTRACTION", "config": { "variablesToExtract": [ { "name": "intent", "type": "enum", "options": ["appointment", "billing", "visiting_hours", "other"], "description": "The caller's primary goal." } ] }, "connections": { "success": "intent_decision_5", "failure": "unsupported_intent_4" } }, "unsupported_intent_4": { "type": "PRESENTATION", "config": { "message": "I'm sorry, I don't think I can help with that. Is there anything else you need?" }, "connections": { "next": "extract_intent_2" } }, "intent_decision_5": { "type": "DECISION", "config": { "variableToCheck": "intent", "conditions": [ { "operator": "equals", "value": "appointment", "targetNodeId": "extract_name_and_phone_6" } ], "defaultTargetNodeId": "unsupported_intent_4" }, "connections": {} }, "extract_name_and_phone_6": { "type": "EXTRACTION", "config": { "prompt": "Great. To book an appointment, I'll need your first name, last name, and a 10-digit phone number.", "variablesToExtract": [ { "name": "firstName", "type": "string", "description": "The user's first name." }, { "name": "lastName", "type": "string", "description": "The user's last name." }, { "name": "phoneNumber", "type": "string", "description": "The user's 10-digit phone number."} ] }, "connections": { "success": "validate_phone_7", "failure": "finish_goodbye_99" } }, "validate_phone_7": { "type": "VALIDATION", "config": { "validations": [ { "variable": "phoneNumber", "rejectionPrompt": "That doesn't seem to be a valid 10-digit phone number. Please provide just the 10-digit number.", "rules": [ { "function": "isNumeric" }, { "function": "hasLength", "params": { "exact": 10 } } ] } ], "confirmation": { "enabled": true, "prompt": "Got it. I have the name {{firstName}} {{lastName}} and the number {{phoneNumber}}. Is that all correct?", "maxAttempts": 2 } }, "connections": { "validation_failed": "extract_name_and_phone_6", "success": "finish_goodbye_99", "denied": "extract_name_and_phone_6", "max_attempts_reached": "finish_goodbye_99" } }, "finish_goodbye_99": { "type": "FINISH", "config": { "farewellMessage": "Thank you. Your request is being processed. We will be in touch shortly. Goodbye!" }, "connections": {} } } } ``` ## 7\. Backend Execution & State Management The **Flow Engine** manages the conversation by maintaining a `Conversation State` object. **Example Conversation State:** ```json { "currentNodeId": "validate_phone_7", "variables": { "intent": "appointment", "firstName": "Jane", "lastName": "Doe", "phoneNumber": "5551234567" }, "history": [ { "from": "init_1", "to": "extract_intent_2", "reason": "start" }, { "from": "extract_intent_2", "to": "intent_decision_5", "reason": "success" }, { "from": "intent_decision_5", "to": "extract_name_and_phone_6", "reason": "condition_match" }, { "from": "extract_name_and_phone_6", "to": "validate_phone_7", "reason": "success" } ] } ``` **Execution Loop:** 1. The engine starts at the `startNodeId` defined in the flow. 2. It looks up the node data for the `currentNodeId`. 3. It executes the node's logic, which may involve speaking to the user, listening for a response, or evaluating variables. 4. The node's execution returns an output name (e.g., `success`). 5. The engine updates the `variables` and `history` in the state object. 6. It finds the next node's ID by looking up the output name in the current node's `connections`. 7. It updates the `currentNodeId` in the state. 8. The loop repeats until it reaches a `FINISH` node.