# Memory Management: Redis vs. ChatHistoryAgentThread
In the early stages of development, Redis was used to store conversation history. The purpose was to maintain context, enabling the AI assistant to provide consistent responses across multi-turn conversations. However, as system requirements grew, we discovered structural limitations of Redis that made it inadequate for increasingly complex task reasoning and toolchain integration.
This report covers the following key topics:
* **Problem Analysis** – A deep dive into Redis’s structural limitations, including Plugin Tools context loss and lack of personalized context.
* **Solution** – A detailed introduction to the design philosophy of ChatHistoryAgentThread, including structured content storage and thread-based memory management.
* **Implementation** – A complete ThreadManager implementation example showing dynamic user thread management and message flow integration.
* **Technical Advantages** – Explanation of improvements in tool call tracking, personalization, memory optimization, toolchain management, parameter collection, and developer-friendliness.
Through this architecture shift, the system not only “remembers what it did,” but also provides each user with a personalized, efficient, and intelligent conversational experience.
---
## Problem Analysis: Limitations of Redis
Redis stores only the **plain text messages** exchanged between user and AI assistant. While sufficient for basic chat, it falls short in frameworks like Semantic Kernel that rely on Plugin function calling. The limitations include:
* **Cannot store AI reasoning steps and intermediate decisions (Plugin Tools)**
* **Cannot provide personalized context for different users**
### 1. Plugin Tools Context Loss
Redis cannot store:
* `FunctionCallContent` – tool call details
* `FunctionResultContent` – tool execution results
* `AssistantMessageContent` – AI reasoning steps
* Cross-turn parameter collection state
Example: user requests sales trend of “Rice Crackers” but omits the time range:
```python
# Redis only stores plain messages:
user_message: "Query sales trend of Rice Crackers"
ai_response: "Please provide the time range"
user_message: "May"
```
* With Redis, “May” is treated as an independent message. The model must re-interpret context.
* The system cannot track: which tool required this field, whether other parameters are collected, or whether queries are duplicated.
### 2. Lack of Personalized Context
Redis mode shares the same generic context for all users. It cannot adapt to user identity, role, or permissions for personalized experiences.
---
## Solution: Introducing ChatHistoryAgentThread
To solve these issues, we adopt **ChatHistoryAgentThread** from Semantic Kernel. This combines **ChatHistory()** (structured content storage) and **Thread()** (memory management).
### ChatHistory() – Structured Content Storage
Each conversation includes more than just plain text:
* User input (`UserMessageContent`)
* AI responses (`AssistantMessageContent`)
* Tool calls (`FunctionCallContent`)
* Tool parameters (Function Arguments)
* Tool results (`FunctionResultContent`)
All these are **automatically stored in conversation history**, enabling the model to reference directly without re-inference.
### Thread() – Memory Management
While ChatHistory handles structured memory, Thread manages:
* **Conversation lifecycle**: start, pause, resume, end
* **Intelligent summarization**: built-in `reduce()` method automatically triggers summarization
* **Multi-agent collaboration support**: unified interface for multiple agents
* **State tracking**: monitor status (active, waiting, completed)
==**In short: ChatHistory manages “what to remember,” Thread manages “how to remember.”**==
==**ChatHistoryAgentThread combines both into a stable memory system for agents.**==
---
## Implementation
### Core Components
```python
from semantic_kernel.contents import ChatHistorySummarizationReducer
from semantic_kernel.agents.chat_completion.chat_completion_agent import ChatHistoryAgentThread
from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion
```
### Thread Manager Implementation
We implement a `ThreadManager` to dynamically generate and manage threads per user, with summarization:
```python
class ThreadManager:
def __init__(self, service):
self.service = service
self.user_threads = {} # Key: user_id, Value: ChatHistoryAgentThread
self.system_prompt = ""
self.user_prompt_defined = set()
# Define personalized Thread per user
def define_user_thread(self, user_name: str, user_role: str, user_id: str):
if user_id in self.user_prompt_defined:
return
self.system_prompt = f"""
User name: {user_name}, role: {user_role}.
Provide assistance or restrict features based on role.
- Respond politely, and use the user’s name where appropriate.
"""
thread = self.user_threads[user_id]
thread._chat_history.add_system_message(self.system_prompt)
# Retrieve or create user thread
def get_thread(self, user_id: str):
if user_id not in self.user_threads:
summarizer_chat_history = ChatHistorySummarizationReducer(
service=self.service,
target_count=15, # Around 2-3 messages
threshold_count=5,
summarization_instructions = """
Instructions for summarizing chat history
"""
)
history = summarizer_chat_history
thread = ChatHistoryAgentThread()
thread._chat_history = history
self.user_threads[user_id] = thread
return self.user_threads[user_id]
```
### Message Flow Integration
```python
async def handle_user_message(event):
user_id = event.source.user_id
user_message = event.message.text
thread = thread_manager.get_thread(user_id)
# Authentication & RBAC check
if not AuthPlugin.line_id_in_db(user_id):
verification_response = await auth_agent.get_response(
messages=user_message + f"LINE ID: {user_id}",
thread=thread
)
verification_response = verification_response.items[0].text
line_text_response(event, verification_response)
return
# Define personalized Thread
user_info = AuthPlugin.get_user_info_by_line_id(user_id)
user_name = user_info.get("user_name", "Unknown")
user_role = user_info.get("role", "Unknown")
thread_manager.define_user_thread(user_name, user_role, user_id)
# Process message with full context
response = await agent.get_response(messages=user_message, thread=thread)
line_text_response(event, response.items[0].text)
# Summarize history
reduced = await thread.reduce()
```
---
## Technical Advantages: The Model “Remembers What It Did”
Key benefits of ChatHistoryAgentThread:
* Tool call tracking: trace entire execution flow
* Personalization: role-based permissions & customized replies
* Memory optimization: smart summarization & token saving
* Toolchain management: prevent duplicate queries
* Parameter collection: maintain state across turns
* Developer-friendly: easier debugging & traceability
### 1. Tool Call Tracking
Failed tool calls (e.g., missing parameters) are recorded:
```json
{
"turn_1": {
"user": "Query sales trend of Rice Crackers",
"function_call": {
"name": "get_product_sales_data",
"arguments": {
"product_name": "Rice Crackers",
"start_day": null,
"end_day": null
}
},
"function_result": "Error: Missing required parameters",
"assistant": "Please provide the time range"
},
"turn_2": {
"user": "May",
"context_resolution": "Identified as parameter completion for previous query",
"function_call": {
"name": "get_product_sales_data",
"arguments": {
"product_name": "Rice Crackers",
"start_day": "2024-05-01",
"end_day": "2024-05-31"
}
}
}
}
```
If the user says “May,” the model links it as a parameter completion, not a new query.
---
### 2. Personalization: Role Permissions & Customized Responses
Using `define_user_thread()`, the system personalizes conversations:
* Role recognition: restrict or allow features by role (manager, employee, customer)
* Personalized replies: use user’s real name for natural interaction
* Permission control: enforce role-based feature restrictions
Redis cannot achieve this because it stores only generic text, without system-level user-specific context.
---
### 3. Memory Optimization: Smart Summarization & Token Savings
Summarization strategy:
* **Short-term memory**: keep recent messages in detail
* **Long-term memory compression**: older messages summarized to save tokens
* **Traceable logic**: every tool call and output recorded for later reference
---
### 4. Toolchain Management: Prevent Duplicate Queries
If the model needs product info first, then sales data, full history ensures:
* Check whether intermediate results are already available
* Avoid redundant queries
* Dynamically decide next step (fallback or continue)
```
# Example: Product Analysis Workflow
# Step 1: Get product info
# Step 2: Get sales data
# Step 3: Generate trend analysis
# Example 2: Product Suggestions
# Reuse intermediate results to avoid repeating tool calls
```
---
### 5. Parameter Collection: Cross-Turn State Maintenance
Users may provide parameters over multiple turns (e.g., product name, time, store). Full history allows:
* Building complete queries
* Merging parameters automatically
---
### 6. Developer-Friendly: Debugging & Traceability
Retaining `FunctionCallContent` and `FunctionResultContent` benefits developers:
* Quickly identify incorrect parameters or logic issues
* Visualize conversation flows and behavior records
---
## Comparison: Redis vs ChatHistoryAgentThread
| Metric | Redis | ChatHistoryAgentThread |
| -------------------- | -------------------- | ----------------------- |
| Context retention | Plain text only | Full structured context |
| Plugin Tool tracking | No | Yes |
| Personalization | No | Yes |
| Parameter collection | Re-parsed every turn | Maintained state |
| Token usage | High (re-parsed) | Low (context reuse) |
| Error rate | High | Low |
| Debugging | Difficult | Full traceability |