# Threads API
The Threads API provides a stateful conversation experience with persistent message history and advanced RAG-powered run management.
## Overview
The Threads API enables persistent, context-aware conversations with powerful knowledge retrieval:
1. **Create Threads** - Start new conversations with optional initial messages
2. **Manage Messages** - Add user/assistant messages with rich metadata
3. **Execute Runs** - Generate AI responses with full RAG pipeline integration
4. **Track Progress** - Monitor run status, usage, and performance
5. **Retrieve History** - Access complete conversation and run history
### Key Features
- **🧠 RAG Integration**: Automatic knowledge retrieval from your vector database
- **🔄 Provider Agnostic**: Works with both OpenAI and Anthropic models
- **📡 Real-time Streaming**: Server-sent events for live response delivery
- **📊 Rich Metadata**: Track models, providers, settings, and usage
- **🎯 Advanced Retrieval**: Multiple strategies (semantic, hybrid, source-prioritized)
- **💬 OpenAI Compatible**: Drop-in replacement for OpenAI Threads API
### Typical Chat Application Flow
#### **Recommended: Single-Call Thread Creation**
```
1. User: "How do I create a Stellar account?"
2. POST /threads
Body: { "messages": [{ "role": "user", "content": "How do I create a Stellar account?" }] }
→ Creates thread + adds message in one call
3. POST /threads/{thread_id}/runs
Body: { "model": "gpt-4o-mini", "temperature": 0.7 }
→ Triggers RAG-powered response generation
→ Returns streaming SSE response
→ Automatically creates assistant message with metadata
```
#### **Multi-turn Conversation**
```
4. User: "What about multi-signature accounts?"
5. POST /threads/{thread_id}/messages
Body: { "role": "user", "content": "What about multi-signature accounts?" }
6. POST /threads/{thread_id}/runs
Body: { "model": "claude-3-5-sonnet-20241022" }
→ AI uses full conversation history + retrieves relevant context
→ Generates contextual response building on previous discussion
```
### Benefits for Developers
- **🚀 Simplified Integration**: No need to manage conversation state
- **⚡ Performance**: Optimized RAG pipeline with intelligent caching
- **🔧 Flexible Configuration**: Fine-tune retrieval, models, and generation
- **📈 Observability**: Detailed run tracking and usage analytics
- **🛡️ Reliability**: Built-in error handling and fallback strategies
## Authentication
All requests require an API key in the `x-api-key` header:
```bash
x-api-key: YOUR_API_KEY
```
## Thread Management
### Create a Thread
Create a new conversation thread, optionally with initial messages.
**Endpoint:** `POST /threads`
#### Request Body
```json
{
"title": "Stellar Smart Contracts Help",
"metadata": {
"user_id": "user_123",
"session": "web_chat",
"source": "website"
},
"messages": [
{
"role": "user",
"content": "I need help with Stellar smart contracts"
}
]
}
```
#### Parameters
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `title` | string | No | Thread title (max 60 characters). If not provided, will be auto-generated after first assistant response |
| `metadata` | object | No | Custom metadata for the thread |
| `messages` | array | No | Initial messages to add to the thread |
| `messages[].role` | string | Yes | Message role: `"user"` or `"assistant"` |
| `messages[].content` | string | Yes | Message content |
#### Response
```json
{
"id": "thread_abc123def456",
"object": "thread",
"created_at": 1699014083,
"title": "Stellar Smart Contracts Help",
"metadata": {
"user_id": "user_123",
"session": "web_chat",
"source": "website"
}
}
```
### Retrieve a Thread
Get details about a specific thread.
**Endpoint:** `GET /threads/{thread_id}`
#### Response
```json
{
"id": "thread_abc123def456",
"object": "thread",
"created_at": 1699014083,
"title": "Stellar Smart Contracts Help",
"metadata": {
"user_id": "user_123",
"session": "web_chat",
"source": "website"
}
}
```
### Retrieve All Threads
Get all threads, optionally with pagination.
**Endpoint:** `GET /threads`
#### Query Parameters
| Parameter | Type | Required | Description | Default |
|-----------|------|----------|-------------|---------|
| `limit` | number | No | Maximum number of threads to return | `10` |
| `offset` | number | No | Number of threads to skip | `0` |
#### Response
```json
{
"object": "list",
"data": [
{
"id": "thread_abc123def456",
"object": "thread",
"created_at": 1699014083,
"title": "Stellar Smart Contracts Help",
"metadata": {
"user_id": "user_123",
"session": "web_chat",
"source": "website"
}
},
...
],
"first_id": "thread_abc123def456",
"last_id": "thread_xyz789abc123",
"total_count": 100
}
```
### Update Thread Title
Update the title of an existing thread. Useful for manually setting or correcting auto-generated titles.
**Endpoint:** `PATCH /threads/{thread_id}`
#### Request Body
```json
{
"title": "Updated Thread Title"
}
```
#### Parameters
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `title` | string | Yes | New thread title (1-60 characters) |
#### Response
```json
{
"id": "thread_abc123def456",
"object": "thread",
"created_at": 1699014083,
"title": "Updated Thread Title",
"metadata": {
"user_id": "user_123",
"session": "web_chat",
"source": "website"
}
}
```
### Delete a Thread
Delete a thread and all its associated messages and runs.
**Endpoint:** `DELETE /threads/{thread_id}`
#### Response
```json
{
"deleted": true
}
```
## Automatic Title Generation
Threads automatically generate descriptive titles after the first assistant response is completed. This feature:
- **Triggers automatically** when a thread has no title and receives its first assistant response
- **Uses lightweight AI** (GPT-4o-mini) to generate concise, descriptive titles (max 60 characters)
- **Analyzes conversation context** from both the user's question and assistant's response
- **Provides intelligent fallbacks** if generation fails (uses first sentence or truncated user message)
- **Fails gracefully** without affecting the main conversation flow
#### Title Generation Behavior
| Scenario | Behavior |
|----------|----------|
| Thread created with title | No auto-generation, uses provided title |
| Thread created without title | Auto-generates after first assistant response |
| Very short messages | Skips generation, uses fallback |
| Error responses | Skips generation to avoid unhelpful titles |
| Generation failure | Uses intelligent fallback based on user message |
#### Manual Override
You can always override auto-generated titles using the `PATCH /threads/{thread_id}` endpoint.
## Message Management
### Add a Message
Add a new message to an existing thread.
**Endpoint:** `POST /threads/{thread_id}/messages`
#### Request Body
```json
{
"role": "user",
"content": "How do I create a Stellar account?",
"title": "Stellar Account Creation"
}
```
#### Parameters
| Parameter | Type | Required | Description | Default |
|-----------|------|----------|-------------|---------|
| `role` | string | No | Message role: `"user"` or `"assistant"` | `"user"` |
| `content` | string | Yes | Message content (minimum 1 character) | - |
| `title` | string | No | Optional title for the thread (max 60 characters) | - |
#### Response
```json
{
"id": "msg_xyz789abc123",
"object": "thread.message",
"created_at": 1699014083,
"thread_id": "thread_abc123def456",
"role": "user",
"content": "How do I create a Stellar account?",
"title": "Stellar Account Creation"
}
```
### List Messages
Retrieve all messages in a thread, ordered chronologically.
**Endpoint:** `GET /threads/{thread_id}/messages`
#### Response
```json
{
"object": "list",
"data": [
{
"id": "msg_xyz789abc123",
"object": "thread.message",
"created_at": 1699014083,
"thread_id": "thread_abc123def456",
"role": "user",
"content": "How do I create a Stellar account?",
"metadata": {
"timestamp": "2024-01-15T10:30:00Z"
}
},
{
"id": "msg_def456ghi789",
"object": "thread.message",
"created_at": 1699014143,
"thread_id": "thread_abc123def456",
"role": "assistant",
"content": "To create a Stellar account, you'll need to generate a keypair...",
"title": "Stellar Account Creation",
"metadata": {
"runId": "run_abc123def456",
"model": "gpt-4o-mini",
"provider": "openai",
"temperature": 0.7,
"retrievalStrategy": "SEMANTIC",
"created_at": 1699014143
}
}
],
"first_id": "msg_xyz789abc123",
"last_id": "msg_def456ghi789",
"has_more": false
}
```
### Add Message Feedback
Provide feedback on an assistant message to improve model performance.
**Endpoint:** `POST /threads/{thread_id}/messages/{message_id}/feedback`
#### Request Body
```json
{
"rating": "thumbs_up",
"comment": "Very helpful explanation with clear examples!"
}
```
#### Parameters
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `rating` | string | Yes | `"thumbs_up"` or `"thumbs_down"` |
| `comment` | string | No | Optional feedback comment |
#### Response
```json
{
"success": true
}
```
## Run Management
### Create a Run
Execute the RAG-powered AI system to generate a response for the thread. This automatically creates an assistant message and streams the response in real-time.
**Endpoint:** `POST /threads/{thread_id}/runs`
#### Request Body
```json
{
"provider": "openai",
"model": "gpt-4o-mini",
"temperature": 0.7,
"max_tokens": 1500,
"topK": 5,
"retrievalStrategy": "HYBRID",
"structuredOutput": false,
"includeCitations": true,
"filter": {
"source_type": "documentation"
},
"promptTemplate": "system-api-base",
"userTemplate": "user-api-base",
"promptVariables": {
"expertise_level": "intermediate"
}
}
```
#### Parameters
| Parameter | Type | Required | Description | Default |
|-----------|------|----------|-------------|---------|
| **Provider Configuration** |
| `provider` | string | No | `"openai"` or `"anthropic"` | `"openai"` |
| `model` | string | No | Model name (e.g., `"gpt-4o-mini"`, `"claude-3-5-sonnet-20241022"`) | `"gpt-4o-mini"` |
| `temperature` | number | No | Sampling temperature (0.0-2.0) | `0.5` |
| `max_tokens` | number | No | Maximum tokens to generate | `1500` |
| **Retrieval Configuration** |
| `topK` | number | No | Number of context chunks to retrieve (1-20) | `5` |
| `retrievalStrategy` | string | No | `"SEMANTIC"`, `"KEYWORD"`, `"HYBRID"`, `"SOURCE_PRIORITIZED"` | `"SEMANTIC"` |
| `filter` | object | No | Filter criteria for knowledge retrieval | `{}` |
| `sourcePriorities` | array | No | Source prioritization for `SOURCE_PRIORITIZED` strategy | `[]` |
| `includeAllSources` | boolean | No | Include all available sources | `false` |
| **Output Configuration** |
| `structuredOutput` | boolean | No | Enable structured JSON output | `false` |
| `includeCitations` | boolean | No | Include source citations in response | `false` |
| `response_format` | object | No | OpenAI-compatible response format | `{"type": "text"}` |
| **Prompt Configuration** |
| `promptTemplate` | string | No | System prompt template to use | `"system-api-base"` |
| `userTemplate` | string | No | User message template to use | `"user-api-base"` |
| `promptVariables` | object | No | Variables to inject into prompt templates | `{}` |
#### Retrieval Strategies
| Strategy | Description | Use Case |
|----------|-------------|----------|
| `SEMANTIC` | Vector similarity search using embeddings | Best for conceptual queries |
| `KEYWORD` | Traditional keyword/term matching | Best for exact term searches |
| `HYBRID` | Combined semantic and keyword search | **Recommended** - balanced approach |
| `SOURCE_PRIORITIZED` | Prioritize specific source types | When you need specific source types |
#### Source Prioritization
When using `SOURCE_PRIORITIZED` strategy, specify which sources to prioritize:
```json
{
"retrievalStrategy": "SOURCE_PRIORITIZED",
"sourcePriorities": [
{
"sourceType": "documentation",
"weight": 0.7,
"topK": 5
},
{
"sourceType": "github-issues",
"weight": 0.3,
"topK": 3
}
],
"includeAllSources": false
}
```
**Note**: If `sourcePriorities` is not provided with `SOURCE_PRIORITIZED`, the system automatically falls back to `SEMANTIC` strategy.
#### Response Headers
```
Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive
X-Run-ID: run_def456ghi789
X-Message-ID: msg_jkl012mno345
```
#### Streaming Response
The response uses Server-Sent Events (SSE) format. Each event contains JSON data:
```
data: {"type": "context", "context": [{"id": "doc_123", "score": 0.95, "text": "Stellar accounts are..."}]}
data: {"type": "content", "content": "To create a Stellar account"}
data: {"type": "content", "content": ", you'll need to follow these steps:\n\n1. "}
data: {"type": "content", "content": "Generate a keypair using the Stellar SDK..."}
data: {"type": "done", "messageId": "msg_jkl012mno345", "runId": "run_def456ghi789", "usage": {"prompt_tokens": 150, "completion_tokens": 75, "total_tokens": 225}}
data: [DONE]
```
#### Event Types
| Type | Description | When Sent |
|------|-------------|-----------|
| `context` | Retrieved knowledge base context | Once, when context is ready |
| `content` | Incremental text content from AI | Multiple times during generation |
| `done` | Final completion with metadata | Once, when generation completes |
#### Event Data Structure
**Context Event:**
```json
{
"type": "context",
"context": [
{
"id": "doc_123",
"score": 0.95,
"text": "Stellar accounts are identified by...",
"metadata": {
"source": "stellar-docs",
"url": "https://developers.stellar.org/docs/accounts"
}
}
]
}
```
**Content Event:**
```json
{
"type": "content",
"content": "To create a Stellar account"
}
```
**Done Event:**
```json
{
"type": "done",
"messageId": "msg_jkl012mno345",
"runId": "run_def456ghi789",
"usage": {
"prompt_tokens": 150,
"completion_tokens": 75,
"total_tokens": 225
}
}
```
### Retrieve a Run
Get details about a specific run.
```
GET /v1/threads/{thread_id}/runs/{run_id}
```
#### Response
```json
{
"id": "run_def456ghi789",
"object": "thread.run",
"created_at": 1699000000,
"thread_id": "thread_abc123def456",
"status": "completed",
"started_at": 1699000001,
"completed_at": 1699000015,
"model": "claude-3-opus-20240229",
"metadata": {
"retrieval_strategy": "source_prioritized"
},
"usage": {
"prompt_tokens": 150,
"completion_tokens": 75,
"total_tokens": 225
}
}
```
### List Runs
Get all runs for a thread.
```
GET /v1/threads/{thread_id}/runs
```
#### Response
```json
{
"object": "list",
"data": [
{
"id": "run_def456ghi789",
"object": "thread.run",
"created_at": 1699000000,
"thread_id": "thread_abc123def456",
"status": "completed",
"model": "claude-3-opus-20240229"
}
],
"first_id": "run_def456ghi789",
"last_id": "run_def456ghi789",
"has_more": false
}
```
### Cancel a Run
Cancel a running operation.
```
POST /v1/threads/{thread_id}/runs/{run_id}/cancel
```
#### Response
```json
{
"id": "run_def456ghi789",
"object": "thread.run",
"created_at": 1699000000,
"thread_id": "thread_abc123def456",
"status": "cancelled",
"cancelled_at": 1699000010
}
```
## Run Status Values
| Status | Description |
|--------|-------------|
| `queued` | Run is waiting to be processed |
| `in_progress` | Run is currently executing |
| `requires_action` | Run requires user action (future use) |
| `cancelling` | Run is being cancelled |
| `cancelled` | Run was cancelled |
| `failed` | Run failed with an error |
| `completed` | Run completed successfully |
| `expired` | Run expired before completion |
## Examples
### Complete Conversation Flow
#### 1. Create Thread with Initial Message
```bash
# Option A: Let the system auto-generate a title after first response
curl -X POST https://api.stellabot.app/v1/threads \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"metadata": {"user_id": "user_123"},
"messages": [
{
"role": "user",
"content": "I want to learn about Stellar smart contracts"
}
]
}'
# Option B: Provide a custom title upfront
curl -X POST https://api.stellabot.app/v1/threads \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"title": "Learning Stellar Smart Contracts",
"metadata": {"user_id": "user_123"},
"messages": [
{
"role": "user",
"content": "I want to learn about Stellar smart contracts"
}
]
}'
```
#### 2. Create Run to Generate Response
```bash
curl -X POST https://api.stellabot.app/v1/threads/thread_abc123/runs \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3-opus-20240229",
"temperature": 0.7,
"retrieval_strategy": "hybrid",
"include_citations": true
}'
```
#### 3. Add Follow-up Message
```bash
curl -X POST https://api.stellabot.app/v1/threads/thread_abc123/messages \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"role": "user",
"content": "Can you show me a code example?"
}'
```
#### 4. Create Another Run
```bash
curl -X POST https://api.stellabot.app/v1/threads/thread_abc123/runs \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"response_format": {"type": "json_object"},
"retrieval_strategy": "source_prioritized"
}'
```
#### 5. Update Thread Title (Optional)
```bash
# Update the auto-generated or existing title
curl -X PATCH https://api.stellabot.app/v1/threads/thread_abc123 \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"title": "Stellar Smart Contracts: Complete Guide"
}'
```
### Advanced Run Configuration
```bash
curl -X POST https://api.stellabot.app/v1/threads/thread_abc123/runs \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"provider": "anthropic",
"model": "claude-3-sonnet-20240229",
"temperature": 0.3,
"maxTokens": 2000,
"retrievalStrategy": "HYBRID",
"includeCitations": true,
"filter": {
"source_type": "documentation",
"category": "smart_contracts"
},
"promptTemplate": "code-examples",
"promptVariables": {
"language": "javascript",
"complexity": "intermediate"
}
}'
```
### Monitor Run Progress
```bash
# Check run status
curl -X GET https://api.stellabot.app/v1/threads/thread_abc123/runs/run_def456 \
-H "x-api-key: YOUR_API_KEY"
# Cancel if needed
curl -X POST https://api.stellabot.app/v1/threads/thread_abc123/runs/run_def456/cancel \
-H "x-api-key: YOUR_API_KEY"
```
## Error Responses
### 400 Bad Request
```json
{
"error": "Thread has no messages"
}
```
### 404 Not Found
```json
{
"error": "Thread not found"
}
```
### 404 Run Not Found
```json
{
"error": "Run not found or cannot be cancelled"
}
```
## Best Practices
### Thread Management
1. **Use meaningful metadata** - Store user IDs, session info, etc.
2. **Let titles auto-generate** - System creates descriptive titles after first response
3. **Override when needed** - Use PATCH to update auto-generated titles if necessary
4. **Keep titles concise** - Maximum 60 characters for optimal display
5. **Clean up old threads** - Delete threads when conversations end
6. **Monitor thread size** - Very long threads may impact performance
### Message Management
2. **Add context in system messages** - Provide relevant background
3. **Use metadata for tracking** - Store timestamps, sources, etc.
4. **Collect feedback** - Use thumbs up/down for model improvement
### Run Management
1. **Monitor run status** - Check for failures and handle appropriately
2. **Use appropriate models** - Balance cost, speed, and quality
3. **Cancel long-running operations** - Prevent resource waste
4. **Track token usage** - Monitor costs and optimize prompts
### Performance Optimization
1. **Choose optimal retrieval strategies** - `hybrid` for best results
2. **Use filters effectively** - Narrow search scope when possible
3. **Set reasonable token limits** - Prevent excessive generation
4. **Stream responses** - Better user experience for long responses
## Model Support
### OpenAI Models
- `gpt-4o` - Latest GPT-4 Omni model
- `gpt-4o-mini` - Faster, cost-effective GPT-4 Omni
- `gpt-4-turbo` - GPT-4 Turbo with 128K context
- `gpt-4` - Standard GPT-4 model
### Anthropic Models
- `claude-3-5-sonnet-20241022` - Latest Claude 3.5 Sonnet (8K output)
- `claude-3-5-haiku-20241022` - Latest Claude 3.5 Haiku (fast)
- `claude-3-opus-20240229` - Most capable Claude 3 model
- `claude-3-sonnet-20240229` - Balanced performance and speed
- `claude-3-haiku-20240307` - Fastest Claude 3 model
### Automatic Provider Detection
The API automatically detects the provider based on the model name:
- Models containing `claude` → Anthropic provider
- Models containing `gpt` → OpenAI provider
- No need to specify provider explicitly
## OpenAI Compatibility
The Threads API is designed to be compatible with OpenAI's Assistants API pattern while adding powerful RAG capabilities:
### Key Similarities:
- Thread and message management
- Run-based execution model
- Streaming responses
- Status tracking
### Key Differences:
- **RAG Integration** - Automatic knowledge retrieval
- **Enhanced Parameters** - Support for retrieval strategies
- **Flexible Naming** - Both snake_case and camelCase support
- **Source Citations** - Optional citation of retrieved sources
- **Advanced Filtering** - Filter knowledge base searches
## Rate Limits
Rate limits are applied per API key:
- **Thread operations**: 1,000/hour
- **Message operations**: 5,000/hour
- **Run operations**: 500/hour
- **Enterprise**: Custom limits available