# Building Your Own AI Companion: Combining Gaia Node with Jina Embeddings v4 with Late Chunking
Ever wanted to build an AI assistant that actually knows about your specific data? Today, we'll walk through creating a powerful RAG (Retrieval-Augmented Generation) system that combines **Gaia Node's decentralized AI** with **Jina's state-of-the-art embeddings** to build an intelligent companion that can answer questions about your personal knowledge base.
## What We're Building
Our AI companion will:
- Convert your text data into high-quality vector embeddings using Jina AI
- Store these embeddings locally in a Qdrant vector database
- Use natural language to search through your knowledge base
- Generate contextual responses via a Gaia Node
**Why This Stack?**
- **Gaia Node**: Decentralized, privacy-focused AI inference
- **Jina Embeddings v4**: Superior multilingual embeddings with late chunking
- **Qdrant**: Fast, local vector database
- **Complete Privacy**: Everything runs locally except embedding generation
## Prerequisites
```bash
pip install qdrant-client requests openai
```
You'll also need:
- A running Qdrant instance (local or Docker)
- Access to a Gaia Node. Run your own by following [this tutorial](https://docs.gaianet.ai/getting-started/quick-start/)
- Jina AI API key (free tier available). Get one [here](https://jina.ai/embeddings/).
Start Qdrant locally:
```bash
docker run -p 6333:6333 qdrant/qdrant
```
## Step 1: Prepare Your Data
First, organize your data in a simple JSON format:
```json
[
{"text": "Your first piece of knowledge"},
{"text": "Another important fact"},
{"text": "More information about your domain"}
]
```
Save this as `your_data.json`.
## Step 2: Generate Embeddings with Jina
Here's our embedding pipeline that handles Jina's batch limits and stores everything with the original text:
```python
import json
import requests
import time
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct
import uuid
from typing import List, Dict, Any
class JinaQdrantEmbedder:
def __init__(self, jina_api_key: str, qdrant_host: str = "localhost"):
self.jina_api_key = jina_api_key
self.jina_url = 'https://api.jina.ai/v1/embeddings'
self.qdrant_client = QdrantClient(host=qdrant_host, port=6333)
self.headers = {
'Content-Type': 'application/json',
'Authorization': f'Bearer {jina_api_key}'
}
def load_json_data(self, file_path: str) -> List[Dict[str, str]]:
"""Load data from JSON file"""
with open(file_path, 'r', encoding='utf-8') as f:
data = json.load(f)
print(f"✓ Loaded {len(data)} items from {file_path}")
return data
def create_embeddings_batch(self, batch_data: List[Dict[str, str]], batch_num: int):
"""Create embeddings for a batch using Jina API with late chunking"""
jina_input = [{"text": item['text']} for item in batch_data]
data = {
"model": "jina-embeddings-v4",
"task": "text-matching",
"late_chunking": True, # This is the magic sauce!
"input": jina_input
}
response = requests.post(self.jina_url, headers=self.headers, json=data)
if response.status_code == 200:
result = response.json()
embeddings = result.get('data', [])
combined_results = []
for i, (original_item, embedding_data) in enumerate(zip(batch_data, embeddings)):
combined_results.append({
'embedding': embedding_data['embedding'],
'original_data': original_item,
'global_index': len(combined_results)
})
print(f" ✓ Generated {len(combined_results)} embeddings for batch {batch_num}")
return combined_results
else:
print(f" Error {response.status_code}: {response.text}")
return []
def store_in_qdrant(self, batch_results: List[Dict], collection_name: str, global_offset: int):
"""Store embeddings with original text in Qdrant"""
points = []
for i, item in enumerate(batch_results):
payload = {
'text': item['original_data']['text'],
'global_index': global_offset + i,
'type': 'text',
'source': 'user_data'
}
point = PointStruct(
id=str(uuid.uuid4()),
vector=item['embedding'],
payload=payload
)
points.append(point)
self.qdrant_client.upsert(collection_name=collection_name, points=points)
print(f" ✓ Stored {len(points)} points in Qdrant")
def embed_and_store(self, json_file_path: str, collection_name: str = "my_knowledge_base"):
"""Complete pipeline: JSON → Embeddings → Qdrant"""
print("🚀 Starting embedding pipeline...")
# Load data
data = self.load_json_data(json_file_path)
batch_size = 512 # Jina's limit
total_batches = (len(data) + batch_size - 1) // batch_size
# Process first batch to get vector dimensions
first_batch = data[:min(batch_size, len(data))]
first_results = self.create_embeddings_batch(first_batch, 1)
if not first_results:
print("❌ Failed to process first batch!")
return
# Create Qdrant collection
vector_size = len(first_results[0]['embedding'])
try:
self.qdrant_client.delete_collection(collection_name)
except:
pass
self.qdrant_client.create_collection(
collection_name=collection_name,
vectors_config=VectorParams(size=vector_size, distance=Distance.COSINE)
)
# Store first batch
self.store_in_qdrant(first_results, collection_name, 0)
processed_items = len(first_batch)
# Process remaining batches
for batch_num in range(2, total_batches + 1):
start_idx = (batch_num - 1) * batch_size
end_idx = min(start_idx + batch_size, len(data))
batch_data = data[start_idx:end_idx]
print(f"Processing batch {batch_num}/{total_batches}...")
time.sleep(1) # Rate limiting
batch_results = self.create_embeddings_batch(batch_data, batch_num)
if batch_results:
self.store_in_qdrant(batch_results, collection_name, start_idx)
processed_items += len(batch_results)
print(f"🎉 Success! Processed {processed_items} items into '{collection_name}'")
# Usage
embedder = JinaQdrantEmbedder(jina_api_key="your_jina_api_key")
embedder.embed_and_store("your_data.json")
```
## Step 3: Build the RAG System
Now let's create the retrieval system that connects everything:
```python
import openai
from openai import OpenAI
class GaiaQdrantRAG:
def __init__(self, gaia_base_url: str, jina_api_key: str,
collection_name: str = "my_knowledge_base"):
# Initialize Gaia Node client
self.gaia_client = OpenAI(
base_url=gaia_base_url,
api_key="gaia" # Most Gaia nodes don't require real API keys
)
# Initialize Qdrant client
self.qdrant_client = QdrantClient(host="localhost", port=6333)
self.collection_name = collection_name
# Jina setup for query embeddings
self.jina_api_key = jina_api_key
self.jina_url = 'https://api.jina.ai/v1/embeddings'
def generate_query_embedding(self, query: str) -> List[float]:
"""Convert user question to embedding using same Jina model"""
headers = {
'Content-Type': 'application/json',
'Authorization': f'Bearer {self.jina_api_key}'
}
data = {
"model": "jina-embeddings-v4",
"task": "text-matching",
"input": [{"text": query}]
}
response = requests.post(self.jina_url, headers=headers, json=data)
result = response.json()
return result['data'][0]['embedding']
def search_knowledge_base(self, query_embedding: List[float], top_k: int = 3):
"""Find most relevant content from knowledge base"""
search_results = self.qdrant_client.search(
collection_name=self.collection_name,
query_vector=query_embedding,
limit=top_k,
score_threshold=0.6,
with_payload=True
)
return [
{
'text': result.payload['text'],
'score': result.score
}
for result in search_results
]
def generate_response(self, user_query: str, context_results: List[Dict]):
"""Generate response using Gaia Node with retrieved context"""
# Format context from search results
context = "\n".join([
f"[Source {i+1}] {result['text']}"
for i, result in enumerate(context_results)
])
# Create prompt for Gaia Node
system_prompt = """You are a helpful AI assistant. Use the provided context to answer the user's question accurately. If the context doesn't contain relevant information, say so clearly."""
user_prompt = f"""Context from knowledge base:
{context}
User Question: {user_query}
Please provide a helpful answer based on the context above."""
# Query Gaia Node
response = self.gaia_client.chat.completions.create(
model="gpt-3.5-turbo", # Use whatever model your Gaia node provides
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}
],
max_tokens=500,
temperature=0.7
)
return response.choices[0].message.content
def ask(self, query: str) -> str:
"""Complete RAG pipeline: question → embedding → search → generate"""
print(f"🔍 Processing: {query}")
# Step 1: Convert question to embedding
query_embedding = self.generate_query_embedding(query)
# Step 2: Search knowledge base
relevant_content = self.search_knowledge_base(query_embedding)
if not relevant_content:
return "I couldn't find relevant information in the knowledge base."
# Step 3: Generate response with Gaia Node
response = self.generate_response(query, relevant_content)
return response
# Usage
rag = GaiaQdrantRAG(
gaia_base_url="https://your-gaia-node-url/v1",
jina_api_key="your_jina_api_key"
)
# Ask questions naturally!
answer = rag.ask("What do you know about machine learning?")
print(answer)
```
## Step 4: Interactive AI Companion
Let's create a simple chat interface:
```python
def main():
"""Interactive chat with your AI companion"""
rag = GaiaQdrantRAG(
gaia_base_url="https://your-gaia-node-url/v1",
jina_api_key="your_jina_api_key"
)
print("🤖 AI Companion Ready! (Type 'quit' to exit)")
print("Ask me anything about your knowledge base...\n")
while True:
user_input = input("You: ").strip()
if user_input.lower() in ['quit', 'exit', 'q']:
print("Goodbye! 👋")
break
if not user_input:
continue
try:
response = rag.ask(user_input)
print(f"🤖 Assistant: {response}\n")
except Exception as e:
print(f"❌ Error: {str(e)}\n")
if __name__ == "__main__":
main()
```
## Why This Stack Rocks
**Jina Embeddings v4 with Late Chunking** provides:
- Superior multilingual understanding
- Better semantic search quality
- Efficient processing of long documents
**Gaia Node** offers:
- Decentralized AI inference
- Privacy-focused processing
- No vendor lock-in
**Local Qdrant** ensures:
- Fast vector searches
- Complete data privacy
- No external dependencies for retrieval
## Example Interaction
```
You: What are the main benefits of renewable energy?
🤖 Assistant: Based on your knowledge base, renewable energy offers several key benefits:
1. Environmental Impact: Significantly reduces carbon emissions and helps combat climate change
2. Economic Advantages: Creates jobs and reduces long-term energy costs
3. Energy Independence: Reduces reliance on fossil fuel imports
4. Sustainability: Provides an inexhaustible energy source for future generations
The context shows that solar and wind technologies have become increasingly cost-competitive with traditional energy sources.
```
## Performance Tips
1. **Batch Size**: Keep batches at 512 items for Jina API efficiency
2. **Vector Dimensions**: Jina v4 uses 2048 dimensions - very information-rich
3. **Search Threshold**: Start with 0.6 similarity threshold, adjust based on your data
4. **Late Chunking**: Always enable this for better semantic understanding
## Next Steps
- [ ] Add document parsing (PDFs, Word docs)
- [ ] Implement conversation memory
- [ ] Create a web interface with FastAPI
- [ ] Add real-time data updates
- [ ] Integrate with more Gaia nodes for redundancy
## Conclusion
You now have a powerful, privacy-focused AI companion that can understand and reason about your specific knowledge base. The combination of Jina's advanced embeddings with Gaia's decentralized inference creates a system that's both intelligent and respects your data privacy.
The best part? Everything runs locally except for the initial embedding generation, giving you complete control over your AI assistant.
**Ready to build your own AI companion?** Start with a small dataset, get the pipeline working, then scale up with your full knowledge base.
> The future of personal AI is decentralized, and you just built it! 🚀