# Building Your Own AI Companion: Combining Gaia Node with Jina Embeddings v4 with Late Chunking Ever wanted to build an AI assistant that actually knows about your specific data? Today, we'll walk through creating a powerful RAG (Retrieval-Augmented Generation) system that combines **Gaia Node's decentralized AI** with **Jina's state-of-the-art embeddings** to build an intelligent companion that can answer questions about your personal knowledge base. ## What We're Building Our AI companion will: - Convert your text data into high-quality vector embeddings using Jina AI - Store these embeddings locally in a Qdrant vector database - Use natural language to search through your knowledge base - Generate contextual responses via a Gaia Node **Why This Stack?** - **Gaia Node**: Decentralized, privacy-focused AI inference - **Jina Embeddings v4**: Superior multilingual embeddings with late chunking - **Qdrant**: Fast, local vector database - **Complete Privacy**: Everything runs locally except embedding generation ## Prerequisites ```bash pip install qdrant-client requests openai ``` You'll also need: - A running Qdrant instance (local or Docker) - Access to a Gaia Node. Run your own by following [this tutorial](https://docs.gaianet.ai/getting-started/quick-start/) - Jina AI API key (free tier available). Get one [here](https://jina.ai/embeddings/). Start Qdrant locally: ```bash docker run -p 6333:6333 qdrant/qdrant ``` ## Step 1: Prepare Your Data First, organize your data in a simple JSON format: ```json [ {"text": "Your first piece of knowledge"}, {"text": "Another important fact"}, {"text": "More information about your domain"} ] ``` Save this as `your_data.json`. ## Step 2: Generate Embeddings with Jina Here's our embedding pipeline that handles Jina's batch limits and stores everything with the original text: ```python import json import requests import time from qdrant_client import QdrantClient from qdrant_client.models import Distance, VectorParams, PointStruct import uuid from typing import List, Dict, Any class JinaQdrantEmbedder: def __init__(self, jina_api_key: str, qdrant_host: str = "localhost"): self.jina_api_key = jina_api_key self.jina_url = 'https://api.jina.ai/v1/embeddings' self.qdrant_client = QdrantClient(host=qdrant_host, port=6333) self.headers = { 'Content-Type': 'application/json', 'Authorization': f'Bearer {jina_api_key}' } def load_json_data(self, file_path: str) -> List[Dict[str, str]]: """Load data from JSON file""" with open(file_path, 'r', encoding='utf-8') as f: data = json.load(f) print(f"✓ Loaded {len(data)} items from {file_path}") return data def create_embeddings_batch(self, batch_data: List[Dict[str, str]], batch_num: int): """Create embeddings for a batch using Jina API with late chunking""" jina_input = [{"text": item['text']} for item in batch_data] data = { "model": "jina-embeddings-v4", "task": "text-matching", "late_chunking": True, # This is the magic sauce! "input": jina_input } response = requests.post(self.jina_url, headers=self.headers, json=data) if response.status_code == 200: result = response.json() embeddings = result.get('data', []) combined_results = [] for i, (original_item, embedding_data) in enumerate(zip(batch_data, embeddings)): combined_results.append({ 'embedding': embedding_data['embedding'], 'original_data': original_item, 'global_index': len(combined_results) }) print(f" ✓ Generated {len(combined_results)} embeddings for batch {batch_num}") return combined_results else: print(f" Error {response.status_code}: {response.text}") return [] def store_in_qdrant(self, batch_results: List[Dict], collection_name: str, global_offset: int): """Store embeddings with original text in Qdrant""" points = [] for i, item in enumerate(batch_results): payload = { 'text': item['original_data']['text'], 'global_index': global_offset + i, 'type': 'text', 'source': 'user_data' } point = PointStruct( id=str(uuid.uuid4()), vector=item['embedding'], payload=payload ) points.append(point) self.qdrant_client.upsert(collection_name=collection_name, points=points) print(f" ✓ Stored {len(points)} points in Qdrant") def embed_and_store(self, json_file_path: str, collection_name: str = "my_knowledge_base"): """Complete pipeline: JSON → Embeddings → Qdrant""" print("🚀 Starting embedding pipeline...") # Load data data = self.load_json_data(json_file_path) batch_size = 512 # Jina's limit total_batches = (len(data) + batch_size - 1) // batch_size # Process first batch to get vector dimensions first_batch = data[:min(batch_size, len(data))] first_results = self.create_embeddings_batch(first_batch, 1) if not first_results: print("❌ Failed to process first batch!") return # Create Qdrant collection vector_size = len(first_results[0]['embedding']) try: self.qdrant_client.delete_collection(collection_name) except: pass self.qdrant_client.create_collection( collection_name=collection_name, vectors_config=VectorParams(size=vector_size, distance=Distance.COSINE) ) # Store first batch self.store_in_qdrant(first_results, collection_name, 0) processed_items = len(first_batch) # Process remaining batches for batch_num in range(2, total_batches + 1): start_idx = (batch_num - 1) * batch_size end_idx = min(start_idx + batch_size, len(data)) batch_data = data[start_idx:end_idx] print(f"Processing batch {batch_num}/{total_batches}...") time.sleep(1) # Rate limiting batch_results = self.create_embeddings_batch(batch_data, batch_num) if batch_results: self.store_in_qdrant(batch_results, collection_name, start_idx) processed_items += len(batch_results) print(f"🎉 Success! Processed {processed_items} items into '{collection_name}'") # Usage embedder = JinaQdrantEmbedder(jina_api_key="your_jina_api_key") embedder.embed_and_store("your_data.json") ``` ## Step 3: Build the RAG System Now let's create the retrieval system that connects everything: ```python import openai from openai import OpenAI class GaiaQdrantRAG: def __init__(self, gaia_base_url: str, jina_api_key: str, collection_name: str = "my_knowledge_base"): # Initialize Gaia Node client self.gaia_client = OpenAI( base_url=gaia_base_url, api_key="gaia" # Most Gaia nodes don't require real API keys ) # Initialize Qdrant client self.qdrant_client = QdrantClient(host="localhost", port=6333) self.collection_name = collection_name # Jina setup for query embeddings self.jina_api_key = jina_api_key self.jina_url = 'https://api.jina.ai/v1/embeddings' def generate_query_embedding(self, query: str) -> List[float]: """Convert user question to embedding using same Jina model""" headers = { 'Content-Type': 'application/json', 'Authorization': f'Bearer {self.jina_api_key}' } data = { "model": "jina-embeddings-v4", "task": "text-matching", "input": [{"text": query}] } response = requests.post(self.jina_url, headers=headers, json=data) result = response.json() return result['data'][0]['embedding'] def search_knowledge_base(self, query_embedding: List[float], top_k: int = 3): """Find most relevant content from knowledge base""" search_results = self.qdrant_client.search( collection_name=self.collection_name, query_vector=query_embedding, limit=top_k, score_threshold=0.6, with_payload=True ) return [ { 'text': result.payload['text'], 'score': result.score } for result in search_results ] def generate_response(self, user_query: str, context_results: List[Dict]): """Generate response using Gaia Node with retrieved context""" # Format context from search results context = "\n".join([ f"[Source {i+1}] {result['text']}" for i, result in enumerate(context_results) ]) # Create prompt for Gaia Node system_prompt = """You are a helpful AI assistant. Use the provided context to answer the user's question accurately. If the context doesn't contain relevant information, say so clearly.""" user_prompt = f"""Context from knowledge base: {context} User Question: {user_query} Please provide a helpful answer based on the context above.""" # Query Gaia Node response = self.gaia_client.chat.completions.create( model="gpt-3.5-turbo", # Use whatever model your Gaia node provides messages=[ {"role": "system", "content": system_prompt}, {"role": "user", "content": user_prompt} ], max_tokens=500, temperature=0.7 ) return response.choices[0].message.content def ask(self, query: str) -> str: """Complete RAG pipeline: question → embedding → search → generate""" print(f"🔍 Processing: {query}") # Step 1: Convert question to embedding query_embedding = self.generate_query_embedding(query) # Step 2: Search knowledge base relevant_content = self.search_knowledge_base(query_embedding) if not relevant_content: return "I couldn't find relevant information in the knowledge base." # Step 3: Generate response with Gaia Node response = self.generate_response(query, relevant_content) return response # Usage rag = GaiaQdrantRAG( gaia_base_url="https://your-gaia-node-url/v1", jina_api_key="your_jina_api_key" ) # Ask questions naturally! answer = rag.ask("What do you know about machine learning?") print(answer) ``` ## Step 4: Interactive AI Companion Let's create a simple chat interface: ```python def main(): """Interactive chat with your AI companion""" rag = GaiaQdrantRAG( gaia_base_url="https://your-gaia-node-url/v1", jina_api_key="your_jina_api_key" ) print("🤖 AI Companion Ready! (Type 'quit' to exit)") print("Ask me anything about your knowledge base...\n") while True: user_input = input("You: ").strip() if user_input.lower() in ['quit', 'exit', 'q']: print("Goodbye! 👋") break if not user_input: continue try: response = rag.ask(user_input) print(f"🤖 Assistant: {response}\n") except Exception as e: print(f"❌ Error: {str(e)}\n") if __name__ == "__main__": main() ``` ## Why This Stack Rocks **Jina Embeddings v4 with Late Chunking** provides: - Superior multilingual understanding - Better semantic search quality - Efficient processing of long documents **Gaia Node** offers: - Decentralized AI inference - Privacy-focused processing - No vendor lock-in **Local Qdrant** ensures: - Fast vector searches - Complete data privacy - No external dependencies for retrieval ## Example Interaction ``` You: What are the main benefits of renewable energy? 🤖 Assistant: Based on your knowledge base, renewable energy offers several key benefits: 1. Environmental Impact: Significantly reduces carbon emissions and helps combat climate change 2. Economic Advantages: Creates jobs and reduces long-term energy costs 3. Energy Independence: Reduces reliance on fossil fuel imports 4. Sustainability: Provides an inexhaustible energy source for future generations The context shows that solar and wind technologies have become increasingly cost-competitive with traditional energy sources. ``` ## Performance Tips 1. **Batch Size**: Keep batches at 512 items for Jina API efficiency 2. **Vector Dimensions**: Jina v4 uses 2048 dimensions - very information-rich 3. **Search Threshold**: Start with 0.6 similarity threshold, adjust based on your data 4. **Late Chunking**: Always enable this for better semantic understanding ## Next Steps - [ ] Add document parsing (PDFs, Word docs) - [ ] Implement conversation memory - [ ] Create a web interface with FastAPI - [ ] Add real-time data updates - [ ] Integrate with more Gaia nodes for redundancy ## Conclusion You now have a powerful, privacy-focused AI companion that can understand and reason about your specific knowledge base. The combination of Jina's advanced embeddings with Gaia's decentralized inference creates a system that's both intelligent and respects your data privacy. The best part? Everything runs locally except for the initial embedding generation, giving you complete control over your AI assistant. **Ready to build your own AI companion?** Start with a small dataset, get the pipeline working, then scale up with your full knowledge base. > The future of personal AI is decentralized, and you just built it! 🚀