Retrieval Augmented Generation (RAG) Technical Plan / Design

# Retrieval Augmented Generation (RAG) Technical Plan / Design # Overview Retrieval-augmented generation (RAG) is a technique that combines the power of pre-trained language models with a retrieval system to generate responses based on a large corpus of documents. The retrieval system first fetches relevant documents or passages based on the input query, and then the language model generates a response using both the input query and the retrieved documents as context. # Flow ![RAG Basic-2024-01-15-084414](https://hackmd.io/_uploads/SJVOldftT.png) 1. The process starts with a user sending a prompt to a class called *DocumentRetriever*. 2. The *DocumentRetriever* enters a loop where it consistently retrieves relevant external information. This loop signifies that the retrieval process might involve multiple steps or iterations to gather all the necessary documents. For now we use search text from PostgreSQL. 3. Once the *DocumentRetriever* has gathered all relevant documents, it sends this information to another class called *PromptContextGenerator*." 4. The *PromptContextGenerator* takes the documents and creates a context that includes both the user's original prompt and the relevant document information. It then sends this combined context to *ChatFlux* as prompt. 5. At the same time, the *PromptContextGenerator* sends metadata about the documents to a component named *ReferenceBuilder*. This metadata likely includes details like document titles, url, or other relevant information that would be needed to reference the documents properly. 6. *ChatFlux* processes the context and the user's prompt and comes up with an answer. This answer is then sent to the *ReferenceBuilder*. 7. The *ReferenceBuilder* takes the answer from *ChatFlux* and combines it with the document metadata to construct a well-referenced and complete final answer. 8. Finally, the *ReferenceBuilder* sends this final answer back to the user, concluding the process. # New Models - **Documents**: - id: int - metadata: jsonb - content: text # New Classes - **DocumentRetriever** - search(text:) --> documents - **PromptContextGenerator** - prompt(documents:) --> {role: 'user', content: 'Context: #context, Question: #user_prompt. Answer the question based on the provided context.'} - metadata(documents:) --> {} - **ReferenceBuilder** - citation(response: {}, metadata: {}) --> {}