RAG and Query analysis

## RAG ![rag_landscape-627f1d0fd46b92bc2db0af8f99ec3724](https://hackmd.io/_uploads/HyJ4tEAKkl.png) ## Routing * Logical routing ![下載 (2)](https://hackmd.io/_uploads/SJkxUnh5ke.png) We can use tool and bind it to our LLM model to that LLM consider to use which source. * Semantic routing ![下載 (1)](https://hackmd.io/_uploads/ByjCS3h9yl.png) We embed our question and prompt and compute the similarity between chosen prompt and sources. ## Indexing * Multi-representation indexing * Herachical Indexing * Specialized Embeddings ## Query analysis 假如使用者想跟 RAG LLM 互動，其中的上下文內容會牽涉到特別的關鍵字或是查詢方法。如何透過使用者的原始輸入與模型的最佳輸出，Query analysis 作為橋梁，目前主要有一些應用: 1. **Query Re-writing**: Queries acan be re-written or expanded to improve semantic or lexical searches. 2. **Query Construction**: Search indexes may require structured queries (e.g. SQL for databases). ### Query re-writing 為了處理從粗略的描述到詳細的問題敘述，從簡單到多方面的問題，目前熱門的方法是使用大語言模型重寫更有效率的查詢。以下為幾種好處: * Query Clarification: Models can rephrase ambiguous or poorly worded queries for clarity * Semantic Understanding: They can capture the intent behind a query, going beyond literal keyword matching. * Query Expansion: Models can generate related terms or concepts to broaden the search scope. * Complex Query Handling: They can break down multi-part questions into simpler sub-queries. 有一些技術已經經過開發: | Name | When to use | Description | | ------------- | ---------------------------------------------------------------------------------------------- | ----------- | | **Multi-query** | When you want to ensure high recall in retrieval by providing multiple phrasings of a question | Rewrite the user question with multiple phrasings, retrieve documents for each rewritten question, return the unique documents for all queries. | | **Step-back** | When a higher-level conceptual understanding is required. | First prompt the LLM to ask a generic step-back question about higher-level concepts or principles, and retrieve relevant facts about them. | | **HyDE** | If you have challenges retrieving relevant documents using the raw user inputs. | Use an LLM to convert questions into hypothetical documents that answer the question | | **Decomposition** | When a question can be broken down into smaller subproblems | Decompose a question into a set of subproblems / questions, which can either be solved sequentially | # Advance RAG ## Agentic RAG 使用 agent (代理人) 來決定是否需要以及如何檢索出最相關的資料來回答使用者的問題 ![image](https://hackmd.io/_uploads/Hy72Mwp9Je.png) ## Adaptive RAG [Adaptive RAG](https://arxiv.org/abs/2403.14403) 使用 query analysis 元件加上 active/ self corrective RAG 的形式來增進大語言模型 RAG 的穩定度。 ![image](https://hackmd.io/_uploads/HJq7rP69Jl.png) ![image](https://hackmd.io/_uploads/SkA2Ewpcyl.png) ## Corrective RAG (CRAG) [CRAG 糾錯檢索增強生成](https://arxiv.org/abs/2401.15884)，是一種提高語言模型生成穩定度的方法。該方法有兩個主要元件: 檢索評估器 (Retrieval Evaluator) 以及知識精煉 (Knowledge Refinement)。 ![image](https://hackmd.io/_uploads/rJXObwp5Jg.png) ![下載 (3)](https://hackmd.io/_uploads/HJRWZva9kl.png) **Retrieval Evaluator**: 用於評估針對特定問題檢索到的文件品質。將答案分類為正確、模糊與錯誤等三種情況。正確的情況，直接進行知識精煉並抽取關鍵資訊給大語言模型。錯誤的資訊則利用網路檢索來擴充知識量。至於模糊的知識則結合前面兩種操作來增加答案的精準度及可靠性。 **Knowledge Refinement**: 先將文件進行分解再重新組合，以便深入挖掘文件中的核心知識點。能利用自訂規則將文件進行分解，並由檢索評估器來衡量其相關性。最後將剩餘的相關知識重新整合。 ## Self-RAG [Self-RAG](https://arxiv.org/abs/2310.11511) 是一個 RAG 方法，透過自我對應 (self reflection) 及自我評量 (self grading) 來提高 RAG 的檢索及生成品質。 ![image](https://hackmd.io/_uploads/ryoNFw6q1g.png) ![下載 (5)](https://hackmd.io/_uploads/HkMeKDacJg.png) ## SQL Agent Build a SQL agent that can answer questions about a SQL database. # Memory 在 langgraph 裡， memory 分為兩種。 ![short-vs-long](https://hackmd.io/_uploads/SJUBu66c1l.png) **Short-term memory**: thread-scoped memory。可以在單一對話執行緒裡記錄使用者資訊。 Langgraph 將其記錄在 state 裡。短期記憶會在模型被呼叫或是步驟流程結束時被更新。 **Long-term memory**:在每個對話執行緒中共想，他可以在任何時間的任何執行緒被使用。他透過一個使用者的 namespace 來記錄。 Langgraph 透過 stores 來儲存跟記憶長期記憶。我們可以用其來儲存跨對話與行程的資訊。 Langgraph 將 long-term memory 儲存成 json 格式。 **Memory types** | Memory Type | What is Stored | Human Example | Agent Example | | ----------- | -------------- | -------------------------- | ------------------ | | Semantic | Facts | Things I learned in school | Facts about a user | | Episodic | Experiences | Things I did | Pass agent actions | | Procedural | Instructions | Instincts or motor skills | Agent system prompt | **Semantic Memory**: involves the retention of specific facts and concepts. Semantic Memory 的實現可以是一個使用者、組織或任何實體的一個 profile![update-profile](https://hackmd.io/_uploads/ry3QiTaqkg.png) 也可以是一個很多文件的 collection，隨著時間更新與擴增 list。list 的內容可以是很細節的設定且很容易生成。但是問題就是可能無法調和更新合併跟擴增的選擇。 ![update-list](https://hackmd.io/_uploads/rkl40aaq1l.png) **Episodic Memory** invoves recalling past events or actions. Introducing the [CoALA paper](https://arxiv.org/pdf/2309.02427): facts can be written to semantic memory, whereas experiences can be written to episodic memory. 對於 AI 代理來說 episodic memory 通常用於幫助代理想起如何完成任務。在實務上，我們通常會使用 few-shot example 提示來實作。 **Procedural Memory** involves remembering the rules used to perform tasks. 對於 AI agents 來說， procedural memory 比較像模型權重、 agent code 與 agent prompt 的組合。在實務上我們比較少動到模型權重，但我們可以修改 prompts。我們可以使用 reflection 或是 meta-prompting 等方法。比如我們的 system prompt 可以根據對話與明示的用戶回饋來修改。 ![update-instructions](https://hackmd.io/_uploads/HJvOZ0ac1e.png) 何時寫入 memories 也是一個課題，是在線修改還是背景修改? ![hot_path_vs_background](https://hackmd.io/_uploads/Skwhb0p5kx.png)