RAG內有五個關鍵階段，這些階段將成為構建的任何大型應用程序的一部分。這些階段包括：
- 加載（Loading）
  - 指的是將數據從其所在位置（無論是文本文件、PDF文件、其他網站、數據庫還是API）引入到流程中。LlamaHub提供數百個連接器供選擇
- 索引(Indexing)
  - 創建一種數據結構，允許查詢數據
  - 對於LLMs來說，這幾乎總是意味著創建向量嵌入，這是數據意義的數值表示，以及許多其他元數據策略(metadata strategies)，使得準確找到具有上下文相關性的數據變得容易
- 存儲(Storing)
  - 一旦數據被索引(indexed)，幾乎總會希望存儲索引(index)，以及其他元數據(metadata)，以避免重複索引(re-inde)
- 查詢(Querying)
  - 對於任何給定的索引策略，都可以使用多種方法利用LLMs和LlamaIndex數據結構進行查詢，包括子查詢(sub-queries)、多步驟查詢(multi-step queries)和混合策略
- 評估(Evaluation)
  - 在任何流程中的一個關鍵步驟是檢查其相對於其他策略的有效性，或者當進行變更時的有效性
  - 評估提供了客觀衡量指標，顯示對查詢的回應有多準確、可靠(faithful)和快速

課程摘要

RAG的概念及其重要性
- 定義：檢索增強生成 (Retrieval Augmented Generation, RAG)，RAG結合了傳統的語言模型生成和資料檢索過程，讓模型在生成答案前能夠參考到相關的外部資訊
- 重要性：此方法不僅提升了答案的準確性，也增加了回答的多樣性和深度，特別是在處理特定領域或少見話題時
建置高品質RAG系統的挑戰
- 成本：有效的檢索技術不僅包括硬體和軟體成本，也涉及到數據的搜集和處理工作
- 迭代和維護：此外，系統的持續改進需要專業知識和時間投入，並且要隨著數據和使用案例的變化進行調整
進階檢索方法
課程涵蓋兩種進階檢索方法，這些方法提供比簡單方法更好的LLM上下文語境
- 句子窗口檢索(sentence window retrieval)：
  - 不僅檢索最相關的句子，還包括其前後的句子窗口，提供更全面的上下文。
  - Image Not Showing Possible Reasons
    The image file may be corrupted
    The server hosting the image is unavailable
    The image path is incorrect
    The image format is not supported
    Learn More →
    有助於理解和利用文本中的語境，尤其當單個句子無法提供足夠信息時
- 自動合併檢索(auto-merging retrieval)：
  - 通過樹狀結構組織文檔，當子節點與問題相關時，使用父節點的完整文本作為上下文。
  - Image Not Showing Possible Reasons
    The image file may be corrupted
    The server hosting the image is unavailable
    The image path is incorrect
    The image format is not supported
    Learn More →
    可以動態地提供更完整的文本片段，有助於解決複雜的查詢
評估LLM問答系統的指標
使用RAG三元組 (RAG triad) 進行評估：這三個指標是專為評估RAG系統設計的，涵蓋了RAG執行的三個主要步驟：
- 上下文相關性 (Context Relevance)：
  - 評估檢索到的文本片段與用戶問題的相關性。它有助於確定系統在提供LLM上下文時的效率和準確性
  - 確認是否從大量資訊中篩選出最有用的部分
- 實證性/真實性 (Groundedness)：
  - 評估答案是否有堅實的事實或數據基礎
  - 它關注於答案是否反映了準確和相關的信息來源
- 答案相關性 (Answer Relevance)：
  - 不僅評估答案是否解決問題
  - 也考慮答案的適當性和用戶的具體需求
這三個指標共同作用，為LLM問答系統的評估提供了一個全面的視角，幫助開發者和研究者理解和改進RAG系統的性能。
實作練習和評估方法的應用
- 系統性迭代：如何針對特定案例和用戶群進行調整和優化
- 實驗追蹤：實驗追蹤不僅幫助快速改進，也是理解模型表現和用戶反饋的重要手段

–

Advanced RAG Pipeline

這一小節是快速概覽，後面三小節會有較為深入的說明

課程概要

如何建立基礎與進階的RAG（Retrieval Augmented Generation，檢索增強生成）管道的全面概覽，主要使用Llama Index並結合TruLens進行效能評估，包括：

句子窗口檢索(Sentence Window retrieval)：
- 透過將單句嵌入及檢索後，以原始檢索句子為中心，取出更大範圍的句子前後文(window)，提供LLM（大型語言模型）更多上下文以改善回答品質
自動合併檢索(Auto-merging retrieval)：
- 建立一個包含大節點（父節點）與小節點（子節點）的層次結構，子節點鏈接至父節點
- 檢索時，如果父節點的多數子節點被檢索到，則將這些子節點替換為父節點，進行層次性的節點合併

Basic RAG pipeline

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

如下圖所示（來源），基礎方法(Basic RAG pipeline)使用相同的文本塊進行索引/嵌入以及輸出合成。

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

優點：
- 簡單和高效：這種方法直接且有效，使用相同的文本塊進行嵌入和合成，簡化了檢索過程
- 數據處理的一致性：它在檢索和合成階段維持了數據使用的一致性
缺點：
- 有限的上下文理解(Limited Contextual Understanding)：LLMs可能需要更大的窗口進行合成以生成更好的回應，這種方法可能無法充分提供
- 可能產生次優回應(Suboptimal Responses)：由於上下文有限，LLM可能沒有足夠的信息來生成最相關和準確的回應

Lab Lesson 1: Advanced RAG Pipeline

環境設置







import utils

import os
import openai
openai.api_key = utils.get_openai_api_key()

from llama_index import SimpleDirectoryReader

讀取文件(Ingestion)
此階段包括以下步驟：
- 文件(Documents)：輸入文檔資料
- 分塊(Chunks)：將文檔分割成更小的片段以便處理
- 嵌入(Embeddings)：將分割後的文檔片段轉換成數字向量
- 索引(Index)：建立向量索引以便於檢索








documents = SimpleDirectoryReader(
    input_files=["./eBook-How-to-Build-a-Career-in-AI.pdf"]
).load_data

print(type(documents), "\n")
print(len(documents), "\n")
print(type(documents[0]))
print(documents[0])

results

<class 'list'> 

41 

<class 'llama_index.schema.Document'>
Doc ID: dd31cb7a-d550-48a5-8af6-254a05b9c5a9
Text: PAGE 1Founder, DeepLearning.AICollected Insights from Andrew Ng
How to  Build Your Career in AIA Simple Guide

Basic RAG pipeline

指定llm、建立檢索服務的service_context








from llama_index import VectorStoreIndex
from llama_index import ServiceContext
from llama_index.llms import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)
service_context = ServiceContext.from_defaults(
    llm=llm, embed_model="local:BAAI/bge-small-en-v1.5"
)

文件合併：將多個文檔的文本合併為一個長文本


from llama_index import Document
document = Document(text="\n\n".join([doc.text for doc in documents]))

建立索引




index = VectorStoreIndex.from_documents([document],
                                        service_context=service_context)

# <llama_index.indices.vector_store.base.VectorStoreIndex at 0x7fdcf6f6d450>

建立檢索引擎


query_engine = index.as_query_engine()

使用者查詢




response = query_engine.query(
    "What are steps to take when finding projects to build your experience?"
)
print(str(response))

Advanced RAG pipeline

1. Sentence Window retrieval

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

詳細流程、實作與實驗見Sentence-window retrieval

句子窗口(Sentence Window retrieval)方法將文件分解為更小的單位，例如句子或小組句子。它解耦了用於檢索任務的嵌入（這些較小的塊存儲在Vector DB中），但在合成(生成階段)，重新添加了檢索塊(chunk)周圍的上下文

在檢索期間，通過相似性搜索檢索與查詢最相關的句子，並用完整的周圍上下文替換該句子（使用圍繞上下文的靜態句子窗口，通過檢索原始檢索句子周圍的句子來實現），如下圖所示

2. Auto-merging retrieval / Hierarchical Retriever

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

詳細流程、實作與實驗見Auto-merging retrieval

下圖展示了自動合併檢索的工作方式，不會像基礎方法那樣檢索出一堆過於破碎的文本塊。當基礎方法把文本切割得太小時，會得到一堆訊息不完整的破碎文件塊(text chunk)

自動合併檢索旨在合併（或融合）來自多個來源或文本段落的信息，創建更全面並與上下文更相關的回應

lab

Sentence Window retrieval¶



from llama_index.llms import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)

build_sentence_window_index()
- llamaindex 的SentenceWindowNodeParser實作
  - 官文文件範例，使用Metadata Replacement + Node Sentence Window組合勝過基本VectorStoreIndex
  - 長文本關鍵資訊遺失的問題:2307．Lost in the Middle: How Language Models Use Long Contexts．因此，需要用其他輔助方式幫助檢索出有用的資訊

建立Sentence Window 索引








from utils import build_sentence_window_index

sentence_index = build_sentence_window_index(
    document,
    llm,
    embed_model="local:BAAI/bge-small-en-v1.5",
    save_dir="sentence_index"
)







from utils import get_sentence_window_query_engine

sentence_window_engine = get_sentence_window_query_engine(sentence_index)

window_response = sentence_window_engine.query(
"how do I get started on a personal project in AI?")
print(str(window_response))

To get started on a personal project in AI, it is important to first identify and scope the project. Consider your career goals and choose a project that complements them. Ensure that the project is responsible, ethical, and beneficial to people. As you progress in your career, aim for projects that grow in scope, complexity, and impact over time. Building a portfolio of projects that shows skill progression can also be helpful. Additionally, there are resources available in the book that provide guidance on starting your AI job search and finding the right AI job for you.

Auto-merging retrieval



















from utils import build_automerging_index

automerging_index = build_automerging_index(
    documents,
    llm,
    embed_model="local:BAAI/bge-small-en-v1.5",
    save_dir="merging_index"
)

from utils import get_automerging_query_engine

automerging_query_engine = get_automerging_query_engine(
    automerging_index,
)

auto_merging_response = automerging_query_engine.query(
    "How do I build a portfolio of AI projects?"
)
print(str(auto_merging_response))

檢視response















 Merging 1 nodes into parent node.
> Parent node id: 40e80a95-972f-484b-ab69-f4309b819c7b.
> Parent node text: PAGE 21Building a Portfolio of 
Projects that Shows 
Skill Progression CHAPTER 6
PROJECTS

> Merging 1 nodes into parent node.
> Parent node id: f83e0a3d-87d4-49da-935f-97cfcef45c6f.
> Parent node text: PAGE 21Building a Portfolio of 
Projects that Shows 
Skill Progression CHAPTER 6
PROJECTS

To build a portfolio of AI projects, it is important to start with simple undertakings and gradually progress to more complex ones. This progression over time will demonstrate your growth and development in the field. Additionally, effective communication is crucial. You should be able to explain your thought process and the value of your work to others. This will help others see the potential in your projects and trust you with resources for larger endeavors.

檢視評估結果

使用`TruLens`評估。 Evaluation setup using TruLens

詳見RAG Triad of metrics小節

建立用來評估RAG Pipeline表現的問題集


















eval_questions = []
with open('eval_questions.txt', 'r') as file:
    for line in file:
        # Remove newline character and convert to integer
        item = line.strip()
        print(item)
        eval_questions.append(item)
        
# What are the keys to building a career in AI?
# How can teamwork contribute to success in AI?
# What is the importance of networking in AI?
# What are some good habits to develop for a successful career?
# How can altruism be beneficial in building a career?
# What is imposter syndrome and how does it relate to AI?
# Who are some accomplished individuals who have experienced imposter syndrome?
# What is the first step to becoming good at AI?
# What are some common challenges in AI?
# Is it normal to find parts of AI challenging?

初始化TrueLens、重置資料庫




from trulens_eval import Tru
tru = Tru()

tru.reset_database()

For the classroom, we've written some of the code in helper functions inside a utils.py file.

You can view the utils.py file in the file directory by clicking on the "Jupyter" logo at the top of the notebook.

In later lessons, you'll get to work directly with the code that's currently wrapped inside these helper functions, to give you more options to customize your RAG pipeline.

🦑 Tru initialized with db url sqlite:///default.sqlite .
🛑 Secret keys may be written to the database. See the `database_redact_keys` option of `Tru` to prevent this.




from utils import get_prebuilt_trulens_recorder

tru_recorder = get_prebuilt_trulens_recorder(query_engine,
                                             app_id="Direct Query Engine")

檢視評估結果

用基本方法得到的上下文相關性（Context Relevance）的分數很低







with tru_recorder as recording:
    for question in eval_questions:
        response = query_engine.query(question)

records, feedback = tru.get_records_and_feedback(app_ids=[])

records.head()

啟動TruLens dashboard


# launches on http://localhost:8501/
tru.run_dashboard()

兩種進階RAG與Base方法必較
- Context Relevance分數比直接查詢提升不少， total cost也下降
- 顯示是更有效的檢索方法

Reference

Supplements

2024.0219。Leonie Monigatti。Advanced Retrieval-Augmented Generation: From Theory to LlamaIndex Implementation

Difference between Naive and Advanced RAG (Image by the author, inspired by [1])

文章做了檢索三階段優化的範例實作

檢索前優化(Pre-Retrieval Optimization)

滑動窗口（Sliding window）
- 使用片段之間的重疊，將文本分成多個部分，確保檢索到的片段包含相關的上下文信息
增強資料細節（Enhancing data granularity）
應用資料清理技術來提高文本資料的細節程度，例如刪除無關信息、確認事實準確性、更新過時信息等
添加元數據（Adding metadata）
- 如日期、目的或章節等信息，以便進行檢索時進行篩選，幫助更精確地定位和檢索特定類型或特定時期的文本資料
優化索引結構（Optimizing index structures）
- 使用不同的策略來對文本資料進行索引，例如調整片段(chunk sizes)大小或使用多索引(multi-indexing)策略等

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

此外，檢索前技術不僅限於資料索引，還可以涵蓋推理階段的查詢(Query)技術優化，例如查詢路由、查詢重寫和查詢擴展。

查詢路由（Query Routing）
- 在檢索前階段，系統可能會根據特定的條件或需求將查詢路由到不同的檢索模型或檢索系統
- 可以根據用戶的查詢內容、上下文或其他因素來進行，以確保最佳的檢索結果。
查詢重寫（Query Rewriting）
- 有時候，用戶提出的查詢可能模糊或不夠清晰，需要進行重寫以更好地符合檢索系統的需求
- 查詢重寫可以根據先前的經驗或模型的知識來重新構造或調整查詢，以提高準確性和相關性
查詢擴展（Query Expansion）
- 在查詢擴展中，系統會自動擴展用戶的查詢，以包括更多的相關詞彙或相關概念，從而提高檢索的全面性和相關性
- 通常通過分析查詢中的關鍵詞，然後在檢索期間動態添加相關的詞彙或概念來實現

檢索優化（Retrieval Optimization）

指的是通過不同的方法和技術來改進信息檢索的效率和準確性。這包括調整檢索模型、優化查詢處理方式以及改進檢索過程中的各種步驟。

混合搜索（Hybrid Search）
- 混合搜索結合了不同類型的搜索技術，通常包括向量搜索和關鍵詞搜索
- 可以利用向量檢索的優點（例如，對相似性進行有效的度量）以及關鍵詞搜索的靈活性（例如，能夠處理特定詞彙或短語的查詢）。
優化嵌入模型（Fine-tuning Embedding Models）
- 這是對嵌入模型進行微調，以使其更適合特定領域或任務。
- 通常，通用的嵌入模型可能無法捕捉到某些特定領域的概念或語義關係，因此需要對模型進行微調以提高其性能
動態嵌入（Dynamic Embedding）
- 與靜態嵌入不同，動態嵌入會根據上下文動態調整單詞或詞語的嵌入表示
- 這種方法可以捕捉到單詞在不同上下文中的不同含義和用法，提高模型的表現和準確性
  - 例如，OpenAI的embeddings-ada-02就是一種動態嵌入模型，它能夠捕捉到上下文的理解。

Sentence window retrieval

檢索後優化(Post-Retrieval Optimization)

進一步處理檢索到的上下文可以幫助解決，例如超出上下文窗口限制或雜訊等問題，以幫助對關鍵信息的聚焦

提示壓縮（Prompt Compression）
- 通過刪除無關或不重要的上下文，並突出重要的內容，來減少整個提示的長度，提高模型對於關鍵信息的聚焦度
重新排序（Re-ranking）
- 使用機器學習模型重新計算檢索到的上下文的相關性分數，更準確地評估每個上下文的重要性

2023.12。IVAN ILIN。Advanced RAG Techniques: an Illustrated Overview

簡中版2023.12。Arron。LLM之RAG理论（三）| 高级RAG技术全面汇总LLM之RAG理论（三）| 高级RAG技术全面汇总

通俗的進階檢索技巧概論

[source: Advanced RAG Techniques: an Illustrated Overview]

2020。Vinija's AI Notes。NLP • Retrieval Augmented Generation。NLP • Retrieval Augmented Generation

滿完整的筆記，推薦一讀

GenAI

AI Agents

RAG

課程概要

Retrieval Augmented Generation (RAG)概念複習

課程摘要

課程概要

Basic RAG pipeline

Lab Lesson 1: Advanced RAG Pipeline

Basic RAG pipeline

Advanced RAG pipeline

1. Sentence Window retrieval

2. Auto-merging retrieval / Hierarchical Retriever

lab

檢視response

檢視評估結果

使用TruLens評估。 Evaluation setup using TruLens

Reference

Supplements

檢索前優化(Pre-Retrieval Optimization)

檢索優化（Retrieval Optimization）

檢索後優化(Post-Retrieval Optimization)

Read more

[GenAI][AI Agents] Long-Term Agentic Memory With LangGraph - Introduction to Agent Memory

[GenAI][AI Agents] Long-Term Agentic Memory With LangGraph - Baseline Email Assistant

[AI Agents in LangGraph](https://learn.deeplearning.ai/courses/ai-agents-in-langgraph/lesson/1/introduction)

AI / ML領域相關學習筆記入口頁面

使用`TruLens`評估。 Evaluation setup using TruLens