YH Hsu
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note No publishing access yet

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.

      Your account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

      Your team account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

      Explore these features while you wait
      Complete general settings
      Bookmark and like published notes
      Write a few more notes
      Complete general settings
      Write a few more notes
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note No publishing access yet

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.

    Your account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

    Your team account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

    Explore these features while you wait
    Complete general settings
    Bookmark and like published notes
    Write a few more notes
    Complete general settings
    Write a few more notes
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    2
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    ### [AI / ML領域相關學習筆記入口頁面](https://hackmd.io/@YungHuiHsu/BySsb5dfp) ### GenAI之RAG 系列筆記 ![image](https://hackmd.io/_uploads/r1g1kaXgC.png =600x) > Modified from [2023.12。IVAN ILIN。Advanced RAG Techniques: an Illustrated Overview](https://pub.towardsai.net/advanced-rag-techniques-an-illustrated-overview-04d193d8fec6) > ## Indexing/Chunking Module系列 - [[RAG for GenAI] LLM Agentic Chunking](https://hackmd.io/@YungHuiHsu/Hk1O6n7x0) ### Chunking策略與考量 推薦閱讀: #### [2023.01。Pinecone。Chunking Strategies for LLM Applications](https://www.pinecone.io/learn/chunking-strategies/) 以下列出考量原則的重點 - Chunking Considerations * **Nature of Content** * Consider if the content is long (e.g., articles, books) or short (tweets, messages) * **Embedding Model and Optimal Chunk Sizes** * Different models have preferences for chunk sizes. * sentence-transformers for sentences * text-embedding-ada-002 for 256 or 512 tokens) * **User Query Expectations** * Match query and content embeddings * short and specific V.S. long and complex * **Application of Results** * Application's purpose * semantic search, question answering, etc. * Token limitation - Determining the optimal chunk size - **Preprocessing Data** * Ensure data quality by preprocessing, such as removing HTML tags from web-sourced data to reduce noise - **Selecting a Range of Chunk Sizes** * Post-preprocessing, explore various chunk sizes considering content nature and model capabilities * Balance between context preservation and accuracy. - **Evaluating Performance of Chunk Sizes** * Test different sizes using a dataset to create embeddings * Run queries to compare performance * Iteratively find the best size for accuracy and context relevance #### [2024.01。Anurag Mishra。Five Levels of Chunking Strategies in RAG| Notes from Greg’s Video](https://medium.com/@anuragmishra_27746/five-levels-of-chunking-strategies-in-rag-notes-from-gregs-video-7b735895694d) [tutorials](https://github.com/FullStackRetrieval-com/RetrievalTutorials/blob/main/tutorials/LevelsOfTextSplitting/5_Levels_Of_Text_Splitting.ipynb) - Level 1 : Fixed Size Chunking - Level 2: Recursive Chunking - Level 3 : Document Based Chunking - Level 4: Semantic Chunking - Level 5: Agentic Chunking #### [2024.0213。Pratik Bhavsar,Galileo Labs。Mastering RAG: Advanced Chunking Techniques for LLM Applications](https://www.rungalileo.io/blog/mastering-rag-advanced-chunking-techniques-for-llm-applications?utm_medium=email&_hsenc=p2ANqtz-8maBXx7vmFHU6cHv5jZV16O8KYbSg2CfHBdeeOYdV-pSeS_GDOXkj8HFgvczEebMKOl5pUhKM9tsehpFxWEDnde9--Xg&_hsmi=301760418&utm_content=301761114&utm_source=hs_email) :::info - :pencil2: 向量資料庫與LLM生成的cost是相反的,切分的文本快越小越多,向量儲存的成本及查找時間越慢,但相反的,送入LLM生成時,所花費的token與文本的雜訊可能越少。 - 越複雜的問題可能需要越多的文本塊、越完整的上下文。 ::: - Impact of chunking - Retrieval Quality: 分塊質量直接影響檢索結果的準確性 * Vector database cost: 分塊大小影響向量數據庫的儲存成本 * Vector database query latency: 分塊策略可能會增加查詢延遲 * LLM latency and cost: 適當的分塊可降低LLM處理的延遲與成本 * LLM hallucinations: 防止模型產生不基於事實的回答 - Factors influencing chunking * Text structure: 文本的結構性影響分塊方式 * 精練的短句、段落 vs 低資訊量的聊天、會議紀錄 * Embedding model: 嵌入模型選擇對分塊結果有決定性影響 * LLM context length: 模型的上下文長度限制決定分塊大小 * Type of questions: 問題類型指導分塊策略的選擇 * 問題的複雜度決定是否需要多個分塊、文本快中上下文的完整性 - Types of chunking ![image](https://hackmd.io/_uploads/r1f-4A7eR.png =800x) --- ## LLM Agentic Chunking 筆記 以下主要筆記關於Agentic Chunking論文的重要內容,目前提出這個概念的主要來自於以下這篇論文,在論文發表到現在半年內,目前似乎沒看到其他方法的改進 目前只有在Langchain上有實作(Llamaindex尚未釋出?) ### 論文 [arXiv:2312.06648。Dense X Retrieval: What Retrieval Granularity Should We Use?](https://chentong0.github.io/factoid-wiki/) #### 主要結果 ![image](https://hackmd.io/_uploads/SJGtpGIlC.png =400x) ![image](https://hackmd.io/_uploads/r1wIpfIeA.png =400x) ![image](https://hackmd.io/_uploads/SyrT6fLlA.png) > 圖中的流程展示了使用命題作為檢索單位進行密集檢索的過程: > * A. **內容轉換**:將維基百科的內容透過「Propositionizer」轉換成命題,這些命題是精簡而自足的事實陳述。 > * B. **資料庫準備**:這些命題被彙編成一個結構化的資料庫,稱為FactoidWiki,代表著被分段並索引的檢索資料庫。 > * C. **檢索過程**:給定一個查詢後,檢索器會在這個資料庫中搜索,識別與查詢向量相似的相關命題。 > * D. **回答生成**:問答模型隨後使用這些檢索到的命題來生成準確且相關的回答。 #### 重要insight :pencil2: Proposition as a Retrieval Unit 以命題作為檢索單元 :::info 1. 每個命題應對應於文本中一個獨特的意義片段,所有命題的組合共同代表了整個文本的語義。 2. 命題應該是最小的單位,也就是說,它不能進一步被分割成其他命題。 3. 命題應當是有上下文且自成一體的。命題應該包含文本中所有必要的上下文(例如,共指現象),以解釋其意義。 ::: - 具體範例 `"埃菲爾鐵塔位於法國巴黎,高300米,是1889年世界博覽會的主要展覽建築。這座鐵塔由古斯塔夫·埃菲爾設計,是巴黎的象徵。"` 根據命題作為檢索單位的定義,把這段文本分解為以下命題: - `埃菲爾鐵塔位於法國巴黎。(包含地理位置)` - `高300米,是1889年世界博覽會的主要展覽建築。(包含高度和歷史意義)` - `由古斯塔夫·埃菲爾設計,是巴黎的象徵。(包含設計者資訊和文化象徵意義` 每個命題都是獨立且自包含的,即使脫離原文,也能表達一個完整的事實。這些命題包含了足夠的上下文資訊,因此,即使在不參考原始全文的情況下,讀者仍然可以理解每個命題的具體意義。 #### 論文方法的適用情境 :pencil: 使用命題作為檢索單位可提升檢索精確性,但可能缺乏足夠的上下文且處理較為耗時 - :pencil: 可能適用的情境與文見類型: - 需要高度精確信息檢索的任務,例如常見的問答系統,或者是在法律文件和技術文檔中查找特定事實。對於這些情境,每個命題提供了獨立且明確的信息,能夠直接回應查詢需求 相反的,可能不適用的情境: - 高度敘事性或依賴大範圍上下文的文本,例如小說或長篇文章,因為這些文類型中的信息往往需要更廣泛的上下文來進行解釋和理解 --- # Supply ### RAG 架構與模組化 對於目前(2024年)整體架構發展,推薦以下幾篇: - [2023.12。IVAN ILIN。Advanced RAG Techniques: an Illustrated Overview](https://pub.towardsai.net/advanced-rag-techniques-an-illustrated-overview-04d193d8fec6) - [arXiv:2312.10997。Retrieval-Augmented Generation for Large Language Models: A Survey](https://arxiv.org/abs/2312.10997) 比較通俗好理解的介紹,特別是第一作者的Yunfan Gao的blog - [2024.0325。Prompt Engineering Guide。Retrieval Augmented Generation (RAG) for LLMs](https://www.promptingguide.ai/research/rag) - Prompt Engineering Guide中擷取精華的介紹 ![image](https://hackmd.io/_uploads/r1L6pt47A.png =600x) - [2024.01。Yunfan Gao。Modular RAG and RAG Flow: Part Ⅰ](https://medium.com/@yufan1602/modular-rag-and-rag-flow-part-%E2%85%B0-e69b32dc13a3) - 論文第一作者的Yunfan Gao的blog,module細節有補了不少比較多細節的圖 ### Agent + RAG - [2024.05。Yantraka.ai。Deep Dive into Agentic Retrieval Augmented Generation (A-RAG)](https://www.linkedin.com/pulse/deep-dive-agentic-retrieval-augmented-generation-a-rag-sai-panyam-22dlc/) 如何用Agent 設計拓展RAG的能力 ![image](https://hackmd.io/_uploads/r1A10u47A.png =600x) > Plan And Execute Agent: Langchain Experimental

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password
    or
    Sign in via Facebook Sign in via X(Twitter) Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    By signing in, you agree to our terms of service.

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully