flora0110
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    Differentiable open-ended commonsense reasoning 閱讀筆記 === [原始論文](https://arxiv.org/abs/2010.14439) Abstract === - OpenSCR是一種評估模型推理能力的方法(?) - DRFACT 用於對知識事實進行多跳推理 - 為了評估OpenCSR方法,作者適應了三個流行的多選擇數據集,並通過群眾外包收集了每個測試問題的多個新答案。 --- 1 Introduction === - 現有的常識推理模型通常通過對問題—候選答案對進行評分來工作 - 通識常識事實語料庫中離線提取的大量與問題獨立的候選概念集合中產生一個排序的答案列表。 <big>**OpenCSR問題**</big> - 通過問題本身從語料庫中檢索出相關的事實 <big>**多跳推理**</big> - 從多個事實中進行推理,而不是僅依靠一個或幾個事實來回答問題 - OpenCSR中沒有可用的標註來幫助模型識別哪些事實需要被用於推理鏈,唯一的監督信號只有一組問題和答案 - 多數常識問題都需要進行跨概念的推理,但如「樹木」、「二氧化碳」、「大氣」和「光合作用」這些概念之間的關係很難用一個簡單的知識圖譜(KG)來表示。 - 知識圖譜(KG) - 解決方法 : 常識語料庫(如GenericsKB) - 一個收集了大量自然語言句子的語料庫,其中包含了許多具有共通性的常識事實,例如「樹木透過光合作用將二氧化碳從大氣中移除」。我們可以從這樣的語料庫中提取出事實 → how can we conduct multi-hop reasoning over such a knowledge corpus, similar to the way multi-hop reasoning methods traverse a KG? → can we achieve this in a differentiable way, to support end-to-end learning? - differentiable way 當一個模型是「可微分的」,代表著當我們對該模型進行訓練時,我們可以使用基於梯度的優化方法來找到最佳的模型參數,以最大化模型的性能。換句話說,如果模型是可微分的,那麼我們可以通過在模型中獲得對模型參數的梯度,來調整模型的參數,使其更好地擬合訓練數據,從而提高模型的性能。Ex. 線性回歸模型可微分 - end-to-end learning 在 end-to-end learning 中,我們將整個模型看作一個黑盒,只需要提供輸入和期望輸出,然後讓模型自己學習如何提取特徵和進行分類 - Multi-hop reasoning 指透過多個步驟來推理知識圖譜(Knowledge Graph)中的概念之間的關係。在知識圖譜中,每個概念都是一個節點,並且概念之間的關係可以表示為邊。透過多步推理,可以跨越多個節點和邊,從而推理出概念之間的複雜關係。 **DRFACT** - formulate multi-hop reasoning over a corpus as an iterative process of differentiable fact-following operations over a hypergraph - 把corpus 中的句子都encode成dense vector 去組成一個neural fact index,如此一來就可以在maximum inner product search (MIPS)做快速索引 - neural fact index : 將事實encode 成vector,這些向量就構成了一個neural fact index,可以用於快速檢索相關事實。例如,如果我們想知道英國的首都是什麼,我們可以計算每個向量與“英國的首都是什麼”這個向量的內積,並選擇最大的內積對應的事實。 - MIPS(之後有空在研究) - 使用簡單的線性搜尋,需要逐個計算所有向量的內積 MIPS 是一種可以加速最大內積搜尋的方法,基於一個稱為「倒排索引表」的資料結構。在 MIPS 中,我們首先將所有向量進行分組,將每個向量的內積相當於「該向量所屬的組」和「輸入向量 $q$ 所屬的組」之間的內積,這樣可以大大減少需要計算的向量數量。 - fact-to-fact矩陣 - 存儲事實之間的符號鏈接(例如,如果兩個事實共享共同的概念,則它們會被連接) <big>**evaluate OpenCSR methods**</big> - 為了評估DRFACT方法而創建的OpenCSR數據集,並使用Crowd-sourcing人工標注來為測試問題收集多個新答案 --- 2 Related Work === Commonsense Reasoning --- - 最近常識推理 (CSR) 的一些方法,大多聚焦在多選題 QA 上然而,這些方法不太適用於實際應用中,因為通常不會有可用的答案候選者。 - UnifiedQA 和其他closed-book QA models可以生成問題的答案,但缺點是它們不提供答案的supporting evidence,這使得它們不太可信。 - closed-book models exist that are augmented with an additional retrieval module: 但這些模型主要適用於單跳推理。 QA over KGs or Text --- - triple-based symbolic commonsense knowledge graphs (CSKGs)(例如“物體1”、“屬性”、“物體2”)中,由於其僅描述兩個實體之間的關係,所以在表示複雜的常識知識時可能受到限制。 - triple-based symbolic commonsense knowledge graphs (CSKGs)不能處的三元關係範例 ex. 「小明是這個班上最高的學生」:這個關係包含三個元素,分別是「小明」、「這個班」和「最高」。其中,「小明」和「這個班」是實體,「最高」是描述性詞語。 - GenericsKB (corpus of generic sentences about commonsense facts) : - text can represent more complex commonsense knowledge, including facts that relate three or more concepts. - OpenCSR might need interative retrieval(迭代檢索) - OpenCSR provide fewer hints surface of the commnense questions - surface hints 例如,在open-domain QA任務的一個問題中,“誰是美國的第一任總統?”這個問題中包含了“美國”、“第一任總統”等關鍵詞,這些關鍵詞就提供了關於回答這個問題所需要的一些線索。因此,這個問題的答案可以通過單個步驟的推理得到。 相比之下,OpenCSR任務的問題通常涉及更複雜的常識推理,問題的“表面信息”相對較少。例如,“為什麼貓喜歡捉老鼠?”這個問題中並沒有直接給出答案所需要的線索,回答這個問題需要涉及到多個概念(如貓、老鼠、狩獵等)之間的多層次推理過程。因此,OpenCSR任務對於常識推理的難度更高。 Multi-Hop Reasoning --- - recent model using multi-hop reasoning through iterative retrieval(GRAFT-Net (Sun et al., 2018), MUPPET (Feldman and El-Yaniv, 2019), PullNet (Sun et al., 2019), and GoldEn (Qi et al., 2019)) are not end-to-end differentiable, so they are slow - Neural Query Language (Cohen et al., 2020) : - differentiable multi-hop - entity-following templates for reasoning over a compactly stored symbolic KG - 但KG只能處理二元關係 <big>**DrKIT**</big> | DrKIT | DRFACT | | --- | --- | | multi-hop between entities | multi-hop between facts | | 1) finding mentions of new entities x’ that are related to some entity in X, guided by the indices, and then 2) aggregating these mentions to produce a new weighted set of entities. | | | differentiable | differentiable | | only named entities | | | entity | fact | | --- | --- | | 具體的事物或概念 | 這些entities之間的關係 | | 節點 | 邊線 | | ex. 人、地點、組織、事件 | ex. “John是某個組織的CEO”、“Paris是法國的首都” | --- 3 Open-Ended Commonsense Reasoning === Task Formulation --- - F: corpus of knowledge facts - sentence that describes generic commonsense knowledge - ex. “trees remove carbon dioxide from the atmosphere through photosynthesis.” - V: denote a vocabulary of concepts - noun or base noun phrase mentioned frequently in these facts - ex. ‘tree’ and ‘carbon dioxide’ - q: question - answer it by returning a weighted set of concepts, such as {(a1=‘renewable energy’, w1), (a2=‘tree’, w2), . . . }, where wi ∈ R is the weight of the predicted concept ai ∈ V. - to be an interpretable, trustworthy reasoning models, it is expected that models can output intermediate results that justify the reasoning process - i.e., the supporting facts from F. E.g., an explanation for ‘tree’ to be an answer to the question above can be the combination of two facts: f1 = “carbon dioxide is the major ...” and f2 = “trees remove ...” Implicit Multi-Hop Structures --- | Commonsense questions | multi-hop factoid QA | | --- | --- | | more implicit and relatively unclear | focus on querying about evident relations between named entities. | | the reasoning process can be implicit and relatively unclear | the reasoning process can be decomposed into more specific qs | | ex.<br>question: “what can help alleviate global warming?”<br>→<br>q1 = “what contributes to global warming”<br>q2 = “what removes q1. answer from the atmosphere”<br> ==— but many other decompositions are also plausible== | ex.<br>question: “which team does the player named 2015 Diamond Head Classic’s MVP play for?”<br>→<br>q1 = “the player named 2015 DHC’s MVP” <br>q2 = “which team does q1. answer play for” | | 指需要透過人們對於一般常識的了解才能回答的問題 | 多個事實之間進行多次推理才能回答的問題 | - unlike HotpotQA 我們沒有任何標準答案或事實作為訓練數據的依據 --- 4 DrFact: An Efficient Approach for Differentiable Reasoning over Facts === ![](https://i.imgur.com/RqkVH1F.png) Figure 3 <small>The overall workflow of DRFACT. We encode the hypergraph (Fig. 2) with a concept-to-fact sparse matrix E and a fact-to-fact sparse matrix S. The dense fact index D is pre-computed with a pre-trained bi-encoder. A weighed set of facts is represented as a sparse vector F. The workflow (left) of DRFACT starts mapping a question to a set of initial facts that have common concepts with it. Then, it recursively performs Fact-Follow operations (right) for computing Ft and At. Finally, it uses learnable hop-weights αt to aggregate the answers</small> **總結步驟** 1. 預處理 製作**Sparse Fact-to-Fact Index (S)**和**Dense Neural Fact Index (D)**(Bi-encoder產生),以計算fact和fact之間的相似度 2. fact-following 1. sparse retrieva 將Ft-1傳入S矩陣中取得可能的Fst Fst = Ft-1S 2. dense retrieval 1. 將Ft-1經過D轉為zt-1 zt-1 = Ft-1D 2. 再將zt-1和qt傳入MLP得到**ht−1** ht-1 = g(zt-1,qt) 3. 使用MIPS(maximum inner product search)在D上用ht−1搜尋next-hop的前K個相關事實,即Fdt Fdt = MIPS_k(ht-1,D) 3. element-wise multiplication Ft = Fst ⊙ Fdt 4.1 Overview --- ![](https://i.imgur.com/9rgTIwq.png) - reasoning model will traverse hypergraph - each hyperedge corresponds to a fact in F and connects the concepts in V - 因為hyperedge (F)連接了提到的concepts (V),這樣的方式可以保留原始自然語言陳述的上下文信息。 1. 從question(concepts)開始traverse hypergraph, 最終經過多個hyperedge(facts)到達a set of cpncept nodes. 2. 我們想得到每個c (c ∈ V) 作為問題q的答案的可能性P(c|q) 3. Ft表示a weighted set of retrieved facts at 第t次hop, and F0 for the initial facts below 4. 我們迭代搜尋facts作為下次hop,最終我們用搜尋到的fact幫concept評分 5. (我們會將每次搜尋得到的一些事實(facts)加入到已經搜尋到的事實中,進而擴大已知的事實集合。這樣可以幫助模型更好地理解問題,並找到更精確的答案。而在最後一步,我們使用已經搜尋到的事實來幫每個候選答案(即V中的每個概念)進行評分,以判斷哪個答案最可能是正確的答案。)chatgpt不知道從哪知道的 4.2 Pre-computed Indices --- - **Dense Neural Fact Index D** - pre-train a bi-encoder architecture over BERT - bi-encoder [BERT to bi-encoder](https://www.notion.so/BERT-to-bi-encoder-b6d4254e10ef4b2c9ba732ca873cc106) Bi-encoder是用於計算兩個文本之間相似度的一種方法。其中一個文本作為“query”(即問題),另一個文本則作為“candidate”(即可能的答案)。Bi-encoder將這兩個文本都嵌入到相同的向量空間中,然後計算它們之間的相似度得分 - 學習最大化含有question的正確答案的facts的分數 - 用MIPS再facts去做dense搜尋 - pre-train後,把fact F都embed成dense vector (使用 [CLS] 標記表示) - D is a |F| × d dense matrix - **Sparse Fact-to-Fact Index S** - 通過一組連接規則去pre-compute facts 之間的sparse links(稀疏連接) - ex. fi→fj when - fi and fj has a least one common concept - & fj 引入了至少兩個不在fi中的新concept - S is a binary spare tensor(張量),舉有dense shape |F|×|F| - - **Sparse Index of Concept-to-Fact Links E** - a concept can appear in multiple facts - a fact usually mentions multiple concepts - ⇒ 把每個fact和每個concept之間共同出現的情況encode成一個 sparse matrix with dense shape |V|×|F| (也就是concept-to-fact index) - 將所有的concept與相關聯的fact之間的關聯性以稀疏矩陣的形式表示出來的索引 4.3 Differentiable Fact-Following Operation --- <big>**fact-following framework**</big> - **P (Ft | Ft−1, q)** : 在問題q的背景下從一個fact到另一個fact的轉換model - **S(Sparse Fact-to-Fact Index)** : Sij。直觀地說,如果我們可以從fi到fj進行遍歷,這些fact應該提到一些共同的concep - 具體來說,稀疏矩陣 S 中的每一行代表一個事實,每一列代表一個特定的特徵,當 Ft-1 的向量表示通過矩陣乘法與 S 相乘時,會獲得一個長度為 S 的列向量,其中每個元素表示 Ft-1 和對應事實之間的相似性得分。通過對得分進行排序,可以獲得可能的下一個事實。 - **D(Dense Neural Fact Index)** 包含了每個fact的與一訊息,可用於衡量一個fact在某question下的可信度 - 使用TensorFlow的tf.RaggedTensor結構implement <big>**fact-follow operation**</big> **sparse retrieva** - uses a fact-to-fact sparse matrix to obtain possible next-hop facts - 作者使用了 TensorFlow 中的 tf.RaggedTensor 來存儲 S 矩陣,該方法可以節省大量的存儲空間和計算資源,可以高效地計算 取得下一跳的fact,Fst (a set) $F_t^s = F_{t-1}S$ **dense retrieval** 1. 將Ft-1作為input經過D得到一個密集向量zt-1 - **zt-1**: Ft-1 集合中所有事實的向量表示聚合成的一個向量,它表示當前時間步前的所有事實的綜合信息。 $z_{t-1} = F_{t-1}D$ 2. 將zt-1和qt傳入模型 $h_{t-1} = g(z_{t-1},q_t)$ - **qt**: 是當前步驟的查詢向量,它基於輸入的question和先前的查詢向量計算而來,簡單來說,qt 是當前步驟的查詢 - **g()**: MLP(一種深度學習模型,也稱為fact-translating function),計算前一步的事實向量 zt-1 和當前步驟的查詢向量 qt的相似性得分 - **ht−1**: query vector a dense vector that has same dimemsionality as fact vector(zt-1) $F_t^d = MIPS_k(h_{t-1},D)$ - 使用maximum inner product search (MIPS) 在 D上用ht−1搜尋next-hop的前K個相關事實,即Fdt 3. element-wise multiplication - To get the best of both symbolic and neural world - Ft = Fst ⊙ Fdt $\begin{split}F_t&=Fact-Follow(F_{t-1},q)\\&=F_{t-1}\odot MIPS_k(g(F_{t-1D,q_t},D))\end{split}$ 4. concept predictions 1. **At** (a set of concept predictions) : multiply **Ft** with a precomputed **fact-to-concept matrix E →** At - fact-to-concept matrix E (還沒了解過) Appendix B 2. **concept scores :** - 好像是從At找出有提到concept c的facts中,分數最大的分數去aggregate the concept scores - 好像是用於更新Pre-computed Indices中的S和D矩陣 3. $A = \sum_{t=1}^T(\alpha_tA_t)$ - αt is a learnable parameter - αt Appendix B - Ft = Fact-Follow(Ft−1, q): - a random-walk process on the hypergraph associated with the corpus. - self-following - performance improved - augmenting(增加) Ft with the 高於threshold τ 的 Ft−1: Ft = Fact-Follow(Ft−1, q) + Filter(Ft−1, τ ) - Ft 包含和自己高度相關的facts (distance t’<t),在不同的問題可能需要不同數量的推理步驟的情況下,可以improve model. ---

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully