Chain of Natural Language Inference for Reducing Large Language Model Ungrounded Hallucinations

# Chain of Natural Language Inference for Reducing Large Language Model Ungrounded Hallucinations [paper](https://arxiv.org/pdf/2310.03951) Microsoft, 2023 ## Introduction <center> ![image](https://hackmd.io/_uploads/H1kwAo7QR.png) </center> ## Methodology <center> ![image](https://hackmd.io/_uploads/Sym9AsXXR.png) </center> ### Detection agent #### Step1 : Split and select **Raw Response Segmentation** - **Raw responses** are segmented into **individual sentences** using the NLTK sentence splitter. **Noise Removal** - Sentences **lacking factual information** or **considered noise** are **purged** from the segmented responses. **Exception for Short-Generated Responses** - **Short-generated responses** that can be directly formulated as hypotheses are **exempt** from the purging process for benchmark comparison purposes. **Future Work** - There's a plan to develop an **advanced hypothesis selector** as future work. **Resulting Set** - After the **segmentation** and **filtering** steps, a set of selected hypotheses, referred to as **Hselected**, is obtained. #### Step2 : Sentence-level detection **Judge each hypothesis independently against the corresponding premise** - $X$ denote the input source text - $Y_{raw}$ denote the output raw response - $hyp_{i}$ denote selected hypotheses from $Y_{raw}$ - $Entailment: X \Rightarrow hyp_{i}$ - $Contradiction: X \Rightarrow ¬hyp_{i}$ - $Neutral: X \nRightarrow hyp_{i}$ They utilize **CoT prompting**, **guiding** the LLM to locate relevant passages in the **source text X** and allow it to **reason** and then make a **conclusion**. **Detection agent prompt** - **System instruction** - Definition of noun - ex : hypothesis, Entailment - How to think step by step for each judgement - ex : Repeat hypothesis you are judging - Rules - ex : You may assume that today is March 24th, 2023 - **First few shot example** - User - Single Premise - Multiple Hypothesis - Assistant - Answer with **reason** and **conclusion** for each Hypothesis - **Second few shot example** - User - Single Premise - Multiple Hypothesis with **tagged words** - Assistant - Answer with **reason** and **conclusion** for each Hypothesis - **Current request** - Single Premise(**Source Text**) - Multiple Hypothesis(**$hyp_{i}$**) #### Step3 : Entity-level detection - Leverage an **entity recognition model (NER)** to find entities in the nonhallucinated hypothesis - This is because that if a hypothesis contains **abundant factual details** or **some details require complex reasoning** against the source text, sentence-level detection may reach false negative conclusions. ##### Entity category definition - **PERSON**: Names of people. - **PERSONTYPE**: Job types or roles held by a person - **LOCATION**: Natural and human-made landmarks, structures, geographical features, and geopolitical entities. - **EVENT**: Historical, social, and naturally occurring events. - **SKILL**: A capability, skill, or expertise. - **DATETIME-DATERANGE**: Date ranges. - **DATETIME-DURATION**: Durations. - **QUANTITY-NUMBER**: Numbers. - **QUANTITY-CURRENCY**: Currencies #### Step4 : Merging - A hypothesis will be judged as **non-hallucination** only if overall **sentence judgment** and **tagged entities judgments** all vote for non-hallucination. ### Mitigation agent - Now, we have **reason** and **conclusion** of **Hallucinations Hypothesis**, and also **Source Texts** and **Raw Response**. - We use these as **instructions** to **rewrite Raw Response**. #### Mitigation agent prompt - **System instruction** - Explain the task - **Current request** - Source Text - Raw Response - Reason and Conclusion - Explain writing detail ## Experiment ### Hallucination detection experiments #### Datasets <center> ![image](https://hackmd.io/_uploads/rJ53v1NXC.png) </center> - Datasets with **synthetic hallucination** generated on **ground truth response text** - Datasets with hallucination annotated **manually** on real state-of-theart (SOTA) **NLG model output** response text #### Result ##### Synthetic hallucination dataset results <center> ![image](https://hackmd.io/_uploads/Hy9UwyVQR.png) </center> - Dark green : No.1 - Green : No.2 - We can see that CoNLI achieves good result ##### Annotated hallucination dataset results <center> ![image](https://hackmd.io/_uploads/ryh1OyVXA.png) </center> - Dark green : No.1 - Green : No.2 - We can see that CoNLI achieves good result ##### Ablation study <center> ![image](https://hackmd.io/_uploads/rJMNuJ47C.png) </center> - Using both **Sentence-level detection** and **Entity-level detection** get better result on most dataset ### Hallucination reduction experiments #### Result <center> ![image](https://hackmd.io/_uploads/rk03_1V70.png) </center> - **Refined Response is better** than Raw Response on most setting ## Conclusion - **Detect** and **reduce** ungrounded hallucinations in a **plug-and-play** manner - Propose a simple yet effective **LLM-based** framework that formulates **hallucination detection** into a chain of **NLI tasks** - Importantly, its **interpretable output** can also be **leveraged for hallucination reduction**.