Graph Neural Prompting with Large Language Models - HackMD

<style> img { display: block; margin-left: auto; margin-right: auto; } </style> > [Paper link](https://arxiv.org/abs/2309.15427) | [Note link](https://blog.csdn.net/qq_44426403/article/details/136398475) | [Code link](https://github.com/meettyj/GNP) | AAAI 2024 :::success **Thoughts** They propose Graph Neural Prompting (GNP), a novel plug-and-play method to assist pre-trained LLMs in learning beneficial knowledge from KGs. ::: ## Abstract To reduce the limitation of the large language model, the existing work enhances pre-trained LLMs using grounded knowledge. For example, retrieval-augmented generation, remains an open question, knowledge graphs (KG). In this paper, they propose graph neural prompting (GNP), a plug-and-play method to help pre-trained LLMs gain beneficial knowledge from KGs. ## Background Knowledge graphs (KGs) store enormous facts and serve as a systematic way of presenting knowledge. Existing methods have incorporated KGs with language models by designing customized model architectures to accommodate both textual data and KGs. The joint training model becomes challenging due to the parameter size of the language model. Adirect way to combine the benefit of KGs and the language model feeding the KG triples1 into LLMs. However, this method can introduce substantial noise since KGs might contain various extraneous contexts. Can we learn beneficial knowledge from KGs and integrate them into pre-trained LLMs? ## Method They propose a method that retrieves and encodes the pertinent grounded knowledge to derive a Graph Neural Prompt. The prompt is an embedding vector that can be sent to LLMs for guidance and instructions. Given a question $Q$, a set of answer options $A = \{ a_k \}_{k=1}^K$, and an optional context $C$. The ground truth label $y \in A$ is the correct answer. We need to design a model $\mathcal{F}_{\Theta}$ that selects the best option to answer the question. In addition, this paper uses knowledge graph $\mathcal{G}$ to provide external knowledge and help the model to answer the question. ![image](https://hackmd.io/_uploads/ryWfknns0.png) Below are the steps: 1. Tokenizing the concatenation of $C, Q, A$ into a sequence of input text tokens $X$. 2. Designing a series of prompt tokens $P$, pretending it to the input tokens $X$. Using it as input for the LLM model to generate prediction $y^\prime = f([P, X])$. 3. The model can be trained for downstream task adaptation using standard maximum likelihood loss using teacher forcing and a cross-entropy loss $\mathcal{L}_{llm} = - \log p(y \mid X, \Theta)$. ## Experiment ### Knowledge Graphs and Datasets - General domain (commonsense reasoning) - Biomedical domain (biomedical reasoning) ### Two Settings - LLM Frozen - LLM Tuned ### Baselines * LLM-only * Hard prompts (three prompt design methods) * KG Flattening * One-hop (OH) and two-hop (TH) * Prompt tuning (soft prompts) Below is the table shows the overall experimental results on commonsense reasoning and biomedical reasoning tasks. ![image](https://hackmd.io/_uploads/HJuxk3nsR.png) Below is the figure shows the result comparison across LLM Frozen and LLM Tuned. ![image](https://hackmd.io/_uploads/SJOEy22sR.png)