Self-Instruct: Aligning Language Models with Self-Generated Instructions - HackMD

<style> img { display: block; margin-left: auto; margin-right: auto; } </style> > [Paper link](https://arxiv.org/abs/2212.10560) | [Note link](https://zhuanlan.zhihu.com/p/617965285) | [Code link](https://github.com/yizhongw/self-instruct) | ACL 2023 :::success **Thoughts** - Self-Instruct may work based on the performance of the LMs where the instruction generated from. - But the quality of the instrcution need to be checked by human. - The performace can closely to $\text{InstructGPT}_{\text{001}}$ if the dataset become bigger. ::: ## Abstract This paper introduces **SELF-INSTRUCT, a framework for improving the instruction-following capabilities of pre-trained language models by bootstrapping off their own generations.** Their pipeline generates instructions, input, and output samples from a language model, then filters invalid or similar ones before using them to finetune the original model. SELF-INSTRUCT provides an **almost annotation-free method** for aligning pre-trained language models with instructions, and they release our large synthetic dataset to facilitate future studies on instruction tuning. ## Introduction Collecting such instruction data is costly and often suffers limited diversity given that most human generations tend to be popular NLP tasks. In this work, they introduce SELF-INSTRUCT, a semi-automated process for instruction-tuning a pretrained LM using instructional signals from the model itself. ![](https://hackmd.io/_uploads/BkZ00Sr52.png) In summary, their contributions are: 1. They introduce SELF-INSTRUCT, a method for inducing instruction following capabilities with minimal human-labeled data 2. They demonstrate its effectiveness via extensive instruction-tuning experiments 3. They release a large synthetic dataset of 52K instructions and a set of manually written novel tasks for building and evaluating future instruction-following models ## Method SELF-INSTRUCT, which refers to the pipeline of generating tasks with a vanilla pretrained language model itself, filtering the generated data, and then conducting instruction tuning with this generated data in order to align the LM to follow instructions better. ### Defining Instruction Data The instruction data: a set of instructions $\{I_t\}$, $t$ defines a task in natural language. Task $t$ has $n_t \ge 1$ input-output instances $\{X_{t,i}, Y_{t,i}\}^{n_t}_{i=1}$. A model $M$ is expected to produce the output, given the task instruction and the corresponding input: $M(I_t, X_{t,i}) = Y_{t,i}$, for $i \in \{1, \dots, n_t\}$. :::info Note that the instruction and instance input does not have a strict boundary in many cases. To encourage the diversity of the data format, it allows such instructions that do not require additional input (i.e., $X$ is empty). ::: ### Automatic Instruction Data Generation The pipeline for data generation consists of four steps: 1. Generating task instructions 2. Determining if the instruction represents a classification task 3. Instance generation with either an input-first or output-first approach 4. Filtering low-quality data > **Instruction Generation.** SELF-INSTRUCT generates new instructions from the task pool with 175 tasks (1 instruction and 1 instance for each task). For every step, they sample 8 task instructions from this pool as in-context examples. Of the 8 instructions, 6 are from the human-written tasks, and 2 are from the model-generated tasks in previous steps to promote diversity. > **Classification Task Identification.** They prepare two different approach for classification and non-classification tasks, so the method needs to identify whether the generated instruction represents a classification task or not. > **Instance Generation.** Given the instructions and their task type, they generate instances for each instruction independently. And, they found that pretrained LMs can achieve this to a large extent when prompted with instruction-input-output in-context examples from other tasks. :::info The **input-first approach** is a language learning method that emphasizes the importance of comprehensible input (CI) that language learners are exposed to. ::: However, they found that this approach can generate inputs biased toward one label, especially for classification tasks. Therefore, they additionally propose an **Output-first Approach** for classification tasks, where they first generate the possible class labels, and then condition the input generation on each class label. They apply 1. The output-first approach to the classification tasks identified in the former step 2. The input-first approach to the remaining non-classification tasks > **Filtering and Postprocessing.** Add a new instruction if its ROUGE-L similarity is less than 0.7. Also exclude instructions that contain some specific keywords (e.g., image, picture, graph) that usually can not be processed by LMs. When generating new instances for each instruction, it filters out instances that are exactly the same or those with the same input but different outputs. ### Finetuning the LM to Follow Instructions They concatenate the instruction and instance input as a prompt and train the model to generate the instance output in a standard supervised way. To make the model robust to different formats, we use multiple templates to encode the instruction and instance input together. ## SELF-INSTRUCT Data from GPT3 They use the largest GPT3 LM (“davinci” engine) accessed through the OpenAI API. ![](https://hackmd.io/_uploads/HkDYxBUoh.png) ![](https://hackmd.io/_uploads/H1BigS8j2.png) About the Quality, they found that most of the generated instructions are meaningful, while the generated instances **may contain more noise** (to a reasonable extent). However, they found that even though the generations may contain errors, most of them are **still in the correct format or partially correct**, which can provide useful guidance for training models to follow instructions. ![](https://hackmd.io/_uploads/Bk62xBIj3.png) ## Experimental Results Their method called $\text{GPT}_{\text{SELF-INST}}$, finetuning GPT3 on its own instruction data. Baseline model: 1. T5-LM, GPT3 2. T0 and T$k$-INSTRUCT (Both of these models are finetuned from the T5) 3. InstructGPT Human-written instruction data: SUPERNI **Zero-Shot Generalization on SUPERNI benchmark** ![](https://hackmd.io/_uploads/SJwetHvj2.png) **Generalization to User-oriented Instructions on Novel Tasks** ![](https://hackmd.io/_uploads/Syg4trvi2.png) **Effect of Data Size and Quality** ![](https://hackmd.io/_uploads/ryLF5Bvs3.png) ## Conclusion This paper introduces SELF-INSTRUCT, a method to improve the instruction-following ability of LMs via their own generation of instruction data. Human evaluation on this set shows that tuning GPT3 with SELF-INSTRUCT outperforms using existing public instruction datasets by a large marginand performs closely to $\text{InstructGPT}_{\text{001}}$. ## Limitations **Tail phenomena** SELF-INSTRUCT depends on LMs, and it will inherit all the limitations that carry over with LMs. In other words, LMs’ largest gains correspond to the frequent uses of languages (head of the language use distribution), and there might be minimal gains in the low-frequency contexts. **Dependence on large models** Because of SELF-INSTRUCT’s dependence on the inductive biases extracted from LMs, it might work best for larger models. If true, this may create barriers to access for those who may not have large computing resources. **Reinforcing LM biases** A point of concern for the authors is the unintended consequences of this iterative algorithm, such as the amplification of problematic social biases.