Finetuning Large Language Models

指令調整(Instruction-tuning)

課程概要

介紹指令微調技術，使GPT-3轉化為chat GPT，增強其聊天功能
指令微調使模型更像聊天機器人，並大大提高了AI的普及率
提示模板可用於資料轉換和問答格式建立
微調的主要步驟包括資料準備、訓練和評估，其中資料準備最具特色
ChatGPT是一大型模型，而還有其他較小的模型，經微調後能更準確回答問題

什麼是指令微調?(What is instruction finetuning?)

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

微調Finetuning

定義
- 微調是一個過程，其中一個預訓練的模型，如大型語言模型(LLM)，在特定的資料集上進一步訓練，以專門從事某一特定任務或領域。
目的
- 它有助於調整通用模型以滿足特定需求，增強其在專門任務上的性能，而無需從頭開始訓練過程。
- 可以應用於各種任務，如推理reasoning、路由routing、copilot(代碼編寫)、聊天和基於代理(agents)的互動。

instruction Finetuning?

一種微調變體，可以將像GPT-3這樣的模型轉化為像ChatGPT這樣的聊天導向版本
也被稱為"指令調整"(instruction-tuned)或"指令跟隨"(instruction-following)的LLMs
旨在教導模型更像一個聊天機器人
為與模型互動提供了更好的用戶界
在增加AI的普及率(adoption)方面發揮了重要作用，將其用戶基礎從數千名研究人員擴大到數百萬人

指令跟隨資料集(Instruction-following datasets)

專門設計來訓練模型按照指令或以結構化的方式回應提示的資料集。
這些資料集在將像GPT-3這樣的模型轉化為像ChatGPT這樣的更互動版本中起到了關鍵作用。
目標是使AI模型理解並遵循指令，更像一個聊天機器人。

一些現有的資料已經可以在線上直接使用：

FAQs
- 常見問題解答，因為它們提供了問題和答案的結構化格式，對訓練模型回應用戶查詢非常有益。
客戶支持對話
- 這些是客戶和支持代表之間的實際互動。它們提供了豐富的對話和問題解決場景，幫助模型理解並回應用戶問題。
Slack消息
- 來自像Slack這樣的平台的對話可以被使用，因為它們代表了用戶之間的非正式、實時互動。這可以幫助訓練模型理解並參與隨意的對話。

LLM資料生成(LLM Data Generation)

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

非問答形式的資料也可以轉換成問答形式
- 通過提示模板
  - 將說明、文件等非結構化內容轉換成結構化的問答格式，可以幫助LLM更好地學習這些內容
  - eg: 把README文件透過提示模板，使用另一個LLM把它轉變為結構化問達格式
  - 轉換的方式包括人工編寫問答對、使用規則抽取問題和答案等
- 使用其他LLM 生成資料
  - Alpaca使用ChatGPT將非結構化內容轉換為結構化問答
  - 建立LLM鏈，第一個LLM生成資料，第二個LLM進行微調
- 使用開源模型
  - 避免高昂的定制LLM成本，也可以在開源模型基礎上進行進一步微調

指令微調的泛化/通用化(Instruction Finetuning Generalization)

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

可以訪問模型預先存在的知識
- 微調不僅僅是記住問答對，也可以利用模型本身已有的知識
- 模型可以回答不在明確微調資料中的問題
將遵循指示推廣到其他資料，而不只是微調資料集
- 微調教會模型新的行為，這可以應用到更廣泛的資料
- 如ChatGPT論文中，模型可以回答編碼問題，即使編碼問題不在微調資料中

微調概述(Overview of Finetuning)

定義: 微調是一個過程，其中一個預訓練的模型，如大型語言模型(LLM)，在特定的資料集上進一步訓練，以專門從事某一特定任務或領域。
目的: 它有助於調整通用模型以滿足特定需求，增強其在專門任務上的性能，而無需從頭開始訓練過程。

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

不同類型的微調
- 指令微調: 如前所述，這種類型專注於教導模型遵循特定指令，使其更像一個聊天機器人。
- 任務特定微調: 這涉及訓練模型進行特定任務，如推理、路由、copilot (代碼編寫)、聊天和基於代理的互動。
- 領域特定微調: 在這裡，模型被微調為特定的領域或行業，如金融、醫療或娛樂，以理解和生成與該領域相關的內容。

不同微調類型的區別主要在於資料預處理
- 根據不同任務調整訓練資料
- Training和Evaluation流程類似

Lab - Instruction-tuning

Load instruction tuned dataset

環境設定












import itertools
import jsonlines

from datasets import load_dataset
from pprint import pprint

from llama import BasicModelRunner
from transformers import AutoTokenizer， AutoModelForCausalLM
from transformers import AutoModelForSeq2SeqLM， AutoTokenizer
import pandas as pd

instruction_tuned_dataset = load_dataset("tatsu-lab/alpaca"， split="train"， streaming=True)

展示資料集






m = 5
print("Instruction-tuned dataset:")
top_m = list(itertools.islice(instruction_tuned_dataset， m))
with pd.option_context('display.max_colwidth'， None):
    display(pd.DataFrame(top_m))

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

2種提示模板(輸入有、無)
提示模板是指預定義的格式或結構，用於指導用戶以特定方式提供輸入或資訊

prompt_template_with_input
prompt_template_without_input
















prompt_template_with_input = """Below is an instruction that describes a task， paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{instruction}

### Input:
{input}

### Response:"""

prompt_template_without_input = """Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{instruction}

### Response:"""

為提示增加資料 Hydrate prompts (add data to prompts)

Hydrate prompts指指將實際資料插入預定義的提示模板中，從而生成完整的提示

這個過程可以讓提示模板從抽象變得具體，為模型提供真實世界的上下文。
添加資料到提示模板中，可以幫助LLM在訓練時學習到真實的知識，而不只是模板中的通用信息。
資料添加的位置通常是模板中特定的占位符(placehoder)，需要根據不同的預定義模板進行設置。













processed_data = []
for j in top_m:
  if not j["input"]:
    processed_prompt = prompt_template_without_input.format(instruction=j["instruction"])
  else:
    processed_prompt = prompt_template_with_input.format(instruction=j["instruction"]， input=j["input"])

  processed_data.append({"input": processed_prompt， "output": j["output"]})

pprint(processed_data[0])
with pd.option_context('display.max_colwidth'， None):
    display(pd.DataFrame(processed_data))

{'input': 'Below is an instruction that describes a task. Write a response '
          'that appropriately completes the request.\n'
          '\n'
          '### Instruction:\n'
          'Give three tips for staying healthy.\n'
          '\n'
          '### Response:'，
 'output': '1.Eat a balanced diet and make sure to include plenty of fruits '
           'and vegetables. \n'
           '2. Exercise regularly to keep your body active and strong. \n'
           '3. Get enough sleep and maintain a consistent sleep schedule.'}

比較非指令微調模型與指令微調模型(Compare non-instruction-tuned vs. instruction-tuned models)

執行A/B測試，比較同樣大小的LLM模型，在進行指令微調前後的不同表現
評估指標包括遵循指令的精確性，回答和上下文的相關性等
指令微調明顯提高了模型理解並執行指令的能力，而未經指令微調的模型更容易產生與提問無關的回答

未經指令微調



non_instruct_model = BasicModelRunner("meta-    llama/Llama-2-7b-hf")
non_instruct_output = non_instruct_model("Tell me how to train my dog to sit")
print("Not instruction-tuned output (Llama 2 Base):"， non_instruct

基本產出

Not instruction-tuned output (Llama 2 Base): 

Tell me how to train my dog to sit. I have a 10 month old puppy and I want to train him to sit. I have tried the treat method and he just sits there and looks at me like I am crazy. I have tried the "sit" command and he just looks at me like I am crazy. I have tried the "sit" command and he just looks at me like I am crazy. ...

經指令微調



instruct_model = BasicModelRunner("meta-llama/Llama-2-7b-chat-hf")
instruct_output = instruct_model("Tell me how to train my dog to sit")
print("Instruction-tuned output (Llama 2): "， instruct_output)

按照提示與指令得到較好、結構化的產出

Instruction-tuned output (Llama 2):   on command.
How to Train Your Dog to Sit on Command
Training your dog to sit on command is a basic obedience command that can be achieved with patience， consistency， and positive reinforcement. Here's a step-by-step guide on how to train your dog to sit on command:
1. Choose a Quiet and Distraction-Free Area: Find a quiet area with minimal distractions where your dog can focus on you.
2. Have Treats Ready: Choose your dog's favorite treats and have them ready to use as rewards.
3. Stand in Front of Your Dog: Stand in front of your dog and hold a treat close to their nose.
4. Move the Treat Above Your Dog's Head: Slowly move the treat above your dog's head， towards their tail. As your dog follows the treat with their nose， their bottom will naturally lower into a sitting position.
5. Say "Sit" and Reward: As soon as your dog's butt touches the ground， say "Sit" and give them the treat. It's important to say the command word as they're performing

嘗試更小的模型Try smaller models
除了使用大規模的預訓練語言模型如GPT-3等進行指令微調，也可以嘗試使用較小的開源語言模型。
- 一些較小規模的語言模型包括:
  - GPT-2等規模在數百萬參數量級的模型
  - 如Anthropic開源的Claude等，規模在幾億參數量級的模型
- 使用小模型的優點包括:
  - 計算資源需求較低，訓練速度更快
  - 可以更經濟高效地測試不同的模型結構和超參數
  - 指令微調後性能也可達到可用水平
- 小模型的劣勢是:
  - 由於參數量更少，表現力較大模型弱
  - 對語言理解的能力相對較弱
  - 需要精心設計訓練資料才能取得好的指令微調效果
Compare to finetuned small model
通過與大模型的比較實驗,可以評估小模型指令微調的效果,為模型選擇和調參提供依據。同時也可以驗證指令微調這種技術的效果是否可擴展到小模型

Large Language Models with Semantic Search。大型語言模型與語義搜索

Finetuning Large Language Models。微調大型語言模型

Finetuning Large Language Models

指令調整(Instruction-tuning)

課程概要

什麼是指令微調?(What is instruction finetuning?)

微調Finetuning

instruction Finetuning?

指令跟隨資料集(Instruction-following datasets)

一些現有的資料已經可以在線上直接使用：

LLM資料生成(LLM Data Generation)

指令微調的泛化/通用化(Instruction Finetuning Generalization)

微調概述(Overview of Finetuning)

Lab - Instruction-tuning

Load instruction tuned dataset

Large Language Models with Semantic Search。大型語言模型與語義搜索

Finetuning Large Language Models。微調大型語言模型

Finetuning Large Language Models

指令調整(Instruction-tuning)

課程概要

什麼是指令微調?(What is instruction finetuning?)

微調Finetuning

instruction Finetuning?

指令跟隨資料集(Instruction-following datasets)

一些現有的資料已經可以在線上直接使用：

LLM資料生成(LLM Data Generation)

指令微調的泛化/通用化(Instruction Finetuning Generalization)

微調概述(Overview of Finetuning)

Lab - Instruction-tuning

Load instruction tuned dataset

Read more

[GenAI][AI Agents] Long-Term Agentic Memory With LangGraph - Baseline Email Assistant

[GenAI][AI Agents] Long-Term Agentic Memory With LangGraph - Introduction to Agent Memory

[AI Agents in LangGraph](https://learn.deeplearning.ai/courses/ai-agents-in-langgraph/lesson/1/introduction)

AI / ML領域相關學習筆記入口頁面