(https://learn.deeplearning.ai/)
通過提示模板
使用其他LLM 生成資料
使用開源模型
定義: 微調是一個過程,其中一個預訓練的模型,如大型語言模型(LLM),在特定的資料集上進一步訓練,以專門從事某一特定任務或領域。
目的: 它有助於調整通用模型以滿足特定需求,增強其在專門任務上的性能,而無需從頭開始訓練過程。
環境設定
import itertools
import jsonlines
from datasets import load_dataset
from pprint import pprint
from llama import BasicModelRunner
from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
import pandas as pd
instruction_tuned_dataset = load_dataset("tatsu-lab/alpaca", split="train", streaming=True)
展示資料集
m = 5
print("Instruction-tuned dataset:")
top_m = list(itertools.islice(instruction_tuned_dataset, m))
with pd.option_context('display.max_colwidth', None):
display(pd.DataFrame(top_m))
2種提示模板(輸入有、無)
提示模板是指預定義的格式或結構,用於指導用戶以特定方式提供輸入或資訊
prompt_template_with_input
prompt_template_without_input
prompt_template_with_input = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{instruction}
### Input:
{input}
### Response:"""
prompt_template_without_input = """Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
{instruction}
### Response:"""
為提示增加資料 Hydrate prompts (add data to prompts)
processed_data = []
for j in top_m:
if not j["input"]:
processed_prompt = prompt_template_without_input.format(instruction=j["instruction"])
else:
processed_prompt = prompt_template_with_input.format(instruction=j["instruction"], input=j["input"])
processed_data.append({"input": processed_prompt, "output": j["output"]})
pprint(processed_data[0])
with pd.option_context('display.max_colwidth', None):
display(pd.DataFrame(processed_data))
{'input': 'Below is an instruction that describes a task. Write a response '
'that appropriately completes the request.\n'
'\n'
'### Instruction:\n'
'Give three tips for staying healthy.\n'
'\n'
'### Response:',
'output': '1.Eat a balanced diet and make sure to include plenty of fruits '
'and vegetables. \n'
'2. Exercise regularly to keep your body active and strong. \n'
'3. Get enough sleep and maintain a consistent sleep schedule.'}
比較非指令微調模型與指令微調模型(Compare non-instruction-tuned vs. instruction-tuned models)
執行A/B測試,比較同樣大小的LLM模型,在進行指令微調前後的不同表現
評估指標包括遵循指令的精確性,回答和上下文的相關性等
指令微調明顯提高了模型理解並執行指令的能力,而未經指令微調的模型更容易產生與提問無關的回答
未經指令微調
non_instruct_model = BasicModelRunner("meta- llama/Llama-2-7b-hf")
non_instruct_output = non_instruct_model("Tell me how to train my dog to sit")
print("Not instruction-tuned output (Llama 2 Base):", non_instruct
基本產出
Not instruction-tuned output (Llama 2 Base):
Tell me how to train my dog to sit. I have a 10 month old puppy and I want to train him to sit. I have tried the treat method and he just sits there and looks at me like I am crazy. I have tried the "sit" command and he just looks at me like I am crazy. I have tried the "sit" command and he just looks at me like I am crazy. ...
經指令微調
instruct_model = BasicModelRunner("meta-llama/Llama-2-7b-chat-hf")
instruct_output = instruct_model("Tell me how to train my dog to sit")
print("Instruction-tuned output (Llama 2): ", instruct_output)
按照提示與指令得到較好、結構化的產出
Instruction-tuned output (Llama 2): on command.
How to Train Your Dog to Sit on Command
Training your dog to sit on command is a basic obedience command that can be achieved with patience, consistency, and positive reinforcement. Here's a step-by-step guide on how to train your dog to sit on command:
1. Choose a Quiet and Distraction-Free Area: Find a quiet area with minimal distractions where your dog can focus on you.
2. Have Treats Ready: Choose your dog's favorite treats and have them ready to use as rewards.
3. Stand in Front of Your Dog: Stand in front of your dog and hold a treat close to their nose.
4. Move the Treat Above Your Dog's Head: Slowly move the treat above your dog's head, towards their tail. As your dog follows the treat with their nose, their bottom will naturally lower into a sitting position.
5. Say "Sit" and Reward: As soon as your dog's butt touches the ground, say "Sit" and give them the treat. It's important to say the command word as they're performing
嘗試更小的模型Try smaller models
除了使用大規模的預訓練語言模型如GPT-3等進行指令微調,也可以嘗試使用較小的開源語言模型。
Compare to finetuned small model
通過與大模型的比較實驗,可以評估小模型指令微調的效果,為模型選擇和調參提供依據。同時也可以驗證指令微調這種技術的效果是否可擴展到小模型