# FMOps:大語言模型訓練及部署
如何在 AWS 上選擇適合的工具進行模型的訓練及部署:
以 Model : Llama 3 70B 為例:(✓ 越多,代表操作越簡易;✗ 不適用)
| Service | Host | Train (Full) | Train (Adaptor) |
|-----------|------|-------|-------|
| Bedrock | ✓✓✓✓ <br>(支援模型有限制) | ✗ | ✗
| SageMaker JumpStart | ✓✓✓ | ✗ | ✓✓✓ <br>(支援模型眾多)
| SageMaker Training | ✗ | ✓✓ <br>(可自定義) | ✓✓ <br>(可自定義)
| SageMaker Hosting | ✓✓ | ✗ | ✗
| SageMaker HyperPod | ✓ | ✓ | ✓
| EC2 instances | ✓ | ✓ | ✓
注意:
- 每個區域支援的可調優模型不同。Nova 僅在 us-east-1,而 us-west-2 有支援 Llama 3.1 8B/70B Instruct 模型的 finetuning。
- 採用 Bedrock funetuning 的模型須使用 PT 部署。
---
## 在 AWS 上訓練模型
:::success
AWS blog: [Generative AI foundation model training on Amazon SageMaker](https://aws.amazon.com/blogs/machine-learning/generative-ai-foundation-model-training-on-amazon-sagemaker/)
更新時間:2024-10-22
:::
### 使用 Bedrock 訓練文字語言模型的資料格式
:::info
請參考 [官網](https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-prepare.html)
:::
#### 非監督式訓練 Domain adaptation fine-tuning
僅支援
```python
# .jsonl
{"input": "<input text>"}
{"input": "<input text>"}
{"input": "<input text>"}
```
#### 任務型監督式訓練 Instruction tuning format
```python
# .jsonl
{"prompt": "<prompt1>", "completion": "<expected generated text>"}
{"prompt": "<prompt2>", "completion": "<expected generated text>"}
```
### 使用 Bedrock 進行模型訓練
:::info
支援的模型少,支援的模型請見[官網](https://docs.aws.amazon.com/bedrock/latest/userguide/custom-model-supported.html)
:::
### 使用 SageMaker 訓練文字語言模型的資料格式
#### 非監督式訓練 Domain adaptation fine-tuning
```text
# .jsonl
{"text": "<input text>"}
{"text": "<input text>"}
{"text": "<input text>"}
```
#### 非監督式訓練 Chat completion
[範例資料](https://huggingface.co/datasets/ruslanmv/ai-medical-chatbot/viewer/default/train?p=1&row=101)
```text
[
{"role": "user", "content": "{Human conversation}"},
{"role": "assistant", "content": "{AI conversation}"}
]
```
#### 任務型監督式訓練 Instruction tuning format
訓練資料必須使用 JSON lines (.jsonl) 格式,其中每一行都是一個JSON object,代表單個資料樣本。所有訓練資料必須放在同一個資料夾中,不過可以分散儲存在多個 jsonl 檔案中。必須使用 .jsonl 副檔名。訓練資料夾中也可以包含一個 template.json 檔案,用來描述輸入和輸出的格式。
如果沒有提供範本檔案,系統將使用以下預設範本,使用 `prompt` 和 `completion`:
```text
# 預設
{
"prompt": "{prompt}",
"completion": "{completion}"
}
```
可以自定義提示詞中可動態置換的參數名稱,像是 `instruction`, `context`, 和 `response`:
```python!
# 準備 template.json
{
"prompt": "Below is an instruction that describes a task."
"Write a response that appropriately completes the request.\n\n"
"### Instruction:\n{instruction}\n\n### Response:\n",
"completion": "{output}"
}
# 或者
{
"prompt": "Below is an instruction that describes a task, paired with an input that provides further context. "
"Write a response that appropriately completes the request.\n\n"
"### Instruction:\n{instruction}\n\n### Input:\n{context}\n\n### Response:\n",
"completion": " {response}",
}
# 準備 train.jsonl
{
"instruction": "Who painted the Two Monkeys",
"context": "Two Monkeys or Two Chained Monkeys is a 1562 painting by Dutch and Flemish Renaissance artist Pieter Bruegel the Elder. The work is now in the Gemäldegalerie (Painting Gallery) of the Berlin State Museums.",
"response": "The two Monkeys or Two Chained Monkeys is a 1562 painting by Dutch and Flemish Renaissance artist Pieter Bruegel the Elder. The work is now in the Gemaeldegalerie (Painting Gallery) of the Berlin State Museums."
}
```
### 使用 SageMaker JumpStart 進行 Llama 模型訓練
:::info
支援的模型眾多,主要使用 LoRA/QLoRA 訓練
:::
:::success
AWS blog: [Use Llama 3.1 405B for synthetic data generation and distillation to fine-tune smaller models](https://aws.amazon.com/blogs/machine-learning/use-llama-3-1-405b-to-generate-synthetic-data-for-fine-tuning-tasks/)
更新時間:2024-07-23
AWS blog: [Fine-tune Meta Llama 3.1 models for generative AI inference using Amazon SageMaker JumpStart](https://aws.amazon.com/blogs/machine-learning/fine-tune-meta-llama-3-1-models-for-generative-ai-inference-using-amazon-sagemaker-jumpstart/)
更新時間:2024-08-21
AWS blog: [Fine-tune Llama 3 for text generation on Amazon SageMaker JumpStart](https://aws.amazon.com/blogs/machine-learning/fine-tune-llama-3-for-text-generation-on-amazon-sagemaker-jumpstart/)
更新時間:2024-09-06
:::

### 使用 SageMaker Bring Your Own Script, BYOS 進行模型訓練
[AWS GitHub BYOS](https://github.com/aws/amazon-sagemaker-examples/tree/default/%20%20%20%20generative_ai) | [script for encoder-decoder](https://github.com/aws/amazon-sagemaker-examples/blob/default/%20%20%20%20generative_ai/sm-finetuning_huggingface_with_your_own_scripts_and_data/source/train.py) | [script for mistral](https://github.com/aws/amazon-sagemaker-examples/blob/default/%20%20%20%20generative_ai/sm-mixtral_8x7b_fine_tune_and_deploy/scripts/run_clm.py)
### 模型調優函式庫及工具
[torchtune](https://github.com/pytorch/torchtune)
[transformers](https://github.com/huggingface/transformers) | [guide](https://www.philschmid.de/sagemaker-train-deploy-llama3)
[autotrain-advanced](https://github.com/huggingface/autotrain-advanced/tree/main)
[meta/llama-recipes](https://github.com/meta-llama/llama-recipes/tree/main)
[TRL - Transformer Reinforcement Learning](https://huggingface.co/docs/trl/main/en/index)
[Weights & Biases (W&B)](https://wandb.ai/site)
---
## 部署模型
### 使用 Bedrock Imported Models 部署及調用模型 (Preview)
:::info
- 採用 on-demand 方式計費,需要先提交 quota `Imported models per account` 申請,詳情請見[官網](https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-import-model.html)
- 請注意:Bedrock Imported Models 支援模型請參考[官網](https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-import-model.html)
- 關於 [產品定價](https://aws.amazon.com/bedrock/pricing/)
:::
:::success
AWS blog: [Amazon Bedrock Custom Model Import now generally available](https://aws.amazon.com/blogs/machine-learning/amazon-bedrock-custom-model-import-now-generally-available/)
更新時間:2024-10-21
:::
您可以從 HuggingFace 將 model artifacts 下載到 S3,透過 Bedrock Imported Models 進行模型的部署和調用

或者,您也可以透過 Bedrock 部署及調用 SageMaker JumpStart 訓練的模型, 僅需要找出 model artifacts 的儲存位置即可(如下圖位置)

Bedrock Imported Models Job 在部署 Llama 3 70B 的模型,約莫需要25分鐘的時間

Import job 提交後,若您選擇創建新的 IAM role,須在頁面上停留等待資源建立。

完成 import job 後,<font color="#f00">需要等待模型部署約莫數分鐘</font>的時間,才能在 Playgrounds 使用,否則您將會看到下圖的警告紅框
 
目前<font color="#f00">尚不能</font>使用 Bedrock imported models 進行模型的二次訓練,或是選擇 Provisioned throughput 進行部署

---
## 模型輸出結果評測
:::success
AWS samples: [Large Language model and prompt evaluation](https://github.com/aws-samples/build-an-automated-large-language-model-evaluation-pipeline-on-aws)
:::
GitHub: [aws/fmeval](https://github.com/aws/fmeval)
GitHub: [Evaluate Bedrock Imported Models](https://github.com/aws-samples/amazon-bedrock-samples/blob/main/custom-models/import_models/fmeval_imported_models.ipynb)
---
## 推理效能評測
[GitHub AWS samples](https://github.com/aws-samples/amazon-bedrock-samples/blob/main/custom-models/import_models/perf-test-imported_model.ipynb)
## 調優部署流程自動化
:::success
AWS blog: [Automate fine-tuning of Llama 3.x models with the new visual designer for Amazon SageMaker Pipelines](https://aws.amazon.com/blogs/machine-learning/automate-fine-tuning-of-llama-3-x-models-with-the-new-visual-designer-for-amazon-sagemaker-pipelines/)
更新時間:2024-10-22
:::
---
## 更多學習資源
AWS samples: [Fine-tune Foundation Models on Amazon SageMaker using @remote decorator](https://github.com/aws-samples/amazon-sagemaker-llm-fine-tuning-remote-decorator/tree/main)
[Meta Fine-tune](https://www.llama.com/docs/how-to-guides/Fine-tuning)
[Meta Quantization](https://www.llama.com/docs/how-to-guides/quantization)
YouTube: [Chat Fine tuning](https://www.youtube.com/watch?v=71x8EMrB0Gc)
Medium: [LLM domain adaptation using continued pre-training](https://medium.com/@aris.tsakpinis/llm-domain-adaptation-using-continued-pre-training-part-4-4-e4fc3acffac7)
[Transformers API](https://github.com/huggingface/transformers)