FMOps：大語言模型訓練及部署

# FMOps：大語言模型訓練及部署如何在 AWS 上選擇適合的工具進行模型的訓練及部署：以 Model : Llama 3 70B 為例：（✓ 越多，代表操作越簡易；✗ 不適用） | Service | Host | Train (Full) | Train (Adaptor) | |-----------|------|-------|-------| | Bedrock | ✓✓✓✓ <br>(支援模型有限制) | ✗ | ✗ | SageMaker JumpStart | ✓✓✓ | ✗ | ✓✓✓ <br>(支援模型眾多) | SageMaker Training | ✗ | ✓✓ <br>(可自定義) | ✓✓ <br>(可自定義) | SageMaker Hosting | ✓✓ | ✗ | ✗ | SageMaker HyperPod | ✓ | ✓ | ✓ | EC2 instances | ✓ | ✓ | ✓ 注意： - 每個區域支援的可調優模型不同。Nova 僅在 us-east-1，而 us-west-2 有支援 Llama 3.1 8B/70B Instruct 模型的 finetuning。 - 採用 Bedrock funetuning 的模型須使用 PT 部署。 --- ## 在 AWS 上訓練模型 :::success AWS blog: [Generative AI foundation model training on Amazon SageMaker](https://aws.amazon.com/blogs/machine-learning/generative-ai-foundation-model-training-on-amazon-sagemaker/) 更新時間：2024-10-22 ::: ### 使用 Bedrock 訓練文字語言模型的資料格式 :::info 請參考 [官網](https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-prepare.html) ::: #### 非監督式訓練 Domain adaptation fine-tuning 僅支援 ```python # .jsonl {"input": "<input text>"} {"input": "<input text>"} {"input": "<input text>"} ``` #### 任務型監督式訓練 Instruction tuning format ```python # .jsonl {"prompt": "<prompt1>", "completion": "<expected generated text>"} {"prompt": "<prompt2>", "completion": "<expected generated text>"} ``` ### 使用 Bedrock 進行模型訓練 :::info 支援的模型少，支援的模型請見[官網](https://docs.aws.amazon.com/bedrock/latest/userguide/custom-model-supported.html) ::: ### 使用 SageMaker 訓練文字語言模型的資料格式 #### 非監督式訓練 Domain adaptation fine-tuning ```text # .jsonl {"text": "<input text>"} {"text": "<input text>"} {"text": "<input text>"} ``` #### 非監督式訓練 Chat completion [範例資料](https://huggingface.co/datasets/ruslanmv/ai-medical-chatbot/viewer/default/train?p=1&row=101) ```text [ {"role": "user", "content": "{Human conversation}"}, {"role": "assistant", "content": "{AI conversation}"} ] ``` #### 任務型監督式訓練 Instruction tuning format 訓練資料必須使用 JSON lines (.jsonl) 格式，其中每一行都是一個JSON object，代表單個資料樣本。所有訓練資料必須放在同一個資料夾中，不過可以分散儲存在多個 jsonl 檔案中。必須使用 .jsonl 副檔名。訓練資料夾中也可以包含一個 template.json 檔案，用來描述輸入和輸出的格式。如果沒有提供範本檔案，系統將使用以下預設範本，使用 `prompt` 和 `completion`： ```text # 預設 { "prompt": "{prompt}", "completion": "{completion}" } ``` 可以自定義提示詞中可動態置換的參數名稱，像是 `instruction`, `context`, 和 `response`： ```python! # 準備 template.json { "prompt": "Below is an instruction that describes a task." "Write a response that appropriately completes the request.\n\n" "### Instruction:\n{instruction}\n\n### Response:\n", "completion": "{output}" } # 或者 { "prompt": "Below is an instruction that describes a task, paired with an input that provides further context. " "Write a response that appropriately completes the request.\n\n" "### Instruction:\n{instruction}\n\n### Input:\n{context}\n\n### Response:\n", "completion": " {response}", } # 準備 train.jsonl { "instruction": "Who painted the Two Monkeys", "context": "Two Monkeys or Two Chained Monkeys is a 1562 painting by Dutch and Flemish Renaissance artist Pieter Bruegel the Elder. The work is now in the Gemäldegalerie (Painting Gallery) of the Berlin State Museums.", "response": "The two Monkeys or Two Chained Monkeys is a 1562 painting by Dutch and Flemish Renaissance artist Pieter Bruegel the Elder. The work is now in the Gemaeldegalerie (Painting Gallery) of the Berlin State Museums." } ``` ### 使用 SageMaker JumpStart 進行 Llama 模型訓練 :::info 支援的模型眾多，主要使用 LoRA/QLoRA 訓練 ::: :::success AWS blog: [Use Llama 3.1 405B for synthetic data generation and distillation to fine-tune smaller models](https://aws.amazon.com/blogs/machine-learning/use-llama-3-1-405b-to-generate-synthetic-data-for-fine-tuning-tasks/) 更新時間：2024-07-23 AWS blog: [Fine-tune Meta Llama 3.1 models for generative AI inference using Amazon SageMaker JumpStart](https://aws.amazon.com/blogs/machine-learning/fine-tune-meta-llama-3-1-models-for-generative-ai-inference-using-amazon-sagemaker-jumpstart/) 更新時間：2024-08-21 AWS blog: [Fine-tune Llama 3 for text generation on Amazon SageMaker JumpStart](https://aws.amazon.com/blogs/machine-learning/fine-tune-llama-3-for-text-generation-on-amazon-sagemaker-jumpstart/) 更新時間：2024-09-06 ::: ![Screenshot 2024-10-20 at 9.44.01 PM](https://hackmd.io/_uploads/rJovH9zlJl.png) ### 使用 SageMaker Bring Your Own Script, BYOS 進行模型訓練 [AWS GitHub BYOS](https://github.com/aws/amazon-sagemaker-examples/tree/default/%20%20%20%20generative_ai) | [script for encoder-decoder](https://github.com/aws/amazon-sagemaker-examples/blob/default/%20%20%20%20generative_ai/sm-finetuning_huggingface_with_your_own_scripts_and_data/source/train.py) | [script for mistral](https://github.com/aws/amazon-sagemaker-examples/blob/default/%20%20%20%20generative_ai/sm-mixtral_8x7b_fine_tune_and_deploy/scripts/run_clm.py) ### 模型調優函式庫及工具 [torchtune](https://github.com/pytorch/torchtune) [transformers](https://github.com/huggingface/transformers) | [guide](https://www.philschmid.de/sagemaker-train-deploy-llama3) [autotrain-advanced](https://github.com/huggingface/autotrain-advanced/tree/main) [meta/llama-recipes](https://github.com/meta-llama/llama-recipes/tree/main) [TRL - Transformer Reinforcement Learning](https://huggingface.co/docs/trl/main/en/index) [Weights & Biases (W&B)](https://wandb.ai/site) --- ## 部署模型 ### 使用 Bedrock Imported Models 部署及調用模型 (Preview) :::info - 採用 on-demand 方式計費，需要先提交 quota `Imported models per account` 申請，詳情請見[官網](https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-import-model.html) - 請注意：Bedrock Imported Models 支援模型請參考[官網](https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-import-model.html) - 關於 [產品定價](https://aws.amazon.com/bedrock/pricing/) ::: :::success AWS blog: [Amazon Bedrock Custom Model Import now generally available](https://aws.amazon.com/blogs/machine-learning/amazon-bedrock-custom-model-import-now-generally-available/) 更新時間：2024-10-21 ::: 您可以從 HuggingFace 將 model artifacts 下載到 S3，透過 Bedrock Imported Models 進行模型的部署和調用 ![Screenshot 2024-10-20 at 9.29.18 PM](https://hackmd.io/_uploads/rkJv0tGeJe.png) 或者，您也可以透過 Bedrock 部署及調用 SageMaker JumpStart 訓練的模型，僅需要找出 model artifacts 的儲存位置即可(如下圖位置) ![Screenshot 2024-10-29 at 9.13.39 AM](https://hackmd.io/_uploads/rki4v3pgyl.png) Bedrock Imported Models Job 在部署 Llama 3 70B 的模型，約莫需要25分鐘的時間 ![Screenshot 2024-10-20 at 9.29.27 PM](https://hackmd.io/_uploads/B1--y9fgJe.png) Import job 提交後，若您選擇創建新的 IAM role，須在頁面上停留等待資源建立。 ![Screenshot 2024-10-29 at 9.19.14 AM](https://hackmd.io/_uploads/BJnsj26gJg.png) 完成 import job 後，<font color="#f00">需要等待模型部署約莫數分鐘</font>的時間，才能在 Playgrounds 使用，否則您將會看到下圖的警告紅框 ![Screenshot 2024-10-20 at 10.27.10 PM](https://hackmd.io/_uploads/ryhOX5zxkx.png =49%x) ![Screenshot 2024-10-20 at 9.29.40 PM](https://hackmd.io/_uploads/SJ1SJ5zgkx.png =49%x) 目前<font color="#f00">尚不能</font>使用 Bedrock imported models 進行模型的二次訓練，或是選擇 Provisioned throughput 進行部署 ![Screenshot 2024-10-20 at 9.30.07 PM](https://hackmd.io/_uploads/ryruk9GlJx.png) --- ## 模型輸出結果評測 :::success AWS samples: [Large Language model and prompt evaluation](https://github.com/aws-samples/build-an-automated-large-language-model-evaluation-pipeline-on-aws) ::: GitHub: [aws/fmeval](https://github.com/aws/fmeval) GitHub: [Evaluate Bedrock Imported Models](https://github.com/aws-samples/amazon-bedrock-samples/blob/main/custom-models/import_models/fmeval_imported_models.ipynb) --- ## 推理效能評測 [GitHub AWS samples](https://github.com/aws-samples/amazon-bedrock-samples/blob/main/custom-models/import_models/perf-test-imported_model.ipynb) ## 調優部署流程自動化 :::success AWS blog: [Automate fine-tuning of Llama 3.x models with the new visual designer for Amazon SageMaker Pipelines](https://aws.amazon.com/blogs/machine-learning/automate-fine-tuning-of-llama-3-x-models-with-the-new-visual-designer-for-amazon-sagemaker-pipelines/) 更新時間：2024-10-22 ::: --- ## 更多學習資源 AWS samples: [Fine-tune Foundation Models on Amazon SageMaker using @remote decorator](https://github.com/aws-samples/amazon-sagemaker-llm-fine-tuning-remote-decorator/tree/main) [Meta Fine-tune](https://www.llama.com/docs/how-to-guides/Fine-tuning) [Meta Quantization](https://www.llama.com/docs/how-to-guides/quantization) YouTube: [Chat Fine tuning](https://www.youtube.com/watch?v=71x8EMrB0Gc) Medium: [LLM domain adaptation using continued pre-training](https://medium.com/@aris.tsakpinis/llm-domain-adaptation-using-continued-pre-training-part-4-4-e4fc3acffac7) [Transformers API](https://github.com/huggingface/transformers)