MLOps World 2023 Workshop

# MLOps World 2023 Workshop **Title:** Learn Your Codebase: Fine-tuning CodeLlama with Flyte… to Learn Flyte *This document provides a literal script to help guide a workshop runner* ``` source ~/venvs/flyte-llama/bin/activate ``` ## Environment Variables ``` export PYTHONPATH=$(pwd):$PYTHONPATH export FLYTECTL_CONFIG=~/.uctl/config-demo.yaml ``` ## Orientation Show github repo on browser: https://github.com/unionai-oss/llm-fine-tuning In `llm-fine-tuning/flyte_llama` directory of the repo: Show current directory ``` tree . | grep -v '.pyc' ``` ## Dataset Creation Walk through `dataset.py` - Repo urls, include files, extensions - `create_dataset` - `iter_github_documents` Create the dataset locally: ``` python flyte_llama/dataset.py --output-path ~/datasets/flyte_llama ``` Get file count per repo: ``` find ~/datasets/flyte_llama -type f | cut -d/ -f6 | uniq -c | sort | grep -v 'metadata' ``` Get file extension count: ``` find ~/datasets/flyte_llama -type f -name '*.*' | sed 's|.*\.||' | sort | uniq -c | sort ``` It's often useful to see how the data is loaded when model is trained on. Walk through `dataloader.py` - `get_dataset` ``` ipython ``` In the REPL: ```python from flyte_llama.dataloader import get_dataset from pathlib import Path dataset = get_dataset(Path.home() / "datasets" / "flyte_llama") print(dataset[0]["text"]) ``` Let's tokenize ```python from transformers import AutoTokenizer tokenizer = AutoTokenizer.from_pretrained( "codellama/CodeLlama-7b-hf", model_max_length=1024, padding_side="left", ) tokenizer(dataset[0]["text"]) ``` ## Fine-tuning Walk through the `train.py` script - `train` function - `TrainerConfig` - `BitsAndBytesConfig` - `LoraConfig` setup - `TrainingArguments` and `Trainer` Show the `config/flyte_llama_7b_local.json` file. Walk through `workflows.py`: - `train_workflow.py` - caching - image_spec - resources - secrets - environment vars - `create_dataset.py` ``` pyflyte run flyte_llama/workflows.py train \ --dataset ~/datasets \ --config config/flyte_llama_7b_local.json ``` Now scale it up using remote: ```bash pyflyte run --remote --copy-all \ --project llm-fine-tuning \ flyte_llama/workflows.py train_workflow \ --config config/flyte_llama_7b_qlora_v0.json ``` Show Union.ai dashboard: - Execution view - Inputs/outputs - Graph view - Timeline view - Task-level monitoring - Weights and biases run Publish the model: Go to union.ai execution: https://demo.hosted.unionai.cloud/console/projects/llm-fine-tuning/domains/development/executions/f6c0282eff52f4d28a8a Get model_dir file ``` pyflyte run --remote --copy-all \ flyte_llama/workflows.py publish_model_workflow \ --config config/flyte_llama_7b_qlora_v0.json \ --model_dir <s3_path> ``` ## Model Serving Export env vars ```bash eval $(sed 's/^/export /g' secrets.txt) export VERSION=$(git rev-parse --short=7 HEAD) export SERVING_SSE_IMAGE=ghcr.io/unionai-oss/modelz-flyte-llama-serving-sse:$VERSION ``` Walk through `server.py` Change `model_path` to `"EleutherAI/pythia-70m-deduped-v0"` and `adapter_path` to `None` Add `"max_gen_length=128"` to the `ServerConfig` Run server locally ``` python server.py --timeout 60000 ``` Open new tab: ``` source ~/venvs/flyte-llama/bin/activate eval $(sed 's/^/export /g' secrets.txt) ``` ``` curl --request POST --url 0.0.0.0:8000/inference --data "$(echo '"Flyte is a"'| json2msgpack)" ``` Walk through `server_sse.py` Build image for deployment ``` docker build . -f Dockerfile.server_sse -t $SERVING_SSE_IMAGE docker push $SERVING_SSE_IMAGE ``` Deploy to modelz ``` python deploy.py \ --deployment-name flyte-llama-sse-$VERSION \ --image $SERVING_SSE_IMAGE \ --server-resource "nvidia-ada-l4-4-48c-192g" \ --stream ``` Go to modelz deployment: https://cloud.modelz.ai/ ## Client CLI Walk through `client_sse.py` Run `client_sse.py` ``` python client_sse.py \ --prompt "The code snippet below shows a basic Flyte workflow" \ --n-tokens 250 \ --deployment-key <deployment_key> ``` Walkthrough modelz deployment page: https://cloud.modelz.ai/deployments/detail/7b6b65ce-a02b-44b7-ae63-dd763d136291 While waiting for this to spin up, use an existing deployment key: ``` python client_sse.py \ --prompt "The code snippet below shows a basic Flyte workflow" \ --n-tokens 250 \ --deployment-key flyte-llama-sse-0e90ccb-sknqkpilm0kyojog ``` Try the CLI with different inputs