# MLOps World 2023 Workshop
**Title:** Learn Your Codebase: Fine-tuning CodeLlama with Flyte… to Learn Flyte
*This document provides a literal script to help guide a workshop runner*
```
source ~/venvs/flyte-llama/bin/activate
```
## Environment Variables
```
export PYTHONPATH=$(pwd):$PYTHONPATH
export FLYTECTL_CONFIG=~/.uctl/config-demo.yaml
```
## Orientation
Show github repo on browser:
https://github.com/unionai-oss/llm-fine-tuning
In `llm-fine-tuning/flyte_llama` directory of the repo:
Show current directory
```
tree . | grep -v '.pyc'
```
## Dataset Creation
Walk through `dataset.py`
- Repo urls, include files, extensions
- `create_dataset`
- `iter_github_documents`
Create the dataset locally:
```
python flyte_llama/dataset.py --output-path ~/datasets/flyte_llama
```
Get file count per repo:
```
find ~/datasets/flyte_llama -type f | cut -d/ -f6 | uniq -c | sort | grep -v 'metadata'
```
Get file extension count:
```
find ~/datasets/flyte_llama -type f -name '*.*' | sed 's|.*\.||' | sort | uniq -c | sort
```
It's often useful to see how the data is loaded when model is trained on.
Walk through `dataloader.py`
- `get_dataset`
```
ipython
```
In the REPL:
```python
from flyte_llama.dataloader import get_dataset
from pathlib import Path
dataset = get_dataset(Path.home() / "datasets" / "flyte_llama")
print(dataset[0]["text"])
```
Let's tokenize
```python
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(
"codellama/CodeLlama-7b-hf",
model_max_length=1024,
padding_side="left",
)
tokenizer(dataset[0]["text"])
```
## Fine-tuning
Walk through the `train.py` script
- `train` function
- `TrainerConfig`
- `BitsAndBytesConfig`
- `LoraConfig` setup
- `TrainingArguments` and `Trainer`
Show the `config/flyte_llama_7b_local.json` file.
Walk through `workflows.py`:
- `train_workflow.py`
- caching
- image_spec
- resources
- secrets
- environment vars
- `create_dataset.py`
```
pyflyte run flyte_llama/workflows.py train \
--dataset ~/datasets \
--config config/flyte_llama_7b_local.json
```
Now scale it up using remote:
```bash
pyflyte run --remote --copy-all \
--project llm-fine-tuning \
flyte_llama/workflows.py train_workflow \
--config config/flyte_llama_7b_qlora_v0.json
```
Show Union.ai dashboard:
- Execution view
- Inputs/outputs
- Graph view
- Timeline view
- Task-level monitoring
- Weights and biases run
Publish the model:
Go to union.ai execution: https://demo.hosted.unionai.cloud/console/projects/llm-fine-tuning/domains/development/executions/f6c0282eff52f4d28a8a
Get model_dir file
```
pyflyte run --remote --copy-all \
flyte_llama/workflows.py publish_model_workflow \
--config config/flyte_llama_7b_qlora_v0.json \
--model_dir <s3_path>
```
## Model Serving
Export env vars
```bash
eval $(sed 's/^/export /g' secrets.txt)
export VERSION=$(git rev-parse --short=7 HEAD)
export SERVING_SSE_IMAGE=ghcr.io/unionai-oss/modelz-flyte-llama-serving-sse:$VERSION
```
Walk through `server.py`
Change `model_path` to `"EleutherAI/pythia-70m-deduped-v0"` and `adapter_path` to `None`
Add `"max_gen_length=128"` to the `ServerConfig`
Run server locally
```
python server.py --timeout 60000
```
Open new tab:
```
source ~/venvs/flyte-llama/bin/activate
eval $(sed 's/^/export /g' secrets.txt)
```
```
curl --request POST --url 0.0.0.0:8000/inference --data "$(echo '"Flyte is a"'| json2msgpack)"
```
Walk through `server_sse.py`
Build image for deployment
```
docker build . -f Dockerfile.server_sse -t $SERVING_SSE_IMAGE
docker push $SERVING_SSE_IMAGE
```
Deploy to modelz
```
python deploy.py \
--deployment-name flyte-llama-sse-$VERSION \
--image $SERVING_SSE_IMAGE \
--server-resource "nvidia-ada-l4-4-48c-192g" \
--stream
```
Go to modelz deployment: https://cloud.modelz.ai/
## Client CLI
Walk through `client_sse.py`
Run `client_sse.py`
```
python client_sse.py \
--prompt "The code snippet below shows a basic Flyte workflow" \
--n-tokens 250 \
--deployment-key <deployment_key>
```
Walkthrough modelz deployment page: https://cloud.modelz.ai/deployments/detail/7b6b65ce-a02b-44b7-ae63-dd763d136291
While waiting for this to spin up, use an existing deployment key:
```
python client_sse.py \
--prompt "The code snippet below shows a basic Flyte workflow" \
--n-tokens 250 \
--deployment-key flyte-llama-sse-0e90ccb-sknqkpilm0kyojog
```
Try the CLI with different inputs