Jan Studio spec discussion

## Motivation - Users need the ability to fine tune models on their own hardware (part of `Create your own Assistant`) - Users need a seamless experience for fine tuning models (not only LLM, but embedding for Retrieval, or Function calling etc) then can be used right away within Jan. - User can use this component via Studio extension on Jan suite (Jan desktop app, Jan server) - Jan studio should expose the API similar to OpenAI fine tuning API, but it also need to enable user with more fine tuning method, as well and converting models that Jan can support in #783 - The targeted users - Normies: Just click 1 button, everything will be filled by default. - Power user (cc @hahuyhoang411 ) pls add. - We have to focus on very neat UX/UI that is better than existing solutions with Gradio for power users, but still allow power to extend what they need. ## Specs - @hiro-v thinks we need to develop Studio decoupled from Jan app and abstract away the `training engine` and `api` in Python runtime in order to reuse existing Python ecosystem. - **High level architecture** (green is existed, yellow is to develop) ![Untitled Diagram-Page-17](https://github.com/janhq/jan/assets/22463238/24a7e698-05f4-4994-bd8f-1a9395b3bc7e) - Jan suite interact with Studio component with Jan studio extension and OpenAI compatible API - Jan Studio is in docker environment, using python 3.9 that includes 1 webserver and 1 scheduler for background task. The training engines are abstract with `trainingEngine` class which helps users to fine tune on CPU, NVIDIA GPU or Apple MLX - **Studio comes with API as follow** - **Create fine tuning job** - **POST** `http://studio.jan.ai/v1/fine_tuning/jobs` - Input ``` model: string - required training_file: file upload - jsonl (required) validation: [file upload (file), GSM8K (choose), TruthfulQA (choose)] hyperparameters: object ``` - Output (`ft_job`) ``` { "object": "fine_tuning.job", "id": "ftjob-abc123", "model": "gpt-3.5-turbo-0613", "created_at": 1614807352, "fine_tuned_model": null, "organization_id": "org-123", "result_files": [], "status": "queued", "validation_file": null, "training_file": "file-abc123", } ``` - **List fine tuning jobs** - **GET** `http://studio.jan.ai/v1/fine_tuning/jobs` -> LIST[ft_job] - **Retrieve fine-tuning job** - GET `http://studio.jan.ai/v1/fine_tuning/jobs/<:ft_id> -> `ft_job` - **Cancel fine-tuning** - **POST** `http://studio.jan.ai/v1/fine_tuning/jobs/<:id>/cancel` -> `ft_job` (status: cancelled) - Jan Studio will use similar way to store data - your local FS ``` jan/ assistants/ models/ extensions/ logs/ settings/ threads/ studio/ jobs/ logs.jsonl metrics.jsonl metadata.json files/ training.jsonl validation.jsonl files.abc artifacts/ model.bin model_2.bin model.fp6.gguf model.Q5.gguf ``` ## Designs - Mock up ![CleanShot 2024-02-03 at 16 18 02@2x](https://github.com/janhq/jan/assets/22463238/ce37562f-e6ba-4fb1-8416-244454107794) ![CleanShot 2024-02-03 at 16 18 16@2x](https://github.com/janhq/jan/assets/22463238/0b9ba77b-73ab-4690-bf69-91b0c2ff7d02) [Figma](link) ## Tasklist - [ ] Discussion to address possible technical blockage to finalize the specs - @janhq/engineers - [ ] App Pod - [ ] Foundry team: You should consider this as the thing we use everyday to automate our work - [ ] Jan studio in python - [ ] Webserver fastapi + Scheduler (Celery) - @hiro-v - [ ] TrainingEngine abstraction - [ ] TrainingEngine -> unsloth - [ ] TrainingEngine -> `similar to unsloth but use MLX` (optional) - [ ] Converter job - [ ] transformer based -> GGUF - @hiro-v - [ ] Jan Studio extension - @louis-jan ## Not in Scope - The MLops platform to fine tune/ train anything. We focus on LLM and Embedding models first (i.e: NLP based) ## Appendix - Reference for OAI: https://platform.openai.com/docs/api-reference/fine-tuning/create - Workflow that Jan Foundry team has now in CI/ manual steps to gen data then fine tune models ![new_image](https://github.com/janhq/jan/assets/22463238/c4457017-8837-497d-a27a-f380368e57b1)