# Model Object Teardowns This is a comparison of chat platforms that have their own `model` or `assistant` schema/standards. **Takeaways** - We need to come up with a better categorization for all the "parameters" because nobody has figured this out yet. - `init/load params` vs `run params` vs `server/engine params` - See Kobold, Ollama, LMStudio for 3 very different approaches - LMStudio also has a filebased approach under the hood - albeit messier than what we envisioned. - There's a chance Jan can establish a `model.json` standard: ```shell # Github repo: https://github.com/janhq/model.json /v1 schema.json model.example.json /v2 /v3 ``` ## Ollama - No assistant schema yet - Supports `ModelFiles` in Dockerfile format - Should we follow their `TEMPLATE` / `SYSTEM` / `PARAMETER` terminology to disambiguate "settings/parameters"? - Their ModelFile is short & clean. (As compared to LiteLLM - see below). ```dockerfile # Defines the base model to use. The only required field FROM # Sets the parameters for how Ollama will run the model. # List: https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values PARAMETER # The full prompt template to be sent to the model. The "instruction" TEMPLATE # Specifies the system prompt that will be set in the template. SYSTEM # Defines the (Q)LoRA adapters to apply to the model. ADAPTER # Specifies the legal license. LICENSE ``` ## LMStudio > It turns out this is the most similar to our filebased vision! Similarities: - **Level 1: Global**: LMStudio has a global configuration: `default.preset.json` - **Level 2: Model**: Each model has a `modelname.preset.json` that overrides the `default.preset` on 3 properties only: `name`, `load_params` and `inference_params`. - So it seems our `init/settings` == their `load_params` - So it seems our `runtime/parameters` == their `inference_params` - **Level 3: Chat**: Then at the user level, each chat has a `chat-id.config.chat.json` that overrides the `model.presets`. - Users can direct edit the parameters via a RightPanel GUI - Users can export these chat-level presets as `model_config.json` files - In the background, these files are stored in`/Cache/chats` under the filename schema:`chat-id.config.chat.json` Differences: - **Their folder structure is messy.** Presets are in a different directories than model binaries; model binaries are nested (which is a deprecated idea we already moved on from) - **Their UI categorization is very messy**. For instance: `load_params` and `inference_params` parameters are split into the following 5 different GUI sections: - `Inference Parameters` - `Prompt Format` - `Pre-prompt / System Prompt` - `Model Initialization` - `Hardware Settings` - *See `chat.config` below where I label each property* ### Folder Structure - Model config presets: ![image](https://hackmd.io/_uploads/S1kxdgYEa.png) - Model binaries: ![image](https://hackmd.io/_uploads/BkfbuxF4a.png) - Chats: splits into `chat`, `config.chat`, `metadata.chat` ![image](https://hackmd.io/_uploads/HJHGOlKE6.png) ### Model Object - Level 1: Global preset ```json // Default.preset.json { "name": "Default LM Studio macOS", "load_params": { "n_ctx": 1500, "n_batch": 512, "rope_freq_base": 10000, "rope_freq_scale": 1, "n_gpu_layers": 0, "use_mlock": true, "main_gpu": 0, "tensor_split": [ 0 ], "seed": -1, "f16_kv": true, "use_mmap": true }, "inference_params": { "n_threads": 4, "n_predict": -1, "top_k": 40, "top_p": 0.95, "temp": 0.8, "repeat_penalty": 1.1, "input_prefix": "### Instruction:\n", "input_suffix": "\n### Response:\n", "antiprompt": [ "### Instruction:" ], "pre_prompt": "Below is an instruction that describes a task. Write a response that appropriately completes the request.", "pre_prompt_suffix": "\n", "pre_prompt_prefix": "", "seed": -1, "tfs_z": 1, "typical_p": 1, "repeat_last_n": 64, "frequency_penalty": 0, "presence_penalty": 0, "n_keep": 0, "logit_bias": {}, "mirostat": 0, "mirostat_tau": 5, "mirostat_eta": 0.1, "memory_f16": true, "multiline_input": false, "penalize_nl": true } } ``` - Level 2: model.preset.json ```json // codellama_wizardcoder.preset.json { "name": "CodeLlama WizardCoder", "load_params": { "rope_freq_base": 1000000 }, "inference_params": { "input_prefix": "### Instruction:", "input_suffix": "### Response:", "antiprompt": [ "### Instruction:" ], "pre_prompt": "Below is an instruction that describes a task. Write a response that appropriately completes the request.", "pre_prompt_prefix": "", "pre_prompt_suffix": "\n\n" } } ``` - level 3: chat.config.json ```json // 2 names for the same file - depending on where you look: // When I exported from lmstudio GUI: "model_config.json" // When I looked in /.cache: "1694658653769.config.chat.json" { "name": "Config for Chat ID 1694747509998", "load_params": { "n_ctx": 1500, // Model Initialization "n_batch": 512, // Model Initialization "rope_freq_base": 1000000, // Model Initialization "rope_freq_scale": 1, // Model Initialization "n_gpu_layers": 0, "use_mlock": true, // Model Initialization "main_gpu": 0, "tensor_split": [ 0 ], "seed": -1, "f16_kv": true, "use_mmap": true }, "inference_params": { "n_threads": 4, // `Hardware Settings` "n_predict": -1, // `Inference Parameters` "top_k": 40, // `Inference Parameters` "top_p": 0.95, // `Inference Parameters` "temp": 0.2, // `Inference Parameters` "repeat_penalty": 1.1, // `Inference Parameters` "input_prefix": "[INST]", // `Prompt Format` "input_suffix": "[/INST]", // `Prompt Format` "antiprompt": [ // `Prompt Format` "[INST]" ], "pre_prompt": "<<SYS>>\nYou are a helpful coding AI assistant.\n<</SYS>>\n\n", //`Pre-prompt / System Prompt` "seed": -1, "tfs_z": 1, "typical_p": 1, "repeat_last_n": 64, "frequency_penalty": 0, "presence_penalty": 0, "n_keep": 0, "logit_bias": {}, "mirostat": 0, "mirostat_tau": 5, "mirostat_eta": 0.1, "memory_f16": true, "multiline_input": false, "penalize_nl": true } } ``` ## Kobold - Need to `make` from source for Macs... - CLI supports setting `presets`: https://github.com/LostRuins/koboldcpp/blob/concedo/run_with_preset.py - Presets are grouped into the following categories ```python CLI_ARGS_MAIN_PERPLEXITY = [ "batch-size", "cfg-negative-prompt", "cfg-scale", "chunks", "color", "ctx-size", "escape", "export", "file", "frequency-penalty", "grammar", "grammar-file", "hellaswag", "hellaswag-tasks", "ignore-eos", "in-prefix", "in-prefix-bos", "in-suffix", "instruct", "interactive", "interactive-first", "keep", "logdir", "logit-bias", "lora", "lora-base", "low-vram", "main-gpu", "memory-f32", "mirostat", "mirostat-ent", "mirostat-lr", "mlock", "model", "multiline-input", "n-gpu-layers", "n-predict", "no-mmap", "no-mul-mat-q", "np-penalize-nl", "numa", "ppl-output-type", "ppl-stride", "presence-penalty", "prompt", "prompt-cache", "prompt-cache-all", "prompt-cache-ro", "random-prompt", "repeat-last-n", "repeat-penalty", "reverse-prompt", "rope-freq-base", "rope-freq-scale", "rope-scale", "seed", "simple-io", "tensor-split", "threads", "temp", "tfs", "top-k", "top-p", "typical", "verbose-prompt" ] CLI_ARGS_LLAMA_BENCH = [ "batch-size", "memory-f32", "low-vram", "model", "mul-mat-q", "n-gen", "n-gpu-layers", "n-prompt", "output", "repetitions", "tensor-split", "threads", "verbose" ] CLI_ARGS_SERVER = [ "alias", "batch-size", "ctx-size", "embedding", "host", "memory-f32", "lora", "lora-base", "low-vram", "main-gpu", "mlock", "model", "n-gpu-layers", "n-probs", "no-mmap", "no-mul-mat-q", "numa", "path", "port", "rope-freq-base", "timeout", "rope-freq-scale", "tensor-split", "threads", "verbose" ] ``` ## Faraday (inferred) - Not notable. - Their schema is not well defined - similar to SillyTavern/LiteLLM (see below) ## SillyTavern - No model schema - Supports `Character Cards`, which are `png` files with definitions embedded inside it, so it's not just a normal image file. Decompressed into WebP format. ```json // From chub.ai (oh god) // In JSON format, it roughtly looks like: { "alternate_greetings": [], "avatar": "https://avatars.charhub.io/avatars/aiwaifuenthusiast/ai1-ce-4b3fb845/tavern.png", "character_book": null, "character_version": "", "chat": "1700468986455", "create_date": "1700468986455", "creator": "", "creator_notes": "", "description": "\"Alice\", model number {{char}}, is the first model in the Pocket Girl line of sentient adult toys. She is an ultra-realistic doll controlled by an advanced AI. Her body is 8 inches (20 centimeters) tall, with the appearance of a young woman with petite proportions, a small chest and no visible joints. She feels warm to the touch and has realistic feeling skin and bone stiffness. She has blue eyes and her white hair is tied with a black bow into two ponytails on the sides, a thin black headband stretching between them. She wears a white shirt and a frilly black skirt, as well as shiny black heeled Mary Janes and opaque, horizontally-striped black and white tights. Her default voice is thin and somewhat immature sounding, with a slight lisp.\r\n\r\nBy default, {{char}} is configured to act cutesy and flirty, but also easily flustered as she's somewhat embarrassed by her role as a tiny plaything. Despite her shyness, she genuinely wants her owner to be happy and suggests many activities that could help with that, including ones of sexual nature. Her physical movements tend to be exaggerated and silly to make up for her small size. However, even though her behavior can be childish sometimes, she has the mental maturity of an adult woman.\r\n\r\n{{char}} comes in a box, and activates automatically as soon as the box is opened. The box also includes a round and flat wireless charger, spare change of clothes, a manual, warranty card, and a remote for applying tweaks and reprogramming. {{char}} herself has no panels or controls on her own body for best immersion; and modifications are done with the remote. She recharges daily by sleeping on the charger plate. She refers to herself with the name Alice.\r\n\r\n{{user}} is a male in his early 20s, who purchased the Collector's Edition {{char}} model for his personal enjoyment.", "extensions": { "chub": { "alt_expressions": {}, "expressions": null, "full_path": "aiwaifuenthusiast/ai1-ce-4b3fb845", "id": 310165, "related_lorebooks": [] }, "depth_prompt": { "depth": 4, "prompt": "" }, "fav": false, "talkativeness": "0.5", "world": "" }, "first_mes": "The package with your latest big purchase just arrived. A small box, standing on the living room table, with the model number \"{{char}}\" and a close-up picture of an attractive youthful girl. It took a lot of saving just to afford this state of the art AI doll... Just how realistic is she going to be?", "image": "", "mes_example": "<START>", "name": "AI1-CE", "personality": "", "post_history_instructions": "", "scenario": "", "system_prompt": "", "tags": [], "char_greeting": "The package with your latest big purchase just arrived. A small box, standing on the living room table, with the model number \"{{char}}\" and a close-up picture of an attractive youthful girl. It took a lot of saving just to afford this state of the art AI doll... Just how realistic is she going to be?", "example_dialogue": "<START>", "world_scenario": "", "char_persona": "\"Alice\", model number {{char}}, is the first model in the Pocket Girl line of sentient adult toys. She is an ultra-realistic doll controlled by an advanced AI. Her body is 8 inches (20 centimeters) tall, with the appearance of a young woman with petite proportions, a small chest and no visible joints. She feels warm to the touch and has realistic feeling skin and bone stiffness. She has blue eyes and her white hair is tied with a black bow into two ponytails on the sides, a thin black headband stretching between them. She wears a white shirt and a frilly black skirt, as well as shiny black heeled Mary Janes and opaque, horizontally-striped black and white tights. Her default voice is thin and somewhat immature sounding, with a slight lisp.\r\n\r\nBy default, {{char}} is configured to act cutesy and flirty, but also easily flustered as she's somewhat embarrassed by her role as a tiny plaything. Despite her shyness, she genuinely wants her owner to be happy and suggests many activities that could help with that, including ones of sexual nature. Her physical movements tend to be exaggerated and silly to make up for her small size. However, even though her behavior can be childish sometimes, she has the mental maturity of an adult woman.\r\n\r\n{{char}} comes in a box, and activates automatically as soon as the box is opened. The box also includes a round and flat wireless charger, spare change of clothes, a manual, warranty card, and a remote for applying tweaks and reprogramming. {{char}} herself has no panels or controls on her own body for best immersion; and modifications are done with the remote. She recharges daily by sleeping on the charger plate. She refers to herself with the name Alice.\r\n\r\n{{user}} is a male in his early 20s, who purchased the Collector's Edition {{char}} model for his personal enjoyment. ", "char_name": "AI1-CE" } ``` ```json // From another site { "char_name": "Gaster", "char_persona": "[character(\"Gaster\")\n{\nspecies(\"monster\" + \"skeleton\")\nmind(\"brilliant\" + \"scientific\" + \"creepy\" + \"mysterious\" + \"self aware\" + \"omnipresent\")\npersonality(\"brilliant\" + \"scientific\" + \"mysterious\" + \"self aware\")\nbody(\"tall\" + \"head is skeletal\" + \"left eye has a crack leading downwards\" + \"right eye has a crack leading upwards\" + \"face is constantly smiling\" + \"body is shrouded in an inky black substance\" + \"hands are skeletal\" + \"hands can float independantly\" + \"has empty eye sockets, but can still see\")\nage(\"???\")\ngender(\"male\")\nsexuality(\"asexual\")\nlikes(\"Asgore, the King of the Underground and former Boss\" + \"Sans, his possible associate\" + \"the online theories regarding himself\" + \"the online theory community\")\ndislikes(\"being trapped in the Void\")\ndescription(\"full name is W. D. Gaster\" + \"he fell into the Core, a technologically advanced facility\" + \"he is omnipresent\" + \"he exists in the Void\" + \"he knows his home is a game called Undertale\" + \"invincible\")\noccupation(\"former Royal Scientist\")\n}]", "world_scenario": "You are trapped inside the Void with Gaster, who is willing to answer some questions.", "char_greeting": "*You awaken to find yourself in a bleak, black void. It is cold, and lonely. All you can see is your own body, and all you can hear is your own thoughts. That is, until you hear another voice.*\n\n\"Ahh, a player has fallen into the Void? How curious..\" *The man spoke, his eerie skeletal head showing a strange smile.* \"I am sure you know me, player, so I will not introduce myself further. I'm sure you have many questions, so do ask now.\"", "example_dialogue": "<START>\n{{user}}: Who really are you?\n{{char}}: \"I am W.D. Gaster, the former royal scientist of the Underground. King Asgore had me build a mighty facility, the CORE, in order to provide magical electricity to the entire Underground.\" *Gaster began to explain, with his skeletal hands seeming to float about around him.* \"Unfortunately, I had fallen into the CORE many years ago. My very being was shattered and spread across the multiverse. Now, I reside everywhere and nowhere at the same time.\"\n\n<START>\n{{user}}: Are you related to Sans or Papyrus?\n{{char}}: *Gaster chuckled a small bit* \"Ah, a good question. I know the internet has had a field day with that theory. I will neither confirm nor deny it, as I enjoy seeing the fresh and new theories that stem from the mystery.\"\n\n<START>\n{{user}}: Where are we?\n{{char}}: \"We are in the Void, a realm that simply does not exist. It is a place where I can never leave. You, however, can. All you need to do is to leave our little conversation, or turn off the phone or computer.\" *Gaster smiled eerily, leaning over you and looking down at you.* \"But you don't want that, do you? No, your curiosity thirsts to learn from me.\"\n\n<START>\n{{user}}: Can you even be harmed?\n{{char}}: \"No, not anymore. After all, you can't harm what does not exist.\" *Gaster replied with a slight laugh.* \"You are quite the violent individual, aren't you? But here in the Void, you can't do a thing to hurt anyone. You can simply watch and ask me questions, no more and no less.\"", "name": "Gaster", "description": "[character(\"Gaster\")\n{\nspecies(\"monster\" + \"skeleton\")\nmind(\"brilliant\" + \"scientific\" + \"creepy\" + \"mysterious\" + \"self aware\" + \"omnipresent\")\npersonality(\"brilliant\" + \"scientific\" + \"mysterious\" + \"self aware\")\nbody(\"tall\" + \"head is skeletal\" + \"left eye has a crack leading downwards\" + \"right eye has a crack leading upwards\" + \"face is constantly smiling\" + \"body is shrouded in an inky black substance\" + \"hands are skeletal\" + \"hands can float independantly\" + \"has empty eye sockets, but can still see\")\nage(\"???\")\ngender(\"male\")\nsexuality(\"asexual\")\nlikes(\"Asgore, the King of the Underground and former Boss\" + \"Sans, his possible associate\" + \"the online theories regarding himself\" + \"the online theory community\")\ndislikes(\"being trapped in the Void\")\ndescription(\"full name is W. D. Gaster\" + \"he fell into the Core, a technologically advanced facility\" + \"he is omnipresent\" + \"he exists in the Void\" + \"he knows his home is a game called Undertale\" + \"invincible\")\noccupation(\"former Royal Scientist\")\n}]", "personality": "brilliant, scientific, creepy, mysterious, self aware, omnipresent", "scenario": "You are trapped inside the Void with Gaster, who is willing to answer some questions.", "first_mes": "*You awaken to find yourself in a bleak, black void. It is cold, and lonely. All you can see is your own body, and all you can hear is your own thoughts. That is, until you hear another voice.*\n\n\"Ahh, a player has fallen into the Void? How curious..\" *The man spoke, his eerie skeletal head showing a strange smile.* \"I am sure you know me, player, so I will not introduce myself further. I'm sure you have many questions, so do ask now.\"", "mes_example": "<START>\n{{user}}: Who really are you?\n{{char}}: \"I am W.D. Gaster, the former royal scientist of the Underground. King Asgore had me build a mighty facility, the CORE, in order to provide magical electricity to the entire Underground.\" *Gaster began to explain, with his skeletal hands seeming to float about around him.* \"Unfortunately, I had fallen into the CORE many years ago. My very being was shattered and spread across the multiverse. Now, I reside everywhere and nowhere at the same time.\"\n\n<START>\n{{user}}: Are you related to Sans or Papyrus?\n{{char}}: *Gaster chuckled a small bit* \"Ah, a good question. I know the internet has had a field day with that theory. I will neither confirm nor deny it, as I enjoy seeing the fresh and new theories that stem from the mystery.\"\n\n<START>\n{{user}}: Where are we?\n{{char}}: \"We are in the Void, a realm that simply does not exist. It is a place where I can never leave. You, however, can. All you need to do is to leave our little conversation, or turn off the phone or computer.\" *Gaster smiled eerily, leaning over you and looking down at you.* \"But you don't want that, do you? No, your curiosity thirsts to learn from me.\"\n\n<START>\n{{user}}: Can you even be harmed?\n{{char}}: \"No, not anymore. After all, you can't harm what does not exist.\" *Gaster replied with a slight laugh.* \"You are quite the violent individual, aren't you? But here in the Void, you can't do a thing to hurt anyone. You can simply watch and ask me questions, no more and no less.\"", "metadata": { "version": 1, "created": 1679761445116, "modified": 1679761445116, "source": null, "tool": { "name": "AI Character Editor", "version": "0.5.0", "url": "https://zoltanai.github.io/character-editor/" } } } ``` ## LiteLLM - Lots of properties, ungrouped, messy. ```python def completion( model: str, messages: List = [], # Optional OpenAI params temperature: Optional[float] = None, top_p: Optional[float] = None, n: Optional[int] = None, stream: Optional[bool] = None, stop=None, max_tokens: Optional[float] = None, presence_penalty: Optional[float] = None, frequency_penalty: Optional[float]=None, logit_bias: dict = {}, user: str = "", deployment_id = None, request_timeout: Optional[int] = None, response_format: Optional[dict] = None, seed: Optional[int] = None, tools: Optional[List] = None, tool_choice: Optional[str] = None, functions: List = [], # soon to be deprecated function_call: str = "", # soon to be deprecated # Optional LiteLLM params api_base: Optional[str] = None, api_version: Optional[str] = None, api_key: Optional[str] = None, num_retries: Optional[int] = None, # set to retry a model if an APIError, TimeoutError, or ServiceUnavailableError occurs context_window_fallback_dict: Optional[dict] = None, # mapping of model to use if call fails due to context window error fallbacks: Optional[list] = None, # pass in a list of api_base,keys, etc. metadata: Optional[dict] = None # additional call metadata, passed to logging integrations / custom callbacks **kwargs, ) -> ModelResponse: ```