# Parameters Table | Parameter Name | Default Value | Min | Max | Description | UI Element Suggestion | |----------------|---------------|-----|-----|-------------|----------------------| | **Inference Parameters** | | | | | | | max_tokens | 2048 | 128 | 4096 | The maximum number of tokens the model will generate in a single response. | Slider | | stop | - | - | - | Defines specific tokens or phrases at which the model will stop generating further output. | Textbox | | frequency_penalty | 0 | 0 | 2 | Adjusts the likelihood of the model repeating words or phrases in its output. | Slider | | presence_penalty | 0 | 0 | 2 | Influences the generation of new and varied concepts in the model's output. | Slider | | temperature | 0.7 | 0 | 1 | Controls the randomness of the model's output. | Slider | | top_p | 0.95 | 0 | 1 | Set probability threshold for more relevant outputs. | Slider | | **Model Parameters** | | | | | | | pre_prompt | A chat between a curious user and an artificial intelligence assistant. The assistant follows the given rules no matter what. | - | - | The prompt to use for internal configuration. | Textbox | | system_prompt | "ASSISTANT's RULE:" | - | - | The prefix for system prompt | Textbox | | user_prompt | "USER:" | - | - | The prefix for user prompt. | Textbox | | ai_prompt | "ASSISTANT:" | - | - | The prefix for assistant prompt. | Textbox | | **Engine Parameters** | | | | | | | ngl | 100 | 0 | 100 | The number of layers to load onto the GPU for acceleration. | Textbox | | ctx_len | 2048 | 128 | 4096 | The context length for model operations varies; the maximum depends on the specific model used. | Slider | | **Advanced** | | | | | | | n_parallel | 1 | - | - | The number of parallel operations. Only set when enable continuous batching. | - | | cont_batching | false | - | - | Whether to use continuous batching. | Toggle Switch | | cpu_threads | - | - | - | The number of threads for CPU-based inference. | - | | embedding | true | - | - | Whether to enable embedding. | Toggle Switch |