# Parameters Table
| Parameter Name | Default Value | Min | Max | Description | UI Element Suggestion |
|----------------|---------------|-----|-----|-------------|----------------------|
| **Inference Parameters** | | | | | |
| max_tokens | 2048 | 128 | 4096 | The maximum number of tokens the model will generate in a single response. | Slider |
| stop | - | - | - | Defines specific tokens or phrases at which the model will stop generating further output. | Textbox |
| frequency_penalty | 0 | 0 | 2 | Adjusts the likelihood of the model repeating words or phrases in its output. | Slider |
| presence_penalty | 0 | 0 | 2 | Influences the generation of new and varied concepts in the model's output. | Slider |
| temperature | 0.7 | 0 | 1 | Controls the randomness of the model's output. | Slider |
| top_p | 0.95 | 0 | 1 | Set probability threshold for more relevant outputs. | Slider |
| **Model Parameters** | | | | | |
| pre_prompt | A chat between a curious user and an artificial intelligence assistant. The assistant follows the given rules no matter what. | - | - | The prompt to use for internal configuration. | Textbox |
| system_prompt | "ASSISTANT's RULE:" | - | - | The prefix for system prompt | Textbox |
| user_prompt | "USER:" | - | - | The prefix for user prompt. | Textbox |
| ai_prompt | "ASSISTANT:" | - | - | The prefix for assistant prompt. | Textbox |
| **Engine Parameters** | | | | | |
| ngl | 100 | 0 | 100 | The number of layers to load onto the GPU for acceleration. | Textbox |
| ctx_len | 2048 | 128 | 4096 | The context length for model operations varies; the maximum depends on the specific model used. | Slider |
| **Advanced** | | | | | |
| n_parallel | 1 | - | - | The number of parallel operations. Only set when enable continuous batching. | - |
| cont_batching | false | - | - | Whether to use continuous batching. | Toggle Switch |
| cpu_threads | - | - | - | The number of threads for CPU-based inference. | - |
| embedding | true | - | - | Whether to enable embedding. | Toggle Switch |