###### tags: `AFS`
# How to Use Tool Call of FFM Conversation API with Llama3 Series
Meta 在 4 月 19 日推出新一代大語言模型 Llama 3,有 80 億和 700 億參數兩種版本,在經過 ASUS AI 核心團隊調整過後,可以與函式一起使用,由大語言模型判斷是否呼叫函式。如果請求中包含一個或多個函式,則模型會根據提示的上下文決定是否需要呼叫函式。當模型確定應該使用某個函式時,會以該函式參數的格式化資料(**JSON**)來進行輸出。
模型是基於所提供的函式,再解析意圖後,輸出對應的 API 與結構化資料。特別注意的是,**模型只挑選出適用的函式,但並不會進行函式的操作,函式呼叫是由「應用端」所實作的業務邏輯來控制**。
函式的使用可以分為三個步驟:
1. 提供函式並輸入使用者問題來呼叫 FFM Conversation API,取得函式呼叫的資訊。
2. 使用模型輸出的函式資訊來呼叫對應的 API 或函式,並取得執行結果。
3. 再次呼叫 FFM Conversation API,將第 2 步驟所取得的執行結果一併傳入模型推論服務中,以便獲得總結。
::: info
:bulb: **提示:** Parallel function calling(目前尚未支援)
Parallel function calls 允許輸出多個函式呼叫,進而可以並行執行和檢索結果。這樣可以減少 API 呼叫次數,來提高整體效能。
:::
## API Support
隨著 Llama3-FFM 模型的發布,FFM Conversation API 會提供更完整的格式來完善 Function Calling 功能。[**舊格式**](https://hackmd.io/D6hChs1HRxyOaLul_HgirQ#Conversation-API) Request body 中的 functions 欄位,以及 Response 中的 function_call 欄位,未來將被棄用,後面的章節會描述新格式的使用方式。
::: info
:bulb: **提示:** 目前 LLMBackend 可以往前相容。也就是不論 FFM-Llama2 還是 FFM-Llama3 模型,使用舊版 function call 格式會得到舊版 functaion call response,使用新版 tool call 格式,會得到新版 tool call response。
:::
1. **Request body by calling the model with tools**
透過大語言模型來選擇適當的函式並解析對應的參數。
* 參數 **`tools`** 為 array 格式,內容主要為函式所對應的 [**JSON Schema**](https://json-schema.org/understanding-json-schema) 描述,其中包含兩個必要參數。
* type
* function
| Field | Type | Required | Description |
| -------- | -------- | -------- | -------- |
| **tools** | array | Optional | JSON 格式的函式列表|
:::spoiler **Properties**
<br>
<table>
<tbody>
<tr>
<td><b>type</b> <font color="#808080">string</font> <font color="#FF0000">Required</font>
<p>tool 類型,目前僅支援 <code>function</code>。</p>
</td>
</tr>
</tbody>
<tr>
<td><p><b>function</b> <font color="#808080">object</font> <font color="#FF0000">Required</font></p>
<p>
<blockquote><table>
<tbody>
<tr>
<td><b>description</b> <font color="#808080">string</font> <font color="#FF0000">Optional</font><br>函式功能的描述,模型根據描述選擇何時呼叫函式。
</td>
</tr>
<tr>
<td><b>name</b> <font color="#808080">string</font> <font color="#FF0000">Required</font><br>函式名稱,必須是 a-z、A-Z、0-9,或是包含底線(_)或連接號(-)。
</td>
</tr>
<tr>
<td><b>parameters</b> <font color="#808080">object</font> <font color="#FF0000">Optional</font><br>函式的輸入參數,使用 JSON Schema 來描述。用法可以參考此 <a href="https://json-schema.org/understanding-json-schema">JSON Schema reference</a> 連結。
</td>
</tr>
</tbody>
</table>
</p>
</td>
</tr>
</table>
:::
* 參數 **`tool_choice`** 為 string 或 object 格式,非必要參數,**此功能只支援 Llama3-FFM 版本**,主要用來指定函式呼叫的情境。當有提供函式時,此欄位預設為 **`"auto"`**,無函式時,預設值為 **`"none"`**。
* **`"none"`**:不執行函式呼叫的功能,而是文字生成。
* **`"auto"`**:由模型自行決定輸出為函式呼叫或是文字生成。
- 在此模式下,可透過回傳欄位 **`finish_reason`** 來判別模型的輸出,若是 **`"finish_reason": "tool_calls"`** 則為函式呼叫,非 **`"tool_calls"`** 則是文字生成。
* **`{"type": "function", "function": {"name": "my_function"}}`** 指定某 function 的函式呼叫。
- 在此模式下,因為已經明確指定要輸出函式呼叫,所以 **`finish_reason`** 是一般像 eos_token 等提示,並 **不會** 是 **`"tool_calls"`**,這部分由應用端自行解析內容來判別。
| Field | Type | Required | Description |
| -------- | -------- | -------- | -------- |
| **tool_choice** | string or object | Optional | 指定函式呼叫的情境|
:::spoiler Possible Types
<br>
<table>
<tbody>
<tr>
<td><font color="#808080">string</font>
<p>- <code>none</code> 不執行函式呼叫,輸出為一般的文字生成。 <br>- <code>auto</code> 由模型決定輸出為函式呼叫或是文字生成。</p>
</td>
</tr>
</tbody>
<tr>
<td><font color="#808080">object</font>
<p>- <code>{"type": "function", "function": {"name": "my_function"}}</code> 指定函式,強制模型輸出指定的函式呼叫。
</p>
<blockquote><p>properities</p>
<table>
<tbody>
<tr>
<td>
<b>type</b> <font color="#808080">string</font> <font color="#FF0000">Required</font>
<p>tool 類型,前僅支援 <code>function</code>。
</p>
</td>
</tr>
</tbody>
<tbody>
<tr>
<td>
<b>function</b> <font color="#808080">object</font> <font color="#FF0000">Required</font>
<blockquote><p>函式屬性</p>
<table>
<tbody>
<tr>
<td>
<b>name</b> <font color="#808080">string</font> <font color="#FF0000">Required</font>
<p>函式名稱
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</table>
:::
#### **Request 使用範例**
:::spoiler Non-Streaming
```JSON=
export API_KEY={API_KEY}
export API_URL={API_URL}
curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "X-API-HOST: afs-inference" \
-H "content-type: application/json" \
-d '{
"model": "Llama-3-8b",
"messages": [
{
"role": "user",
"content": "What is the weather like in Boston?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
}
},
"required": [
"location"
]
}
}
}
],
"parameters": {
"show_probabilities": false,
"max_new_tokens": 350,
"frequence_penalty": 1,
"temperature": 0.01,
"top_k": 100,
"top_p": 0.93,
"seed": 42
},
"stream": false
}'
```
:::
:::spoiler Streaming
```JSON=
export API_KEY={API_KEY}
export API_URL={API_URL}
curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "X-API-HOST: afs-inference" \
-H "content-type: application/json" \
-d '{
"model": "Llama-3-8b",
"messages": [
{
"role": "user",
"content": "What is the weather like in Boston?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
}
},
"required": [
"location"
]
}
}
}
],
"parameters": {
"show_probabilities": false,
"max_new_tokens": 350,
"frequence_penalty": 1,
"temperature": 0.01,
"top_k": 100,
"top_p": 0.93,
"seed": 42
},
"stream": true
}'
```
:::
#### **使用 tool_choice 的 Request 範例**
:::spoiler Use auto
```JSON=
export API_KEY={API_KEY}
export API_URL={API_URL}
curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "X-API-HOST: afs-inference" \
-H "content-type: application/json" \
-d '{
"model": "Llama-3-8b",
"messages": [
{
"role": "user",
"content": "What is the weather like in Boston?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
}
},
"required": [
"location"
]
}
}
}
],
"parameters": {
"show_probabilities": false,
"max_new_tokens": 350,
"frequence_penalty": 1,
"temperature": 0.01,
"top_k": 100,
"top_p": 0.93,
"seed": 42
},
"tool_choice": "auto",
"stream": false
}'
```
:::
:::spoiler Use none
```JSON=
export API_KEY={API_KEY}
export API_URL={API_URL}
curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "X-API-HOST: afs-inference" \
-H "content-type: application/json" \
-d '{
"model": "Llama-3-8b",
"messages": [
{
"role": "user",
"content": "What is the weather like in Boston?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
}
},
"required": [
"location"
]
}
}
}
],
"parameters": {
"show_probabilities": false,
"max_new_tokens": 350,
"frequence_penalty": 1,
"temperature": 0.01,
"top_k": 100,
"top_p": 0.93,
"seed": 42
},
"tool_choice": "none",
"stream": false
}'
```
:::
:::spoiler Specifies a function
```JSON=
export API_KEY={API_KEY}
export API_URL={API_URL}
curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "X-API-HOST: afs-inference" \
-H "content-type: application/json" \
-d '{
"model": "Llama-3-8b",
"messages": [
{
"role": "user",
"content": "What is the weather like in Boston?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
}
},
"required": [
"location"
]
}
}
}
],
"parameters": {
"show_probabilities": false,
"max_new_tokens": 350,
"frequence_penalty": 1,
"temperature": 0.01,
"top_k": 100,
"top_p": 0.93,
"seed": 42
},
"tool_choice": {
"type": "function",
"function": {
"name": "get_current_weather"
}
},
"stream": false
}'
```
:::
2. **Response by calling the model with functions**
大語言模型回傳函式呼叫的結果
| Field | Type |
| -------- | -------- |
| **tool_calls** | array |
:::spoiler Possible Types
<br>
<table>
<tbody>
<tr>
<td><b>id</b> <font color="#808080">string</font>
<p>函式呼叫識別碼</p>
</td>
</tr>
</tbody>
<tbody>
<tr>
<td><b>type</b> <font color="#808080">string</font>
<p>tool 類型。目前僅支援 <code>function</code>。</p>
</td>
</tr>
</tbody>
<tbody>
<tr>
<td><b>function</b> <font color="#808080">object</font>
<p>為包含函式名稱、參數值的函式呼叫內容。</p>
</td>
</tr>
</tbody>
</table>
:::
#### **使用範例**
::: spoiler Response with Non-Streaming
```JSON=
{
"tool_calls": [
{
"index": 0,
"type": "function",
"id": "call_8a53fdf7e96c418aaaff76d2e1bb9964",
"function": {
"name": "get_current_weather",
"arguments": "{\"location\": \"Boston, MA\", \"unit\": \"celsius\"}"
}
}
],
"details": null,
"total_time_taken": "1.17 sec",
"prompt_tokens": 141,
"generated_tokens": 43,
"total_tokens": 184,
"finish_reason": "tool_calls"
}
```
:::
::: spoiler Response with Streaming
```JSON=
data: {"tool_calls": [{"index": 0, "type": "function", "id": "call_afc9227158e6458798d789ab1f84c920", "function": {"name": "get_current_weather", "arguments": ""}}], "details": null, "finish_reason": null}
data: {"tool_calls": [{"index": 0, "function": {"arguments": "{\""}}], "details": null, "finish_reason": null}
data: {"tool_calls": [{"index": 0, "function": {"arguments": "location"}}], "details": null, "finish_reason": null}
data: {"tool_calls": [{"index": 0, "function": {"arguments": "\":"}}], "details": null, "finish_reason": null}
data: {"tool_calls": [{"index": 0, "function": {"arguments": " \""}}], "details": null, "finish_reason": null}
data: {"tool_calls": [{"index": 0, "function": {"arguments": "Boston, MA"}}], "details": null, "finish_reason": null}
data: {"tool_calls": [{"index": 0, "function": {"arguments": "\","}}], "details": null, "finish_reason": null}
data: {"tool_calls": [{"index": 0, "function": {"arguments": " \""}}], "details": null, "finish_reason": null}
data: {"tool_calls": [{"index": 0, "function": {"arguments": "unit"}}], "details": null, "finish_reason": null}
data: {"tool_calls": [{"index": 0, "function": {"arguments": "\":"}}], "details": null, "finish_reason": null}
data: {"tool_calls": [{"index": 0, "function": {"arguments": " \""}}], "details": null, "finish_reason": null}
data: {"tool_calls": [{"index": 0, "function": {"arguments": "c"}}], "details": null, "finish_reason": null}
data: {"tool_calls": [{"index": 0, "function": {"arguments": "elsius"}}], "details": null, "finish_reason": null}
data: {"tool_calls": [{"index": 0, "function": {"arguments": "\"}"}}], "details": null, "finish_reason": null}
data: {"tool_calls": [{"index": 0, "function": {"arguments": ""}}], "details": null, "total_time_taken": "1.17 sec", "prompt_tokens": 141, "generated_tokens": 43, "total_tokens": 184, "finish_reason": "tool_calls"}
```
:::
::: spoiler Response with tool_choice by auto
```JSON=
{
"tool_calls": [
{
"type": "function",
"id": "call_fe97cf6c20ae4b00b88b660b853d93d9",
"function": {
"name": "get_current_weather",
"arguments": "{\"location\": \"Boston, MA\", \"unit\": \"celsius\"}"
}
}
],
"details": null,
"total_time_taken": "1.16 sec",
"prompt_tokens": 135,
"generated_tokens": 43,
"total_tokens": 178,
"finish_reason": "tool_calls"
}
```
:::
::: spoiler Response with tool_choice by none
```JSON=
{
"generated_text": "As of my last update, the weather in Boston was quite chilly with temperatures around 40°F (4°C) and some light rain. However, it's always a good idea to check the latest weather forecast before heading out, as conditions can change quickly.",
"details": null,
"total_time_taken": "1.41 sec",
"prompt_tokens": 18,
"generated_tokens": 53,
"total_tokens": 71,
"finish_reason": "stop_sequence"
}
```
:::
::: spoiler Response with tool_choice by specifies a function
```JSON=
{
"tool_calls": [
{
"type": "function",
"id": "call_7JK8LIPTho7DffbvceTV5Oey",
"function": {
"name": "get_current_weather",
"arguments": "{\"location\": \"Boston, MA\", \"unit\": \"celsius\"}"
}
}
],
"details": null,
"total_time_taken": "0.82 sec",
"prompt_tokens": 159,
"generated_tokens": 18,
"total_tokens": 177,
"finish_reason": "eos_token"
}
```
:::
3. **Request body by sending the response back to the model to summarize**
大語言模型將函式執行後的結果,以容易理解的方式來輸出。這個步驟屬於多輪對話的情境,除了要提供之前的歷史對話紀錄,還需要將執行函式的結果,放在 role 為 **`tool`** 的 **content** 欄位中。
| Field |value |
| -------- | -------- |
|**role** | tool |
|**tool_call_id** | 引用 tool_calls 中的函式呼叫識別碼 |
|**content** | 函式呼叫的執行結果 |
#### **使用範例**
::: spoiler Request
```JSON=
export API_KEY={API_KEY}
export API_URL={API_URL}
curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "X-API-HOST: afs-inference" \
-H "content-type: application/json" \
-d '{
"model": "Llama-3-8b",
"messages": [
{
"role": "user",
"content": "What is the weather like in Boston?"
},
{
"role": "assistant",
"content": "",
"tool_calls": [
{
"id": "call_8a53fdf7e96c418aaaff76d2e1bb9964",
"type": "function",
"function": {
"name": "get_current_weather",
"arguments": "{\"location\":\"Boston, MA\", \"unit\": \"celsius\"}"
}
}
]
},
{
"role": "tool",
"tool_call_id": "call_8a53fdf7e96c418aaaff76d2e1bb9964",
"content": "{\"location\": \"Boston, MA\", \"temperature\": \"22\", \"unit\": \"celsius\"}"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
}
},
"required": [
"location"
]
}
}
}
],
"parameters": {
"show_probabilities": false,
"max_new_tokens": 350,
"frequence_penalty": 1,
"temperature": 0.5,
"top_k": 100,
"top_p": 0.93,
"seed": 42
}
}'
```
:::
::: spoiler Response
```JSON=
{
"generated_text": "The current temperature in Boston, MA is 22 degrees Celsius.",
"details": null,
"total_time_taken": "0.43 sec",
"prompt_tokens": 250,
"generated_tokens": 14,
"total_tokens": 264,
"finish_reason": "stop_sequence"
}
```
:::