tags: `AFS`

How to Use Tool Call of FFM Conversation API with Llama3 Series

Meta 在 4 月 19 日推出新一代大語言模型 Llama 3，有 80 億和 700 億參數兩種版本，在經過 ASUS AI 核心團隊調整過後，可以與函式一起使用，由大語言模型判斷是否呼叫函式。如果請求中包含一個或多個函式，則模型會根據提示的上下文決定是否需要呼叫函式。當模型確定應該使用某個函式時，會以該函式參數的格式化資料（JSON）來進行輸出。

模型是基於所提供的函式，再解析意圖後，輸出對應的 API 與結構化資料。特別注意的是，模型只挑選出適用的函式，但並不會進行函式的操作，函式呼叫是由「應用端」所實作的業務邏輯來控制。

函式的使用可以分為三個步驟：

提供函式並輸入使用者問題來呼叫 FFM Conversation API，取得函式呼叫的資訊。
使用模型輸出的函式資訊來呼叫對應的 API 或函式，並取得執行結果。
再次呼叫 FFM Conversation API，將第 2 步驟所取得的執行結果一併傳入模型推論服務中，以便獲得總結。

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

提示： Parallel function calling（目前尚未支援）
Parallel function calls 允許輸出多個函式呼叫，進而可以並行執行和檢索結果。這樣可以減少 API 呼叫次數，來提高整體效能。

API Support

隨著 Llama3-FFM 模型的發布，FFM Conversation API 會提供更完整的格式來完善 Function Calling 功能。舊格式 Request body 中的 functions 欄位，以及 Response 中的 function_call 欄位，未來將被棄用，後面的章節會描述新格式的使用方式。

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

提示： 目前 LLMBackend 可以往前相容。也就是不論 FFM-Llama2 還是 FFM-Llama3 模型，使用舊版 function call 格式會得到舊版 functaion call response，使用新版 tool call 格式，會得到新版 tool call response。

Request body by calling the model with tools
透過大語言模型來選擇適當的函式並解析對應的參數。

參數 tools 為 array 格式，內容主要為函式所對應的 JSON Schema 描述，其中包含兩個必要參數。

type
function

Field	Type	Required	Description
tools	array	Optional	JSON 格式的函式列表

Properties

type string Required

tool 類型，目前僅支援 function。

function object Required

description string Optional
函式功能的描述，模型根據描述選擇何時呼叫函式。

name string Required
函式名稱，必須是 a-z、A-Z、0-9，或是包含底線（_）或連接號（-）。

parameters object Optional
函式的輸入參數，使用 JSON Schema 來描述。用法可以參考此 JSON Schema reference 連結。

參數 tool_choice 為 string 或 object 格式，非必要參數，此功能只支援 Llama3-FFM 版本，主要用來指定函式呼叫的情境。當有提供函式時，此欄位預設為 "auto"，無函式時，預設值為 "none"。

"none"：不執行函式呼叫的功能，而是文字生成。
"auto"：由模型自行決定輸出為函式呼叫或是文字生成。
- 在此模式下，可透過回傳欄位 finish_reason 來判別模型的輸出，若是 "finish_reason": "tool_calls" 則為函式呼叫，非 "tool_calls" 則是文字生成。
{"type": "function", "function": {"name": "my_function"}} 指定某 function 的函式呼叫。
- 在此模式下，因為已經明確指定要輸出函式呼叫，所以 finish_reason 是一般像 eos_token 等提示，並不會是 "tool_calls"，這部分由應用端自行解析內容來判別。

Field	Type	Required	Description
tool_choice	string or object	Optional	指定函式呼叫的情境

Possible Types

string

- none 不執行函式呼叫，輸出為一般的文字生成。
- auto 由模型決定輸出為函式呼叫或是文字生成。

object

- {"type": "function", "function": {"name": "my_function"}} 指定函式，強制模型輸出指定的函式呼叫。

properities

type string Required
tool 類型，前僅支援 function。

function object Required
函式屬性

name string Required
函式名稱

Request 使用範例

Non-Streaming























































export API_KEY={API_KEY}
export API_URL={API_URL}

curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "X-API-HOST: afs-inference" \
-H "content-type: application/json" \
-d '{
      "model": "Llama-3-8b",
      "messages": [
        {
          "role": "user",
          "content": "What is the weather like in Boston?"
        }
      ],
      "tools": [
        {
          "type": "function",
          "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
              "type": "object",
              "properties": {
                "location": {
                  "type": "string",
                  "description": "The city and state, e.g. San Francisco, CA"
                },
                "unit": {
                  "type": "string",
                  "enum": [
                    "celsius",
                    "fahrenheit"
                  ]
                }
              },
              "required": [
                "location"
              ]
            }
          }
        }
      ],
      "parameters": {
        "show_probabilities": false,
        "max_new_tokens": 350,
        "frequence_penalty": 1,
        "temperature": 0.01,
        "top_k": 100,
        "top_p": 0.93,
        "seed": 42
      },
      "stream": false
}'

Streaming























































export API_KEY={API_KEY}
export API_URL={API_URL}

curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "X-API-HOST: afs-inference" \
-H "content-type: application/json" \
-d '{
      "model": "Llama-3-8b",
      "messages": [
        {
          "role": "user",
          "content": "What is the weather like in Boston?"
        }
      ],
      "tools": [
        {
          "type": "function",
          "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
              "type": "object",
              "properties": {
                "location": {
                  "type": "string",
                  "description": "The city and state, e.g. San Francisco, CA"
                },
                "unit": {
                  "type": "string",
                  "enum": [
                    "celsius",
                    "fahrenheit"
                  ]
                }
              },
              "required": [
                "location"
              ]
            }
          }
        }
      ],
      "parameters": {
        "show_probabilities": false,
        "max_new_tokens": 350,
        "frequence_penalty": 1,
        "temperature": 0.01,
        "top_k": 100,
        "top_p": 0.93,
        "seed": 42
      },
      "stream": true
}'

使用 tool_choice 的 Request 範例

Use auto
























































export API_KEY={API_KEY}
export API_URL={API_URL}

curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "X-API-HOST: afs-inference" \
-H "content-type: application/json" \
-d '{
      "model": "Llama-3-8b",
      "messages": [
        {
          "role": "user",
          "content": "What is the weather like in Boston?"
        }
      ],
      "tools": [
        {
          "type": "function",
          "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
              "type": "object",
              "properties": {
                "location": {
                  "type": "string",
                  "description": "The city and state, e.g. San Francisco, CA"
                },
                "unit": {
                  "type": "string",
                  "enum": [
                    "celsius",
                    "fahrenheit"
                  ]
                }
              },
              "required": [
                "location"
              ]
            }
          }
        }
      ],
      "parameters": {
        "show_probabilities": false,
        "max_new_tokens": 350,
        "frequence_penalty": 1,
        "temperature": 0.01,
        "top_k": 100,
        "top_p": 0.93,
        "seed": 42
      },
      "tool_choice": "auto",
      "stream": false
}'

Use none
























































export API_KEY={API_KEY}
export API_URL={API_URL}

curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "X-API-HOST: afs-inference" \
-H "content-type: application/json" \
-d '{
      "model": "Llama-3-8b",
      "messages": [
        {
          "role": "user",
          "content": "What is the weather like in Boston?"
        }
      ],
      "tools": [
        {
          "type": "function",
          "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
              "type": "object",
              "properties": {
                "location": {
                  "type": "string",
                  "description": "The city and state, e.g. San Francisco, CA"
                },
                "unit": {
                  "type": "string",
                  "enum": [
                    "celsius",
                    "fahrenheit"
                  ]
                }
              },
              "required": [
                "location"
              ]
            }
          }
        }
      ],
      "parameters": {
        "show_probabilities": false,
        "max_new_tokens": 350,
        "frequence_penalty": 1,
        "temperature": 0.01,
        "top_k": 100,
        "top_p": 0.93,
        "seed": 42
      },
      "tool_choice": "none",
      "stream": false
}'

Specifies a function





























































export API_KEY={API_KEY}
export API_URL={API_URL}

curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "X-API-HOST: afs-inference" \
-H "content-type: application/json" \
-d '{
      "model": "Llama-3-8b",
      "messages": [
        {
          "role": "user",
          "content": "What is the weather like in Boston?"
        }
      ],
      "tools": [
        {
          "type": "function",
          "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
              "type": "object",
              "properties": {
                "location": {
                  "type": "string",
                  "description": "The city and state, e.g. San Francisco, CA"
                },
                "unit": {
                  "type": "string",
                  "enum": [
                    "celsius",
                    "fahrenheit"
                  ]
                }
              },
              "required": [
                "location"
              ]
            }
          }
        }
      ],
      "parameters": {
        "show_probabilities": false,
        "max_new_tokens": 350,
        "frequence_penalty": 1,
        "temperature": 0.01,
        "top_k": 100,
        "top_p": 0.93,
        "seed": 42
      },
      "tool_choice": {
        "type": "function",
        "function": {
          "name": "get_current_weather"
        }
      },
      "stream": false
}'

Response by calling the model with functions
大語言模型回傳函式呼叫的結果

Field	Type
tool_calls	array

Possible Types

id string

函式呼叫識別碼

type string

tool 類型。目前僅支援 function。

function object

為包含函式名稱、參數值的函式呼叫內容。

使用範例

Response with Non-Streaming



















{
  "tool_calls": [
    {
      "index": 0,
      "type": "function",
      "id": "call_8a53fdf7e96c418aaaff76d2e1bb9964",
      "function": {
        "name": "get_current_weather",
        "arguments": "{\"location\": \"Boston, MA\", \"unit\": \"celsius\"}"
      }
    }
  ],
  "details": null,
  "total_time_taken": "1.17 sec",
  "prompt_tokens": 141,
  "generated_tokens": 43,
  "total_tokens": 184,
  "finish_reason": "tool_calls"
}

Response with Streaming





























data: {"tool_calls": [{"index": 0, "type": "function", "id": "call_afc9227158e6458798d789ab1f84c920", "function": {"name": "get_current_weather", "arguments": ""}}], "details": null, "finish_reason": null}

data: {"tool_calls": [{"index": 0, "function": {"arguments": "{\""}}], "details": null, "finish_reason": null}

data: {"tool_calls": [{"index": 0, "function": {"arguments": "location"}}], "details": null, "finish_reason": null}

data: {"tool_calls": [{"index": 0, "function": {"arguments": "\":"}}], "details": null, "finish_reason": null}

data: {"tool_calls": [{"index": 0, "function": {"arguments": " \""}}], "details": null, "finish_reason": null}

data: {"tool_calls": [{"index": 0, "function": {"arguments": "Boston, MA"}}], "details": null, "finish_reason": null}

data: {"tool_calls": [{"index": 0, "function": {"arguments": "\","}}], "details": null, "finish_reason": null}

data: {"tool_calls": [{"index": 0, "function": {"arguments": " \""}}], "details": null, "finish_reason": null}

data: {"tool_calls": [{"index": 0, "function": {"arguments": "unit"}}], "details": null, "finish_reason": null}

data: {"tool_calls": [{"index": 0, "function": {"arguments": "\":"}}], "details": null, "finish_reason": null}

data: {"tool_calls": [{"index": 0, "function": {"arguments": " \""}}], "details": null, "finish_reason": null}

data: {"tool_calls": [{"index": 0, "function": {"arguments": "c"}}], "details": null, "finish_reason": null}

data: {"tool_calls": [{"index": 0, "function": {"arguments": "elsius"}}], "details": null, "finish_reason": null}

data: {"tool_calls": [{"index": 0, "function": {"arguments": "\"}"}}], "details": null, "finish_reason": null}

data: {"tool_calls": [{"index": 0, "function": {"arguments": ""}}], "details": null, "total_time_taken": "1.17 sec", "prompt_tokens": 141, "generated_tokens": 43, "total_tokens": 184, "finish_reason": "tool_calls"}

Response with tool_choice by auto


















{
  "tool_calls": [
    {
      "type": "function",
      "id": "call_fe97cf6c20ae4b00b88b660b853d93d9",
      "function": {
        "name": "get_current_weather",
        "arguments": "{\"location\": \"Boston, MA\", \"unit\": \"celsius\"}"
      }
    }
  ],
  "details": null,
  "total_time_taken": "1.16 sec",
  "prompt_tokens": 135,
  "generated_tokens": 43,
  "total_tokens": 178,
  "finish_reason": "tool_calls"
}

Response with tool_choice by none









{
  "generated_text": "As of my last update, the weather in Boston was quite chilly with temperatures around 40°F (4°C) and some light rain. However, it's always a good idea to check the latest weather forecast before heading out, as conditions can change quickly.",
  "details": null,
  "total_time_taken": "1.41 sec",
  "prompt_tokens": 18,
  "generated_tokens": 53,
  "total_tokens": 71,
  "finish_reason": "stop_sequence"
}

Response with tool_choice by specifies a function


















{
  "tool_calls": [
    {
      "type": "function",
      "id": "call_7JK8LIPTho7DffbvceTV5Oey",
      "function": {
        "name": "get_current_weather",
        "arguments": "{\"location\": \"Boston, MA\", \"unit\": \"celsius\"}"
      }
    }
  ],
  "details": null,
  "total_time_taken": "0.82 sec",
  "prompt_tokens": 159,
  "generated_tokens": 18,
  "total_tokens": 177,
  "finish_reason": "eos_token"
}

Request body by sending the response back to the model to summarize
大語言模型將函式執行後的結果，以容易理解的方式來輸出。這個步驟屬於多輪對話的情境，除了要提供之前的歷史對話紀錄，還需要將執行函式的結果，放在 role 為 tool 的 content 欄位中。

Field	value
role	tool
tool_call_id	引用 tool_calls 中的函式呼叫識別碼
content	函式呼叫的執行結果

使用範例

Request









































































export API_KEY={API_KEY}
export API_URL={API_URL}

curl -X POST "${API_URL}/models/conversation" \
-H "accept: application/json" \
-H "X-API-KEY:${API_KEY}" \
-H "X-API-HOST: afs-inference" \
-H "content-type: application/json" \
-d '{
      "model": "Llama-3-8b",
      "messages": [
        {
          "role": "user",
          "content": "What is the weather like in Boston?"
        },
        {
          "role": "assistant",
          "content": "",
          "tool_calls": [
            {
              "id": "call_8a53fdf7e96c418aaaff76d2e1bb9964",
              "type": "function",
              "function": {
                "name": "get_current_weather",
                "arguments": "{\"location\":\"Boston, MA\", \"unit\": \"celsius\"}"
              }
            }
          ]
        },
        {
          "role": "tool",
          "tool_call_id": "call_8a53fdf7e96c418aaaff76d2e1bb9964",
          "content": "{\"location\": \"Boston, MA\", \"temperature\": \"22\", \"unit\": \"celsius\"}"
        }
      ],
      "tools": [
        {
          "type": "function",
          "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
              "type": "object",
              "properties": {
                "location": {
                  "type": "string",
                  "description": "The city and state, e.g. San Francisco, CA"
                },
                "unit": {
                  "type": "string",
                  "enum": [
                    "celsius",
                    "fahrenheit"
                  ]
                }
              },
              "required": [
                "location"
              ]
            }
          }
        }
      ],
      "parameters": {
        "show_probabilities": false,
        "max_new_tokens": 350,
        "frequence_penalty": 1,
        "temperature": 0.5,
        "top_k": 100,
        "top_p": 0.93,
        "seed": 42
      }
}'

Response









{
  "generated_text": "The current temperature in Boston, MA is 22 degrees Celsius.",
  "details": null,
  "total_time_taken": "0.43 sec",
  "prompt_tokens": 250,
  "generated_tokens": 14,
  "total_tokens": 264,
  "finish_reason": "stop_sequence"
}

tags: AFS

How to Use Tool Call of FFM Conversation API with Llama3 Series

API Support

Request 使用範例

使用 tool_choice 的 Request 範例

使用範例

使用範例

Read more

AFS API 說明文件

AFS MyCoder

AFS Coder - GitCoder

Function Calling

tags: `AFS`