打造一個 AI Discord 聊天機器人？

# 打造一個 AI Discord 聊天機器人？ [TOC] :::success **[English version available!](/@stanley2058/SJ64aM_2gg)** {%preview https://hackmd.io/@stanley2058/SJ64aM_2gg %} **跳到第二篇？** {%preview https://hackmd.io/@stanley2058/ByISgSFhee %} 特別感謝 [llmcord](https://github.com/jakobdylanc/llmcord) ::: ## 0. 都 2025 了誰還要做 Discord 聊天機器人阿？就是因為 2025 才要搞 Discord 聊天機器人阿！看看我們現在有什麼： - 大型語言模型 (Large language models) - 方便簡易的 Web 工具 - 多樣的工具生態系 (工具呼叫、MCP) 只要有一個前端界面就能打造你的~~數位老公/老婆~~聊天機器人了阿！不過要支援伺服端串流不會爆炸，還要能哪裡都能用，對前端的要求其實蠻嚴格的。如果我告訴你，一行前端程式都不用寫就能有個完美的聊天界面還有跨平臺桌面、手機版支援呢？沒錯，就是 Discord 啦！ ## 1. 怎麽聊天？不是，沒有人在跟你討論聊天技巧。造聊天機器人的第一步就是有一個可以聊天的機器人 (廢話)。以前的聊天機器人很多都是規則製 (rule based) 的，只能回答簡單的問題或是透過向量化查詢、意圖猜測來判斷應該給出什麼樣子的回覆。現在有了 LLM，我們可以很簡單的串上各大提供商的 API 直接獲得高品質的聊天體驗。 ### 機器人的大腦舉例來說，OpenAI 提供了很多種類的 API： ![gpt-5 api types](https://hackmd.io/_uploads/S1K_-hvhee.png) 以文字來回來說，主要可以分成兩種： 1. Chat completions (<- 我們用這個) - 無狀態，每次都要把內容、附件、工具呼叫結果等等記錄整組丟過去。基本上支援相容 OpenAI API 的人都支援。 2. Responses - 比較新的 API，可以保存之前的聊天記錄，更好的工具呼叫支援。幾乎只有 OpenAI 的 API 能用。考量到我們可能會使用到各家模型來測試聊天機器人的效果，我們會用 chat completions API 來實作。確定要用什麼 API 後下一個要確定的是如何跟這個 API 溝通。如果直接按照 OpenAI 的文件打一個 API 請求過去： ```bash curl https://api.openai.com/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -d '{ "model": "gpt-5", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "Hello!" } ] }' ``` 我們會一次獲得整包的回覆： ```json { "id": "chatcmpl-B9MBs8CjcvOU2jLn4n570S5qMJKcT", "object": "chat.completion", "created": 1741569952, "model": "gpt-5", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hello! How can I assist you today?" // 其他欄位 }, "finish_reason": "stop" } ] // 其他欄位 } ``` 不過這樣有個小問題，一些比較大的模型完整回覆可能需要數十秒甚至數分鐘。如果等到模型生成完所有內容後才一次給使用者，就會很像是聊天軟體上那種已讀後 10 分鐘才回一篇超長文的人一樣，體驗不是很好。所幸，completion API 有一個 `stream: true` 的選項可以給，回覆就會從一整塊變成一次幾個文字的串流 (stream)： ```json {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-4o-mini", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}]} {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-4o-mini", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{"content":"Hello"},"logprobs":null,"finish_reason":null}]} .... {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-4o-mini", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]} ``` 使用者會看到文字一個 (或一次數個) 接著一個出現，感覺像是馬上就有回覆，體驗就好了不少！但是串流的方式對開發者來說就稍微複雜了一點，一次拿到整個回應就像是平常打 API 樣一來一往就結束了。串流則是很不一樣，因為每次都拿一點點，需要做一個暫存區記錄目前整個回覆長什麼樣子，同時又要把接到的東西丟到前端給使用者看。要寫的程式就多上了許多。 ### 機器人的神經為了~~少寫一點程式~~更簡單的銜接 API 與 Discord，我們可以直接使用 Vercel 的 [AI SDK](https://ai-sdk.dev)。 #### AI SDK 使用方式基本上是這樣的： ```ts import { generateText } from "ai"; import { createOpenAI } from "@ai-sdk/openai"; const provider = createOpenAI({ baseURL: "https://api.openai.com/v1", apiKey: "<API_KEY>", }); const { text } = await generateText({ model: provider('gpt-5'), prompt: "What is love?", }); console.log(text); ``` 不過因為我們要用串流的方式加上要支援多輪來回所以會稍微複雜一點： ```ts import { streamText, type ModelMessage } from "ai"; import { createOpenAI } from "@ai-sdk/openai"; const provider = createOpenAI({ baseURL: "https://api.openai.com/v1", apiKey: "<API_KEY>", }); // 後面會加上每一輪新增的內容 const messages: ModelMessage[] = [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "What is love?" }, ]; const { textStream, response } = streamText({ model: provider('gpt-5'), messages, }); let buffer = ""; for await (const chunk of textStream) { buffer += chunk; // 我們要在這邊把 chunk 丟給使用者看，製造持續在回覆的感覺 } // 完整內容 console.log(buffer); // 把這一輪由模型產生的訊息全部加進去訊息內，下一輪聊天才會有所有的資訊 const { messages: responseMessages } = await response; messages.push(...responseMessages); ``` 最重要的其實是最後一行，要確保模型有完整的資訊才能有正確且高品質的回覆！ :::info `response.messages` 的補充說明 AI SDK 很好心的在 response 提供了 `messages` 欄位幫我們把這一輪新的內容都保存了下來。`responseMessages` 是一個陣列，這也是為什麼是用 `push(...)`，這個陣列可能包含模型的回應或是工具呼叫的結果，具體來說可以透過每個記錄的 `role` 欄位來區分： ```ts [ { role: "assistant", content: [{ type: "tool-call", toolName: "fetch" }] }, { role: "tool", content: [{ type: "tool-result", toolName: "fetch", output: {} }] }, { role: "assistant", content: [{ type: "text", text: "The page you requested is ..." }] }, ] ``` ::: #### 多模型上面的範例都是使用 `gpt-5` 這個模型當做範例，不過 `gpt-5` 其實不太適合拿來聊天，第一是太慢了，第二是用詞很生硬。不過沒關係，把它換掉就好了！ ```ts import { createOpenAI } from "@ai-sdk/openai"; const provider = createOpenAI({ baseURL: "https://api.openai.com/v1", apiKey: "<API_KEY>" }); const model = provider('gpt-5-chat-latest'); // 改用 Grok import { createXai } from "@ai-sdk/xai"; const provider = createXai({ apiKey: "<API_KEY>" }); const model = provider('grok-4-fast-non-reasoning'); // 使用 Openrouter import { createOpenRouter } from "@openrouter/ai-sdk-provider"; const provider = createOpenRouter({ apiKey: "<API_KEY>" }); const model = provider('openai/gpt-5-chat'); const model = provider('anthropic/claude-sonnet-4'); const model = provider('google/gemini-2.5-pro'); // 任何 OpenAI 相容模型 import { createOpenAICompatible } from "@ai-sdk/openai-compatible"; const provider = createOpenRouter({ name: "ollama", baseURL: "http://localhost:11434/v1" }); const model = provider('llama4'); ``` 這就是 AI SDK 最方便的地方了，可以隨意置換底層的模型，上面寫過處理串流的程式一行都不用動到！ ## 2. 怎麽 Discord？其實這才是整個過程最困難的一個部分，我們要在使用者 ==@機器人== 的時候觸發啟動聊天，並且把模型的回覆串流回去 Discord 回覆在使用者的留言下方。 ### 建立機器人我們先從最簡單的地方開始：建立一個機器人帳號 1. 先去 [Discord 開發者平臺](https://discord.com/developers/applications)上新建一個應用程式。 2. 完成後從左邊的側欄點去機器人，並建立一個機器人： ![image](https://hackmd.io/_uploads/ryodVaD3xl.png) 務必記得打開 Message content intent，不然會收不到頻道訊息喔！ ![image](https://hackmd.io/_uploads/SJxC46vnee.png) 3. 再來就可以從 Installation 頁面把機器人邀請進去你的 Discord 頻道啦！ (邀請進去是離線狀態是正常的，因為我們還沒告訴 Discord 我的的機器人活著) ### 上線機器人我們透過 `discord.js` 來串接 Discord 相關的操作，首先要先連上 Discord： ```ts import { Client, GatewayIntentBits, Partials } from "discord.js"; const client = new Client({ intents: [ GatewayIntentBits.Guilds, GatewayIntentBits.GuildMessages, GatewayIntentBits.GuildMessageReactions, GatewayIntentBits.DirectMessages, // 如果你要允許機器人接受私訊的話 GatewayIntentBits.MessageContent, ], partials: [Partials.Channel], }); ``` 再來就可以讓機器人上線啦： ```ts client.user?.setPresence({ status: "online", // 設定個自定訊息，注意長度顯示 128 字元 activities: [{ type: ActivityType.Custom, name: "New bot online!" }], }); ``` Discord 還支援很多其他功能，不過以一個簡單的機器人來說我們可以先這樣就好。 ### 接受訊息再來就是接受訊息了！首先，我們可以先註冊一個 `messageCreate` 事件： ```ts client.on("messageCreate", (msg) => console.log(msg)); ``` 這樣在任何這個機器人所在的群組內的任何頻道 (或私訊) 有任何訊息時我們都會收到。不過這些訊息可能不是全部都跟我們的機器人有關，所以我們應該讓使用者在 ==@機器人== 或是回覆機器人留言時才觸發。 ```ts import { Message, ChannelType } from "discord.js"; function handleMessageCreate(msg: Message) { // 如果觸發的訊息來源作者是一個機器人就跳過，除非這個機器人要回覆其他機器人的訊息。 // 拿掉這一行可能會造成機器人互相回覆的無窮迴圈！！！ if (msg.author.bot) return; // 如果是私訊或是訊息內包含 `@這個機器人` // `mentions` 是整串回覆訊息內有就算，想像今天是你被 @mention，一樣的情況 const isDM = msg.channel.type === ChannelType.DM; if (!isDM && !msg.mentions.users.has(client.user!.id)) return; // 到這邊大概就是找我們的訊息了！ } ``` :::info 小技巧：如果你要拿到整個討論串的話會需要順藤摸瓜一個一個找上去。 - 如果你只要 `id`：`msg.reference?.messageId` 拿到回覆對象的 `id`，如果是 `null` 代表是串頭。 - 如果你要整個訊息：`await msg.fetchReference().catch(() => null)` ::: ### 回覆訊息在接上前面做的 AI 大腦前，我們會需要先了解把訊息回覆到 Discord 上的方法。 :::info Discord 一則訊息上限是 4096 個字元，要注意長度喔！ ::: ```ts function handleMessageCreate(msg: Message) { // 前面的程式 const reply = await msg.reply({ content: "Reply from the bot!!", allowedMentions: { parse: [], repliedUser: false }, }); } ``` 這樣機器人就會對任何 ==@機器人== 的訊息回覆一個訊息了！ :::info 如果要讓回覆有趣一點可以用 `Embed` 來加上一點樣式。 ```ts import { Colors } from "discord.js"; function handleMessageCreate(msg: Message) { // 前面的程式 const emb = new EmbedBuilder(); emb.setDescription("Hello! This is an embed message!"); emb.setColor(Colors.Gold); const reply = await msg.reply({ embeds: [emb], allowedMentions: { parse: [], repliedUser: false }, }); } ``` ::: 因為我們的 AI 回覆是一塊一塊串流進來的，所以我們可以透過 "編輯" 訊息的方法，把新加入的內容補進去！ ```ts let buffer = "Hello! "; const emb = new EmbedBuilder(); emb.setDescription(buffer); const reply = await msg.reply({ embeds: [emb], allowedMentions: { parse: [], repliedUser: false }, }); buffer += "This is an embed message!"; emb.setDescription(buffer); // 編輯！ await reply.edit({ embeds: [emb] }); ``` ### 接上大腦！！終於可以來把 AI 接上去啦！對多模態模型來說圖片、附加檔案都是可以接受的，不過處理起來會比較複雜一點，所以這邊只會簡單示範純文字的串接。首先，我們需要一個把 Discord 的訊息變成模型訊息的 function： ```ts private async messageToModelMessages(msg: Message) { try { // 訊息本人 const content = msg.content || ""; // (optional) 如果有嵌入訊息 const embedsText = msg.embeds .map((e) => [e.title, e.description, e.footer?.text].filter(Boolean).join("\n"), ) .filter((s) => s && s.length > 0) as string[]; // (optional) 如果有元件訊息 const componentsText: string[] = []; for (const row of msg.components || []) { if (row.type === ComponentType.ActionRow) { for (const comp of row.components || []) { if ("label" in comp && typeof comp.label === "string") { componentsText.push(comp.label); } } } } const combinedText = [content, ...embedsText, ...componentsText] .filter(Boolean) .join("\n"); const role = msg.author.id === this.client.user?.id ? "assistant" : "user"; const contentArray = [ { type: "text", text: combinedText }, ]; const parent = await msg.fetchReference().catch(() => null); if (role === "user") { // (optional) 可以加上名字讓模型知道誰是誰 const userId = msg.author.id; for (const c of contentArray || []) { if (c.type !== "text") continue; c.text = `[name=${String(userId)}]: ${c.text}`; } return { parent, message: { role, content: contentArray, } satisfies ModelMessage, }; } else { return { parent, message: { role, content: contentArray as string | TextPart[], } satisfies ModelMessage, }; } } catch (e) { console.error(e); } return { parent: null }; } ``` 再來處理收到訊息 -> 轉成模型訊息 -> 餵給模型 -> 串流回覆到 Discord： :::info 1. 因為我們是從最新一則 Discord 訊息一路往上拿，所以建出來的 `messages` 會是新到舊，傳給模型的時候要記得反過來。 2. 只處理文字的話因為資訊全部都在 Discord 上面，所以不用保存模型回應的 message。如果要傳圖片或是呼叫工具的話就必須要保存一部分的資料。 ::: ```ts const provider = createOpenAI({ baseURL: "https://api.openai.com/v1", apiKey: "<API_KEY>", }); function handleMessageCreate(msg: Message) { if (msg.author.bot) return; const isDM = msg.channel.type === ChannelType.DM; if (!isDM && !msg.mentions.users.has(client.user!.id)) return; // 可以限制模型拿到的訊息數量，避免花太多錢 :) const maxMessages = 25; let currMsg: Message | null = msg; const messages: ModelMessage[] = []; // new -> old while (currMsg && messages.length < maxMessages) { const { parent, message } = await this.messageToModelMessages(currMsg); if (message) messages.push(message); currMsg = parent; } // 加入 system prompt 告訴模型他該做什麼 messages.push({ role: "system", content: "You are a helpful assistant.\nUser's names are their Discord IDs and should be typed as '<@ID>'.", }); // 告訴 Discord 我們在打字 if ("sendTyping" in msg.channel) await msg.channel.sendTyping(); const { textStream, finishReason } = streamText({ model: provider('gpt-5-chat-latest'), messages: messages.reverse(), }); let lastSentAt = 0; let buffer = ""; const reply = await msg.reply({ embeds: [new EmbedBuilder()], allowedMentions: { parse: [], repliedUser: false }, }); for await (const chunk of textStream) { buffer += chunk; // 最多每秒更新一次避免撞到 Discord 的限制 const now = Date.now(); if (now - lastSentAt < 1000) continue; lastSentAt = now; const emb = new EmbedBuilder(); emb.setDescription(buffer + " ⚪"); reply.edit({ embeds: [emb] }); } // 確保最後的內容是正確的 const emb = new EmbedBuilder(); emb.setDescription(buffer); reply.edit({ embeds: [emb] }); // debug 用，如果正常串流完成會是 `stop` console.log(await finishReason); } ``` ## 3. 結束了？當然還沒阿！我們還沒加入工具跟 MCP 呢！但是這篇文章已經有點太長了，所以我們會在下一個部分繼續介紹怎麽把工具給串進去 (包含原生不支援工具呼叫的模型)，讓模型除了聊天外還能夠真正做一點事情！ :::success 繼續查看第二篇加入工具！ {%preview https://hackmd.io/@stanley2058/ByISgSFhee %} :::