ChatGPT API程式設計

ChatGPT是個產品，實際上聊天功能都是透過OpenAI來達成的，而OpenAI有開放其功能給開發者使用，稱為OpenAI API，所以對開發者來說，其實都是透過OpenAI API來實作出人工智慧聊天。

GPT-3.5和GPT-4的差別

如果是處理非正式的對話，GPT-3.5 和 GPT-4 之間的差異不大。當任務的複雜性達到一定的程度時，差異別就會顯現出來；GPT-4 比 GPT-3.5 更可靠、更有創造力，並能夠處理比GPT-3.5更細緻的指令（但也更貴ＸＤ）。

ChatGPT官方定價

注意：
所有在OpenAI內的價格，其單位都是美金，在付款請特別注意，結帳的幣別也都是以美金來做計價。

計量制

以使用量來計費方式，根據不同的modal會有不一樣的價格

GPT-4

	價格	Tokens
使用輸入(8K)	$0.03	1K Tokens
使用輸出(8K)	$0.06	1K Tokens
使用輸入(32K)	$0.06	1K Tokens
使用輸出(32K)	$0.12	1K Tokens

GPT-3.5 Turbo

	價格	Tokens
使用輸入(4K)	$0.0015	1K Tokens
使用輸出(4K)	$0.02	1K Tokens
使用輸入(16K)	$0.003	1K Tokens
使用輸出(16K)	$0.004	1K Tokens

Fine-tuning models

	價格	Tokens
訓練階段	$0.008	1K Tokens
使用輸入	$0.012	1K Tokens
使用輸出	$0.016	1K Tokens

Embedding models

Embedding model用來將內容轉換為向量表示。

	價格	Tokens
使用	$0.0001	1K Tokens

資料來源：https://openai.com/pricing#language-models

訂閱制

2023年2月10日開始，ChatGPT新增了可以使用訂閱的方式來使用付費版的ChatGPT+。

費用

每個月２０美金

特色

免費使用最新的ChatGPT-4 modal和大部分的其它modal。
更快的回應速度。
優先使用測試版功能。

企業版

如果為商業用，有更大的使用量的話，還有企業版的可以購買，但目前沒有固定價格，需詢價後，針對企業的使用方式來得到價格。

其特色如下：

企業優先使用ChatGPT-4權利，且沒有使用量上限
執行速度比普通版ChatGPT-4提高兩倍。
一個context window能輸入更多內容到3.2萬個Token（約2.5萬個字）。
較高的資安和部署功能：
- 客戶的提示和輸入資料都不會被OpenAI拿去訓練模型。
- 提供額外的單一登入、分析、儀表板等其他功能。
其他客製化功能。

補充：

ChatGPT企業版官網：https://openai.com/enterprise

何謂Token

Token 是 GPT 處理內容的基本單位。Token 可以是一個字、一個詞語或特定語言中的一個字元。它們負責將輸入的內容資料轉換為 GPT 可以處理的資料格式。

通常1個Token大約等於4個英文字元，或者四分之三個中文字。

每個 GPT 模型都有一個預設的最大 Tokens 數量。例如，GPT-3.5 每次調用允許處理的最大 Tokens 數量約為 4000（4K），而16K版本則代表可以處理16000個Token。

GPT-4 32K版本則允許處理32000個Token，這個數量包括用戶輸入和GPT輸出的所有 Tokens。

Token官方計算機：https://platform.openai.com/tokenizer

資料來源：https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them

速度限制

ChatGPT API不同的model有設定不同的速度限制，單位有三種：

TPM（Token per minute）：
RPD（Request per day）：
RPM（Request per minute）：

當呼叫API的速度超過上限時會出現如下錯誤：

Rate limit reached for default-text-davinci-002 in organization org-{id} on requests per min. Limit: 20.000000 / min. Current: 24.000000 / min.

參考：https://platform.openai.com/account/rate-limits

AI偵測

因為目前許多學生使用ChatGPT來代表寫出做或是論等等，因此許多學校明文禁止使用ChatGPT，是否要禁止使用ChatGPT這是另外一個職得討論的議題。

而ChatGPTe官方也推出了一個工具：AI Text Classifier（https://platform.openai.com/ai-text-classifier），用來偵測其文章是否出自AI，但需要超過1000字以上才能辨別

備註：

2023年7月30日官方已經暫停開放該工具（https://openai.com/blog/new-ai-classifier-for-indicating-ai-written-text），原因是官方認為準確率太低。

取得ChatGPT API Key

申請ChatGPT帳號
- 前往ChatGPT官網（https://chat.openai.com/auth/login）註冊。
  
  Sign in = 登入
  
  Sign up = 註冊
  
  註冊需要email信箱＋手機號碼
取得API Key - https://platform.openai.com/account/api-keys

第一步：左邊選單選擇「API Key」
第二步：點擊「Create new secret key」建立新的key

Name欄位設定這個key的名稱，為選填，可以不填
複製建立好的Key

注意：

這個key只能看到一次，關閉後就再也看不到了，請務必收好。

API key 不要外流，以免他人透過你的 key 使用你的 Token 額度。

測試API Key

官方文件：https://platform.openai.com/docs/api-reference/making-requests

$ curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer xxxxxxx" \
  -d '{
     "model": "gpt-3.5-turbo",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

補充：

MacOS已經內建curl；Windows手動安裝cUrl工具：https://curl.se/download.html

回應：

{
  "id": "chatcmpl-87lbx7s9PFt0Pu79llGD9Dsg7Xfgo",
  "object": "chat.completion",
  "created": 1696862005,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "This is a test!"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 13,
    "completion_tokens": 5,
    "total_tokens": 18
  }
}

Organization ID可以從這裡取得：https://platform.openai.com/account/org-settings

將API Key設定到環境變數：

MacOS/Linux
```
$ export OPENAI_API_KEY={你的API Key}
```
直接編輯.bash_profile讓重新開機後也生效；立即生效需要下：$ source .bash_profile

Windows
1. 在「搜尋」中，搜尋並選取：系統 (控制台)
2. 按一下進階系統設定連結。
3. 按一下環境變數。

API介紹

官方文件

https://platform.openai.com/docs/api-reference

API Endpoint種類

-	-
Audio	語音辨識
Chat	聊天API
Completions	聊天API（舊版）
Embeddeding	取得文字的向量值
Fine-tuning	微調自己的模型
Files	檔案操作
Images	圖片生成
Models	取得可用的模組相關資訊

開發Python程式

安裝openai套件

$ pip3 install openai

Hello OpenAI













import openai

openai.organization = 'xxxxxxxxxxxx'
openai.api_key = 'xxxxxxxxxxxx'

completion = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "user", "content": "你好"}
  ]
)

print(completion.choices[0].message.content)

Translate














import openai

openai.organization = 'xxxxxxxxxxxx'
openai.api_key = 'xxxxxxxxxxxx'

completion = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "response by Traditional Chinese"},
    {"role": "user", "content": "Hello"}
  ]
)

print(completion.choices[0].message.content)

語音辨識









import openai

openai.organization = 'xxxxxxxxxxxx'
openai.api_key = 'xxxxxxxxxxxx'

audio_file= open("1.mp3", "rb")
transcript = openai.Audio.transcribe("whisper-1", audio_file)

print(transcript.text)

參考：

官方文件：https://platform.openai.com/docs/guides/speech-to-text

文字轉語音：https://ttsmaker.com/zh-hk

Fine Tuning

使用Fune-Tuning API來微調自己的模型，讓ChatGPT的回答可以更貼近自己的需求

步驟一：準備資料

準備要用來訓練模型的資料，其格式必須是jsonl（JSON Line）格式。

例如：

{"messages": [{"role": "system", "content": "中央大學"}, {"role": "user", "content": "誰最棒"}, {"role": "assistant", "content": "Aaron1"}]}
{"messages": [{"role": "system", "content": "中央大學"}, {"role": "user", "content": "誰最帥"}, {"role": "assistant", "content": "Aaron2"}]}
{"messages": [{"role": "system", "content": "中央大學"}, {"role": "user", "content": "誰最美"}, {"role": "assistant", "content": "Aaron3"}]}
{"messages": [{"role": "system", "content": "中央大學"}, {"role": "user", "content": "誰最高"}, {"role": "assistant", "content": "Aaron4"}]}
{"messages": [{"role": "system", "content": "中央大學"}, {"role": "user", "content": "誰最矮"}, {"role": "assistant", "content": "Aaron5"}]}
{"messages": [{"role": "system", "content": "中央大學"}, {"role": "user", "content": "誰最酷"}, {"role": "assistant", "content": "Aaron6"}]}
{"messages": [{"role": "system", "content": "中央大學"}, {"role": "user", "content": "誰最壞"}, {"role": "assistant", "content": "Aaron7"}]}
{"messages": [{"role": "system", "content": "中央大學"}, {"role": "user", "content": "誰最貴"}, {"role": "assistant", "content": "Aaron8"}]}
{"messages": [{"role": "system", "content": "中央大學"}, {"role": "user", "content": "誰最好"}, {"role": "assistant", "content": "Aaron9"}]}
{"messages": [{"role": "system", "content": "中央大學"}, {"role": "user", "content": "誰最嚴"}, {"role": "assistant", "content": "Aaron0"}]}

補充：

什麼是 JSONL 格式？

JSONL 是一種以「行」為單位儲存資料的格式，也就是 JSON Lines。這種格式中，每一行都是一個完整的 JSON 物件。

步驟二：上傳檔案








f = openai.File.create(
  file=open("mydata.jsonl", "rb"),
  purpose='fine-tune'
)

print(f.id)

print(openai.FineTuningJob.list())

步驟三：建立Fine-tuning Job


job = openai.FineTuningJob.create(training_file=f.id, model="gpt-3.5-turbo")

步驟四：追蹤狀態

建立完任務後，模型就會開始訓練，這個步驟需要一些時間，程式並不會hang在那邊，而是會直接結束，所以需要追蹤目前的狀態來確認訓練進度。

print(openai.FineTuningJob.list_events(id=job.id, limit=10))

print(openai.FineTuningJob.retrieve(job.id))

在訓練完成之前，其fine_tuned_model參數都會是null

{
  "object": "list",
  "data": [
    {
      "object": "fine_tuning.job",
      "id": "ftjob-GG4grjQ3OW9ILFSkkcs8drG5",
      "model": "gpt-3.5-turbo-0613",
      "created_at": 1697811663,
      "finished_at": null,
      "fine_tuned_model": null,
      "organization_id": "org-DvHsvX68qd9zjNMZDerRE3VJ",
      "result_files": [],
      "status": "validating_files",
      "validation_file": null,
      "training_file": "file-SPD3OYi7lsVbkrOQqNghj0Rv",
      "hyperparameters": {
        "n_epochs": "auto"
      },
      "trained_tokens": null,
      "error": null
    }
  ],
  "has_more": false
}

事件清單：

{
  "object": "list",
  "data": [
    {
      "object": "fine_tuning.job.event",
      "id": "ftevent-8lp6JYMJB4XLShUI3RswBbcS",
      "created_at": 1697813747,
      "level": "info",
      "message": "Validating training file: file-elRQ1zlqUWLuscMJttWWgdCn",
      "data": {},
      "type": "message"
    },
    {
      "object": "fine_tuning.job.event",
      "id": "ftevent-LFtGMEbfikbuJ7oToWfbT2TD",
      "created_at": 1697813747,
      "level": "info",
      "message": "Created fine-tuning job: ftjob-zrGoLkWiCO6YTVSEf3XMUz4T",
      "data": {},
      "type": "message"
    }
  ],
  "has_more": false
}

補充：

模型訓練結果也會收到email通知

也可以直接前往fine-tuning分頁查看訓練進度：https://platform.openai.com/finetune

步驟五：使用fine tune後的模型

取得模型資訊

{
  "object": "fine_tuning.job",
  "id": "ftjob-zrGoLkWiCO6YTVSEf3XMUz4T",
  "model": "gpt-3.5-turbo-0613",
  "created_at": 1697813747,
  "finished_at": 1697815710,
  "fine_tuned_model": "ft:gpt-3.5-turbo-0613:personal::8BliK0o7",
  "organization_id": "org-DvHsvX68qd9zjNMZDerRE3VJ",
  "result_files": [
    "file-LiTPyInDa5rZukMjHGYjRTwi"
  ],
  "status": "succeeded",
  "validation_file": null,
  "training_file": "file-elRQ1zlqUWLuscMJttWWgdCn",
  "hyperparameters": {
    "n_epochs": 10
  },
  "trained_tokens": 2600,
  "error": null
}

訓練前










completion = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "中央大學"},
    {"role": "user", "content": "誰最帥"}
  ],
  temperature=1
)

print(completion.choices[0].message.content)

回應

這是一個主觀問題，每個人對於帥的標準可能不同。無法確定誰是最帥的人，因為每個人的觀感可能會有所不同。

訓練後

completion = openai.ChatCompletion.create(
  model="ft:gpt-3.5-turbo-0613:personal::8BliK0o7",
  messages=[
    {"role": "system", "content": "中央大學"},
    {"role": "user", "content": "誰最帥"}
  ],
  temperature=0
)

print(completion.choices[0].message.content)

回應

Aaron2

圖片生成

r = openai.Image.create(
  prompt="日本",
  n=2,
  size="1024x1024"
)

print(r)

Q&A

API回傳「You exceeded your current quota, please check your plan and billing details」訊息，例如：

{
    "error": {
        "message": "You exceeded your current quota, please check your plan and billing details.",
        "type": "insufficient_quota",
        "param": null,
        "code": "insufficient_quota"
    }
}

答：

所有的API呼叫都是以Token為單位來計費，新帳號有18美元額度可以使用，期限三個月，過期或是使用完畢後，就必須付費使用。