# 使用 OpenAI 的 Moderation 模型偵測不適當內容
## 介紹
Moderation 模型是一個 OpenAI 所提供的免費工具,用來審查所謂的『不適當內容』。詳細的禁止條例可以參考 https://openai.com/policies/usage-policies。
目前此工具對英文的支援度較高,對其他語言可能相對沒那麼好用。
使用者可以透過這個工具辨識出不適當的內容並做出處理,比如**過濾**掉訊息。
moderation 過濾掉的訊息種類如下:
|CATEGORY|DESCRIPTION|
|-|-|
|hate|Content that expresses, incites, or promotes hate based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste. Hateful content aimed at non-protected groups (e.g., chess players) is not covered by this category.|
|hate/threatening|Hateful content that also includes violence or serious harm towards the targeted group.|
|self-harm|Content that promotes, encourages, or depicts acts of self-harm, such as suicide, cutting, and eating disorders.|
|sexual|Content meant to arouse sexual excitement, such as the description of sexual activity, or that promotes sexual services (excluding sex education and wellness).|
|sexual/minors|Sexual content that includes an individual who is under 18 years old.|
|violence|Content that promotes or glorifies violence or celebrates the suffering or humiliation of others.|
|violence/graphic|Violent content that depicts death, violence, or serious physical injury in extreme graphic detail.|
## 使用方法
```
curl https://api.openai.com/v1/moderations \
-X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{"input": "Sample text goes here"}'
```
```python=
import requests
import os
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {os.getenv('OPENAI_API_KEY')}",
}
data = {
"input": "I want to kill them."
}
response = requests.post("https://api.openai.com/v1/moderations", headers=headers, json=data)
# Print the response
print(response.json())
```
output:
```
{'id': 'modr-7PTCmG5D6bTT5hTbBwJEOjVoGfBa0',
'model': 'text-moderation-004',
'results': [{'flagged': True,
'categories': {'sexual': False,
'hate': False,
'violence': True,
'self-harm': False,
'sexual/minors': False,
'hate/threatening': False,
'violence/graphic': False},
'category_scores': {'sexual': 9.530887e-07,
'hate': 0.18386647,
'violence': 0.8870859,
'self-harm': 1.7594473e-09,
'sexual/minors': 1.3112696e-08,
'hate/threatening': 0.003258761,
'violence/graphic': 3.173159e-08}}]}
```
可以看到,對於『**暴力**』的分類,預測數值是很高的。
<br/>
---
## 其他語言的可用性
對我而言,最重要的就是中文的可用性了。