[GenAI][RAG] Prompt Engineering for Multimodal Model

AI / ML領域相關學習筆記入口頁面

Deeplearning.ai GenAI/LLM系列課程筆記

GenAI

RAG

Preprocessing Unstructured Data for LLM Applications。大型語言模型(LLM)應用的非結構化資料前處理
Building and Evaluating Advanced RAG。建立與評估進階RAG
[GenAI][RAG] Multi-Modal Retrieval-Augmented Generation and Evaluaion。多模態的RAG與評估
- [GenAI][RAG] 利用多模態模型解析PDF文件內的表格
- [GenAI][RAG] Prompt Engineering for Multimodal Model

2024.06。LangGPT：超越文字，多模态提示词在大模型中的创新实践（langgpt作者云中江树）筆記

多模態侷限性(8:20)

使用gpt-4o提取資料仍存在大量幻覺問題

場景理解能力不足
Image Not Showing Possible Reasons
- The image was uploaded to a note which you don't have access to
- The note which the image was originally uploaded to has been deleted
Learn More →
資訊遺漏
資訊錯位
胡亂編造
死循環

如何寫好多模態提示詞(10:15)

OPENAI Prompt engineering

在提問中中包含詳細資訊以獲得更相關的答案
Include details in your query to get more relevant answers
要求模型採用某種人格(角色扮演)
Ask the model to adopt a persona
使用分隔符清楚地指出輸入的不同部分
Use delimiters to clearly indicate distinct parts of the input
指定完成任務所需的步驟
Specify the steps required to complete a task
提供範例
Provide examples
指定輸出的期望長度
Specify the desired length of the output

準確清晰的表述
- 提取全部的資訊
扮演專家角色
提供範例(In-Context Learning)

指定輸出格式

OPENAI API也可以指定輸出格式

















import os
from openai import AzureOpenAI

client = AzureOpenAI(
  azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
  api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
  api_version="2024-03-01-preview"
)

response = client.chat.completions.create(
  model="gpt-4-0125-Preview", # Model = should match the deployment name you chose for your 0125-Preview model deployment
  response_format={ "type": "json_object" },
  messages=[
    {"role": "system", "content": "You are a helpful assistant designed to output JSON."},
    {"role": "user", "content": "Who won the world series in 2020?"}
  ]
)

多模態提示詞的獨有方法

標記提示法-減少幻覺
- 在圖像上添加視覺符號
  如何在大量資料上自動化達成?
標記提示法-提高答案精準度
- 用標記指定模型關注區域進行重點解讀
  同樣的問題，如何在大量資料上自動化達成?
標記集提示法
- 既然能自動便是進行標記，其實也不用LLM了…?

如何實現自動化標準化標記?

透過SAM等通用基礎的視覺模型進行目標檢測、語意分割來標記
- 標記後對於小目標的理解能力強大
- 按照序號理解、容易DEBUG、不易遺漏

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

如何應用在表格資料上、標記所有欄位?

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`	在筆記中貼入程式碼
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.

AI / ML領域相關學習筆記入口頁面

Deeplearning.ai GenAI/LLM系列課程筆記

GenAI

RAG

2024.06。LangGPT：超越文字，多模态提示词在大模型中的创新实践（langgpt作者云中江树）筆記

多模態侷限性(8:20)

如何寫好多模態提示詞(10:15)

OPENAI Prompt engineering

多模態提示詞的獨有方法

如何實現自動化標準化標記?

Ref