Try   HackMD

AI / ML領域相關學習筆記入口頁面

Deeplearning.ai GenAI/LLM系列課程筆記

Large Language Models with Semantic Search。大型語言模型與語義搜索

Finetuning Large Language Models。微調大型語言模型

LangChain for LLM Application Development


LangChain for LLM Application Development 系列課程筆記


deeplearning.ai.LangChain -Memory

Outline

  • ConversationBufferMemory
  • ConversationBufferWindowMemory
  • ConversationTokenBufferMemory
  • ConversationSummaryMemory
  • Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →
    VectorStoreRetrieverMemory
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
langchain的“記憶”處理處理機制

source:langchain/docs/modules/memory

言語模型的記憶

  • 記憶的重要性: 當與語言模型互動時,模型不會記住先前的對話。要建立具有對話功能的應用,如聊天機器人,這會成為問題。

  • LangChain 記憶功能: 提供了多種管理對話記憶的方法。

    • ConversationBufferMemory: 可以儲存和查看對話記憶。
    • ConversationBufferWindowMemory: 僅保存特定數量的對話交換。
    • ConversationTokenBufferMemory: 限制保存的token(字)數量,與LLM的價格直接相關。
    • ConversationSummaryBufferMemory: 使用LLM產生對話摘要作為記憶
    • VectorStoreRetrieverMemory: 將對話記錄向量化,並以向量資料庫儲存,查詢前k個最為關聯的對話紀錄
  • LangChain的使用: 透過指定不同的記憶類型,LangChain提供了記憶管理的靈活性。對於長對話,LangChain可以生成對話摘要以節省空間。


ConversationBufferMemory

ConversationBufferMemory 是一種極其簡單的內存形式,它僅將聊天消息列表保存在緩衝區中並將其傳遞到提示模板中。

from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

llm = ChatOpenAI(temperature=0.0)
memory = ConversationBufferMemory()
conversation = ConversationChain(
    llm=llm, 
    memory = memory,
    verbose=True
)

接著輸入對話,可以看到自動生成的提示與生成的內容

conversation.predict(input="Hi, my name is Andrew")

> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi, my name is YH
AI:

> Finished chain.
"Hello YH! It's nice to meet you. How can I assist you today?"

接著繼續輸入對話,隨著對話內容累積,可以看到過去的對話內容被保留下來,見Current conversation:欄位。

conversation.predict(input="What is 1+1?")
...
'1+1 is equal to 2.'

conversation.predict(input="What is my name?")
...

Current conversation:
Human: Hi, my name is YH
AI: Hello YH! It's nice to meet you. How can I assist you today?
Human: What is 1+1?
AI: 1+1 is equal to 2.
Human: What is my name?
AI:

> Finished chain.
'Your name is YH.'

可以直接用memory.buffer檢視過去保存的對話紀錄

print(memory.buffer)
Human: Hi, my name is YH
AI: Hello YH! It's nice to meet you. How can I assist you today?
Human: What is 1+1?
AI: 1+1 is equal to 2.
Human: What is my name?
AI: Your name is YH.

或使用load_memory_variables取出過去對話紀錄

memory.load_memory_variables({})
{'history': "Human: Hi, my name is YH\nAI: Hello YH! It's nice to meet you. How can I assist you today?\nHuman: What is 1+1?\nAI: 1+1 is equal to 2.\nHuman: What is my name?\nAI: Your name is YH."}

也可以直接把對談內容輸入

memory.save_context({"input": "Hi"}, 
                    {"output": "What's up"})

由於對話長度累加逐漸拉長後會增加記憶體的使用量,使LLM的計算成本逐漸提高,如果無限制的保存所有的對話內容顯然代價是昂貴的,因此Langchain中還提供以下幾種記憶體緩衝方式來保存對話內容

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

ConversationBufferWindowMemory

使用參數來保存最近k輪的對話紀錄

當k=1時,可以看到範例中memory只儲存最近一輪的紀錄

from langchain.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(k=1)  

memory.save_context({"input": "Hi"},
                    {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})
memory.load_memory_variables({})
# {'history': 'Human: Not much, just hanging\nAI: Cool'}

另一個讓LLM得到健忘症節省記憶體的案例

llm = ChatOpenAI(temperature=0.0)
memory = ConversationBufferWindowMemory(k=1)
conversation = ConversationChain(
    llm=llm, 
    memory = memory,
    verbose=False
)

conversation.predict(input="Hi, my name is YH")
# "Hello YH! It's nice to meet you. How can I assist you today?"
conversation.predict(input="What is 1+1?")
# '1+1 is equal to 2.'
conversation.predict(input="What is my name?")
# "I'm sorry, but I don't have access to personal information about individuals unless it has been shared with me in the course of our conversation."

ConversationTokenBufferMemory

LLM的使用與字符/字數(token)的使用更直接相關,也可直接使用對話中的token數量來限制記憶體使用。

from langchain.memory import ConversationTokenBufferMemory
memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=30)

ConversationSummaryBufferMemory

透過llm先產生一段摘要後,再將這段摘要儲存在記憶體,達到節省token用量的效果

from langchain.memory import ConversationSummaryBufferMemory

# create a long string
schedule = "There is a meeting at 8am with your product team. \
You will need your powerpoint presentation prepared. \
9am-12pm have time to work on your LangChain \
project which will go quickly because Langchain is such a powerful tool. \
At Noon, lunch at the italian resturant with a customer who is driving \
from over an hour away to meet you to understand the latest in AI. \
Be sure to bring your laptop to show the latest LLM demo."

memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100)
memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})
memory.save_context({"input": "What is on the schedule today?"}, 
                    {"output": f"{schedule}"})

檢查儲存在記憶體內的變量,可以發現是一段摘要過的文件,並且在後續的提問當中會被整合入系統的提示內。

print(memory.load_memory_variables({}))

# human mentions that they are not doing much. The AI informs the human about their schedule for the day, including a meeting with the product team, working on the LangChain project, and having lunch with a customer to discuss the latest in AI. The AI also reminds the human to bring their laptop to show a demo.'}

VectorStoreRetrieverMemory

這段是課程內沒提到,但在官方文件內有提到的

VectorStoreRetrieverMemory直接將之前的對話內容向量化並儲存在向量資料庫(VectorDB)內,並在每次調用時查詢前 K 個最“顯著”的文檔。白話來說,是透過比對對話內容的相關性,從資料庫提取過去最相關連的對話紀錄

這與大多數其他 Memory 類不同,因為它不顯式跟踪交互的順序。

在這種情況下,“文檔(context)”是以前的對話片段。這對於參考人工智能在對話之前被告知的相關信息很有用。

from datetime import datetime
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.llms import OpenAI
from langchain.memory import VectorStoreRetrieverMemory
from langchain.chains import ConversationChain
from langchain.prompts import PromptTemplate

選擇你要用的向量資料庫

import faiss

from langchain.docstore import InMemoryDocstore
from langchain.vectorstores import FAISS


embedding_size = 1536 # Dimensions of the OpenAIEmbeddings
index = faiss.IndexFlatL2(embedding_size)
embedding_fn = OpenAIEmbeddings().embed_query
vectorstore = FAISS(embedding_fn, index, InMemoryDocstore({}), {})

存入對話紀錄

# In actual usage, you would set `k` to be a higher value, but we use k=1 to show that
# the vector lookup still returns the semantically relevant information
retriever = vectorstore.as_retriever(search_kwargs=dict(k=1))
memory = VectorStoreRetrieverMemory(retriever=retriever)

# When added to an agent, the memory object can save pertinent information from conversations or used tools
memory.save_context({"input": "My favorite food is pizza"}, {"output": "that's good to know"})
memory.save_context({"input": "My favorite sport is soccer"}, {"output": "..."})
memory.save_context({"input": "I don't the Celtics"}, {"output": "ok"}) #

檢視當輸入問題時,查詢到的最相關對話紀錄

# Notice the first result returned is the memory pertaining to tax help, which the language model deems more semantically relevant
# to a 1099 than the other documents, despite them both containing numbers.
print(memory.load_memory_variables({"prompt": "what sport should i watch?"})["history"])
#  input: My favorite sport is soccer
#  output: ...

使用範例

llm = OpenAI(temperature=0) # Can be any valid LLM
_DEFAULT_TEMPLATE = """The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Relevant pieces of previous conversation:
{history}

(You do not need to use these pieces of information if not relevant)

Current conversation:
Human: {input}
AI:"""
PROMPT = PromptTemplate(
    input_variables=["history", "input"], template=_DEFAULT_TEMPLATE
)
conversation_with_summary = ConversationChain(
    llm=llm, 
    prompt=PROMPT,
    # We set a very low max_token_limit for the purposes of testing.
    memory=memory,
    verbose=True
)
conversation_with_summary.predict(input="Hi, my name is Perry, what's up?")

如果詢問運動相關的話題,可以看到運動相關主題的對話紀錄被查詢到,作為對話回覆的依據。見Relevant pieces of previous conversation:的部分

# Here, the basketball related content is surfaced
conversation_with_summary.predict(input="what's my favorite sport?")
 
    ...
    
    Relevant pieces of previous conversation:
    input: My favorite sport is soccer
    output: ...
    
    (You do not need to use these pieces of information if not relevant)
    ...
    ' You told me earlier that your favorite sport is soccer.'
    
    
conversation_with_summary.predict(input="What's my name?")
    ...
    Relevant pieces of previous conversation:
    input: Hi, my name is Perry, what's up?
    response:  Hi Perry, I'm doing well. How about you?
    ...
     ' Your name is Perry.'
  • 如果在記憶與數據上加上時間戳(timestamp),通常有助於agent更確認時間相關性
  • 對話中的記憶會自動存儲。透過查詢與問題最匹配的對話,代理(agent)就能 "記住"之前的對話內容,從中提取資訊作為回答的依據。