LangChain for LLM Application Development 系列課程筆記
source:langchain/docs/modules/memory
記憶的重要性: 當與語言模型互動時,模型不會記住先前的對話。要建立具有對話功能的應用,如聊天機器人,這會成為問題。
LangChain 記憶功能: 提供了多種管理對話記憶的方法。
LangChain的使用: 透過指定不同的記憶類型,LangChain提供了記憶管理的靈活性。對於長對話,LangChain可以生成對話摘要以節省空間。
ConversationBufferMemory
是一種極其簡單的內存形式,它僅將聊天消息列表保存在緩衝區中並將其傳遞到提示模板中。
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
llm = ChatOpenAI(temperature=0.0)
memory = ConversationBufferMemory()
conversation = ConversationChain(
llm=llm,
memory = memory,
verbose=True
)
接著輸入對話,可以看到自動生成的提示與生成的內容
conversation.predict(input="Hi, my name is Andrew")
> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Current conversation:
Human: Hi, my name is YH
AI:
> Finished chain.
"Hello YH! It's nice to meet you. How can I assist you today?"
接著繼續輸入對話,隨著對話內容累積,可以看到過去的對話內容被保留下來,見Current conversation:
欄位。
conversation.predict(input="What is 1+1?")
...
'1+1 is equal to 2.'
conversation.predict(input="What is my name?")
...
Current conversation:
Human: Hi, my name is YH
AI: Hello YH! It's nice to meet you. How can I assist you today?
Human: What is 1+1?
AI: 1+1 is equal to 2.
Human: What is my name?
AI:
> Finished chain.
'Your name is YH.'
可以直接用memory.buffer
檢視過去保存的對話紀錄
print(memory.buffer)
Human: Hi, my name is YH
AI: Hello YH! It's nice to meet you. How can I assist you today?
Human: What is 1+1?
AI: 1+1 is equal to 2.
Human: What is my name?
AI: Your name is YH.
或使用load_memory_variables
取出過去對話紀錄
memory.load_memory_variables({})
{'history': "Human: Hi, my name is YH\nAI: Hello YH! It's nice to meet you. How can I assist you today?\nHuman: What is 1+1?\nAI: 1+1 is equal to 2.\nHuman: What is my name?\nAI: Your name is YH."}
也可以直接把對談內容輸入
memory.save_context({"input": "Hi"},
{"output": "What's up"})
由於對話長度累加逐漸拉長後會增加記憶體的使用量,使LLM的計算成本逐漸提高,如果無限制的保存所有的對話內容顯然代價是昂貴的,因此Langchain中還提供以下幾種記憶體緩衝方式來保存對話內容
使用參數來保存最近k
輪的對話紀錄
當k=1時,可以看到範例中memory只儲存最近一輪的紀錄
from langchain.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(k=1)
memory.save_context({"input": "Hi"},
{"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
{"output": "Cool"})
memory.load_memory_variables({})
# {'history': 'Human: Not much, just hanging\nAI: Cool'}
另一個讓LLM得到健忘症節省記憶體的案例
llm = ChatOpenAI(temperature=0.0)
memory = ConversationBufferWindowMemory(k=1)
conversation = ConversationChain(
llm=llm,
memory = memory,
verbose=False
)
conversation.predict(input="Hi, my name is YH")
# "Hello YH! It's nice to meet you. How can I assist you today?"
conversation.predict(input="What is 1+1?")
# '1+1 is equal to 2.'
conversation.predict(input="What is my name?")
# "I'm sorry, but I don't have access to personal information about individuals unless it has been shared with me in the course of our conversation."
LLM的使用與字符/字數(token)的使用更直接相關,也可直接使用對話中的token數量來限制記憶體使用。
from langchain.memory import ConversationTokenBufferMemory
memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=30)
透過llm先產生一段摘要後,再將這段摘要儲存在記憶體,達到節省token用量的效果
from langchain.memory import ConversationSummaryBufferMemory
# create a long string
schedule = "There is a meeting at 8am with your product team. \
You will need your powerpoint presentation prepared. \
9am-12pm have time to work on your LangChain \
project which will go quickly because Langchain is such a powerful tool. \
At Noon, lunch at the italian resturant with a customer who is driving \
from over an hour away to meet you to understand the latest in AI. \
Be sure to bring your laptop to show the latest LLM demo."
memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100)
memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
{"output": "Cool"})
memory.save_context({"input": "What is on the schedule today?"},
{"output": f"{schedule}"})
檢查儲存在記憶體內的變量,可以發現是一段摘要過的文件,並且在後續的提問當中會被整合入系統的提示內。
print(memory.load_memory_variables({}))
# human mentions that they are not doing much. The AI informs the human about their schedule for the day, including a meeting with the product team, working on the LangChain project, and having lunch with a customer to discuss the latest in AI. The AI also reminds the human to bring their laptop to show a demo.'}
這段是課程內沒提到,但在官方文件內有提到的
VectorStoreRetrieverMemory
直接將之前的對話內容向量化並儲存在向量資料庫(VectorDB)內,並在每次調用時查詢前 K 個最“顯著”的文檔。白話來說,是透過比對對話內容的相關性,從資料庫提取過去最相關連的對話紀錄
這與大多數其他 Memory 類不同,因為它不顯式跟踪交互的順序。
在這種情況下,“文檔(context)”是以前的對話片段。這對於參考人工智能在對話之前被告知的相關信息很有用。
from datetime import datetime
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.llms import OpenAI
from langchain.memory import VectorStoreRetrieverMemory
from langchain.chains import ConversationChain
from langchain.prompts import PromptTemplate
選擇你要用的向量資料庫
import faiss
from langchain.docstore import InMemoryDocstore
from langchain.vectorstores import FAISS
embedding_size = 1536 # Dimensions of the OpenAIEmbeddings
index = faiss.IndexFlatL2(embedding_size)
embedding_fn = OpenAIEmbeddings().embed_query
vectorstore = FAISS(embedding_fn, index, InMemoryDocstore({}), {})
存入對話紀錄
# In actual usage, you would set `k` to be a higher value, but we use k=1 to show that
# the vector lookup still returns the semantically relevant information
retriever = vectorstore.as_retriever(search_kwargs=dict(k=1))
memory = VectorStoreRetrieverMemory(retriever=retriever)
# When added to an agent, the memory object can save pertinent information from conversations or used tools
memory.save_context({"input": "My favorite food is pizza"}, {"output": "that's good to know"})
memory.save_context({"input": "My favorite sport is soccer"}, {"output": "..."})
memory.save_context({"input": "I don't the Celtics"}, {"output": "ok"}) #
檢視當輸入問題時,查詢到的最相關對話紀錄
# Notice the first result returned is the memory pertaining to tax help, which the language model deems more semantically relevant
# to a 1099 than the other documents, despite them both containing numbers.
print(memory.load_memory_variables({"prompt": "what sport should i watch?"})["history"])
# input: My favorite sport is soccer
# output: ...
使用範例
llm = OpenAI(temperature=0) # Can be any valid LLM
_DEFAULT_TEMPLATE = """The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Relevant pieces of previous conversation:
{history}
(You do not need to use these pieces of information if not relevant)
Current conversation:
Human: {input}
AI:"""
PROMPT = PromptTemplate(
input_variables=["history", "input"], template=_DEFAULT_TEMPLATE
)
conversation_with_summary = ConversationChain(
llm=llm,
prompt=PROMPT,
# We set a very low max_token_limit for the purposes of testing.
memory=memory,
verbose=True
)
conversation_with_summary.predict(input="Hi, my name is Perry, what's up?")
如果詢問運動相關的話題,可以看到運動相關主題的對話紀錄被查詢到,作為對話回覆的依據。見Relevant pieces of previous conversation:
的部分
# Here, the basketball related content is surfaced
conversation_with_summary.predict(input="what's my favorite sport?")
...
Relevant pieces of previous conversation:
input: My favorite sport is soccer
output: ...
(You do not need to use these pieces of information if not relevant)
...
' You told me earlier that your favorite sport is soccer.'
conversation_with_summary.predict(input="What's my name?")
...
Relevant pieces of previous conversation:
input: Hi, my name is Perry, what's up?
response: Hi Perry, I'm doing well. How about you?
...
' Your name is Perry.'