### [AI / ML領域相關學習筆記入口頁面](https://hackmd.io/@YungHuiHsu/BySsb5dfp) ### [Deeplearning.ai GenAI/LLM系列課程筆記](https://learn.deeplearning.ai/) #### [Large Language Models with Semantic Search。大型語言模型與語義搜索 ](https://hackmd.io/@YungHuiHsu/rku-vjhZT) #### [Finetuning Large Language Models。微調大型語言模型](https://hackmd.io/@YungHuiHsu/HJ6AT8XG6) #### [LangChain for LLM Application Development](https://hackmd.io/1r4pzdfFRwOIRrhtF9iFKQ) --- [LangChain for LLM Application Development](https://www.youtube.com/watch?v=jFo_gDOOusk) 系列課程筆記 - [Models, Prompts and Output Parsers](https://hackmd.io/1r4pzdfFRwOIRrhtF9iFKQ) - [Memory](https://hackmd.io/@YungHuiHsu/Hy120mR23) - [Chains](https://hackmd.io/@YungHuiHsu/SJJvZ-ya2) - [Question-and-Answer](https://hackmd.io/@YungHuiHsu/BJ10qunzp) - [Evaluation](https://hackmd.io/@YungHuiHsu/Hkg0SgazT) - [Agents](https://hackmd.io/@YungHuiHsu/rkBMDgRM6) ![](https://hackmd.io/_uploads/r1WTGXRhn.png =400x) source : [LangChain.dart](https://pub.dev/packages/langchain) --- # [deeplearning.ai.LangChain -Memory](https://learn.deeplearning.ai/langchain/lesson/3/memory) ## Outline * ConversationBufferMemory * ConversationBufferWindowMemory * ConversationTokenBufferMemory * ConversationSummaryMemory * :muscle:VectorStoreRetrieverMemory <div style="text-align: center;"> <figure> <img src="https://python.langchain.com/assets/images/memory_diagram-0627c68230aa438f9b5419064d63cbbc.png" alt="memory_diagram.png" width="1200"> <figcaption><span style="color: BLACK" >langchain的“記憶”處理處理機制</span></figcaption> </figure> </div> source:[langchain/docs/modules/memory](https://python.langchain.com/docs/modules/memory/) ## 言語模型的記憶 - **記憶的重要性**: 當與語言模型互動時,模型不會記住先前的對話。要建立具有對話功能的應用,如聊天機器人,這會成為問題。 - **LangChain 記憶功能**: 提供了多種管理對話記憶的方法。 - **ConversationBufferMemory**: 可以儲存和查看對話記憶。 - **ConversationBufferWindowMemory**: 僅保存特定數量的對話交換。 - **ConversationTokenBufferMemory**: 限制保存的token(字)數量,與LLM的價格直接相關。 - **ConversationSummaryBufferMemory**: 使用LLM產生對話摘要作為記憶 - **VectorStoreRetrieverMemory**: 將對話記錄向量化,並以向量資料庫儲存,查詢前k個最為關聯的對話紀錄 - **LangChain的使用**: 透過指定不同的記憶類型,LangChain提供了記憶管理的靈活性。對於長對話,LangChain可以生成對話摘要以節省空間。 --- ### ConversationBufferMemory > `ConversationBufferMemory` 是一種極其簡單的內存形式,它僅將聊天消息列表保存在緩衝區中並將其傳遞到提示模板中。 ```python! from langchain.chat_models import ChatOpenAI from langchain.chains import ConversationChain from langchain.memory import ConversationBufferMemory llm = ChatOpenAI(temperature=0.0) memory = ConversationBufferMemory() conversation = ConversationChain( llm=llm, memory = memory, verbose=True ) ``` 接著輸入對話,可以看到自動生成的提示與生成的內容 ```python=! conversation.predict(input="Hi, my name is Andrew") > Entering new ConversationChain chain... Prompt after formatting: The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know. Current conversation: Human: Hi, my name is YH AI: > Finished chain. "Hello YH! It's nice to meet you. How can I assist you today?" ``` 接著繼續輸入對話,隨著對話內容累積,可以看到過去的對話內容被保留下來,見`Current conversation:`欄位。 ```python! conversation.predict(input="What is 1+1?") ... '1+1 is equal to 2.' conversation.predict(input="What is my name?") ... Current conversation: Human: Hi, my name is YH AI: Hello YH! It's nice to meet you. How can I assist you today? Human: What is 1+1? AI: 1+1 is equal to 2. Human: What is my name? AI: > Finished chain. 'Your name is YH.' ``` 可以直接用`memory.buffer`檢視過去保存的對話紀錄 ```python! print(memory.buffer) Human: Hi, my name is YH AI: Hello YH! It's nice to meet you. How can I assist you today? Human: What is 1+1? AI: 1+1 is equal to 2. Human: What is my name? AI: Your name is YH. ``` 或使用`load_memory_variables`取出過去對話紀錄 ```python! memory.load_memory_variables({}) {'history': "Human: Hi, my name is YH\nAI: Hello YH! It's nice to meet you. How can I assist you today?\nHuman: What is 1+1?\nAI: 1+1 is equal to 2.\nHuman: What is my name?\nAI: Your name is YH."} ``` 也可以直接把對談內容輸入 ```python! memory.save_context({"input": "Hi"}, {"output": "What's up"}) ``` 由於對話長度累加逐漸拉長後會增加記憶體的使用量,使LLM的計算成本逐漸提高,如果無限制的保存所有的對話內容顯然代價是昂貴的,因此Langchain中還提供以下幾種記憶體緩衝方式來保存對話內容 ![](https://hackmd.io/_uploads/SJ2_3NRhn.png =400x) ### ConversationBufferWindowMemory 使用參數來保存最近`k`輪的對話紀錄 當k=1時,可以看到範例中memory只儲存最近一輪的紀錄 ```python! from langchain.memory import ConversationBufferWindowMemory memory = ConversationBufferWindowMemory(k=1) memory.save_context({"input": "Hi"}, {"output": "What's up"}) memory.save_context({"input": "Not much, just hanging"}, {"output": "Cool"}) memory.load_memory_variables({}) # {'history': 'Human: Not much, just hanging\nAI: Cool'} ``` 另一個讓LLM~~得到健忘症~~節省記憶體的案例 ```python! llm = ChatOpenAI(temperature=0.0) memory = ConversationBufferWindowMemory(k=1) conversation = ConversationChain( llm=llm, memory = memory, verbose=False ) conversation.predict(input="Hi, my name is YH") # "Hello YH! It's nice to meet you. How can I assist you today?" conversation.predict(input="What is 1+1?") # '1+1 is equal to 2.' conversation.predict(input="What is my name?") # "I'm sorry, but I don't have access to personal information about individuals unless it has been shared with me in the course of our conversation." ``` ### ConversationTokenBufferMemory LLM的使用與字符/字數(token)的使用更直接相關,也可直接使用對話中的token數量來限制記憶體使用。 ```python! from langchain.memory import ConversationTokenBufferMemory memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=30) ``` ### ConversationSummaryBufferMemory 透過llm先產生一段摘要後,再將這段摘要儲存在記憶體,達到節省token用量的效果 ```python! from langchain.memory import ConversationSummaryBufferMemory # create a long string schedule = "There is a meeting at 8am with your product team. \ You will need your powerpoint presentation prepared. \ 9am-12pm have time to work on your LangChain \ project which will go quickly because Langchain is such a powerful tool. \ At Noon, lunch at the italian resturant with a customer who is driving \ from over an hour away to meet you to understand the latest in AI. \ Be sure to bring your laptop to show the latest LLM demo." memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100) memory.save_context({"input": "Hello"}, {"output": "What's up"}) memory.save_context({"input": "Not much, just hanging"}, {"output": "Cool"}) memory.save_context({"input": "What is on the schedule today?"}, {"output": f"{schedule}"}) ``` 檢查儲存在記憶體內的變量,可以發現是一段摘要過的文件,並且在後續的提問當中會被整合入系統的提示內。 ```python! print(memory.load_memory_variables({})) # human mentions that they are not doing much. The AI informs the human about their schedule for the day, including a meeting with the product team, working on the LangChain project, and having lunch with a customer to discuss the latest in AI. The AI also reminds the human to bring their laptop to show a demo.'} ``` ### VectorStoreRetrieverMemory 這段是課程內沒提到,但在官方文件內有提到的 `VectorStoreRetrieverMemory`直接將之前的對話內容向量化並儲存在向量資料庫(VectorDB)內,並在每次調用時查詢前 K 個最“顯著”的文檔。白話來說,是透過比對對話內容的相關性,從資料庫提取過去最相關連的對話紀錄 這與大多數其他 Memory 類不同,因為它不顯式跟踪交互的順序。 在這種情況下,“文檔(context)”是以前的對話片段。這對於參考人工智能在對話之前被告知的相關信息很有用。 ```python! from datetime import datetime from langchain.embeddings.openai import OpenAIEmbeddings from langchain.llms import OpenAI from langchain.memory import VectorStoreRetrieverMemory from langchain.chains import ConversationChain from langchain.prompts import PromptTemplate ``` 選擇你要用的向量資料庫 ```python! import faiss from langchain.docstore import InMemoryDocstore from langchain.vectorstores import FAISS embedding_size = 1536 # Dimensions of the OpenAIEmbeddings index = faiss.IndexFlatL2(embedding_size) embedding_fn = OpenAIEmbeddings().embed_query vectorstore = FAISS(embedding_fn, index, InMemoryDocstore({}), {}) ``` 存入對話紀錄 ```python! # In actual usage, you would set `k` to be a higher value, but we use k=1 to show that # the vector lookup still returns the semantically relevant information retriever = vectorstore.as_retriever(search_kwargs=dict(k=1)) memory = VectorStoreRetrieverMemory(retriever=retriever) # When added to an agent, the memory object can save pertinent information from conversations or used tools memory.save_context({"input": "My favorite food is pizza"}, {"output": "that's good to know"}) memory.save_context({"input": "My favorite sport is soccer"}, {"output": "..."}) memory.save_context({"input": "I don't the Celtics"}, {"output": "ok"}) # ``` 檢視當輸入問題時,查詢到的最相關對話紀錄 ```python! # Notice the first result returned is the memory pertaining to tax help, which the language model deems more semantically relevant # to a 1099 than the other documents, despite them both containing numbers. print(memory.load_memory_variables({"prompt": "what sport should i watch?"})["history"]) # input: My favorite sport is soccer # output: ... ``` 使用範例 ```python! llm = OpenAI(temperature=0) # Can be any valid LLM _DEFAULT_TEMPLATE = """The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know. Relevant pieces of previous conversation: {history} (You do not need to use these pieces of information if not relevant) Current conversation: Human: {input} AI:""" PROMPT = PromptTemplate( input_variables=["history", "input"], template=_DEFAULT_TEMPLATE ) conversation_with_summary = ConversationChain( llm=llm, prompt=PROMPT, # We set a very low max_token_limit for the purposes of testing. memory=memory, verbose=True ) conversation_with_summary.predict(input="Hi, my name is Perry, what's up?") ``` 如果詢問運動相關的話題,可以看到運動相關主題的對話紀錄被查詢到,作為對話回覆的依據。見`Relevant pieces of previous conversation:`的部分 ```python! # Here, the basketball related content is surfaced conversation_with_summary.predict(input="what's my favorite sport?") ... Relevant pieces of previous conversation: input: My favorite sport is soccer output: ... (You do not need to use these pieces of information if not relevant) ... ' You told me earlier that your favorite sport is soccer.' conversation_with_summary.predict(input="What's my name?") ... Relevant pieces of previous conversation: input: Hi, my name is Perry, what's up? response: Hi Perry, I'm doing well. How about you? ... ' Your name is Perry.' ``` - 如果在記憶與數據上加上時間戳(timestamp),通常有助於agent更確認時間相關性 - 對話中的記憶會自動存儲。透過查詢與問題最匹配的對話,代理(agent)就能 "記住"之前的對話內容,從中提取資訊作為回答的依據。