### [AI / ML領域相關學習筆記入口頁面](https://hackmd.io/@YungHuiHsu/BySsb5dfp)
### [Deeplearning.ai GenAI/LLM系列課程筆記](https://learn.deeplearning.ai/)
#### [Large Language Models with Semantic Search。大型語言模型與語義搜索 ](https://hackmd.io/@YungHuiHsu/rku-vjhZT)
#### [Finetuning Large Language Models。微調大型語言模型](https://hackmd.io/@YungHuiHsu/HJ6AT8XG6)
#### [LangChain for LLM Application Development](https://hackmd.io/1r4pzdfFRwOIRrhtF9iFKQ)
---
[LangChain for LLM Application Development](https://www.youtube.com/watch?v=jFo_gDOOusk) 系列課程筆記
- [Models, Prompts and Output Parsers](https://hackmd.io/1r4pzdfFRwOIRrhtF9iFKQ)
- [Memory](https://hackmd.io/@YungHuiHsu/Hy120mR23)
- [Chains](https://hackmd.io/@YungHuiHsu/SJJvZ-ya2)
- [Question-and-Answer](https://hackmd.io/@YungHuiHsu/BJ10qunzp)
- [Evaluation](https://hackmd.io/@YungHuiHsu/Hkg0SgazT)
- [Agents](https://hackmd.io/@YungHuiHsu/rkBMDgRM6)

source : [LangChain.dart](https://pub.dev/packages/langchain)
---
# [deeplearning.ai.LangChain -Memory](https://learn.deeplearning.ai/langchain/lesson/3/memory)
## Outline
* ConversationBufferMemory
* ConversationBufferWindowMemory
* ConversationTokenBufferMemory
* ConversationSummaryMemory
* :muscle:VectorStoreRetrieverMemory
<div style="text-align: center;">
<figure>
<img src="https://python.langchain.com/assets/images/memory_diagram-0627c68230aa438f9b5419064d63cbbc.png" alt="memory_diagram.png" width="1200">
<figcaption><span style="color: BLACK" >langchain的“記憶”處理處理機制</span></figcaption>
</figure>
</div>
source:[langchain/docs/modules/memory](https://python.langchain.com/docs/modules/memory/)
## 言語模型的記憶
- **記憶的重要性**: 當與語言模型互動時,模型不會記住先前的對話。要建立具有對話功能的應用,如聊天機器人,這會成為問題。
- **LangChain 記憶功能**: 提供了多種管理對話記憶的方法。
- **ConversationBufferMemory**: 可以儲存和查看對話記憶。
- **ConversationBufferWindowMemory**: 僅保存特定數量的對話交換。
- **ConversationTokenBufferMemory**: 限制保存的token(字)數量,與LLM的價格直接相關。
- **ConversationSummaryBufferMemory**: 使用LLM產生對話摘要作為記憶
- **VectorStoreRetrieverMemory**: 將對話記錄向量化,並以向量資料庫儲存,查詢前k個最為關聯的對話紀錄
- **LangChain的使用**: 透過指定不同的記憶類型,LangChain提供了記憶管理的靈活性。對於長對話,LangChain可以生成對話摘要以節省空間。
---
### ConversationBufferMemory
> `ConversationBufferMemory` 是一種極其簡單的內存形式,它僅將聊天消息列表保存在緩衝區中並將其傳遞到提示模板中。
```python!
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
llm = ChatOpenAI(temperature=0.0)
memory = ConversationBufferMemory()
conversation = ConversationChain(
llm=llm,
memory = memory,
verbose=True
)
```
接著輸入對話,可以看到自動生成的提示與生成的內容
```python=!
conversation.predict(input="Hi, my name is Andrew")
> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Current conversation:
Human: Hi, my name is YH
AI:
> Finished chain.
"Hello YH! It's nice to meet you. How can I assist you today?"
```
接著繼續輸入對話,隨著對話內容累積,可以看到過去的對話內容被保留下來,見`Current conversation:`欄位。
```python!
conversation.predict(input="What is 1+1?")
...
'1+1 is equal to 2.'
conversation.predict(input="What is my name?")
...
Current conversation:
Human: Hi, my name is YH
AI: Hello YH! It's nice to meet you. How can I assist you today?
Human: What is 1+1?
AI: 1+1 is equal to 2.
Human: What is my name?
AI:
> Finished chain.
'Your name is YH.'
```
可以直接用`memory.buffer`檢視過去保存的對話紀錄
```python!
print(memory.buffer)
Human: Hi, my name is YH
AI: Hello YH! It's nice to meet you. How can I assist you today?
Human: What is 1+1?
AI: 1+1 is equal to 2.
Human: What is my name?
AI: Your name is YH.
```
或使用`load_memory_variables`取出過去對話紀錄
```python!
memory.load_memory_variables({})
{'history': "Human: Hi, my name is YH\nAI: Hello YH! It's nice to meet you. How can I assist you today?\nHuman: What is 1+1?\nAI: 1+1 is equal to 2.\nHuman: What is my name?\nAI: Your name is YH."}
```
也可以直接把對談內容輸入
```python!
memory.save_context({"input": "Hi"},
{"output": "What's up"})
```
由於對話長度累加逐漸拉長後會增加記憶體的使用量,使LLM的計算成本逐漸提高,如果無限制的保存所有的對話內容顯然代價是昂貴的,因此Langchain中還提供以下幾種記憶體緩衝方式來保存對話內容

### ConversationBufferWindowMemory
使用參數來保存最近`k`輪的對話紀錄
當k=1時,可以看到範例中memory只儲存最近一輪的紀錄
```python!
from langchain.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(k=1)
memory.save_context({"input": "Hi"},
{"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
{"output": "Cool"})
memory.load_memory_variables({})
# {'history': 'Human: Not much, just hanging\nAI: Cool'}
```
另一個讓LLM~~得到健忘症~~節省記憶體的案例
```python!
llm = ChatOpenAI(temperature=0.0)
memory = ConversationBufferWindowMemory(k=1)
conversation = ConversationChain(
llm=llm,
memory = memory,
verbose=False
)
conversation.predict(input="Hi, my name is YH")
# "Hello YH! It's nice to meet you. How can I assist you today?"
conversation.predict(input="What is 1+1?")
# '1+1 is equal to 2.'
conversation.predict(input="What is my name?")
# "I'm sorry, but I don't have access to personal information about individuals unless it has been shared with me in the course of our conversation."
```
### ConversationTokenBufferMemory
LLM的使用與字符/字數(token)的使用更直接相關,也可直接使用對話中的token數量來限制記憶體使用。
```python!
from langchain.memory import ConversationTokenBufferMemory
memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=30)
```
### ConversationSummaryBufferMemory
透過llm先產生一段摘要後,再將這段摘要儲存在記憶體,達到節省token用量的效果
```python!
from langchain.memory import ConversationSummaryBufferMemory
# create a long string
schedule = "There is a meeting at 8am with your product team. \
You will need your powerpoint presentation prepared. \
9am-12pm have time to work on your LangChain \
project which will go quickly because Langchain is such a powerful tool. \
At Noon, lunch at the italian resturant with a customer who is driving \
from over an hour away to meet you to understand the latest in AI. \
Be sure to bring your laptop to show the latest LLM demo."
memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100)
memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
{"output": "Cool"})
memory.save_context({"input": "What is on the schedule today?"},
{"output": f"{schedule}"})
```
檢查儲存在記憶體內的變量,可以發現是一段摘要過的文件,並且在後續的提問當中會被整合入系統的提示內。
```python!
print(memory.load_memory_variables({}))
# human mentions that they are not doing much. The AI informs the human about their schedule for the day, including a meeting with the product team, working on the LangChain project, and having lunch with a customer to discuss the latest in AI. The AI also reminds the human to bring their laptop to show a demo.'}
```
### VectorStoreRetrieverMemory
這段是課程內沒提到,但在官方文件內有提到的
`VectorStoreRetrieverMemory`直接將之前的對話內容向量化並儲存在向量資料庫(VectorDB)內,並在每次調用時查詢前 K 個最“顯著”的文檔。白話來說,是透過比對對話內容的相關性,從資料庫提取過去最相關連的對話紀錄
這與大多數其他 Memory 類不同,因為它不顯式跟踪交互的順序。
在這種情況下,“文檔(context)”是以前的對話片段。這對於參考人工智能在對話之前被告知的相關信息很有用。
```python!
from datetime import datetime
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.llms import OpenAI
from langchain.memory import VectorStoreRetrieverMemory
from langchain.chains import ConversationChain
from langchain.prompts import PromptTemplate
```
選擇你要用的向量資料庫
```python!
import faiss
from langchain.docstore import InMemoryDocstore
from langchain.vectorstores import FAISS
embedding_size = 1536 # Dimensions of the OpenAIEmbeddings
index = faiss.IndexFlatL2(embedding_size)
embedding_fn = OpenAIEmbeddings().embed_query
vectorstore = FAISS(embedding_fn, index, InMemoryDocstore({}), {})
```
存入對話紀錄
```python!
# In actual usage, you would set `k` to be a higher value, but we use k=1 to show that
# the vector lookup still returns the semantically relevant information
retriever = vectorstore.as_retriever(search_kwargs=dict(k=1))
memory = VectorStoreRetrieverMemory(retriever=retriever)
# When added to an agent, the memory object can save pertinent information from conversations or used tools
memory.save_context({"input": "My favorite food is pizza"}, {"output": "that's good to know"})
memory.save_context({"input": "My favorite sport is soccer"}, {"output": "..."})
memory.save_context({"input": "I don't the Celtics"}, {"output": "ok"}) #
```
檢視當輸入問題時,查詢到的最相關對話紀錄
```python!
# Notice the first result returned is the memory pertaining to tax help, which the language model deems more semantically relevant
# to a 1099 than the other documents, despite them both containing numbers.
print(memory.load_memory_variables({"prompt": "what sport should i watch?"})["history"])
# input: My favorite sport is soccer
# output: ...
```
使用範例
```python!
llm = OpenAI(temperature=0) # Can be any valid LLM
_DEFAULT_TEMPLATE = """The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Relevant pieces of previous conversation:
{history}
(You do not need to use these pieces of information if not relevant)
Current conversation:
Human: {input}
AI:"""
PROMPT = PromptTemplate(
input_variables=["history", "input"], template=_DEFAULT_TEMPLATE
)
conversation_with_summary = ConversationChain(
llm=llm,
prompt=PROMPT,
# We set a very low max_token_limit for the purposes of testing.
memory=memory,
verbose=True
)
conversation_with_summary.predict(input="Hi, my name is Perry, what's up?")
```
如果詢問運動相關的話題,可以看到運動相關主題的對話紀錄被查詢到,作為對話回覆的依據。見`Relevant pieces of previous conversation:`的部分
```python!
# Here, the basketball related content is surfaced
conversation_with_summary.predict(input="what's my favorite sport?")
...
Relevant pieces of previous conversation:
input: My favorite sport is soccer
output: ...
(You do not need to use these pieces of information if not relevant)
...
' You told me earlier that your favorite sport is soccer.'
conversation_with_summary.predict(input="What's my name?")
...
Relevant pieces of previous conversation:
input: Hi, my name is Perry, what's up?
response: Hi Perry, I'm doing well. How about you?
...
' Your name is Perry.'
```
- 如果在記憶與數據上加上時間戳(timestamp),通常有助於agent更確認時間相關性
- 對話中的記憶會自動存儲。透過查詢與問題最匹配的對話,代理(agent)就能 "記住"之前的對話內容,從中提取資訊作為回答的依據。