1/16 LangChain for LLM Application Development

# 1/16 LangChain for LLM Application Development [TOC] --- Write by 陳正哲 ## 課程大綱主要學習內容 * Langchain屬於Knowledge retrival類的應用，不同於直接調用LLM內的knowledge，而是調用LLM模型並提供提示及資料庫解析來生成回應。 * LLM記憶體功能：存儲對話並管理有限的上下文空間。 * 鏈接功能：創建操作序列。 * 針對文件問答：應用LLM於特定數據或資料集。 * 代理人(agents)：探索作為推理代理的LLM的新開發 ### Introduction 參考基本教學連結 https://pub.aimind.so/langchain-all-you-need-to-know-beadc2c93323 DeepLearning.AI課程[連結](https://learn.deeplearning.ai/langchain/lesson/1/introduction) :::info 2022年早期ChatGPT僅提供簡單問畫面且對話token限制大(4000 tokens)，當時若要進一步客製化，則需要透過OpenAI API來開發。 LangChain 是開放原始碼架構，使用者可選擇大型語言模型(LLM)與模組打造應用程式。利用LangChain框架擴展語言模型在應用開發中的用途與能力。 LangChain 提供標準化模組例如Memory、Chains、Document loader、agents等，藉以改善模型生成資訊的客製化程度、準確度和關聯性。 ::: #### 練習小技巧(1): Colab起手式(update:2024/01) ``` #設定可用openai version ! pip install openai==0.28 ! pip install python-dotenv #local computer environment ! pip install langchain ! pip install langchain_experimental #新版agent部分功能移到此處 ``` ### L1-基本模組Models, Prompts and parsers :::info * 學習如何有效提供提示給LLM，掌握解析LLM回應的技巧 * LangChain[支援的LLMs](https://python.langchain.com/docs/integrations/llms/), 如: OpenAI, HuggingFaceHub等 * 應用程式呼叫LLM時, 不會直接將使用者的本文傳到LLM, 而是會將使用者撰寫的文字進行組合, 再傳送至LLM ::: DeepLearning.AI[課程連結 L1](https://learn.deeplearning.ai/langchain/lesson/2/models,-prompts-and-parsers) #### 使用API calls through LangChain ``` import os import openai from dotenv import load_dotenv, find_dotenv _ = load_dotenv(find_dotenv()) # read local .env file openai.api_key = os.environ['OPENAI_API_KEY'] ``` ``` # 按照使用時間點選擇GPT模型 # Get the current date current_date = datetime.datetime.now().date() # Define the date after which the model should be set to "gpt-3.5-turbo" target_date = datetime.date(2024, 6, 12) # Set the model variable based on the current date if current_date > target_date: llm_model = "gpt-3.5-turbo" else: llm_model = "gpt-3.5-turbo-0301" ``` 其餘按照課程進行就可以了解基本對話模式 ### L2-Memory :::info 探索如何使用Langchain記憶體來存儲對話瞭解管理有限上下文空間(tokens)的方法 ::: DeepLearning.AI[課程連結 L2](https://learn.deeplearning.ai/langchain/lesson/3/memory) #### 2-1 ConversationBufferMemory ``` from langchain.chat_models import ChatOpenAI from langchain.chains import ConversationChain from langchain.memory import ConversationBufferMemory ``` ``` llm = ChatOpenAI(temperature=0.0, model=llm_model) memory = ConversationBufferMemory() conversation = ConversationChain( llm=llm, memory = memory, verbose=True ) ``` ``` conversation.predict(input="你好，我是小強") conversation.predict(input="What is 1+1?") ``` ``` print(memory.buffer) ``` 說明: 顯示memory內容 > Human: 你好，我是小強 AI: 你好小強，很高興認識你。我是AI。有什麼我可以幫助你的嗎？ (Translation: Hello Xiao Qiang, nice to meet you. I'm AI. How can I assist you?) Human: What is 1+1? AI: The answer to 1+1 is 2. ``` memory = ConversationBufferMemory() ``` 說明: 清除memory ``` memory.load_memory_variables({}) ``` 自己執行看看 #### 2-2 ConversationBufferWindowMemory ``` from langchain.memory import ConversationBufferWindowMemory ``` ``` memory = ConversationBufferWindowMemory(k=1) ``` 說明: 設定只記最後一則對話(k=1) ``` #建立範例 memory.save_context({"input": "Hi"}, {"output": "What's up"}) memory.save_context({"input": "Not much, just hanging"}, {"output": "Cool"}) ``` ``` llm = ChatOpenAI(temperature=0.0, model=llm_model) memory = ConversationBufferWindowMemory(k=1) conversation = ConversationChain( llm=llm, memory = memory, verbose=False ) ``` #第一則對話 conversation.predict(input="Hi, my name is Andrew") #第二則對話 conversation.predict(input="What is 1+1?") ``` conversation.predict(input="What is my name?") memory.load_memory_variables({}) ``` 自己執行試試看memory內容 #### 2-3 ConversationTokenBufferMemory ``` from langchain.memory import ConversationTokenBufferMemory from langchain.llms import OpenAI llm = ChatOpenAI(temperature=0.0, model=llm_model) ``` ``` memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=50) #建立文本範例 memory.save_context({"input": "AI is what?!"}, {"output": "Amazing!"}) memory.save_context({"input": "Backpropagation is what?"}, {"output": "Beautiful!"}) memory.save_context({"input": "Chatbots are what?"}, {"output": "Charming!"}) ``` 說明: max_token_limit=50設定記憶為最新/近的50個tokens ``` memory.load_memory_variables({}) ``` 自己執行看看 #### 2-4 ConversationSummaryMemory ``` from langchain.memory import ConversationSummaryBufferMemory ``` ``` # 創建超長文本範例 schedule = "There is a meeting at 8am with your product team. \ You will need your powerpoint presentation prepared. \ 9am-12pm have time to work on your LangChain \ project which will go quickly because Langchain is such a powerful tool. \ At Noon, lunch at the italian resturant with a customer who is driving \ from over an hour away to meet you to understand the latest in AI. \ Be sure to bring your laptop to show the latest LLM demo." memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100) memory.save_context({"input": "Hello"}, {"output": "What's up"}) memory.save_context({"input": "Not much, just hanging"}, {"output": "Cool"}) memory.save_context({"input": "What is on the schedule today?"}, {"output": f"{schedule}"}) #顯示memory memory.load_memory_variables({}) ``` ``` conversation = ConversationChain( llm=llm, memory = memory, verbose=True ) #對話 conversation.predict(input="What would be a good demo to show?") #看結果 memory.load_memory_variables({}) ``` 說明: 自己用文本試驗體會 ### L3-Chains :::info 學習創建操作Chain(鍊/序列)的技術理解如何將不同操作工具串連以增強LLM的功能，如果是明確的工作流程，那麼可以考慮Chain 應用 ::: DeepLearning.AI[課程連結 L3](https://learn.deeplearning.ai/langchain/lesson/4/chains) #### 3-1 LLMChain ``` from langchain.chat_models import ChatOpenAI from langchain.prompts import ChatPromptTemplate from langchain.chains import LLMChain ``` 說明: 這裡會用到chains，其他看課程連結。可以直接在DeepLearning.AI的Jupyter網頁環境執行。如果要在Colab或自己電腦練習，要從網頁環境Jupyter下載Data.csv。 #### 3-2 Sequential Chains 把不同的prompt_template以chains串起 ##### SimpleSequentialChain (1)一次串聯數個prompts，每次單一輸入單一輸出 ![image](https://hackmd.io/_uploads/rJkC8vRd6.png) 最後輸出一個結果 ##### SequentialChain (2)一次匯入多組prompts，用不同流程組合輸出結果 ![image](https://hackmd.io/_uploads/ByIfPv0OT.png) #### 3-3 Router Chain (3) 運用LLM把對話關鍵字分類分流 ![image](https://hackmd.io/_uploads/ByCFvD0_T.png) ``` from langchain.chains.router import MultiPromptChain from langchain.chains.router.llm_router import LLMRouterChain,RouterOutputParser from langchain.prompts import PromptTemplate ``` ### L4-Question and Answer :::info 透過Langchain應用LLM於特定數據資料，不再只是擷取LLM訓練的背景資料庫。 LangChain把大文本分解為chunks存到向量資料庫(VectorStore)，再以prompt訪問VectorStore進行問答。學習如何自定義LLM以適應特定需求 ::: DeepLearning.AI[課程連結 L4](https://learn.deeplearning.ai/langchain/lesson/5/question-and-answer) 課程說明- 課程範例會用到OutdoorClothingCatalog_1000.csv，建議直接在網頁環境執行 (1)LLMs每次輸入的tokens有限 ![image](https://hackmd.io/_uploads/rJs_wDR_6.png) #### Embedding (2) 把文本轉換為向量資料庫的過程 ![image](https://hackmd.io/_uploads/rJ7dDDRu6.png) #### Vector Database (1) ![image](https://hackmd.io/_uploads/B16ivwCu6.png) (2) ![image](https://hackmd.io/_uploads/r18RvwAda.png) ### L5-Evaluation :::info Langchain提供檢視生成回應的品質檢查工具 ::: DeepLearning.AI[課程連結 L5-Evaluation](https://learn.deeplearning.ai/langchain/lesson/6/evaluation) (正哲: 這部分雖然可以在網頁環境執行，但生成的檢查是否這樣就足夠，可能要看case而定) ### L6-Agents :::info 探討LLM作為agents(推理代理人)的概念和應用瞭解如何開發Agents來實現複雜任務範例: 使用Wikipedia、取得時間 ::: DeepLearning.AI[課程連結 L6](https://learn.deeplearning.ai/langchain/lesson/7/agents) 課程說明- 直接DeepLearning.AI的網頁環境執行沒問題。若在Colab環境需要修改如下： ``` #部分功能搬到langchain_experimental ! pip install langchain langchain_experimental ``` ``` from langchain_experimental.agents.agent_toolkits import create_python_agent from langchain.agents import load_tools, initialize_agent from langchain.agents import AgentType from langchain_experimental.tools.python.tool import PythonREPLTool from langchain.python import PythonREPL from langchain.chat_models import ChatOpenAI ``` ## 實際應用和案例研究 ## 問題討論與反思 > 建議大家到Langchain community與Langchain 參考這篇也有這堂課的完整介紹[LangChain Overview](https://pub.aimind.so/langchain-all-you-need-to-know-beadc2c93323) ## 參考-OpenAI新功能Assistants API **OpenAI參考資料[連結](https://)** **Step 1: 創建助手** ``` assistant = client.beta.assistants.create( name="Math Tutor", instructions="You are a personal math tutor. Write and run code to answer math questions.", tools=[{"type": "code_interpreter"}], model="gpt-4-1106-preview" ) ``` **Step 2: 創建Thread(連續對話鍊)** ``` thread = client.beta.threads.create() ``` **Step 3: 加入使用者訊息** ``` message = client.beta.threads.messages.create( thread_id=thread.id, role="user", content="I need to solve the equation `3x + 11 = 14`. Can you help me?" ) ``` **Step 4: 執行** ``` run = client.beta.threads.runs.create( thread_id=thread.id, assistant_id=assistant.id, instructions="Please address the user as Jane Doe. The user has a premium account." ) ```