[Week 7] Building Your Own LLM Application

# [Week 7] Building Your Own LLM Application [課程目錄](https://areganti.notion.site/Applied-LLMs-Mastery-2024-562ddaa27791463e9a1286199325045c) [課程連結](https://areganti.notion.site/Week-7-Building-Your-Own-LLM-Application-fbac298e688a42148c18a1ddc7594362) ## ETMI5: Explain to Me in 5 In the previous parts of the course we covered techniques such as prompting, RAG, and fine-tuning, this section will adopt a practical, hands-on approach to showcase how LLMs can be employed in application development. We'll start with basic examples and progressively incorporate more advanced functionalities like chaining, memory management, and tool integration. Additionally, we'll explore implementations of RAG and fine-tuning. Finally, by integrating these concepts, we'll learn how to construct LLM agents effectively. 前面的課程中，我們介紹涵蓋了提示(prompting)、RAG 和微調等技術，本節將透過實作的方式來說明如何在應用程式開發中使用LLMs。我們將從基本範例開始，逐步融入更進階的功能，像是鏈、記憶管理與工具整合。此外，我們將探討RAG和模型微調的實作。最後，透過整合這些概念，我們將學習如何有效地建立LLM agents。 ## Introduction As LLMs have become increasingly prevalent, there are now multiple ways to utilize them. We'll start with basic examples and gradually introduce more advanced features, allowing you to build upon your understanding step by step. 隨著LLMs變得愈來愈普及，使用它們的方法也愈來愈多元。我們將從基本範例開始，逐步介紹更進階的功能，讓你逐步理解這些模型。 This guide is designed to cover the basics, aiming to familiarize you with the foundational elements through simple applications. These examples serve as starting points and are not intended for production environments. For insights into deploying applications at scale, including discussions on LLM tools, evaluation, and more, refer to our content from previous weeks. As we progress through each section, we'll gradually move from basic to more advanced components. 這個指南主要涵蓋基礎知識，目的是透過簡單的應用程式讓你熟悉基本元素。這些範例儘供參考，不適用於生產環境。對於深入瞭解大規模部署應用程式的部份，包括有關LLM工具、評估等的討論，請參閱我們前幾週的課程內容。隨著課程的進展，我們將逐漸從基本組件轉向更高級的組件。 In every section, we'll not only describe the component but also provide resources where you can find code samples to help you develop your own implementations. There are several frameworks available for developing your application, with some of the most well-known being LangChain, LlamaIndex, Hugging Face, and Amazon Bedrock, among others. Our goal is to supply resources from a broad array of these frameworks, enabling you to select the one that best fits the needs of your specific application. 在每個章節中，我們不僅會描述元件，還會提供資源，你可以在其中找到程式碼範例來幫助你開發自己的實作。有多種框架可用於開發應用程序，其中最著名的框架包括LangChain、LlamaIndex、Hugging Face和Amazon Bedrock等。我們的目標是從一系列廣泛的框架中提供資源，讓你能夠選擇最適合你的特定應用程式需求的框架。 As you explore each section, select a few resources to help build the app with the component and proceed further. 當你探討每個章節時，請選擇一些資源來幫助你自己使用該元件來建立應用程式並進一步繼續。 ![image](https://hackmd.io/_uploads/H1-h2Zb60.png) ## 1. Simple LLM App (Prompt + LLM) **Prompt:** A prompt, in this context, is essentially a carefully constructed request or instruction that guides the model in generating a response. It's the initial input given to the LLM that outlines the task you want it to perform or the question you need answered. In the second week's content, we delved extensively into prompt engineering, please head back to older content to learn more. **提示：** 在這種情況下，提示本質上是一個精心建構的請求或指令，用於指導模型生成回應。這是向LLMs提供的初始輸入，概述了你希望其執行的任務或你需要回答的問題。在第二週的課程內容中，我們廣泛地深入研究了提示工程，請回頭翻閱以了解更多資訊。 The foundational aspect of LLM application development is the interaction between a user-defined prompt and the LLM itself. This process involves crafting a prompt that clearly communicates the user's request or question, which is then processed by the LLM to generate a response. For example: LLM應用程式開發的基礎是使用者定義的提示與LLM本身之間的互動。這過程涉及精心設計一個提示，清楚地傳達使用者的請求或問題，然後LLM接著處理這個提示並生成回應。舉例來說： ```python # Define the prompt template with placeholders prompt_template = "Provide expert advice on the following topic: {topic}." # Fill in the template with the actual topic prompt = prompt_template.replace("{topic}", topic) # API call to an LLM llm_response = call_llm_api(topic) ``` Observe that the prompt functions as a template rather than a fixed string, improving its reusability and flexibility for modifications at run-time. The complexity of the prompt can vary; it can be crafted with simplicity or detailed intricacy depending on the requirement. 注意到，提示是做為一個樣版，而不是固定的字串，以此提高其運行時的可重用性與可修改的靈活性。提示的複雜程度不一定吼；可以簡單也可以複雜，視需求而定。 ### Resources/Code 1. [**Documentation/Code**] LangChain cookbook for simple LLM Application ([link](https://python.langchain.com/docs/expression_language/cookbook/prompt_llm_parser)) 2. [**Video**] Hugging Face + LangChain in 5 mins by AI Jason ([link](https://www.youtube.com/watch?v=_j7JEDWuqLE)) 3. [**Documentation/Code**] Using LLMs with LlamaIndex ([link](https://docs.llamaindex.ai/en/stable/understanding/using_llms/using_llms.html)) 4. [**Blog**] Getting Started with LangChain by Leonie Monigatti ([link](https://towardsdatascience.com/getting-started-with-langchain-a-beginners-guide-to-building-llm-powered-applications-95fc8898732c)) 5. [**Notebook**] Running an LLM on your own laptop by LearnDataWithMark ([link](https://github.com/mneedham/LearnDataWithMark/blob/main/llm-own-laptop/notebooks/LLMOwnLaptop.ipynb)) ## 2. Chaining Prompts (Prompt Chains + LLM) Although utilizing prompt templates and invoking LLMs is effective, sometimes, you might need to ask the LLM several questions, one after the other, using the answers you got before to ask the next question. Imagine this: first, you ask the LLM to figure out what topic your question is about. Then, using that information, you ask it to give you an expert answer on that topic. This step-by-step process, where one answer leads to the next question, is called "chaining." Prompt Chains are essentially this sequence of chains used for executing a series of LLM actions. 儘管利用提示樣版和呼叫LLM是有效的，不過吼，有時候你可能會需要向LLM詢問好幾個問題，一個接著一個，然後使用先前得到的答案來提出下一個問題。想像一下：首先，你要求LLM弄清楚你的問題是關於什麼主題的。然後，使用這個信息，你要求它為你提供有關該主題的專家解答。這一步步的過程，也就是由一個答案引出下一個問題的過程就稱之為"chaining(鏈)"。提示鏈(Prompt Chains)本質上就是用於執行一系列LLM操作的鏈序列。 LangChain has emerged as a widely-used library for creating LLM applications, enabling the chaining of multiple questions and answers with the LLM to produce a singular final response. This approach is particularly beneficial for larger projects requiring multiple steps to achieve the desired outcome. The example discussed illustrates a basic method of chaining. LangChain's [documentation](https://js.langchain.com/docs/modules/chains/) offers guidance on more complex chaining techniques. LangChain已成為創建LLM應用程式廣泛使用的函式庫，它能夠將多個問題和答案與LLM進行鏈接，從而產生單一的最終回應。這種方法對於需要多個步驟才能達到預期結果的大型專案特別好用。所討論的範例展示了一種基本的鏈接方法。LangChain的[文件](https://js.langchain.com/docs/modules/chains/)提供了有關更複雜鏈接技術的指導。 ```python prompt1 ="what topic is the following question about-{question}?" prompt2 = "Provide expert advice on the following topic: {topic}." ``` ### Resources/Code 1. **[Article] ****Prompt Chaining Article on Prompt Engineering Guide([link](https://www.promptingguide.ai/techniques/prompt_chaining)) 2. [**Video**] LLM Chains using GPT 3.5 and other LLMs — LangChain #3 James Briggs ([link](https://www.youtube.com/watch?v=S8j9Tk0lZHU)) 3. [**Video**] LangChain Basics Tutorial #2 Tools and Chains by Sam Witteveen ([link](https://www.youtube.com/watch?v=hI2BY7yl_Ac)) 4. [**Code**] LangChain tools and Chains Colab notebook by Sam Witteveen ([link](https://colab.research.google.com/drive/1zTTPYk51WvPV8GqFRO18kDe60clKW8VV?usp=sharing)) ## **3. Adding External Knowledge Base: Retrieval-Augmented Generation (RAG)** Next, we'll explore a different type of application. If you've followed our previous discussions, you're aware that although LLMs excel at providing information, their knowledge is limited to what was available up until their last training session. To generate meaningful outputs beyond this point, they require access to an external knowledge base. This is the role that Retrieval-Augmented Generation (RAG) plays. 接下來，我們將探討不同類型的應用程式。如果你有關注我們之前的討論，你就會意識到，儘管LLMs擅長提供信息，但它們的知識僅限於最後一次訓練時可獲得的資料。為了在這時間點之後能夠生成有意義的輸出，他們需要存取外部知識庫。這就是檢索增強生成(RAG)所扮演的角色。 Retrieval-Augmented Generation, or RAG, is like giving your LLM a personal library to check before answering. Before the LLM comes up with something new, it looks through a bunch of information (like articles, books, or the web) to find stuff related to your question. Then, it combines what it finds with its own knowledge to give you a better answer. This is super handy when you need your app to pull in the latest information or deep dive into specific topics. 檢索增強生成，或者我們說RAG，就像給你的LLM一個個人圖書館，供你在回答之前檢查。在LLM提出新的東西之前，它會先瀏覽大量信息(像是文章、書籍或網路)來查找與你的問題相關的內容。然後，它會將發現的內容與自己的知識做個結合，為你提供更好的答案。當你需要應用程式獲取最新信息或深入研究特定主題時，這種作法非常方便。 To implement RAG (Retrieval-Augmented Generation) beyond the LLM and prompts, you'll need the following technical elements: 若要在LLM和提示之外實作RAG(檢索增強生成)，你會需下面技術要素： **A knowledge base, specifically a vector database** A comprehensive collection of documents, articles, or data entries that the system can draw upon to find information. This database isn't just a simple collection of texts; it's often transformed into a vector database. Here, each item in the knowledge base is converted into a high-dimensional vector representing the semantic meaning of the text. This transformation is done using models similar to the LLM but focused on encoding texts into vectors. 可以用來查找信息的文件、文章或資料條目的綜合集合的系統。這個資料庫不單純是個簡單的文本集合；它通常被轉換成向量資料庫。這裡吼，知識庫中的每個項目都被轉換為表示文字語意的高維度向量。這種轉換是使用類似於LLM的模型來完成的，不過重點是將文本編碼為向量。 The purpose of having a vectorized knowledge base is to enable efficient similarity searches. When the system is trying to find information relevant to a user's query, it converts the query into a vector using the same encoding process. Then, it searches the vector database for vectors (i.e., pieces of information) that are closest to the query vector, often using measures like cosine similarity. This process quickly identifies the most relevant pieces of information within a vast database, something that would be impractical with traditional text search methods. 擁有向量化知識庫的目的是實現高效的相似性搜尋。當系統嘗試尋找與使用者查詢相關的信息時，它會使用相同的編碼過程將使用者的查詢轉換為向量。然後，它在向量資料庫中搜尋最接近查詢向量的向量(也就是信息片段)，通常使用餘弦相似度等度量。這個過程可以快速識別龐大資料庫中最相關的信息，這對於傳統的文本搜尋方法來說是不切實際的。 **Retrieval Component** The retrieval component is the engine that performs the actual search of the knowledge base to find information relevant to the user's query. It's responsible for several key tasks: 1. **Query Encoding:** It converts the user's query into a vector using the same model or method used to vectorize the knowledge base. This ensures that the query and the database entries are in the same vector space, making similarity comparison possible. 2. **Similarity Search:** Once the query is vectorized, the retrieval component searches the vector database for the closest vectors. This search can be based on various algorithms designed to efficiently handle high-dimensional data, ensuring that the process is both fast and accurate. 3. **Information Retrieval:** After identifying the closest vectors, the retrieval component fetches the corresponding entries from the knowledge base. These entries are the pieces of information deemed most relevant to the user's query. 4. **Aggregation (Optional):** In some implementations, the retrieval component may also aggregate or summarize the information from multiple sources to provide a consolidated response. This step is more common in advanced RAG systems that aim to synthesize information rather than citing sources directly. 檢索元件是執行知識庫的實際搜尋以尋找與使用者查詢相關信息的引擎。它負責幾個關鍵任務：查詢編碼： 1. **查詢編碼：** 它使用與向量化知識庫相同的模型或方法將用戶的查詢轉換為向量。這確保了查詢和資料庫條目位於相同的向量空間中，從而讓相似性的比較成為可能。 2. **相似性搜尋：** 一旦查詢被向量化，檢索元件就會在向量資料庫中搜尋最接近的向量。這種搜尋可以基於各種演算法，而這些演算法主要是能夠處理高維度資料的搜尋，以確保這個過程既快速又準確。 3. **資訊檢索：** 在辨識出最接近的向量之後，檢索元件從知識庫取得對應的條目。這些條目是被認為與使用者的查詢最相關的信息。 4. **聚合(可選)：** 在一些實作中，檢索元件還可以聚合或總結來自多個來源的資訊以提供綜合性的回應。這個步驟在先進的RAG系統中更為常見，這些系統目的是合成信息,而不是直接引用資料來源。 In the RAG framework, the retrieval component's output (i.e., the retrieved information) is then fed into the LLM along with the original query. This enables the LLM to generate responses that are not only contextually relevant but also enriched with the specificity and accuracy of the retrieved information. The result is a hybrid model that leverages the best of both worlds: the generative flexibility of LLMs and the factual precision of dedicated knowledge bases. 在RAG的框架中，檢索元件的輸出(也就是檢索到的信息)隨後會跟原始的查詢一起餵給LLM。這使得LLM能夠生成不單純是跟上下文相關，同時富含檢索到的信息的特異性和準確性的響應。其結果是一個混合模型，充分利用了兩方面的優點：LLMs的生成靈活性和專用知識庫的事實精確性。 By combining a vectorized knowledge base with an efficient retrieval mechanism, RAG systems can provide answers that are both highly relevant and deeply informed by a wide array of sources. This approach is particularly useful in applications requiring up-to-date information, domain-specific knowledge, or detailed explanations that go beyond the pre-existing knowledge of an LLM. 透過向量化知識庫與高效檢索機制的結合，RAG系統可以提供高度相關且深入了解各種來源的答案。這種方法在需要最新息、特定領域知識或超出LLM現有知識的詳細解釋的應用程式中特別有用。 Frameworks like LangChain already have good abstractions in place to build RAG frameworks. 像LangChain這樣的框架已經具備了良好的抽象可以用來來建立RAG框架。 A simple example from LangChain is shown [here](https://python.langchain.com/docs/expression_language/cookbook/retrieval) ### Resources/Code 1. [**Article**] All You Need to Know to Build Your First LLM App by Dominik Polzer ([link](https://towardsdatascience.com/all-you-need-to-know-to-build-your-first-llm-app-eb982c78ffac)) 2. [**Video**] RAG from Scratch series by LangChain ([link](https://www.youtube.com/watch?v=wd7TZ4w1mSw&list=PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x)) 3. [**Video**] A deep dive into Retrieval-Augmented Generation with LlamaIndex ([link](https://www.youtube.com/watch?v=Y0FL7BcSigI&t=3s)) 4. [**Notebook**] RAG using LangChain with Amazon Bedrock Titan text, and embedding, using OpenSearch vector engine notebook ([link](https://github.com/aws-samples/rag-using-langchain-amazon-bedrock-and-opensearch)) 5. [**Video**] LangChain - Advanced RAG Techniques for better Retrieval Performance by Coding Crashcourses ([link](https://www.youtube.com/watch?v=KQjZ68mToWo)) 6. [**Video**] Chatbots with RAG: LangChain Full Walkthrough by James Briggs ([link](https://www.youtube.com/watch?v=LhnCsygAvzY&t=11s)) ## **4. Adding** Memory to LLMs We've explored chaining and incorporating knowledge. Now, consider the scenario where we need to remember past interactions in lengthy conversations with the LLM, where previous dialogues play a role. 我們探討了鏈和整合知識。現在，考慮這樣一個場景，我們需要記住過去與LLM的長時間對話中的互動，而先前的對話扮演著重要角色。 This is where the concept of Memory comes into play as a vital component. Memory mechanisms, such as those available on platforms like LangChain, enable the storage of conversation history. For example, LangChain's ConversationBufferMemory feature allows for the preservation of messages, which can then be retrieved and used as context in subsequent interactions. You can discover more about these memory abstractions and their applications on LangChain's [documentation](https://python.langchain.com/docs/modules/memory/types/). 這就是記憶的概念作為重要組成部分發揮作用的地方。記憶機制，像是LangChain等平台上所提供的記憶機制，可以儲存對話歷史記錄。舉例來說，LangChain的ConversationBufferMemory就允許訊息的保存，然後可以檢索這些訊息並將其用作後續互動中的上下文。你可以在LangChain的[文件](https://python.langchain.com/docs/modules/memory/types/)上找到有關這些記憶抽象及其應用的更多資訊。 ### Resources/Code 1. [**Article**] Conversational Memory for LLMs with LangChain by Pinecone([link](https://www.pinecone.io/learn/series/langchain/langchain-conversational-memory/)) 2. [**Blog**] How to add memory to a chat LLM model by Nikolay Penkov ([link](https://medium.com/@penkow/how-to-add-memory-to-a-chat-llm-model-34e024b63e0c)) 3. [**Documentation**] Memory in LlamaIndex documentation ([link](https://docs.llamaindex.ai/en/latest/api_reference/memory.html)) 4. [**Video**] LangChain: Giving Memory to LLMs by Prompt Engineering ([link](https://www.youtube.com/watch?v=dxO6pzlgJiY)) 5. [**Video**] Building a LangChain Custom Medical Agent with Memory by ([link](https://www.youtube.com/watch?v=6UFtRwWnHws)) ## **5. Using External Tools with LLMs** Consider a scenario within an LLM application, such as a travel planner, where the availability of destinations or attractions depends on seasonal openings. Imagine we have access to an API that provides this specific information. In this case, the application must query the API to determine if a location is open. If the location is closed, the LLM should adjust its recommendations accordingly, suggesting alternative options. This illustrates a crucial instance where integrating external tools can significantly enhance the functionality of LLMs, enabling them to provide more accurate and contextually relevant responses. Such integrations are not limited to travel planning; there are numerous other situations where external data sources, APIs, and tools can enrich LLM applications. Examples include weather forecasts for event planning, stock market data for financial advice, or real-time news for content generation, each adding a layer of dynamism and specificity to the LLM's capabilities. 考慮一個LLM應用程式的場景，像是旅行規劃師，其中目的地或景點的可用性取決於季節性的開放。想像一下，我們可以存取提供這個特定信息的API。在這種情況下，應用程式必須查詢API來確定某個景點是否開放。如果該景點關閉，LLMs就應該要相對應地調整其建議，提出替代選項。這說明了一個重要的例子，也就是整合外部工具可以明顯增強LLMs的功能，使它們能夠提供更準確和與上下文相關的回應。這種整合並非單純的限於旅規劃；在很多情況下，外部的資料來源、API和工具可以豐富LLM應用程式。像是用於活動策劃的天氣預報、用於財務建議的股票市場資料或用於內容生成的即時新聞，每一項都是為LLM的能力增添了一層活力和特異性。 In frameworks like LangChain, integrating these external tools is streamlined through its chaining framework, which allows for the seamless incorporation of new elements such as APIs, data sources, and other tools. 在像是LangChain的框架中，透過其鏈結框架(chaining framework)簡化了這些外部工具的整合，這個框架允許無縫接軌像是API、資料來源和其他工具等新元素。 ### Resources/Code 1. [**Documentation/Code**] List of LLM tools by LangChain ([link](https://python.langchain.com/docs/integrations/tools)) 2. [**Documentation/Code**]Tools in LlamaIndex ([link](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/tools/root.html)) 3. [**Video**] Building Custom Tools and Agents with LangChain by Sam Witteveen ([link](https://www.youtube.com/watch?v=biS8G8x8DdA)) ## **6. LLMs Making Decisions: Agents** In the preceding sections, we explored complex LLM components like tools and memory. Now, let's say we want our LLM to effectively utilize these elements to make decisions on our behalf. 在前面的章節中，我們探討了複雜的LLM元件，例如工具和記憶。現在，假設我們希望LLM能夠有效地利用這些元件來代表我們做出決策。 LLM agents do exactly this, they are systems designed to perform complex tasks by combining LLMs with other modules such as planning, memory, and tool usage. These agents leverage the capabilities of LLMs to understand and generate human-like language, enabling them to interact with users and process information effectively. LLM agents正是為此而生，這些agents的目的是透過結合LLM與其它模組(像是規劃、記憶和工具使用)來執行複雜任務的系統。這些agents利用LLMs的能力來理解並生成類人語言，使它們能夠與使用者互動並有效地處理信息。 For instance, consider a scenario where we want our LLM agent to assist in financial planning. The task is to analyze an individual's spending habits over the past year and provide recommendations for budget optimization. 舉例來說，考慮一個場景，我們希望LLM agent協助做財務規劃。任務是分析個人過去一年的消費習慣，並提供預算優化的建議。 To accomplish this task, the agent first utilizes its memory module to access stored data regarding the individual's expenditures, income sources, and financial goals. It then employs a planning mechanism to break down the task into several steps: 為了完成這個任務，agent首先利用其記憶模組來存取有關個人支出、收入來源和財務目標的儲存資料。然後，它採用規劃機制將任務分解為幾個步驟： 1. **Data Analysis**: The agent uses external tools to process the financial data, categorizing expenses, identifying trends, and calculating key metrics such as total spending, savings rate, and expenditure distribution. 2. **Budget Evaluation**: Based on the analyzed data, the LLM agent evaluates the current budget's effectiveness in meeting the individual's financial objectives. It considers factors such as discretionary spending, essential expenses, and potential areas for cost reduction. 3. **Recommendation Generation**: Leveraging its understanding of financial principles and optimization strategies, the agent formulates personalized recommendations to improve the individual's financial health. These recommendations may include reallocating funds towards savings, reducing non-essential expenses, or exploring investment opportunities. 4. **Communication**: Finally, the LLM agent communicates the recommendations to the user in a clear and understandable manner, using natural language generation capabilities to explain the rationale behind each suggestion and potential benefits. 1. **資料分析**：agent使用外部工具來處理財務資料，將費用分類，識別趨勢，並計算關鍵指標，像是總支出、儲蓄率和支出分配等。 2. **預算評估**：根據分析的資料，LLM agent評估當前預算在滿足個人財務目標方面的有效性。它考慮的因素像是可自由支配支出、基本支出和可能降低成本的部份。 3. **生成建議**：agent利用其對財務原理和最佳化策略的理解，制定出個人化建議，以改善個人的財務健康。這些建議可能包括重新分配資金用於儲蓄、減少非必要性開支或探討投資機會。 4. **溝通**：最後，LLM agent以清晰易懂的方式向使用者傳達建議，使用自然語言生成的能力來解釋每個建議背後的基本原理和潛在的好處。 Throughout this process, the LLM agent seamlessly integrates its decision-making abilities with external tools, memory storage, and planning mechanisms to deliver actionable insights tailored to the user's financial situation. 在這個過程中，LLM agent將其決策能力與外部工具、記憶儲存和規劃機制無縫整合，以提供適合使用者財務狀況的可行見解。 Here's how LLM agents combine various components to make decisions: 下面是LLM agents如何結合各種元件來做決策： 1. **Language Model (LLM)**: The LLM serves as the central controller or "brain" of the agent. It interprets user queries, generates responses, and orchestrates the overall flow of operations required to complete tasks. 2. **Key Modules**: - **Planning**: This module helps the agent break down complex tasks into manageable subparts. It formulates a plan of action to achieve the desired goal efficiently. - **Memory**: The memory module allows the agent to store and retrieve information relevant to the task at hand. It helps maintain the state of operations, track progress, and make informed decisions based on past observations. - **Tool Usage**: The agent may utilize external tools or APIs to gather data, perform computations, or generate outputs. Integration with these tools enhances the agent's capabilities to address a wide range of tasks. 1. **語言模型(LLM)**：LLM做為agent的中央控制器或"大腦"。它解釋使用者查詢、生成回應並協調完成任務所需的整體操作流程。 2. **關鍵模組**： - **規劃**：這個模組協助agent將複雜的任務分解為可管理的子部分。以有效地實現預期目標的方式來制定行動規劃。 - **記憶**：記憶模組允許agent儲存並檢索與當前任務相關的信息。它有助於維護操作狀態、追蹤進度並根據過去的觀察做出明智的決策。 - **工具使用**：agent可以利用外部工具或API來收集資料、執行計算或生成輸出。與這些工具的整合增強了agent處理各種任務的能力。 Existing frameworks offer built-in modules and abstractions for constructing agents. Please refer to the resources provided below for implementing your own agent. 現有的框架提供了用於構建agents的內建模組和抽象。請參閱下面提供的資源來實作你自己的agent。 ### Resources/Code 1. [**Documentation/Code**] Agents in LangChain ([link](https://python.langchain.com/docs/modules/agents/)) 2. [**Documentation/Code**] Agents in LlamaIndex ([link](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/root.html)) 3. [**Video**] LangChain Agents - Joining Tools and Chains with Decisions by Sam Witteveen ([link](https://www.youtube.com/watch?v=ziu87EXZVUE&t=59s)) 4. [**Article**] Building Your First LLM Agent Application by Nvidia ([link](https://developer.nvidia.com/blog/building-your-first-llm-agent-application)) 5. [**Video**] OpenAI Functions + LangChain : Building a Multi Tool Agent by Sam Witteveen ([link](https://www.youtube.com/watch?v=4KXK6c6TVXQ)) --- ## **7. Fine-Tuning** In earlier sections, we explored using pre-trained LLMs with additional components. However, there are scenarios where the LLM must be updated with relevant information before usage, particularly when LLMs lack specific knowledge on a subject. In such instances, it's necessary to first fine-tune the LLM before applying the strategies outlined in sections 1-5 to build an application around it. 在前面的章節中，我們探討了使用預訓練的LLMs模型和額外的元件。然而，在某些情況下，LLMs必須在使用前更新相關信息，特別是當LLMs缺乏某個主題的特定知識時。在這種情況下，我們去應用第1~5節所提到的策略來建立應用程式之前，是有必要先對LLM做微調的動作。 Various platforms offer fine-tuning capabilities, but it's important to note that fine-tuning demands more resources than simply eliciting responses from an LLM, as it involves training the model to understand and generate information on the desired topics. 很多平台都有提供微調的功能，但值得注意的是，微調需要更多的資源，而不是單純地從LLM身上引出回應，因為它涉及訓練模型來理解並生成有關所需主題的信息。 ### Resources/Code 1. [**Article**] How to Fine-Tune LLMs in 2024 with Hugging Face by philschmid ([link](https://www.philschmid.de/fine-tune-llms-in-2024-with-trl)) 2. [**Video**] Fine-tuning Large Language Models (LLMs) | w/ Example Code by Shaw Talebi ([link](https://www.youtube.com/watch?v=eC6Hd1hFvos)) 3. [**Video**] Fine-tuning LLMs with PEFT and LoRA by Sam Witteveen ([link](https://www.youtube.com/watch?v=Us5ZFp16PaU&t=261s)) 4. [**Video**] LLM Fine Tuning Crash Course: 1 Hour End-to-End Guide by AI Anytime ([link](https://www.youtube.com/watch?v=mrKuDK9dGlg)) 5. [**Article**] How to Fine-Tune an LLM series by Weights and Biases ([link](https://wandb.ai/capecape/alpaca_ft/reports/How-to-Fine-Tune-an-LLM-Part-1-Preparing-a-Dataset-for-Instruction-Tuning--Vmlldzo1NTcxNzE2)) ## Read/Watch These Resources (Optional) 1. List of LLM notebooks by aishwaryanr ([link](https://github.com/aishwaryanr/awesome-generative-ai-guide?tab=readme-ov-file#notebook-code-notebooks)) 2. LangChain How to and Guides by Sam Witteveen ([link](https://www.youtube.com/watch?v=J_0qvRt4LNk&list=PL8motc6AQftk1Bs42EW45kwYbyJ4jOdiZ)) 3. LangChain Crash Course For Beginners | LangChain Tutorial by codebasics ([link](https://www.youtube.com/watch?v=nAmC7SoVLd8)) 4. Build with LangChain Series ([link](https://www.youtube.com/watch?v=mmBo8nlu2j0&list=PLfaIDFEXuae06tclDATrMYY0idsTdLg9v)) 5. LLM hands on course by Maxime Labonne ([link](https://github.com/mlabonne/llm-course))