PEFT 和 Prompt Engineering Survey

PEFT 和 Prompt Engineering Survey === 這次期末作業我主要是研讀如何將別人 Train 好的模型修改成更貼合自己需求的 PEFT 及 Prompt Engineering，因為一般人不太有可能有資源去從頭開始 Train 一個模型，耗費的資源太多了，通常是拿人家 pre-train 的模型稍加修改。 # PEFT PEFT 是可以微調很少的參數，達到不錯的效果，在作業12我有研讀 hugging face 的 PEFT document，大致了解該怎麼做。 - 首先 PEFT 會先建立一個 PeftConfig ，定義參數 - 再載入 pre-train 的模型，並使用 PeftConfig 建立 PeftModel - 用 Transformers Trainer, Accelerate, or any custom PyTorch training loop 來 Train model - 用 from_pretrained method 把 model load 進來後 Inference ## **Integrations** PEFT能管理 Diffusers 和 Transformers 的 Adapter ## **Prompt-based methods** 提供任務或任務範例給模型學習，有p-tuning, prefix tuning, prompt tuning 三種方法 - **p-tuning**：p-tuning通過添加可訓練的prompt embeddings到 input 中，並透過prompt encoder來優化，以找到更好的prompt，從而消除了手動設計prompt的需要。prompt tokens可以添加到輸入序列的任何位置，並且p-tuning還引入了錨點tokens以提高性能。 - **prefix tuning**：prefix參數被插入到模型的所有層中，而prompt tuning只將prompt參數添加到模型 input embeddings中。 - **prompt tuning**：prompt tuning僅訓練和存儲一組相對較小的任務特定prompt參數。 ## **LoRA methods** 透過在模型中添加少量可訓練的參數來實現微調，而不是重新訓練整個模型，有以下幾種類型： - **LoRA**（Low-Rank Adaptation）：LoRA通過在模型的注意力層中插入低秩矩陣來實現微調。這些矩陣在訓練期間是可訓練的，並且在推理時與原始權重合併，從而不會增加額外的推理成本。 - **LoHa**（Low-Rank and Hadamard Product）：LoHa是一種結合了低秩矩陣和Hadamard乘積的PEFT方法。它通過在模型的權重矩陣中插入低秩更新和Hadamard乘積來實現微調，這有助於提高模型的表達能力。 - **LoKr**（Low-Rank Kernel）：LoKr是一種使用低秩核矩陣來微調模型的方法。它通過在模型的權重矩陣中插入低秩核來實現微調，這有助於在保持模型大小不變的同時提高模型的性能。 - **AdaLoRA**（Adaptive Low-Rank Adaptation）：根據權重矩陣的重要性來動態分配可訓練參數的預算，這有助於在不同層之間更有效地分配參數，從而提高模型的微調效果。 # IA3 這個 PEFT 方法將模型的激活值乘以三個學習到的向量。這種方法引入的可訓練參數數量甚至比 LoRA 更少，LoRA 引入的是權重矩陣而不是向量。原始模型的參數保持凍結，只更新這些向量。因此，對於新的下游任務來說，微調更快、更便宜、更高效 ## **Quantization** 是一種用較少的位元表示數據的技術，這對於減少記憶體使用和加速推理特別有用 # Prompt Engineering 這一部分的內容會結合作業9、13、14。我覺得 prompt engineering 其實就跟人溝通一樣，就要說清楚講明白和給範例，對方才能比較精確的輸出你想要的內容，好的 Prompt 要注意以下幾點： - 問題清晰：內容需「具體」 - ⽬的明確：避免開放式問題 - 焦點相關：⼀次⼀個指令，提問主題⼀致在 prompt 中出現的符號: - **Triple quotes** are used in many programming languages (like Python) to define multi-line strings. In prompt engineering, you can use them to encapsulate detailed instructions or context information, ensuring that AI interprets them as a single block of text. - **Triple Hashes** can be used interchangeably with Triple quotes. - **Single Hashes** are suitable for dropping short instructions between a long and complex prompt. They are like code comments. (https://www.youtube.com/watch?v=nxzElEJPRWI) ## PROMPT TEMPLATES 每個人的理解方式不同，所以我們要讓他們聽懂就得遵照某些模式，當然 LLM 也是如此，要遵照作者給的 template 使用才能正確溝通 ## AUTOMATE FRAMEWORK & CO-STAR 這兩個方法都是盡量把事情敘述清楚，讓LLM能理解要產出什麼樣的內容 ### AUTOMATE FRAMEWORK Act as a … User Persona & Audience Targeted Action Output Definition Mode / Tonality / Style Atypical Cases Topic Whitelisting Eliminate Garbage Text ### CO-STAR ![image](https://hackmd.io/_uploads/B1LKh5YEC.png) ### Others 我有去聽學校育成中心辦的用 chatGPT 講座生成行銷文案的講座，他有提到一些技巧 - 善⽤「引號」進⾏強調 - 避免太過主觀的問題 **對話架構 5 要素** 1. 擬定⻆⾊ 2. 最終目的 3. 參數資料：你的訊息主題，是否有相關模型、理論、⼯具，有助提供⽣成最終⽬標的答案 ex. 5W2H 4. 設定產出格式 5. 資訊提供：提供相關背景資訊，以便AI根據這些資訊⽣成更加精準和相關的回應 ## FEW-SHOT LEARNING 給一些範例供 LLM 了解 ``` Input: # Example human: How can I prevent SQLInjection? AI: The easiest and most effective way to prevent SQL Injection is to use Parameterized Queries or Prepared Statements. This method ensures that user input is always treated as literal values and not part of the SQL command. How can I prevent DDos attack? ``` ## RAG 可以載入外部的資料給 LLM 學習，像是 LangChain 的 retrieval chain 範例 [Quickstart | 🦜️🔗 LangChain](https://python.langchain.com/v0.1/docs/get_started/quickstart/#retrieval-chain) ## THOUGHT 引導 LLM思考 ![image](https://hackmd.io/_uploads/ryMj35YVR.png) ### CHAIN OF THOUGHT 提供分解步驟給 LLM ``` Input: I want to Know how to learn threat hunting. First, I will search and list the website of teaching threat hunting. Second, I will compare how many people use it. Finally, I will choose the website that have positive user reviews and best ratings one to learn. how to learn threat hunting? ``` ### **Self-Consistency** 多次問 LLM 某個問題，他回答最多此的結果會較貼近正確答案 ### TREE OF THOUGHT 同樣也是列出步驟，不過是以樹的形式思考 ``` Step1 : I have a problem related to {input}. Could you brainstorm three distinct solutions? Please consider a variety of factors such as {perfect_factors} Step 2: For each of the three proposed solutions, evaluate their potential. Consider their pros and cons, initial effort needed, implementation difficulty, potential challenges, and the expected outcomes. Assign a probability of success and a confidence level to each option based on these factors {solutions} Step 3: For each solution, deepen the thought process. Generate potential scenarios, strategies for implementation, any necessary partnerships or resources, and how potential obstacles might be overcome. Also, consider any potential unexpected outcomes and how they might be handled. {review} Step 4: Based on the evaluations and scenarios, rank the solutions in order of promise. Provide a justification for each ranking and offer any final thoughts or considerations for each solution {deepen_thought_process} ``` ``` overall_chain({"input":"human colonization of Mars", "perfect_factors":"The distance between Earth and Mars is very large, making regular resupply difficult"}) ``` ### **Chain-of-Symbol (CoS) Prompting** 用符號的方式讓 LLM 更好理解空間推理問題 ``` CoT: “The email claims to be from a trusted source but contains suspicious links and requests sensitive information.” CoS: “Email(Source: Trusted) [Links: Suspicious, Request: Sensitive Info] => [Action: Mark as Phishing]” ``` ## **Generated knowledge prompting** 讓 LLM 先生成背景知識，再依照背景知識來正確產生其他東西 ``` Input: Generate five strategies about how to identify and prevent phishing attack. Input: Use your response to generate a formal staff training manual draft. ``` ## **Self-refine** 提供內容給 LLM生成建議，再依照他生成的建議產生新內容 ``` Input: Generate a draft of basic security policies for the organization. Please give me formal draft. Input: Evaluate whether the policies in the draft comply with best practices, regulatory requirements and ISO standard. Input: Update the policies based on the feedback to include missing elements . Please give me formal draft. ``` ## **Maieutic prompting** 類似於 tree of thought ，讓模型回答問題並解釋原因，再解釋原因，不一致的部分會被去掉 ``` Input: Should antivirus software be installed on all employees' computers? Input: why should antivirus software be installed on all employees' computers? Input: Should antivirus software be installed on all employees' computers? ``` ## **Directional-stimulus prompting** 給予 LLM 生成的方向 ``` Initial input: "Update our security incident response plan to include the latest threat intelligence." Directional prompt: "Consider ransomware, APT attacks, and insider threats." ``` # 參考資料 - https://en.wikipedia.org/wiki/Prompt_engineering - https://promptingguide.azurewebsites.net/techniques - https://medium.com/@astropomeai/implementing-the-tree-of-thoughts-in-langchains-chain-f2ebc5864fac - https://medium.com/the-generator/the-perfect-prompt-prompt-engineering-cheat-sheet-d0b9c62a2bba - 老師的上課 PPT