summary - HackMD

## 文件摘要若要創建文本文件的摘要（.pdf、.txt.、docx），您可以使用 ***summary.summarize_file*** 函數。如範例，以下使用 map_reduce 摘要方法指示語言模型生成大約 500 字的摘要。有兩種摘要類型，***map_reduce*** 和 ***refine***，***map_reduce*** 將對每個文本段落進行摘要，然後使用所有摘要的文本段落生成最終摘要；***refine*** 將逐個摘要每個文本段落，並使用前一個摘要作為摘要下一段的提示，以獲得更高水平的摘要一致性。 ### 範例 ```python!= import akasha sum = akasha.Summary( chunk_size=1000, chunk_overlap=100) sum.summarize_file(file_path="doc/mic/5軸工具機因應市場訴求改變的發展態勢.pdf",summary_type="map_reduce", summary_len=500\ , chunk_overlap=40) ``` ```text ### Arguments of Summary class ### Args: **chunk_size (int, optional)**: chunk size of texts from documents. Defaults to 1000. **chunk_overlap (int, optional)**: chunk overlap of texts from documents. Defaults to 40. **model (str, optional)**: llm model to use. Defaults to "gpt-3.5-turbo". **verbose (bool, optional)**: show log texts or not. Defaults to False. **threshold (float, optional)**: the similarity threshold of searching. Defaults to 0.2. **language (str, optional)**: the language of documents and prompt, use to make sure docs won't exceed max token size of llm input. **record_exp (str, optional)**: use aiido to save running params and metrics to the remote mlflow or not if record_exp not empty, and setrecord_exp as experiment name. default "". **system_prompt (str, optional)**: the system prompt that you assign special instruction to llm model, so will not be used in searching relevant documents. Defaults to "". **max_doc_len(int, optional)**: max docuemnt length of llm input. Defaults to 1500. **temperature (float, optional)**: temperature of llm model from 0.0 to 1.0 . Defaults to 0.0. **auto_translate (bool, optional)**: translate summary into language or not. ```