auto_create_questionset

## 自動產生問題如果您不想自己創建問題集來評估當前參數的性能，您可以使用 ***eval.auto_create_questionset*** 功能自動生成一個包含參考答案的問題集。隨後，您可以使用 ***eval.auto_evaluation*** 獲取評估指標，如 Bert_score、Rouge 和 LLM_score（對於問答問題集），以及單選問題集的正確率。這些分數範圍從 0 到 1，較高的值表示生成的回答與參考答案更接近。如範例，以下創建了一個名為 'mic_essay.txt' 的問題集文本文件，其中包含十個問題和參考答案。每個問題都是從 'doc/mic/' 目錄中給定文檔的內容段落中隨機生成的。然後，您可以使用該問題集文本文件來評估要測試的參數的性能。 ```python!= import akasha.eval as eval eva = eval.Model_Eval(question_style="essay", search_type='merge',\ model="openai:gpt-3.5-turbo", embeddings="openai:text-embedding-ada-002",record_exp="exp_mic_auto_questionset") eva.auto_create_questionset(doc_path="doc/mic/", question_num=10, output_file_path="questionset/mic_essay.txt") bert_score, rouge, llm_score, tol_tokens = eva.auto_evaluation(questionset_file="questionset/mic_essay.txt", doc_path="doc/mic/", question_style = "essay", record_exp="exp_mic_auto_evaluation",topK=3,search_type="svm") print("bert_score: ", bert_score, "\nrouge: ", rouge, "\nllm_score: ", llm_score) ``` ```text bert_score: 0.782 rouge: 0.81 llm_score: 0.393 ``` </br> </br> ## 使用question_type測試不同方面的能力 question_type 参数提供了四種問題類型：***fact***、***summary***、***irrelevant***、***compared***，預設是 fact。 1. fact測試回答一般事實的能力 2. summary測試模型做摘要的能力 3. irrelevant測試模型能否分辨文件中不存在答案的問題 4. compared測試模型比較不同事物的能力 ### 範例 ```python!= import akasha.eval as eval eva = eval.Model_Eval(search_type='merge', question_type = "irrelevant", model="openai:gpt-3.5-turbo", record_exp="exp_mic_auto_questionset") eva.auto_create_questionset(doc_path="doc/mic/", question_num=10, output_file_path="questionset/mic_irre.txt") bert_score, rouge, llm_score, tol_tokens = eva.auto_evaluation(questionset_file="questionset/mic_irre.txt", doc_path="doc/mic/", question_style = "essay", record_exp="exp_mic_auto_evaluation",search_type="svm") ``` </br> </br> ## 指定問題集主題如果你想生成特定主題的問題，你可以使用 ***create_topic_questionset*** 函數，它會使用輸入的主題在文檔中找到相關的文件段落並生成問題集。 ### 範例 ```python!= import akasha.eval as eval eva = eval.Model_Eval(search_type='merge',question_type = "irrelevant", model="openai:gpt-3.5-turbo", record_exp="exp_mic_auto_questionset") eva.create_topic_questionset(doc_path="doc/mic/", topic= "工業4.0", question_num=3, output_file_path="questionset/mic_topic_irre.txt") bert_score, rouge, llm_score, tol_tokens = eva.auto_evaluation(questionset_file="questionset/mic_topic_irre.txt", doc_path="doc/mic/", question_style = "essay", record_exp="exp_mic_auto_evaluation",search_type="svm") ```