# Description 推理問答系統後端API,使用fastAPI框架 專案位於147上,路徑為:`~/reasoning-qa/reasoning` # Environment * python 3.8 # Build 本專案透過docker-compose管理 ``` cd reasoning-qa tree -L 1 . ├── docker-compose.yaml ├── neo4j ├── reasoning └── web docker-compose up -d ``` # APIs 實際輸入範例及輸出範例可至[系統的openapi](http://140.116.245.147:888/docs)查看 ## 1. Reasoning knowledge graph building(個別型) ### Description 根據文章建立知識圖譜 ### Input * title: 文章標題 * KG: KG名稱 * 決定要把資料建到哪個KG裡面 * news: 文章內容 * release_time: 文章發布時間(optional) * 沒有提供的話會使用當前的時間做為release_time * link: 文章連結(optional) ### Output * cypher: 知識圖譜建立結果 * 前端可以利用cypher得到視覺化的知識圖譜建立結果 * news_id: 文章id,提供給單篇問答使用 * 前端可以利用該id做為單篇問答的id,在這個文章(知識圖譜)中問答 * KG: 目標KG ## 2. Single-article question answering ### Description 根據提供的文章回答問題,該文章會透過`Knowledge graph building`建進知識圖譜中 ### Input * news_id: 文章id * 可以透過Reasoning knowledge graph building得到 * KG: 在指定KG中問答 * 可以透過Reasoning knowledge graph building得到 * question: 使用者問句 ### Output * answers: 問題的答案 * 包含答案文字敘述、來源連結、以及報社 * cypher: 答案對應的知識圖譜 * 前端可以利用這個cypher顯示透過那些知識圖譜節點得出答案 * error: 是否發生錯誤 * true: 正常 * false: 錯誤 (前端透過這個錯誤顯示錯誤資訊) <details> <summary>範例輸出(正常)</summary> ```json { "answers": [ { "answer": "端午節將至,許多人採購準備大快朵頤,腸胃肝膽科醫師提醒民眾,糯米並非平常的主食,相較白米飯消化器官得要多0.5至1小時才能消化,一般人1天最好不要吃超過2顆粽子的量,避免「粽」傷害", "link": "https://www.chinatimes.com/realtimenews/20220531004310-260418?chdtv", "company": "中時新聞網", }, { "answer": "1天食用不要超過2顆,粽子吃太多也容易造成體重負擔", "link": "https://www.chinatimes.com/realtimenews/20220531004310-260418?chdtv", "company": "中時新聞網", } ], "cypher": "match p = (n) - [*] -> (m) where (n.event = \"端午節將至\" and m.event = \"避免「粽」傷害\" and \"https://www.chinatimes.com/realtimenews/20220531004310-260418?chdtv\" in m.links) or (n.event = \"1天食用不要超過2顆\" and m.event = \"粽子吃太多也容易造成體重負擔\" and \"https://www.chinatimes.com/realtimenews/20220531004310-260418?chdtv\" in m.links) return p", "error": "false" } ``` </details> <details> <summary>範例輸出(錯誤)</summary> ```json { 'answer': [], 'cypher': "", 'error': 'true' } ``` </details> ## 3. Multiple articles question answering ### Description 根據知識圖譜中的資料回答問題 (1560篇) ### Input * question: 使用者問句 * KG: 知識圖譜 ### Output * answers: 問題的答案 * 包含答案文字敘述、來源連結、以及報社 * cypher: 答案對應的知識圖譜 * 前端可以利用這個cypher顯示透過那些知識圖譜節點得出答案 * error: 是否發生錯誤 * true: 正常 * false: 錯誤 (前端透過這個錯誤顯示錯誤資訊) <details> <summary>範例輸出(正常)</summary> ```json { "answers": [ { "answer": "衛福部豐原醫院提醒,患有糖尿病的患者不能吃太多,因為一顆粽子的熱量等同一碗飯,吃太多容易造成血糖飆高", "link": "https://www.ettoday.net/news/20200601/1727260.htm", "company": "ETtoday" }, { "answer": "趙函穎提醒,三高患者不可餐餐吃粽子,淺嚐即可,吃粽子時,應該細嚼慢嚥,補充水份,以及配合一些含有豐富膳食纖維、高營養價值的蔬果", "link": "https://www.setn.com/News.aspx?NewsID=1124040", "company": "三立新聞網" }, { "answer": "端午節將至,許多人採購準備大快朵頤,腸胃肝膽科醫師提醒民眾,糯米並非平常的主食,相較白米飯消化器官得要多0.5至1小時才能消化,一般人1天最好不要吃超過2顆粽子的量,避免「粽」傷害", "link": "https://www.chinatimes.com/realtimenews/20220531004310-260418?chdtv", "company": "中時新聞網" }, { "answer": "1天食用不要超過2顆,粽子吃太多也容易造成體重負擔", "link": "https://www.chinatimes.com/realtimenews/20220531004310-260418?chdtv", "company": "中時新聞網" } ], "cypher": "match p = (n) - [*] -> (m) where (n.event = \"衛福部豐原醫院提醒\" and m.event = \"吃太多容易造成血糖飆高\" and \"https://www.ettoday.net/news/20200601/1727260.htm\" in m.links) or (n.event = \"趙函穎提醒\" and m.event = \"以及配合一些含有豐富膳食纖維、高營養價值的蔬果\" and \"https://www.setn.com/News.aspx?NewsID=1124040\" in m.links) or (n.event = \"端午節將至\" and m.event = \"避免「粽」傷害\" and \"https://www.chinatimes.com/realtimenews/20220531004310-260418?chdtv\" in m.links) or (n.event = \"1天食用不要超過2顆\" and m.event = \"粽子吃太多也容易造成體重負擔\" and \"https://www.chinatimes.com/realtimenews/20220531004310-260418?chdtv\" in m.links) return p", "error": "false" } ``` </details> <details> <summary>範例輸出(錯誤)</summary> ```json { 'answer': [], 'cypher': "", 'error': 'true' } ``` </details> # Architecture ![](https://hackmd.io/_uploads/r1-971zn2.png) * `build_kg.py`: 建立知識圖譜的主要程式 * `tools.py`: 包含前處理、抽取關係等工具 * `reasoning.py`: 推理過程的主要程式(TSRC Structural Reasoning + Answer Generation) * `concept.py`: Question Analysis *` neo4japi.py`: Causal Issue Matching # Neo4j performance ## Constraint ``` cypher create constraint event_constraint_on_causation for (n: Causation) require n.event is unique ``` > 加速搜尋 (Equality check) (下cql的時候善用label :) ## Index ``` create text index event_index_on_node for (n: Event) on (n.event) ``` > 加速搜尋 (CONTAINS) ## Reference [neo4j constraint](https://neo4j.com/docs/cypher-manual/current/constraints/) [neo4j index](https://neo4j.com/docs/cypher-manual/current/query-tuning/indexes/#_index_preference)