RdflibEndpointAccess

# RdflibEndpointAccess # Init ```python= from RdflibEndpointAccess import * access = RdflibEndpointAccess( mention_to_entity_file='/nfs/nas-7.1/tyren/kbqa/ccdata/pkubase-mention2ent.txt', triple_dir='/nfs/nas-7.1/ybdiau/ccks/', calculate_entity_mapping=False, cache_path='/nfs/nas-7.1/ybdiau/ccks/rdflib-endpoint-cache/', lac_use_cuda=False ) ``` # Sparql 規則比較不同（見下方程式片段）： - Entity 和 Relation 的格式有兩種選擇： 1. `<file:///nfs/nas-7.1/ybdiau/ccks/微软>` 2. `rdf:微软` - 以上兩種表示法是一樣的 - Literal 要加上 `""`，比如 - `"柳如"` ```python= access.sparql('select distinct ?p1 ?x where { rdf:湖上草 ?p1 ?x . FILTER (?x != rdf:湖上草) } ') access.sparql('select distinct ?p1 ?x where { ?x ?p1 "柳如" . }') ``` # One-Hop & Two-Hop Candidates - `try_cache` 預設都為 `True`，在查詢 in-edge 時會看有沒有查詢過，有的話直接用，沒有的話查詢後 cache 起來。 - 如果是 `False` 的話，則必定會重新查詢。 - Two-Hop 有專屬的 `out_in_style`，有三種值可以選 - `OUT_IN_SKIP`: 完全不考慮 Out-In - `OUT_IN_ONE_PASS`: 使用一個 Sparql 來查詢 Out-In 的邊 (和其他三種路徑一樣做法) - `OUT_IN_TWO_PASS`: 用兩層的 Sparql 來查詢（首先找出所有的 `Entity -> X`，再找 `Y -> X`。第二層的查詢可以用 cache 加快） ```python= ones = access.one_hop_candidates('rdf:微软', try_cache=True) twos = access.two_hop_candidates('rdf:微软', out_in_style=OUT_IN_TWO_PASS) ``` # Two-Hop (只考慮 Out-Out) ```python= twos = two_hop_candidates_out_out_only('rdf:微软') ``` # Triple Fillers ```python= access.get_triple_fillers(subject='rdf:光晕：斯巴达进攻', property='rdf:游戏平台', object=None) access.get_triple_fillers(subject='rdf:光晕：斯巴达进攻', property=None, object='rdf:微软') access.get_triple_fillers(subject=None, property='rdf:游戏平台', object='rdf:微软') ```