# <span style="color:blue"/>State Aware Reasoning Algorithm <span style="color:blue"/>**Reasoning without a template** <span style="color:blue"/>**Like a Reasoning but unlike any Reasoning** Team : Pranav (intern), Sudhanshu, Naresh, Krishna ## Philosophy Compass over Map when doing reasoning! ## Objective Build an reasoning algorithm which can work give a **STATE** and **Large Graph**. Query/Question is represented in a **STATE (graph) object** and this works as blueprint of path into a large real world knowledge graph. Algorithm will take the **Action** on **STATE** to move to **next STATE** and stop once we reach a solution/answer automatically. This reasoning algrothim can be plugined into one of Knowledge Repository release and any of the Q&A systems build in Jio. ## Steps 1. **Question2State**: Build an module to represent the Question into a STATE object > Base Graph : A triplet [Subject, Predicate, Object] > Contraints (Type, Entity Constraint, Predicate Contraint, Type Constraint, Temporal Constraint, Ordinal Constraint etc) ``` message EntityConstaint { Token token_type = 1; // Entity entity = 1; // ex. X node in STATE equals to Amitabh's node in KG } ``` ``` message TypeConstaint { Token token_type = 1; // ex. FILM, DIRECTOR, ACTOR are Entity Types } Expression for TypeConstaint : X 'equals' token_type, equals in KG means class/collection of X is token_type ex. Type = FILM, X could be any film ``` ``` message TemporalConstaint { Expression expression = 1; // ex. film.release > 2000 // AttributeToken attribute = 1; // QualifierToken qualifer = 2; // Operation = 3; } ``` ``` message OrdinalConstaint { Expression expression = 1; // ex. MinAtN poorest, lowest // ex. MaxAtN richest, highest } ``` 2. pass this STATE and KG (pointer to large knowledge graph) to reasoning algorithm Algroithm will take all the constraints learned from step#1 and applies to to Large Graph to generate the Path. 1. Base Graph will help to reach the starting point (sub space) 2. entity constaint will help to resolve information about unidentitied node 3. type constaint will help to take a path on which that type-instance comes 4. temporal will help to filter the entity instance in the path 5. ordinal will help to order by some attribute-value --- ### Graph used for approach (In memory graph) ```graphviz digraph{ "Rajkumar Hirani"-> "3 Idiots" [label=" directed"] "Rajkumar Hirani"-> "PK" [label=" directed"] "Ramesh Sippy"-> "Sholay" [label=" directed"] "Ramesh Sippy"-> "Zamana Deewana" [label=" directed"] "Ramesh Sippy"-> "Andaaz" [label=" directed"] "Aamir Khan"-> "3 Idiots" [label=" starred"] "Aamir Khan"-> "PK" [label=" starred"] "Amitabh Bachchan"-> "Agneepath" [label=" starred"] "Amitabh Bachchan"-> "Sholay" [label=" starred"] "Amitabh Bachchan"-> "Zamana Deewana" [label=" starred"] "3 Idiots"-> "film" [label=" type"] "PK"-> "film" [label=" type"] "Agneepath"-> "film" [label=" type"] "Sholay"-> "film" [label=" type"] "Andaaz"-> "film" [label=" type"] "Zamana Deewana"-> "play" [label=" type"] "Rajkumar Hirani"-> "director" [label=" type"] "Ramesh Sippy"-> "director" [label=" type"] "Aamir Khan"-> "actor" [label=" type"] "Amitabh Bachchan"-> "actor" [label=" type"] } ``` --- ### Question 1. Which film starred by Amitabh Bachchan and directed by Ramesh Sippy? 2. Which city Dhoni born in? --- ### Dry run ![](https://i.imgur.com/x5GLXr6.png) --- ### Question to STATE #### Dividing a question in base and constraint queries Our approach towards creating a Knowledge Based Question Answering(KBQA) capable of answering complex questions is based on the fact that any complex question can be divided into a base query with some additional constraints. We achieve our goal using **POS tagging** and **dependency parsing** capabilities provided by CoreNLP. We expand upon the procedure below ##### Step 1: Extracting entities using POS tagging :pick: In any question, entities will always be either proper or common nouns. Using this knowledge we first extract all the nouns available in a question. This gives us a staring point for our graph. > Example: Which film starred by Amitabh Bachchan and directed by Ramesh Sippy? [color=#0000ff] ```python text = "Which film starred by Amitabh Bachchan and directed by Ramesh Sippy?" nouns = set() nouns = noun_extractor(text) print(nouns) ``` Output: ```python nouns = {'film', 'Bachchan', 'Sippy'} ``` ##### Step 2: Using dependency parsing to get relations :link: Now that we have a list of nouns, we will use dependency parsing to identify the relationship of the nouns and the end points of the graph. Using this information we will essentially try to convert a natural question into a graph. > Example: Which film starred by Amitabh Bachchan and directed by Ramesh Sippy? [color=#0000ff] ```python result = parser.raw_parse(text) dependency = result.__next__() # Identifying output type requested_output = [] for dep in list(dependency.triples()): if str(dep[2][1]) == 'WDT': requested_output.append(str(dep[0][0])) if str(dep[0][1]) == 'WP' and str(dep[2][1]) == 'NN': requested_output.append(str(dep[2][0])) # Identifying relationships for all the nouns relationships = [] for noun in nouns: for dep in list(dependency.triples()): if str(dep[2][0]) == noun and noun not in requested_output and str(dep[1]) != "conj": relationship = [str(dep[2][0]), str(dep[0][0])] relationships.append(relationship) # Adding compound words to nouns (eg. Bachchan into Amitabh Bachchan) for n, v in relationships: for dep in list(dependency.triples()): if str(dep[0][0]) == n and str(dep[1]) == 'compound': q = str(dep[2][0]) + ' ' + n i = relationships.index([n, v]) relationships[i] = [q, v] # Appending everything and creating question_graph question_graph = [] question_graph.append(relationships) question_graph.append(requested_output) print(question_graph) ``` Output: ```python question_graph = [('Amitabh Bachchan', 'starred'), ('Ramesh Sippy', 'directed'), ('film', 'type')] ``` Visual Representation: ```graphviz digraph{ "Amitabh Bachchan" -> "x" [label="starred"] "Ramesh Sippy" -> "x" [label="directed"] "x" -> "film" [label=" type"] } ``` ### Reasoning Algorithm : Traversing through the graph :runner: Now that we have a graph for our question, all that is left is to traverse the graph and get the output. But before we proceed we need to choose one of the constraint as base query and others are seen as constraints. ```graphviz digraph{ subgraph cluster_0 { node [style=filled]; "Amitabh Bachchan" "x"; label = "Base Query"; color=blue; } subgraph cluster_1 { node [style=filled]; "Ramesh Sippy" "x"; label = "Constraint 1"; color=blue; } subgraph cluster_2 { node [style=filled]; "film" "x"; label = "Constraint 2"; color=blue; } "Amitabh Bachchan" -> "x" [label="starred"] "Ramesh Sippy" -> "x" [label="directed"] "x" -> "film" [label=" type"] } ``` We get all the entities that satisfy not just the base query but also takes constraints into consideration. :::info The graph used to solve this is availbale in the Steps section. ::: > Example: Which film starred by Amitabh Bachchan and directed by Ramesh Sippy? [color=#0000ff] Code: ```python # Chossing an entity as base query base_query = graph[0] starting_entity = base_query[0] starting_predicate = base_query[1] # Dealing with synonyms of predicates director_synonym = ['director', 'directed'] starred_synonym = ['starred', 'actor'] print(graph) for t in graph: # print(t) if t[1] in director_synonym: t[1] = 'directed' elif t[1] in starred_synonym: t[1] = 'starred' # Dealing with synonyms of output type for t in graph: if t[1] == 'type': if t[0] in director_synonym: t[0] = 'director' elif t[0] in starred_synonym: t[0] = 'actor' # executing base query base_result = [] for s, p, o in graph_tuples: if starting_entity == s and starting_predicate == p: base_result.append(o) elif starting_predicate == p and starting_entity == o: # print(s, p, o, starting_entity, starting_predicate) base_result.append(s) # executing constraints for e, l in graph: if e == starting_entity and l == starting_predicate: continue final_result = [] for o in base_result: if [e, l, o] in graph_tuples or [o, l, e] in graph_tuples: final_result.append(o) for b in base_result: if b not in final_result: base_result.remove(b) print(base_result) ``` Output: ```python base_result = ['Sholay'] ``` #### Limitations of this approach: In the current version of our algo weare facing the following limitations: 1. Unable to identify superlatives (Need to learn on WordNet superlative corpus) 2. Unable to answer following types of questions for now: 2.1 Questions that have a nominal modifier for the desired output. Example: What was the name of the first film directed by Rajkumar Hirani starring Aamir Khan? 2.2 Questions with implicit time constraints. Example: Who was U.S. president after the Civil War started? ## References https://aclanthology.org/C16-1236.pdf https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/ACL15-STAGG.pdf