Try   HackMD

Brain Knowledge Repository 2.0

Like a Knowledge Graph. Unlike any Knowledge Graph.

  • Reasoning first knowledge graph
  • Knowledge = Enumerate all the options, example Rules of the Game. Constraints my forward path
  • Learning = How do I move with the knowledge, which is a better option. Goodness of State, given the action, state & goal. where do I go?
  • Reasoning = Using Learning + Knowledge in recursive path
  • Factoid question are not answerable via KG 1.0 APIs
  • You shall know an Entity by the company it keeps in the Knowledge Graph!

Objective : The Reasoning Brain

E2E application of the Brain Reasoner.

  1. English Question -> Brain Natural Language Engine (BNLE) -> KnowledgeQuestion(state = ReasoningState)
  2. KnowledgeQuestion -> Brain Knowledge Repository 2.0 -> KnowledgeAnswer
  3. KnowledgeAnswer -> Brain NLP Synthesis Engine (BNSE) -> English Answer

Example

  1. English Question -> Brain Natural Language Engine (BNLE) -> KnowledgeQuestion(state = ReasoningState)

    1. Finding the BrainToken
    2. Converting english question to BrainToken langugauge format

    What is the population of India?
    What is BrainToken("/common/attribute/country/population") of BrainToken("common/entity/country/1")

    common/entity/country/1 represents India node in the KG

    1. Remove stop words

    [What BrainToken("/common/attribute/country/population")] BrainToken("common/entity/country/1")

    1. Create Knowledge Question/Reasoning State
    ​​​​    STATE {
    ​​​​        GIVEN {
    ​​​​             BrainToken("common/entity/country/1")  // by default type is available as BrainToken contains all the details
    ​​​​        },
    ​​​​        ASK {
    ​​​​            BrainToken("/common/attribute/country/population)
    ​​​​        }        
    ​​​​    }
    
    ​​​​KnowledgeQuestion {
    ​​​​    question_is_one_of one_of {
    ​​​​        KnowledgeReasoningQuestion {
    ​​​​            state = STATE
    ​​​​        }
    ​​​​    }
    ​​​​}
    
  2. KnowledgeQuestion -> Brain Knowledge Repository 2.0 -> KnowledgeAnswer

    1.Action Given a state, where what all we can do?
    Generating Path/Action/Query

    ​​​​    STATE {
    ​​​​        GIVEN {
    ​​​​             BrainToken("common/entity/country/1")  // by default type is available as BrainToken contains all the details
    ​​​​        },
    ​​​​        ASK {
    ​​​​            BrainToken("/common/attribute/country/population)
    ​​​​        }        
    ​​​​    }
    

    As the State.GIVEN token as EntityInstance and there is only one instance hence we can reach to country collection's doc/node '1' in the KG. As we are this node and no other GIVEN token is there hence the only option is to read attributes of node 1. This new information is passed to Update State.

    One optimization possible is if we can utalized any token with same entityType as a filter, ex in this case ASK.BrainToken if of same entityType hence applicable to be applied as filter

    Action = Query(EntityID(1), Attribute(/common/attribute/country/population))

    1.1 State.GIVEN.BrainToken == EntityInstance

    ​​​​​​​ 1.1.1 You can read all the types of attributes (schema) ex. cou          ntry as entity type will have population, gdp, area etc
    ​​​​​​​ 1.1.2 You can read all the attributes values ex. area = BrainQuantity(1234), pop = BrainQuantity(345) 
    ​​​​​​​ 1.1.3 You can read all predicate types ex. country can have [has_president, belongs_to]
    ​​​​​​​ 1.1.4 You can move to take a path using a predicate. ex. county has_president and reach to person node 
    ​​​​​​​ 1.1.5 You can read all qualifiers 
    

    1.2 State.GIVEN.BrainToken == EntityType

    ​​​​​​​ 1.2.1 You can read all the types of attributes (schema) ex. country as entity type will have population, gdp, area etc
    ​​​​​​​ 1.2.2 You can fetch all subclasses 
    ​​​​​​​ 1.2.3 You can list all instances of the entity type. ex, all country instances 
    ​​​​​​​ 1.2.4  You can read all predicate types ex. country can have [has_president, belongs_to]
    ​​​​​​​ 1.2.5 You can read all qualifier schema
    

    1.3 State.GIVEN.BrainToken == PredicateType

    2.Update State Generate a reward state

    Input:

    ​​​​    STATE {
    ​​​​        GIVEN {
    ​​​​             BrainToken("common/entity/country/1")
    ​​​​             
    ​​​​        },
    ​​​​        ASK {
    ​​​​            BrainToken("/common/attribute/country/population)
    ​​​​        }        
    ​​​​    }
    

    Action = Query(EntityID(1), Attribute(/common/attribute/country/population))
    Query Response:
    /common/attribute/country/
    /common/attribute/country/1
    /common/attribute/country/population
    BrainQuantity("1234")

    Reward State = {State + Action}

    ​​​​    STATE {
    ​​​​        GIVEN {
    ​​​​             BrainToken("common/entity/country/1")
    ​​​​             BrainToken("/common/attribute/country/
    ​​​​                            /common/attribute/country/1
    ​​​​                                /common/attribute/country/population
    ​​​​                                BrainQuantity("1234")")
    ​​​​             
    ​​​​        },
    ​​​​        ASK {
    ​​​​            // Same Token Got Enriched with Value
    ​​​​            BrainToken("/common/attribute/country/
    ​​​​                            /common/attribute/country/1
    ​​​​                                /common/attribute/country/population
    ​​​​                                BrainQuantity("1234")")
    ​​​​        } 
    ​​​​    }
    

    3.Stop Mechanims: Is the given State is a GOAL State ?

    ​​​​    GOAL STATE = ASK is FILLED & NO_POSSBILE_ACTION
    ​​​​                        OR 
    ​​​​                    NO_ACTION_POSSIBLE
    

    Below state will become the Goal State, as the no action to take and ask is full_filled.

    ​​​​    STATE {
    ​​​​        GIVEN {
    ​​​​             BrainToken("common/entity/country/1")
    ​​​​             BrainToken("/common/attribute/country/
    ​​​​                            /common/attribute/country/1
    ​​​​                                /common/attribute/country/population
    ​​​​                                BrainQuantity("1234")")
    ​​​​             
    ​​​​        },
    ​​​​        ASK {
    ​​​​            // Same Token Got Enriched with Value
    ​​​​            BrainToken("/common/attribute/country/
    ​​​​                            /common/attribute/country/1
    ​​​​                                /common/attribute/country/population
    ​​​​                                BrainQuantity("1234")")
    ​​​​        } 
    ​​​​    }
    
    1. Generate Knowledge Answer
      For now we will consider we are just passing the goal state as KnowledgeAnswer but this section will evolve any logic regarwds to this.

    KnowledgeAnswer(state= GoalState)

  3. KnowledgeAnswer -> Brain NLP Synthesis Engine (BNSE) -> English Answer

    3.1 Template Classification apporach

    Question : What is the BrainToken(X) of BrainToken(Y)?
    Answer Template: BrainToken(X) of BrainToken(Y) is BrainToken(X.Value)

    3.2 Template value replacement

    ​​​​    STATE {
    ​​​​        GIVEN {
    ​​​​             BrainToken("common/entity/country/1")
    ​​​​             BrainToken("/common/attribute/country/
    ​​​​                            /common/attribute/country/1
    ​​​​                                /common/attribute/country/population
    ​​​​                                BrainQuantity("1234")")
    ​​​​             
    ​​​​        },
    ​​​​        ASK {
    ​​​​            // Same Token Got Enriched with Value
    ​​​​            BrainToken("/common/attribute/country/
    ​​​​                            /common/attribute/country/1
    ​​​​                                /common/attribute/country/population
    ​​​​                                BrainQuantity("1234")")
    ​​​​        } 
    ​​​​    }
    

    BrainToken("/common/attribute/country/population") of BrainToken("common/entity/country/1") is BrainToken(BQ("/common/attribute/country/population/1234"))

    3.3 Converting to natual english (DL+Bert)

    Population of India is 138 Crores

Scope

In this document scope, we will focus on factoid question first.

Factoid Question

  • What is BrainToken(Attribute) of BrainToken(EntityInstance)?

    1. What is the length of Ganga?
    2. Height of Virat Kohali?
    3. No of centures of MS Dhoni?

Can we answer above with State Thinking? Yes

  1. Height of Virat Kohali?
  • Step:1

    ​​​​{
    ​​​​    "state": {
    ​​​​        "triplets": [
    ​​​​            {"head": "/common/entity/person/1", "relation": "", "tail": "", "context": "Height of Virat Kohli"},
    ​​​​        ],
    ​​​​        "given": {
    ​​​​            "Entity": [
    ​​​​                {"token": "/common/entity/person/1", "context": "Virat Kohli"}
    ​​​​            ]
    ​​​​        },
    ​​​​        "ask": {
    ​​​​            "Attributes": [
    ​​​​                {
    ​​​​                    "token": "/common/attribute/person/height",
    ​​​​                    "context": "Height",
    ​​​​                }
    ​​​​            ]  // 1. visited - start from here - find any objects from given state
    ​​​​        }
    ​​​​    }
    ​​​​}
    
    
    • Start from ASK, mark the "/common/attribute/person/height" VISITED=TRUE
    • get_next_context = CurrentContext("Height") + LinkedContext("Height", STATE)
    • // LinkedContext is coming from NLP layer
    • get_next_context ={"Virat Kohli", "Height of Virat Kohli"}
    • // State would not change because of the next context information.
    • query.add(getAttributeQuery(/common/attribute/person/height, /common/entity/person/1))
    • execute(query)

Number of centures of MS Dhoni
{
    "state": {
        "triplets": [
            {"head": "/common/entity/person/2", "relation": "", "tail": "", "context": "Number of Centuries of MS Dhoni"},
        ],
        "given": {
            "Entity": [
                {"token": "/common/entity/person/2", "context": "MS Dhoni"} // MS Dhoni
            ]
        },
        "ask": {
            "Attributes": [
                {
                    "token": "/common/attribute/person/centuries",
                    "context": "Number of Centuries",
                }
            ]  // 1. visited - start from here - find any objects from given state
        }
    }
}

What is the length of Ganga
{
    "state": {
        "triplets": [
            {"head": "/common/entity/river/2", "relation": "", "tail": "", "context": "length of Ganga"},
        ],
        "given": {
            "Entity": [
                {"token": "/common/entity/river/2", "context": "Ganga"} // Ganga
            ]
        },
        "ask": {
            "Attributes": [
                {
                    "token": "/common/attribute/river/length",
                    "context": "length",
                }
            ]  // 1. visited - start from here - find any objects from given state
        }
    }
}
# we want to smartly explore only path's which are relevant to current context # context here is the keyword using which we have landed at current object. # We use the context and question to do the link prediction for finding next object to explore.
what is Indian tea made of ?
What is the dob of India’s Prime Minister?
who is the third male prime minister of India ?
who is the first female prime minister of India ?
Who are the actors of Don movie ?
who is the prime minister of india ?
What is the height and weight of Sachin Tendulkar ?
What is the length of yamuna and ganga?
Who is the prime minister of India?
Who is the youngest CEO in India?
Which is longest river in Brazil ?
Who are the players of Indian cricket team?
Who is the first female prime minister of India?
What is the population of India?
which physicist developed theory of relativity ?

Approach: Search problem that grows the graph through staged state-actions

# state graph
who is the first female prime minister of India ?

# state graph
who is the dob of prime minister of India ?

# state graph
which physicist developed theory of relativity ?

# state graph
What is the length of yamuna and ganga?

# state graph
Who is the youngest CEO in India?

Query Graph morphology

A query graph should have exactly one lambda variable to denote the answer, at least one grounded entity, and zero or more existential variables and aggregation functions.

Query graph parsing

Query graph generation, formulated as a search problem with staged states and actions. Each state is a candidate parse in the query graph representation and each action defines a way to grow the graph.

Who was U.S. president after the Civil War started

Topic entity / Entity Linking

  1. Topic Entity The score returned by the entity linking system is directly used as a feature.
  2. We use the entity linking approach proposed by (Yang and Chang, 2015) to detect entities mentioned by the given question.

Core Inferential Chain

  1. For each detected entity s, we treat it as a subject constant vertex. Based on the KB, for each unique KB ‘path’ from s, where a KB ‘path’ means one hop predicate p0 or two hop predicates p1-p2, we construct a basic query graph hs, p0, xi or hs, p1-ycvt-p2, xi. ycvt and x are variable vertices, and x denotes the answer. For example, the basic query graph B in Figure 2 can be represented as hUnited States, officials-y0-holder, xi.

  2. We use similarity scores of different CNN models described in Sec. 3.2.1 to measure the quality of the core inferential chain

    1. PatChain compares the pattern (replacing the topic entity with an entity symbol) and the predicate sequence.
    2. QuesEP concatenates the canonical name of the topic entity and the predicate sequence, and compares it with the question.

    These two CNN models are learned using pairs of the question and the inferential chain of the parse in the training data

Constraint detection and binding

  1. When a constraint node is present in the graph, we use some simple features to check whether there are words in the question that can be associated with the constraint entity or property. Examples of such features include whether a mention in the question can be linked to this entity, and the percentage of the words in the name of the constraint entity appear in the question.
  2. To measure the similarity between the path p (e.g., director) and the ‘context pattern’ (e.g., “directed by e1”) of constraint Ci , we adopt a convolutional neural network (CNN) model

References

  1. https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/ACL15-STAGG.pdf
  2. https://aclanthology.org/C16-1236.pdf

Who was U.S. president after the Civil War started?

BrainTokenization

Rough Work

Steps:

  1. Tokenize

  2. remove stop words

  3. Identify BrainTokens
    entityType
    predicateType
    AttributeType
    entityInstance
    predicateInstance
    AttributeInstance

  4. Temporal Identification
    standard lib

    https://stanfordnlp.github.io/CoreNLP/ner.html

Bag = []

What/When/Which/Where/Who & How [W5H1] - film = Y^ = Looking for instance of Y^

  1. Basic Dependeny Graph Generation

    /common/actor/1 ==> Amitabh Bachhan
    (star_in)

    (context : starred by Amitabh bachhan) - ngram - AnnotationService

    [] > star_in > (Y1) -> directed_by > X1
    > produced_by > X2
    > genre > X3
    Y2

    ​​ 				Y3
    

    [/common/director/1] > instanceOf (Y1^, X1^, X2^, X3^) > X1

    [/common/director/1] > valueCheck (X1[1n]) > X1[k]

    [/common/entity/flim] > typeCheck [Y1^, X1^, ] > Y1^

  2. Add Constraints
    Temporal
    1. Implicit

    ​​ // TODO : 
    
    ​​ 2. Explicit 
    
    ​​ [https://stanfordnlp.github.io/CoreNLP/ner.html]
    ​​ 
    ​​ after year 2000 [Before/After/Since/From]
    
    ​​ film -- year attributeType 
    
    ​​ AttributeType[/common/attribute/film/year]
    ​​ BrainFilter 
    
    ​​ /common/attribute/film/year > 2000
    
    ​​ screened after 2000
    
    ​​ screened >> flim.year_of_release
    
    
    ​​ result : /common/attribute/film/year > 2000
    
    ​​ brainNameService : screen
    
    ​​ 3. Ordinal 
    
    ​​ richest, poorest, fasted, maximum , first
    
    ​​ maximum, richest - maxAtN
    ​​ poor, cheepest - minAtN 
    ​
    ​​ 
    
    ​​ max[/common/attribute/film/revenue]
    ​​ min[/common/attribute/film/revenue]
    

    [A1] > star_in [Y1, Y2] > directed_by [X1]

    Answer : Y1, Y2

Dependency Graph >> Query >> ArangoDB

for f in films
f.outbound actor
actor.id = actor/1
&& f.outbound director
director.id = director/1

BrainExpression where
BrainToken 'op' LITERAL
/common/attribute/film/year > 2000

max[BrainToken]
max[/common/attribute/film/revenue]

DSL
DSL

Looking for an instance of Film Entity which is connect to Director and Actor via relationship and that also director/1 and actor/1

message State {
repreated entity_type
repreated entity
repreated predicate_type
}

message BaseGraph {
subject // actor/1 [Actor], context

​string relation // star_in , context 
​repeated string possible_two_hop_options/path/query, [director. .....]

​map<type, string> context 

}

message Constraint {
// /common/attribute/film/year > 2000
repeated BrainExpression = 1;
//
}

message AnswerState {
//KNOWN vs UNKNOW
repeated brainToken // instance of flim, flim,
??
}

message DependecyGraph {
AnswerState
BaseGraph base_grap;
State
Constraint
}

for f in films
f.outbound actor
actor.id = actor/1
&& f.outbound director
director.id = director/1

​FILTER 

​	f.year > 2000


​for ....
​	...