Working space for pql

# Working space for pql ## Proposed structs ### STMT * STMT_REF * STMT_TYPE ### STMT_REF * (This is simply the statement number) ### STMT_TYPE * statements could be of the type: (‘stmt’ | ‘read’ | ‘print’ | ‘while’ | ‘call’ | ‘if’ | ‘assign’) ### VAR * STMT_REF * VAR_NAME ### VAR_NAME * (string) ### SYNONYM * NAME (string) * TYPE (design-entity (enum)) ### SUCH_THAT * relRef (enum) ### GROUP * CLAUSE[] * isBooleanType ### PATTERN ## Example SIMPLE source code procedure main { while true { //1 hello = 1 //2 bye = 2 //3 print hello //4 print bye //5 total = hello + bye //6 print total //7 new_total = hello - bye //8 } print new_total //9 final_total = new_total + 3 //10 } ## Example PQL print p1, p2; assign a1, a2 Select a1 such that Parent(___, a1) such that Follows(p1, a1) such that Uses(a1, "V") such that Follows(p2, p1) such that Follows(p2, _) pattern a2 (total, "hello + bye") such that Uses(a2, _) such that Uses(3, 4) DFS: stack: a1 algo: find all clauses containing target synonym, for each clause found: build the first node. - add clauses to global clause[] if unselected - find and populate node's neighbours[], for each neighbour: build_node() while has_remaining_clause: pick the first clause, make that the new target * If pattern a (blah, blah) target is a * If such that Follows(s, s2) then you pick either to be the target * If such that Follows(_, _) or (3, 5) then, this is an isolated grp, return. For each clause that contains the curr target: such that Parent(___, a1) such that Follows(p1, a1) such that Follows(p2, p1) such that Follows(p2, _) Clauses ordered within group (such that, pattern) GROUP[] = [[[Parent, ___, a1], [Follows, p1, a1], [Follows, p2, p1], [Follows, p2, _]], [[Pattern, a2, total, "hello + bye"], [Uses, a2, _]]] Tables a1 p1 p2 3 4 6 7 9 Tables p1 p2 1 5 2 5 7 10 11 12 new idea: Tables a1 p1 p2 3 4 5 3 5 BOOLEAN group: also tables I need a code to run through that column and delete the row if the r/s dont hold -> SYNONYM[] = [[s, stmt], [p, print]] TARGET = [s, stmt] GROUPS[] = [&target_node, &another_root_node] hasTargetSynonym For iter 1, the pattern is now an assignEntity object For iter 3, check the boolean for each synonym first before cross product ## Test case generation (basic) Note: * Values with underline are invalid inputs. * Heuristic idea: 1. Each valid input must appear at least once in a positive test case. 2. No more than one invalid input in a test case ### Follows(param 1, param 2) & Follows*(param 1, param 2) param 1 & param 2 -> synonym | _ | INTEGER #### Equivalence Partition for param 1 & param 2 * synonym: [s1], [a1], [p1], [r1], [w1], [if1], [c1], [proc1] * \_: [_], * INTEGER: [-infinity ... -1], [0], [1 ... MAX_INT] | param 1 | param 2 | | ----------- | ----------- | | s1 | a1 | | a1 | a1 | | r1 | p1 | | w1 | if1 | | c1 | p1 | | _ | s1 | | proc1| s1 | | - 1 | s1 | | 0 | s1 | | a1 | _ | | s1 | proc1| | a1 | -1 | | a1 | 0 | | a1 | 1 | ### ModifiesP(param 1, param 2) param 1 & param 2 -> synonym | ‘\_’ | ‘"’ IDENT ‘"’ #### Equivalence Partition for param 1 * synonym: [proc1] * \_: [_], * ‘"’ IDENT ‘"’: ["9"], [""], ["valid"], ["invalid_string"] #### Equivalence Partition for param 2 * synonym: [v1] * \_: [_], * ‘"’ IDENT ‘"’: ["9"], [""], ["valid"], ["invalid_string"] | param 1 | param 2 | | ----------- | ----------- | | p1 | v1 | | p1 | _ | | p1 | "valid" | | _ | v1 | | _ | _ | | _ | "valid" | | validString | v1 | | validString | _ | | validString | "valid" | |"invalid_string"| v1 | |"invalid_string"| _ | |"invalid_string"| "valid" | | p1 |"invalid_string"| ### ModifiesS(param 1, param 2) param 1 -> synonym | ‘\_’ | INTEGER param 2 -> synonym | ‘\_’ | ‘"’ IDENT ‘"’ #### Equivalence Partition for param 1 * synonym: [s1], [a1], [p1], [r1], [w1], [if1], [c1], [proc1] * \_: [_], * INTEGER: [-infinity ... -1], [0], [1 ... MAX_INT] #### Equivalence Partition for param 2 * synonym: [v1] * \_: [_], * ‘"’ IDENT ‘"’: ["9"], [""], ["valid"], ["invalid_string"] | param 1 | param 2 | | ----------- | ----------- | | s1 | v1 | Select s1 such that Uses() | a1 | _ | | p1 | "valid" | | r1 | s1 | | w1 | s1 | | if1 | s1 | | c1 | s1 | | proc1| s1 | | - 1 | s1 | | 0 | s1 | | s1 | proc1| | s1 | "9" | | s1 | "" | | s1 | "invalid_string" | ### AffectsP(param 1, param 2) and AffectP*() param 1 & param 2 -> synonym | ‘\_’ | ‘"’ IDENT ‘"’ #### Equivalence Partition for param 1 * synonym: [proc1] * \_: [_], * ‘"’ IDENT ‘"’: ["9"], [""], ["valid"], ["invalid_string"] #### Equivalence Partition for param 2 * synonym: [v1] * \_: [_], * ‘"’ IDENT ‘"’: ["9"], [""], ["valid"], ["invalid_string"] | param 1 | param 2 | | ----------- | ----------- | | p1 | v1 | | p1 | _ | | p1 | "valid" | | _ | v1 | | _ | _ | | _ | "valid" | | validString | v1 | | validString | _ | | validString | "valid" | |"invalid_string"| v1 | |"invalid_string"| _ | |"invalid_string"| "valid" | | p1 |"invalid_string"| ### affectsS(param 1, param 2) & affectsS* param 1 -> synonym | ‘\_’ | INTEGER param 2 -> synonym | ‘\_’ | ‘"’ IDENT ‘"’ #### Equivalence Partition for param 1 * synonym: [s1], [a1], [p1], [r1], [w1], [if1], [c1], [proc1] * \_: [_], * INTEGER: [-infinity ... -1], [0], [1 ... MAX_INT] #### Equivalence Partition for param 2 * synonym: [v1] * \_: [_], * ‘"’ IDENT ‘"’: ["9"], [""], ["valid"], ["invalid_string"] | param 1 | param 2 | | ----------- | ----------- | | s1 | v1 | Select s1 such that Uses() | a1 | _ | | p1 | "valid" | | r1 | s1 | | w1 | s1 | | if1 | s1 | | c1 | s1 | | proc1| s1 | | - 1 | s1 | | 0 | s1 | | s1 | proc1| | s1 | "9" | | s1 | "" | | s1 | "invalid_string" | ### Pattern a (param1, param 2) param 1 -> synonym | ‘\_’ | ‘"’ IDENT ‘"’ param 2 -> exact match | partial match | ‘\_’ #### Equivalence partition for param 1 * synonym: [v1] * \_: [_] * ‘"’ IDENT ‘"’: ["9"], ["valid"], ["invalid_string"] #### Equivalence partition for param 2 * exact match: ["x"], ["9"], ["x + y * 3"] * partial match: [\_"x"\_], [\_"9"_], [\_"x + y * 3"\_] * \_: [_] | param 1 | param 2 | | ----------- | ----------- | | v1 | exact | | v1 | _ | | v1 | partial | | _ | exact | | _ | _ | | _ | partial | | validString | exact | | validString | \_ | | validString | partial | |invalidString| exact | |invalidString| \_ | |invalidString| partial | pattern a1 (v1, \_) pattern a1 (v1, partial) if there is assign syn in table, for each of it, check if the corresponding variable matches. if there is assign syn in table, for each of it, add the corresponding variable to the row. if assign not in table, for each variable in table-> There might be multiple assign stmt. if both not in table, query by assign and add to table. ## System test cases x - Combination of cases (one non-boolean, one boolean) stmt s1; assign a1; variable v1, v2; Select s1 such that Uses(s1, v1) pattern a1(v2, _"z"_) 1, 2, 3, 4, 5, 8, 9 5000 # Iter 2 ## documentation stuff QueryOptimizer Overview: QueryOptimizer executes extra logic on the initial query object, in order to group the clauses in strategic ways to increase performance during evaluation. API 1. static void GroupClauses(std::vector<Clause*>* clauses, std::list<Group*>* groups, Synonym* target); o Description: Groups clauses (queries) that should be evaluated together, based on existence of common synonyms. The newly formed Groups of clauses are added in-place to the Groups data structure of the query object. Note: For iteration 1, the grouping algorithm has a rudimentary implementation as a maximum of 1 such that and 1 pattern clause is expected. ## Optimisation considerations - Consider boolean first - Followed by the non-boolean - Consider repeated clauses that are identical ## Query Object QueryEvaluator(std::list[Synonym] syn_list, Synonym target, std::list[Group*] groups, PKB pkb); - Problems: - There can be multiple non-boolean groups () - Multiple query types such as boolean or query, for tuples, there can be N number of synonyms. - Need to know if the query contains the target synonyms, AND how to match each group to a synonym in the tuple - How to optimise then? Pass to another component? Or should the data structure be responsible for the ordering (optimisation). -> Across group: Boolean first, size of the grp. Non-boolean, ordering don't matter -> Within group: Prioritize clauses with one constant and one synonym Prioritize clauses with less number of results: Follows, Modifies, etc. Sort clauses such that at least one synonym has been computed in a previous clause Prioritize with-clauses – more restrictive than such that clauses Evaluating pattern-clauses – similar to any such that clause Push Affects/* clauses on the last positions in a group Such that Uses (s1, “v”) Follows (a1, 4) Follows(s1, a1) ### Design choice - soln: (Solves intergroup) Need a Clause class (abstract) which can: - SuchThat, Patter, With - Check for equivalence (Check if objects are equal) - Comparator to compare between 2 types of clauses based on the above intergroup - SHORTCOMING: Difficult to sort clauses based on synonyms already seen within the group - soln: Need another Group class which can: - Tell me if it contains the target synonym - Tell me the list of target synonyms e.g <a1, r1> - Tell me the size of the group ## optimisation we need to add shorter boolean groups first so we need a scheduler for intra-group but pq for inter-group heman: should we drop identical queries during extraction phase? - Pros: evaluator (maybe) does not need to cache, and we avoid the case where grouping logic needs to iterate over all the duplicate clauses. - Cons: extra overhead involved in removing duplicate clauses, or in maintaining a set data structure that implictly 'removes' duplicates. i suppose its just a matter of where. ### within grp ordering: Prioritize clauses with one constant and one synonym (AND 1 result e.g Parent, Follows, Modifies(3, v), with clause with stmt# and number) Prioritize clauses with one constant and one synonym (AND multiple results e.g Parent*, Uses) Prioritize clauses with less number of results: Follows, Modifies, etc. Sort clauses such that at least one synonym has been computed in a previous clause Prioritize with-clauses – more restrictive than such that clauses Evaluating pattern-clauses – similar to any such that clause Push Affects/* clauses on the last positions in a group ## Summary: Todo (query extractor) week8: - Support for exact match in pattern - Support multiple such that and pattern (no and/with) - Support grouping algorithm week9: - Support for BOOLEAN and tuple in select - Support for elem/attrRef + all other lexical tokens - Support Calls*/Next* (semantic validation) + refactoring semantic validation. - Support if and while pattern - Support and - Support with (if there is time) ### ramblings: Updates to Tokenizer, Parser, Validator - lexical tokens: prog_line, elem, line_ref, attr_name, attr_cond... (need quite a few new lexicons and recursive descent logic) - Support multiple such that & pattern: - support attribute parsing + validation. - support with - and (delimiter between same type clause) - Add support for exact match in pattern - add if and while pattern - Calls Calls* next next* (parsing + importantly, validation) - Support for BOOLEAN and tuple in select ### ramblings: Updates to grouping/optimization - Grouping algorithm - (iter 3): lowest priority since does not affect correctness: comparator for within-group and across-group ### Updates to query object enum Attribute { kStmtNumber, kProcName, kVarName, kValue }; struct QueryInfo { bool all_boolean_true; std::Vector<*Table> table_list; std::unordered_map<Synonym, Attribute> }; ### Grouping algorithm pseudocode (old algo. see revised algo in next section) - notes: the problem reduces to finding the 'connected' subcomponents of a 'graph' (ie nodes that are reachable/have common syn from each other are in a group) - we use a dfs based approach. Some auxillary ds are needed: ``` list<Clause> clauses = [.......] // already given Map<Synonym, list<int>> map_of_syn_to_clause_indices; // need this aux ds ^ this serves as an 'adjacency list' ie {syn1 -> clause[], syn2 -> clause[]...} list<Group> groups = [] Algo(): - num_groups = 0 - visited = set() - for index_of_cl in len(clauses): - if index_of_cl in visited: - continue - Dfs(index_of_cl, num_groups, visited) Dfs(index_of_cl, num_groups, visited): - stack = [] - stack.add(index_of_cl) - while stack is not empty: - c_index = stack.pop() // LIFO - if c_index in visited: - continue - visited.add(c_index) - AddClauseToGroup(c_index, num_groups) - Clause[] neighbaes = GetNeighbaes(c_index) - push all clauses in neighbaes to stack AddClauseToGroup(c_index, num_groups): - if len(groups) less than or equal to num_groups: - create a new group. - add clauses[c_index] into groups[num_groups] - Synonym[] syns = GetAllSynonymsOfClause(c_index) - update group's "has_tgt_syn" attribute if any tgt_syn is in syns. GetNeighbaes(c_index): - Synonym[] syns = GetAllSynonymsOfClause(c_index) - set_of_neighbaes = set() - for s in syns: - set_of_neighbaes.merge(map_of_syn_to_clause_indices[i]) - set_of_neighbaes.erase(c_index) - return set_of_neighbaes GetAllSynonymsOfClause(c_index): - just return all arguments that are syn, based on clause type. ----------------------------------------------------------------------- List of target syn = [a1, p1, a2] print p1, p2; assign a1, a2; read r1; variable v1 Select <a1, a2, r1, proc> such that Follows(p2, _) // 1 such that Parent(___, a1) // 2 such that Follows(p1, a1) // 3 such that Follows(p1, r1) // 4 such that Uses(a1, "V") // 5 such that Follows(p2, p1) // 6 Pattern a1("v", _) // 7 such that Modifies(a2, v1) //8 such that Uses(a2, v1) //9 visited: 2, 3, 5, 7, 4, 6, 1 syn_visited: a1, p1, p2, r1 stack: sub_group: [] group: [2, 5, 7, 3, 4,6 1] list_of_target_syn = [a1, r1] group: [] list_of_target_syn = [a2] // add something to the stack if it has not been visited // add to subgroup if clause not visisted // sub group is not sorted, sort in the subgroup then add to group, group is sorted // clear the subgroup (in for loop anyways) // Pop the stack, get the synonym if not in syn_visited // If syn is part of tuple, add to the current group's list_of_target_syn // algo repeats by creating new group for the next unseen target syn Guarantee: First clause, contains first target synonym. ``` ### Grouping algorithm pseudocode (new algo) - notes: the new algo is fairly similar to the old algo in that it groups clauses with common syns, but the issue with the old algo was that it did not ensure that within a group, we have no more than 1 unseen synonym (ie A -> B, when seeing B it could have 2 new synonyms vs A, resulting in state space explosion during table merge). - new algo can be thought of as a variant of old algo, that uses 'stepwise' dfs on each synonym, to create a subgroup. clauses within a subgroup have the common synonym, and we can apply sorting within a subgroup (but not across a subgroup). We then merge subgroups to form a group. - Due to requirements on evaluator side we also ensure the first group has the target synonym. - So after grouping for clauses with tgt syns & their neighbours, we have a logic for creating groups for unvisited clauses. There are some cases: - clause has no synonyms -> shortcircuit logic to place into its own group since it shares no common syn with any other clause - clause has 1 or 2 synonym -> choose one of the clause as the "tgt_syn" and call the same dfs method ``` groups = [] // already given clauses = [] // already given Map<Synonym, list<int>> map_of_syn_to_clause_indices; // need this aux ds ^ this serves as an 'adjacency list' ie {syn1 -> clause[], syn2 -> clause[]...} tgt_syn_list = [e.g. a1, a2, r1, proc] // already given Algo(): // create groups for clauses with tgt syns & their neighbours first - vector<int> visited_clauses = [] // can be set based on imple. - vector<Synonym> visited_tgt_syns = [] - visited_synonyms = set() - num_groups = 0 - for tgt_syn in tgt_syn_list: - if tgt_syn in visited_tgt_syns: - continue - Dfs_Syn(tgt_syn, visited_clauses, visited_tgt_syns, visited_synonyms, num_groups) // create groups for unvisited clauses - for index_of_cl in len(clauses): - if index_of_cl in visited_clauses: - continue - if clause has no syns -> add to its own group; and add to visited - else: call dfs with either arg as tgt syn. Dfs_Syn(tgt_syn, visited_clauses, visited_tgt_syns, visited_synonyms, num_groups): - create a new group with id num_groups; num_groups++ - queue (containing int) = [] - visited_tgt_syn[tgt_syn] = true - visited_synonyms.add(tgt_syn) - queue.add(tgt_syn) - while queue is not empty: - curr_tgt_syn = queue.dequeue() - vector integer[] contains_tgt_syn = map_of_syn_to_clause_indices[curr_tgt_syn] // subgroup - contains_tgt_syn = contains_tgt_syn.remove_visited_indices() - sort_with_comparator(contains_tgt_syn) // dummy/optimization - for clause_index in contains_tgt_syn: - add to visited_clauses - Synonym[] clause_syns = GetAllSynonymsOfClause(clause_index) - for clause_syn in clause_syns: - if clause_syn is alr in visited_synonyms: continue - if it is a tgt syn: visited_tgt_syn[clause_syn] = true - visited_synonyms.add(clause_syn) - queue.add(tgt_syn) - append subgroup into group we created // note: at appropriate sections, populate group metadata (e.g. has_tgt_syn attribute) - add the group we created into groups array. GetAllSynonymsOfClause(c_index): - just return all arguments that are syn, based on clause type. ``` ## New API for extractor-evaulator QueryEvaluator(PKB pkb) query_evaluator_object.EvaluatorQuery UnformattedResult EvaluateQuery(vector[Group] groups, vector[Synonym] target_synonyms) ## Todo (query evaluator) - need to add prog_line, procedure and const to DesignEntity (week 7) - consider making DesignEntity an object (week 7) - Simple extension for calls and calls* (week 8) - Advanced addition for Next and Next* (week 9) - Need to also consider the procedure for adding Affects and Affects* (week 9) - Control flow graph can be done with a graph implementation (wait for frontend should be week 9) - Pattern has while and if now (week 8) - if : syn-if ‘(’ entRef ‘,’ ‘_’ ‘,’ ‘_’ ‘)’ - while : syn-while ‘(’ entRef ‘,’ ‘_’ ‘)’ - Attributes and attribute value types: (week 8) - procedure.procName, call.procName, variable.varName, read.varName, print.varName : NAME - constant.value : INTEGER - stmt.stmt#, read.stmt#, print.stmt#, call.stmt#, while.stmt#, if.stmt#, assign.stmt#: INTEGER - Note that Select can be tuple or BOOLEAN. tuple => elem->single synonym OR synonym.attr || <elem, ...> (week 9) - Implement with ref = ref (week 9) - ref : ‘”’ IDENT ‘”’ | INTEGER | attrRef | synonym (prog_line) - Return type: (week 10) |Select |Should return| |-------|-------------| |BOOLEAN| TRUE/FALSE| |Statement (stmt / read / print / call / while / if / assign Program line (prog_line)| Statement number| |Variable| Name (no need to use “”) |Procedure| Name (no need to use “”) |Constant| Constant value |Tuple (<..., ...>)| Tuple values (e.g. 2 x, 3 z, …) - Note that BOOLEAN for type that have attrRef to an int, should check that int's attr/syn - Need to find more common procedures rs pkb api 1)affects(_,_) HasRelationship(aff) 2)a1, a2 GetRelationshipByTypes(aff, DE, DE) or HasRelationship(aff, DE, DE) Note: second case is for boolean with single clause. i.e only required if really wanna optimize to query in O(1) time. 3)a, _ GetRelationshipByType(aff, DE) 4)_, a GetRelationshipByType(affby, DE) 5)1, _ GetRelationship(aff, 1) 6)_, 10 GetRelationship(affby, 10) 7)1, a GetRelationship(aff, 1) 8)a, 10 GetRelationship(affby, 10) 9)1, 10 HasRelationship(aff, 1, 10) Note: All HasRelationship queries are expected to be in O(1), Everything else should be at most O(N), where N is the size of the vector being passed. Since the creation of the vector already takes O(N), this should be the best runtime anyways. ### Design decisions: - Problem: The tuple returned might have N number of values. Or the return value might be boolean - Soln: Create a result object which can handle this information. Provided the size of the tuple. - Problem: The table changes each time it is combined with another clause. - soln: - Create new table: Use prototype pattern for table class. - Mutate existing table: Have a table manipulator class - Problem: The query requires many different calls and types of manipulations - Soln: QueryEvaluator as the facade class - Problem: Table manipulation should be abide by SRP but each clause has complex operations which each has a different way of calling the table manipulation. - soln: The clause evaluator should be an invoker and the command pattern should be used with a table manipulator - problem: Need to compute Next* and Affects* - soln: Separate class to cache results and evaluate results, return results to calling class. - Problem: Pattern has 3 types now, maybe even more for extendability in the future. - soln: Consider implementing strategy design pattern for the various types of statements being utilized in pattern(?) Not too sure if this is the best design pattern or if 1 is even needed for this prob. - problem: Non-boolean might have multiple target synonyms. Boolean might have 1 target synonym. Boolean might only need to check if there exists the specified result. - The query to the PKB will be different for different types of clause. - Soln: create a command patter to create a command to query the PKB ### Pseudocode ``` For each boolean group: if any are false, break and return false to the query projector For each non-boolean group: evaluate and modify the table, then pass it to the query projector When evaluating boolean: If has only 1 clause and no synonym no need for table else needs table ``` ### thought process for with - ref : ‘”’ IDENT ‘”’ | INTEGER | attrRef | synonym (prog_line) - procedure.procName, call.procName, variable.varName, read.varName, print.varName : NAME - constant.value : INTEGER - stmt.stmt#, read.stmt#, print.stmt#, call.stmt#, while.stmt#, if.stmt#, assign.stmt#: INTEGER e.g: - with "x" = "y" ("IDENT") - with "x" = 1 (should be false) - with "x" = c1.procName ("IDENT", attrRef) - with "x" = c1 (invalid) - with call.procName = variable.varName NOTE: in all cases, there should not be multiple returns right? ### Return format for query - Elements of the tuple separated by space in a string - Each tuple string is an element in the list of strings returned by evaluate (TestWrapper) - Example: Select <a, p, s> … returns a list of strings where first element is 3 First 10, second element is 5 Second 4 etc. ### PKB API (To be reconfirmed with oliver) - For SuchThat with 2 synonyms e.g Uses(s1, v1): - std::vector<std::tuple<Entity*, Entity*>> GetRelationshipByTypes(PKBRelRefs, DesignEntity, DesignEntity); - For SuchThat with 1 synonym e.g Follows(s1, 4) or Parent (_, a1); - std::vector<Entity*> GetRelationship(PKBRelRefs ref, std::string entity); - For SuchThat with No synonym (but not both Wildcards) e.g Follows*(3, _) or Modifies(5, "V") - std::vector<Entity*> GetRelationship(PKBRelRefs ref, std::string entity); - For SuchThat with No synonym (but both wildcard) e.g Parent*(_,_) - bool HasRelationship(PKBRelRefs); - For Pattern with 2 synonyms e.g Pattern a1(v1, "_") or if1(v1, "_") - std::vector<Entity*> GetDesignEntities(DesignEntity de); - NOTE: query assign, variable, if or while accordingly. - For Pattern with 1 synonym e.g Pattern while1 ("x", "_") - std::vector<AssignEntity*> GetAssignEntityByVariable(std::string variable); - std::vector<WhileEntity*> GetWhileEntityByVariable(std::string variable); - std::vector<IfEntity*> GetIfEntityByVariable(std::string variable); ## sprints - refer to query extractor todo for more details on deliverable for that, but its in line with below: ### week 8 - Finish Exact match for pattern - Support multiple clases (group logic, such that and pattern for not) - Support tuple - Refactoring QUERYEvaluator - Adding in new classes for the patterns (changing the APIs called to each class) ### week 9 - Support tuple - Start to add support for Calls and Calls* - Add support for Next, Next*, Affects, Affects* (if frontend is ready) - Add suport for Pattern while and if ### week 11 - with pattern: - Attributes and attribute value types - procedure.procName, call.procName, variable.varName, read.varName, print.varName : NAME - constant.value : INTEGER - stmt.stmt#, read.stmt#, print.stmt#, call.stmt#, while.stmt#, if.stmt#, assign.stmt#: INTEGER - Note that Select can be tuple or BOOLEAN. tuple => elem->single synonym OR synonym.attr || <elem, ...> (week 9) - Implement with ref = ref (week 9) - ref : ‘”’ IDENT ‘”’ | INTEGER | attrRef | synonym (prog_line) - cases: - IDENT or INTEGER x IDENT or INTEGER - Just a simple string comparison - Value x attrRef - if not in the table, then just check, add and cross product. - Note that here should just query the DBManager. - e.g Select s1 with p1.stmt# = 4 - e.g Select s1 with p1.varName = "x" - if not it is, then run through the table and shorten. - No additional support needed here. - attrRef x attrRef - if both in the table, just run through and shorten - no addn support needed - if only one in the table, - e.g Select s1 with p1.stmt# = s1.stmt# - case 1: p1 is in the table: ADD s1 to table - case 2: s1 is in the table: CHECK THAT S1 IS A PRINT. NO NEED PKB - Neither in the table - case 1: specific stmt e.g print - std::vector<Entity*> GetDesignEntities(DesignEntity de); - case 2: no specific - std::vector<Entity*> GetDesignEntities(DesignEntity de); - Check if stmt# is a const - case 1: const not in: e.g Select s1 with const.value = p1.stmt# - case 2: print not in: e.g Select s1 with const.value = p1.stmt# - attrRef x prog_line - if both in table, just shorted. (NO need pkb) - if only 1 in table - e.g Select s1 with p1.stmt# = prog_line - case 1: p1 is in the table: ADD prog_line synonym to table, copy values over. - case 2: prog_line is in the table: CHECK THAT prog_line is PRINT. NO NEED PKB ### New APIs for optimisation FOR with clause: std::vector<Entity*> GetEntitiesWithAttributeValue(DesignEntity design_entity, AttributeEnum attribute, string value); std::vector<tuple<Entity*, Entity*>> GetStatementConstantPair(DesignEntity type_one, DesignEntity type_two) FOR such that with 2 synonyms - std::vector<std::tuple<Entity*, Entity*>> + HashMap GetRelationshipByTypes(PKBRelRefs, DesignEntity, DesignEntity, vector<Entity*>, vector<Entity*>, Enum); 1. std::vector<Entity*> GetRelationship(PKBRelRefs ref, std::string entity); 2. std::vector<Entity*> GetRelationshipByType(PKBRelRefs, DesignEntity, vector<Entity*>, Enum); - Note that its either use kNoScoped vs kLeftScoped or kRightScoped