# IBM-NTU
## Resource Constrained Pattern Matching
Paper link (NTU Template). https://www.overleaf.com/read/vpjhctznsksr
### Paper Structure
1. Introduction
* Importantce of pattern matching
* as key component of deep packer inspection
* Breif intro to pattern matching
* different approaches and their pros, cons
* resource consumption
* Deploy DPI on resource constrained is an issue
* on smart NIC, for example
* Shortcoming of prior work
* DFA-based Rule Grouping requires much more memory
* Intro and summarize Rule Balancer
* Contribution
2. Background
* Intro to Automaton-based pattern matching **(Ongoing**)
* RegEx and pcre
* DFA vs NFA, processing time and memory cost
* Related Work
* Rule grouping
* Other complementary optimization approaches
3. Methodology
* Problem Formulation
:::danger
Comment. So $k$ is the maximum number of $R_x$ ?
:::
:::success
Response. Nope, it's the number of cores we have.
:::
:::danger
Comment. It looks like $R_1 \sim R_k$ are the input parameters, but it should be the output of your optimization problem. Please check.
:::
:::success
Response. They are inputs of our objective function and the output of the optimization. There is no contradiciton.
:::
:::danger
Comment. R is previously defined as regular expression, why are we using it "r" ?
:::
:::success
Response. The captital letter R is used to represent **the set of regular expressions** while the small letter r is used to be a single regex.
:::
* Evolutionary Rule Balancer
* Overview
:::danger
Comment. It is not clear of $a_i$. Is this a group or a rule?
:::
:::success
Response. $a_i$ is the group ID for the $i$-th rule. The above sentence is the definition and the following is the explanation of the notation.
:::
:::danger
Comment. "fitness of chromosome" would be a new concept for a reader. You might include some backgroudn explanation about the GA algorithm in high level.
:::
:::success
Response. Fix the order of the sentences
:::
:::danger
Comment. Use vectorized image for 3.2 and 3.3
:::
:::success
Response. What does vectorized image mean?
:::
:::danger
Comment. Figures are not referenced in the text.
:::
:::success
Response. Add reference in the context!
:::
* Regex Embedding
* Case Study. NFA
* Minimal Total NFA Size <--> Max-k Cut
:::danger
Comment. What is N?
:::
:::success
Response. The number of states of NFA for input regular expressions, as explained in the table 3.1. Add more discriptions in the proposition.
:::
:::danger
Comment. You may want to mention that K, the number of partitions, is given.
:::
:::success
Response. Yes, the number $k$ is supposed to be given.
:::
:::danger
Comment. This notation would be wrong, Please check. I think $V_i={r_{i_1}... r_{i_k}}$ would be correct.
:::
:::success
Response. The expression is correct. $k$ is the number of cores, not the maximal possible number of regex in a group.
:::
:::danger
Comment. Can we simplify this equation? You can use references of previous equations.
:::
:::success
Response. Unfortunately, the answer is no. This is just direct corollary from the definition (3.7). We didn't make any concrete constrcution, So the expression cannot be simplified at the moment.
:::
:::danger
Comment. What do you mean by "Hope to have" ? :)
:::
:::success
Response. Notice that the those paragraphs consists of not only the proofs, but also the motivations. The following expression is equivalent to reduce the problem to Max k-cut, which is our goal. That's why I said "Hope to have" :)
:::
:::danger
Comment. You may want to use different index for diferent i? r_{i,i} is a very confusion notation.
:::
:::success
Response. As a rule will be decomposed into many sub-patterns, I would like to use such notation. Maybe there is a more proper way to do so.
:::
:::danger
Comment. This equation is very hard to follow. You may want to simplify the equation and put the details in the appendix. this comment applies to 3.13-3.15.
:::
:::success
Response. Agree, still thinking ...
:::
* Minimal Largest NFA Size <--> Multi-depot Salesman
* Algorithm design
4. Experiment
* Space-time tradeoff
* Comparison of grouping algorithms
* Evaluate NFA Size Estimator
* Evaluate RegEx Embedding
5. Conclusion