# Remove Invalid Parentheses (Hard) Remove the minimum number of invalid parentheses in order to make the input string valid. Return all possible results. Note: The input string may contain letters other than the parenthesis ( and ). Here's the [link](https://leetcode.com/problems/remove-invalid-parentheses/) to the problem on leetcode ``` Example 1: Input: "()())()" Output: ["()()()", "(())()"] Example 2: Input: "(a)())()" Output: ["(a)()()", "(a())()"] Example 3: Input: ")(" Output: [""] ``` I am going to assume that the reader understands what it means to have 'valid' parenthesis. You might want to solve a couple of similar and simple versions of this type of problem first. I've linked 2 such problems that I personally found useful solving first, to better understand this particular problem: * [Valid Parentheses](https://leetcode.com/problems/valid-parentheses/) * [Minimum Remove To Make A Valid Parentheses](https://leetcode.com/problems/minimum-remove-to-make-valid-parentheses/) ## Make use of signals from the given problem The first signal to me lies in "generate all" such valid strings. Anytime I see a problem telling you to enumerate or generate all possibilities, this screams Recursion! So even if you didn't have any specific insight about the problem, you could possibly try out eliminating or including every single character in the given string and then checking if it is valid. This falls into the classic [subset recursion problem](https://leetcode.com/problems/subsets/). Subset problem is one of the classic recursion problems (that I would urge you to master, if you haven't already). If you could code up the subset problem in your sleep, the following code snippet would look very obvious to you. Btw, this is NOT a working solution. Can you guess what would be the output of this code? ``` List<String> removeInvalid(String s){ HashSet<String> set = new HashSet<>(); //to avoid duplicate result strings char[] candidate = new char[s.length()]; recurse(s, 0, candidate, 0, set); return new ArrayList<>(set); } void recurse(String s, int index, char[] candidate, int k) { if(index == s.length()){ StringBuilder sb = new StringBuilder(); for(int i = 0; i < k; i++){ sb.append(candidate[i]); } if(isValid(sb.toString())){ set.add(sb.toString()); } return; } recurse(s, index + 1, candidate, k, set); candidate[k] = s.charAt(index); recurse(s, index + 1, candidate, k + 1, set); } boolean isValid(String s){ int count = 0; for(char c: s.toCharArray()){ if(c == '('){ count++; }else if(c == ')'){ if(count > 0){ count --; }else{ return false; } } } return count == 0; } ``` The output is this: ``` ["","()()","()","(())","()()()","(())()"] ``` As you can see, what I did was a very mindless enumeration of all possible valid parenthesis subsets, by removing one or more of the existing parentheses. What can we do to make it a working solution? This is where the fun process of pruning begins! If you stay with me throughout this article, you will see how we can take it to an optimized solution just with this technique of pruning and modifying our baseline working solution (which we will get into in a min). For that to happen, we need to learn a bit more about the problem. So let's explore further! ## Explore and get to know the problem a bit more ### Step 1 - Come up with some baseline working solution first! Is there anything we could learn about the problem that would help make the above code snippet a working solution? The clue is in the medium difficulty problem I briefly mentioned in the beginning. And that is this: There is only a fixed minimum number of parenthesis we can ever remove to make it a valid string, isn't it? The outcome of the process may result in more than one valid string but they would all share the same length i.e. you would have removed the same number of parentheses to arrive at each of these resultant strings. If you are not sure about this, try out a few examples yourself! The next thing to ask is: can we know ahead of time, what this `mincount` could be? When we can assert that there is only one `mincount` for a given input string, there should be a way to actually know what the `mincount` would be, isn't it? Turns out, there is! I encourage you to atleast come up with your own algorithm to do this, and try to code it up. You'll learn a lot more about the problem, just with this simple exercise, and you're gonna need all those insights to further optimize our solution! Here's mine: ``` int minRemovalCount(String s){ int lp = 0, rp = 0; for(char c: s.toCharArray()){ if(c == '('){ lp++; }else if(c == ')'){ if(lp > 0){ lp--; }else{ rp++; } } } return lp + rp; } ``` How can we use this information in our code to actually make it a working baseline solution? Simply keep track of the number of characters removed in each recursive call, and in the terminating condition, check if that count equates to the `mincount`. ``` //somwhere in your code, you can precompute the minCount micount = minRemovalCount(s) void recurse(String s, int i, char[] candidate, int k, int count){ if(i == s.length()){ StringBuilder sb = new StringBuilder(); for(int j = 0; j < k; j++){ sb.append(candidate[j]); } if(isValid(sb.toString()) && count == minCount){ set.add(sb.toString()); } return; } //recurse by not considering the current char, so the count increases by 1 recurse(s, i + 1, candidate, k, count + 1); //include the current character candidate[k] = s.charAt(i); //count stays the same, because we are recursing by including the current character recurse(s, i + 1, candidate, k + 1, count); } ``` At this point, you'll notice that all the test cases given in the problem pass but obviously its not efficient. So you'll get a timelimit exceed error running it on leetcode. ### Step 2 What else can we learn from this problem? The clue lies in the implementation of the isValid() method. We already know the following about the current character (c) we are inspecting * If `c == '('` then we can include it (because there might be some matching `')'` ahead in the loop. We could also exclude it (because there may not be another matching `')'` ahead in the loop. So we try both options. * If `c == ')'` then we can include it IF AND ONLY IF there was an unmatched `'('` already found so far in this recurssive path. * If c is neither (it's a regular alphanumeric char), then simply add it and move on in the recursion. We can prune some of the recursive calls, based on these findings. I added an additional parameter to the recursive call, to keep track of the number of unmatched `'('` found so far in that recursive path. Let's see how our existing code might change to these new filters. ``` //we now are keeping track of number of unmatched '(' characters found so far void recurse(String s, int i, char[] candidate, int k, int count, int currentOpen){ if(count > minCount) return; if(i == s.length()){ if(currentOpen == 0){ StringBuilder sb = new StringBuilder(); for(int j = 0; j < k; j++){ sb.append(candidate[j]); } set.add(sb.toString()); } return; } if(s.charAt(i) == '('){ recurse(s, i + 1, candidate, k, count + 1, currentOpen); candidate[k] = s.charAt(i); recurse(s, i + 1, candidate, k + 1, count, currentOpen + 1); }else if(s.charAt(i) == ')'){ if(currentOpen > 0){ candidate[k] = s.charAt(i); recurse(s, i + 1, candidate, k + 1, count, currentOpen - 1); } recurse(s, i + 1, candidate, k, count + 1, currentOpen); }else{ candidate[k] = s.charAt(i); recurse(s, i + 1, candidate, k + 1, count, currentOpen); } } ``` Notice, how I am changing the count and currentOpen parameters. The rules are straightforward: * If you include the `'('`, increment currentOpen. * If you exclude a character, increment count (because you are removing one additional char) Also, notice that we no longer need to check the validity of the string in the base case! This is because we are now keeping track of the validity with the currentOpen parameter. If currentOpen is zero, that means, all the unmatched `'('` have been matched and we are anyway pruning any unmatched `')'` cases in the recursive call. (line #26 in the snippet above) This is already a way better solution and believe it or not, this actually passes on leetcode with all test cases. Hurray! If you still have appetite to chisel this further, then read on! ## Can we do better? May be a non-recursive approach? Yes! Again, the clue is in the recursive solution :) Recursion approaches a solution deeper in one direction before exploring its sibling options. They implicitly stack up the intermediate states of the results to restore later for alternative exploration paths. Most problems (especially the "generate all" type of problems) always have its iterative counterpart that uses queue. So does this problem! In fact, it's not terribly hard to come up with the bread first search intuition right at the beginning. Why? Again, the clue lies in the question. We are asked to find all the valid strings with "minimum" removals - in other words, shortest way to go from an invalid string to a valid string. This is a classic shortest path problem that can be solved using the breadth first search algorithm. **Approach:** You start with the root node (input string), add it to the queue. Then start exploring its children level by level. In this case, every string with one character less, is an immediate child of the root. The moment you find a valid string in one of these levels, you terminate and return the results. Because we are going breadth first way, we are assured that any valid string we encounter this way has minimum number of characters removed. Here's an initial baseline working solution using BFS: ``` public List<String> removeInvalidParentheses(String s) { HashSet<String> results = new HashSet<>(); HashSet<String> candidates = new HashSet<>(); candidates.add(s); while(results.isEmpty()){ HashSet<String> next_candidates = new HashSet<>(); for(String candidate: candidates){ if(isValid(candidate)){ System.out.println("Is Valid " + candidate); results.add(candidate); }else if(results.isEmpty()){ HashSet<String> next = getNext(candidate); for(String n: next){ next_candidates.add(n); } } } candidates = next_candidates; } List<String> list = new ArrayList<>(); for(String r: results){ list.add(r); } return list; } HashSet<String> getNext(String s){ HashSet<String> nextcandidates = new HashSet<>(); for(int i = 0; i < s.length(); i++){ if(s.charAt(i) != '(' || s.charAt(i) != ')') { nextcandidates.add(s.substring(0,i) + s.substring(i + 1)); } } return nextcandidates; } boolean isValid(String s){ int count = 0; for(char c: s.toCharArray()){ if(c == '(') count++; else if(c == ')') count--; if(count < 0) return false; } return count == 0; } ``` We could apply similar pruning techniques we saw earlier, to improve this solution. In this case, the pruning would happen mostly in the getNext() method which selects the neighbors; rather than mindlessly picking all possible substrings, we add a few conditions (similar to what we saw earlier), so that our neighbor list is drastically reduced to the ones that matter most. To be continued....