# The Paw-litical Game ## Subtask 1 Do basically anything that is correct. It probably suffices to do a DFS, where for every array, we branch on merging some pair of adjacent cities. ## Subtask 2 First, we don't want to deal with exponentially large numbers. Realize that if we merge two cities with population $2^x$, we get $2^x+2^x=2 \cdot 2^x=2^{x+1}$. This means that we can redine our merge operation to simply be $x+x=x+1$, and our numbers are now at most $10^9$. Let $C$ be the sizes of the cities. When we say $C[l,r]$, we mean the cities $C[l], C[l+1], \dots, C[r]$. We say that an interval [l,r] is *good* if it is possible to merge the cities $C[l,r]$ into a single city. Now, let us try to actually solve the subtask. When we are done merging cities, we will end up with multiple intervals that are merged into one. It is unclear exactly which cities should be part of which interval. So let us guess! Let us guess that cities $i$ and $i+1$ are not part of the same interval. What happens then? The problem splits into 2 independent parts! This motivates the following DP: $$DP[l][r]=\text{min cities left if we merge C[l,r] optimally}$$ Then, we get: - If $C[l,r]$ is good, then $DP[l][r]=1$. If we can merge $C[l,r]$ into a single city, then the answer for $DP[l][r]$ is obviously 1. - Otherwise, we know there are one or more intervals in $C[l,r]$. We will have guess that $k$ and $k+1$ are in different intervals for some $k$. More exactly, $DP[l][r]=min(DP[l][k]+DP[k+1][r])$ over all $l \leq k < r$. How do we quickly compute whether $C[l,r]$ is good? We precompute all good intervals using a new dp, $$DP2[l][r]=\text{the value of the remaining city if [l,r] is good, otherwise -1}$$ We have the base case $DP2[i][i]=C[i]$ for all $i$. Similarly to the last dp, we will compute this one by guessing where the intervals go: - When computing $DP2[l][r]$, we will try every $l \leq k < r-1$, and check if $DP2[l][k]==DP2[k+1][r]$ (and that they aren't -1 of course). If we find any such $k$, then $DP2[l][r]=DP2[l][k]+1$. Otherwise, it is -1. These types are dps are well-known to run in $O(N^3)$: $O(N^2)$ states, and $O(N)$ transitions per state. This solves subtask 2. ## Subtask 3/4 We can rethink $DP[l][r]$ to run in $O(N^2)$ instead. We basically want to cover $[0, N-1]$ with the smallest number of good intervals. We can therefore go left to right and define $$DP[i]=\text{min number of intervals needed to cover everything up to and including i}$$ We can then see that the transitions are: $$DP[i]=min(DP[j]+1) \text{ over all } j<i \text{ such that the interval } C[j+1,i] \text{ is good}$$ It should not be hard to see that this dp runs in $O(\text{number of good intervals})$ in total. As there are at most $O(N^2)$ good intervals, this is $O(N^2)$. Now, we somehow need to optimize $DP2[l][r]$. It seems hard to remove state, so let us try removing transitions. We can realize that for each interval $[l,r]$, $DP2[l][k]==DP2[k+1][r]$ can only be true for exactly zero or one $k$. Suppose we have a $k$ that works: if we move it one step to the left or right, then we can't possibly merge the new intervals anymore. For subtask 3, it is easier to store numbers in the form $2^x$. Under this form of addition, the sum of an interval never changes when we merge two numbers. Therefore, $k$ will be the number such that $C[l]+C[l+1]+\dots+C[k]==C[k+1]+\dots+C[r]$. Using binary search over a prefix sum of city sizes, this $k$ can be found (or determine that such a $k$ does not exist). $O(N^2log(N))$. However, this does not work as well for larger city sizes. Can we find a new invariant under the $x+x=x+1$ operation? This seems difficult. Instead, we will compute $DP2$ bottom-up. By doing this in a smart order, it is easy to quickly find other potential intervals to merge with and stay good. This solution will be discussed more in-depth later. ## Subtask 5/6 The insight that is key to solving this problem is a bit of leap of faith: it is that there are only $O(N\log(N))$ good intervals. How do you realize this? It feels very hard to solve the problem if there are $O(N^2)$ good intervals. So you try generating a worst-case, but don't succeed. For example, create an array with size $N=2^m$ for some $m$. $m=2$: 1 1 1 1 In this one, it is easy to see that there only $O(N\log(N))$ good intervals: for every left endpoint, the only good right endpoints are the ones that are a power of 2 to the right of it. It seems unlikely that more complicated arrays have more good intervals, as it feels like it should be harder to merge lots of elements in them. Another example: 1 1 2 3 4 5. In this one, one left endpoint has lots of right endpoints, but the rest the left endpoints can't merge to the right. Contest tip: "prove" this by using previous solutions that finds all good intervals, together with an assert: maybe something of the style $$\text{assert(good_intervals<=2*N*ceil(log2(N)))}$$ And then you can try lowering/increasing the 2 based on your results. Use the online judge to "prove" your conjectures! Now, how do we design a solution using this insight? This actually instantly gives us that calculating $DP$ only takes $O(N\log(N))$ time, as it ran in the number of intervals. Now, we have to find all good intervals. This is unfortunately harder. We will find all good intervals by computing $DP2$ bottom-up. The base case is that every interval [i,i] is good. How can we merge intervals? Interval $[l,r]$ can be merged with interval $[r+1,k]$ if $DP2[l][r]==DP2[r+1][k]$. So one possible way to compute this dp is by throwing all intervals into a priority queue, where the ones with smallest $DP2$ come first. Among the ones with the same $DP2$, we process them right to left. This runs in $O(N\log(N)^2)$. Why? Every time an interval looks at another interval, it finds a new good interval. As there are at most $O(N\log(N))$ good intervals, and we do a heap operation for each one, we end up with $O(N\log(N)^2)$ in total. As this task is brutal and $N=10^6$, this will probably only give 45 points due to TLE. So we need to remove the log-factor. To optimize the solution, we will process intervals grouped by their $DP2$ (it is impossible to merge ones with different $DP2$, after all). The goal is to be able to process all intervals with $DP2=S$, then the ones with $DP2=S+1$ etc. For the first $DP2$ group, we the intervals by their left endpoint. Now, our problem is: for every interval $[l,r]$, check if the interval $[r+1,k]$ exists for some $k$. By going through the intervals in reverse order (large left endpoints first), this can be done in linear time. We also need to construct the array $nextintervals$, the intervals created from this iteration. By going the intervals right to left, we can generate the new intervals for $S+1$ sorted by left endpoint, so we don't need to sort! (Although $nextintervals$ will be reversed, so we do need to reverse it back). However, we still missed a detail; there may be intervals with value $S+1$ of size $1$ (the ones from the initial array). You will have to carefully insert these in $nextintervals$ at the same time as you add the newly created intervals, so that it remains reverse-sorted. If you every need to call sort on $nextintervals$, it's probably going to be too slow. In total, we take $O(1)$ for every created good interval and $O(log(N))$ for every initial interval (those of size 1). So in total, we get $O(N\log(N))$. If you implement this carefully and with a good constant factor, it is enough to solve the problem. ## Proof that there are only $O(N\log(N))$ good intervals Let $x$=the smallest number in $C$. Consider any interval of $x$'s of maximal size (that is to say, it can't be extended left or right). We'll call this a *block*. **Lemma 1**: every good interval uses an even number of the numbers from each block (possibly 0). There are four types of ways that a block can be used: - An interval is entirely within the block (obviously) - An interval is completely outside the block (obviously) - An interval uses the entire block. If the block has an odd number elements, it can't be used. - An interval uses a prefix or a suffix of the block Let $L$ be the size of this block. We will now do the following: - If $L$ is even, merge all of the values. That is to say, replace the block with $\frac{L}{2}$ values of $X+1$. The new array has fewer good intervals: however, only intervals of size $1$ within the block are destroyed due to lemma 1. - If $L$ is odd, replace the block with $\frac{L-1}{2}$ values of $X+1$. As before, we destroy some intervals of size 1. We also potentially create some new good "fake" intervals, as it may now be possible to use the whole interval, as it is now of even length. However, even counting the newly created ones, we can show that there are at most $O(N\log(N))$ good intervals in total. Iterate the previous process until we are left with an array containing only equal numbers. If the array has $M$ elements, it should be easy to see that this array has $O(M\log(M))$ good intervals. Note that this number includes the count of all fake intervals created when we reduce odd blocks. What about the ones that we destroyed? For every block along the way, we remove some amount of elements $k$. When these elements were removed, we also removed $O(k\log(k))$ good intervals (the ones within the block). Thus, we remove at most $O(N\log(N))$ intervals along the way. This gives that the sum of the number of good intervals is at most $O(N\log(N))$. Q.E.D. # Rats The solution has two parts, which are about equally painful to implement. First, we fully find the graph. Then, in the second part, we get all the rats in the correct spots. ## Finding the graph We do this by going through all nodes, and finding all of their neighbors. This procedure does not work for leaf nodes, and since we don't know from the start which ones are leaves, we will have to pretend that there are no leaves for now...... 1. Let V be some node that we want to find the neighbors of. There must be a rat on V at the beginning of the procedure. 2. Query just V until there is no longer a rat in V. 3. Now, this rat has moved to some unknown neighbor of V. 4. Query every node except for V. Now, a rat must move from a neighbor of V to V (and no other movements can have been made). Check what changes happened, and you will have an edge in the graph! 5. Repeat steps 2-4, but with some clever thinking about whether you should now also query on the already found neighbors. So in step 2, you want to make sure the rat in V moves to a new neighbor and not the same one you already found, in step 4 you want to make sure not to make a rat move from an already found neighbor to V. You do this procedure for every node in the graph. Since it is important that there is a rat on the node V in the beginning, you have to think about what order to do this. Once you've done this, you still have the issue of leaf nodes. It turns out that in step 4 in the above procedure, once you've already found the "true" neighbor of a leaf node, it might give you some false neighbors as well. So we will have found the graph, with extra stuff. The way we deal with this is by first filtering out edges that seem directed (where a is connected to b but b not connected to a), and then finding all triangles in the graph (3-cycles), cutting of the edge between the two nodes in the triangle with the highest values. This should finally give us the full graph. ## Getting the rats to the correct spots Now that we have the full graph, we can actually simulate the queries: It is deterministic what will happen for a given query. This idea allows us to do a divide&conquer procedure, where we split the graph into several components, decide what queries to do in each component, and then merge the queries together. This allows us to greatly reduce the number of queries needed. The algorithm works as follows: 1. Choose a center node which does not have a KATT (possibly but not necessarily the centroid, should be some node which will divide the graph a lot) 2. Count how many rats and how many KATT:s are in each component that this center splits the graph into. 3. Make queries in such a way to move rats from components with too many rats to components with too few rats. 4. Once all components have the correct amount of rats, recurse down and do the same procedure on all smaller components. You will from now on *always* query on the "old center" to make sure that no rats ever walk across it. In this part of the code, we don't actually perform any queries, only calculate which queries we should make in order to get all the rats to the correct spots. Then when we're all done, we merge the queries together and perform all of them. # Signs ## Main insight Let's say we have a valid subset of signs. We call a subset of signs valid if there is no 2 signs in this subset that contradict each other. Notice that we can do the following opeartions to the signs, and the subset of signs will still be valid: 1. Move a sign to the left or right, which is equivalent to increasing or decreaing all values on a sign by $k$. 2. Move a village to the left or right, which is equivalent to increasing or decreasing the distance to a specific village on all signs by $k$. Using these 2 operations, we can normalize all signs in the following way: - Pick a sign and call i $a$. We will now normalize all signs based off this sign. - For each of the $M$ element on $a$, we will subtract the $j$-th element of $a$ from all $j$-th elements on every sign. Notice that some values on the signs might be negative now. - For each sign, now find the smallest element on each sign, and subtract that value from all other values on the same sign. Now all values on the signs will be non-negative, and the sign $a$ will only consist of $0$s. For example if we had the following 3 signs with 2 values each: ``` 4 5 3 6 2 3 ``` and we normalize the values based on the first sign, we will first turn all the signs into: ``` 0 0 -1 1 -2 -2 ``` and then subtract the smallest value on each sign. ``` 0 0 0 2 0 0 ``` If we assume $a$ is one of the correct signs in the optimal valid subset, then if any sign contains a value larger than $1$, then that sign cannot be in this valid subset. Through this method, we eliminate most of the signs that are not valid. However, that is not sufficient to eliminate all invalid signs. ## Subtask 1, $N \leq 15$ $N$ is small here. We can try all subsets of signs, check if the subset is a valid solution. Note that if a subset is valid, we can choose any of the signs in the subset to be $a$, the sign that all other signs will get normalized against. One can convince themselves that this is true. To check if a subset is a valid solution, we normalize the signs, and check if there is any value that is larger $1$. If there is, then this subset is not valid. After that, we sort the signs based on the sum of the values on each sign. Since in a valid subset of signs, the sum of the values of the sign closest to Stockholm will be the largest. Now we can check in linear time if this subset is correct. We have to check for each pair of consecutive signs, if the values on each index is non-decreasing. For example, this subset of normalized signs is invalid: ``` 0 0 0 1 1 0 ``` Since from sign 2 to sign 3, the value on the second index is decreasing $\implies$ there is no way to place these 3 signs so that the signs are not contradicting. Notice that this subset of signs are valid though: ``` 0 0 0 1 1 1 ``` You can also think that if we check the signs the top to bottom, once an index has been activated, it cannot be deactivated again. The optimal answer will be the largest valid subset. Through the method above we know the indicies of the signs that construct the optimal subset of signs. We take the indicies and sort these signs by the sum of the signs' original values. This works because of a similar reason as stated above, that the sum of the values in a valid subset will be the largest when it is closest to Stockholm, and the sums will be in a decreasing order when we get closer to Gothenburg. This results in a solution that is $O(NM + 2^N \cdot N \log{(N)} \cdot M)$ ## Subtask 2, $N \leq 100$ Let's costruct a graph. So for each sign that we set as $a$ (the one we normalize against), we can construct a graph where each sign is a node, and we add an edge from sign $x$ to sign $y$ if $x[i] \leq y[i]$ for all $1 \leq i \leq M$. Notice that by doing this, an edge will be added if and only if you can place the sign $y$ after the $x$, when $a$ is normalized. What we want to find now is the longest path in this directed graph. This can be found in $O(N^2)$. It will take $O(NM + N^2M) = O(N^2M)$ to construct a graph for a specific sign that we normalize against. This results in a $O(N^3M)$ time complexity, since we construct a graph for every sign. ## Subtask 3, $N \leq 500$ We still want to construct the same graph, however there is one way to make it much faster to construct each edge. Notice that after we have normalized all signs, all values should be either 0 or 1 (if it is not 0 or 1, then that sign can't be valid). This means that instead of comparing all $M$ elements when we are comparing $x[i] \leq y[i]$, we can just create a bitset and compare the bitsets. This saves us a huge factor and results in a $O(N^3\frac{M}{32})$ time complexity. ## Subtask 4, $N \leq 1000$ We can save another factor by using the fact that at least $20\%$ of the signs are correct. As mentioned earlier, if we normalize according to any sign that is in the optimal subset, then we will successfully find the optimal subset of signs. This means if we only create $f=40$ graphs by randomly choose $40$ random signs to normalize agianst. By running 40 times, the probability that we fail is $0.8^{40}\approx$ very small. This results in a $O(f \cdot (NM + N^2\frac{M}{32}))$, which is sufficient to solve the problem. We also note that there exists a deterministic solution running in $O(N^3+N^2M)$, but leave it as an exercise for the reader.