POfinal26 training 2 editorials

# [Knapsack](https://open.kattis.com/problems/knapsack) This is just the classic knapsack problem, where an $O(nC)$ solution suffices. The crux is that you need to output the items actually used in the optimal solution. To do this, store the full table `DP[i][weight]`. Then, at the end, start at the best solution and go backwards: using the DP table, you can check if taking a particular item is optimal. [C++ code](https://gist.github.com/Matistjati/2df0ed5e201fd83e8d460e6ac51c95ca). While we can't really optimize the time complexity, it is possible to get memory down to $O(\sqrt{n}C)$, or even $O(n+C)$ using Hirschberg's algorithm, all while maintaining $O(nC)$ time. # [Arrested Development](https://open.kattis.com/problems/arresteddevelopment) This problem seems very knapsack-like, so let's try DP. We can realize that the order that an intern does the tasks does not matter, only which tasks they should do. Therefore, a first attempt would be something like $$\text{DP[i]}[t_1][t_2]=\text{can we get total times } t_1 \text{ and } t_2$$ $$\text{ for the interns after considering first } i \text { items}$$ We can bound $t_1$ and $t_2$ by $n \cdot 10^5$, for a total of ~$n^3\cdot 10^{10}$ operations, which is obviously too slow. However, we can instead solve the problem $$\text{DP}[i][t_1]=\text{min } t_2$$ since if we know $t_1$, we want to greedily minimize $t_2$. Thus, this becomes a knapsack-like problem: for each item, either increase $t_1$ or $t_2$. The number of states is bounded by $n \cdot n \cdot 10^5$, with $O(1)$ transitions per state. This is tough in Python, but doable if we compute it bottom-up. To make it run fast enough, I made sure to write predictable loops and minimize memory usage. Importantly, note how I never allocate new memory, because that is slow. ```python n=int(input()) inf=10**18 max_time=10**5+10 # dp[time for first intern] = min time for second intern dp=[inf for i in range(n*max_time)] dp[0]=0 newdp=[inf for i in range(n*max_time)] for _ in range(n): a,b=map(int,input().split()) # let second intern do the work # make sure we overwrite newdp to avoid stale values for i in range(len(dp)): newdp[i] = dp[i]+b # let the first intern do the work # iterate backwards to avoid same item being used multiple times for i in range(len(dp)-1, -1, -1): if i-a<0: break newdp[i]=min(newdp[i], dp[i-a]) # swap the dp tables dp,newdp=newdp,dp ans = inf for i in range(len(dp)): ans = min(ans, max(i, dp[i])) print(ans) ``` For even faster code, we do everything using a single array. [Code](https://gist.github.com/Matistjati/8ab2058a5b01a02d1371bdffa76d55be). # [Chimchar Defense](https://open.kattis.com/problems/chimchardefense) Stated concisely, the problem is the following: we have an array $H$. There is another array $D$, and for each entry, we can choose to use it at most once. If we use the $D_i$ at position $i$, we will decrease all $H_j$ with $j\leq i$ by $D_i$. At the end, tally up the total damage, but don't count any beyond what is necessary to bring $H_i$ to $0$. Firing attack $i$ costs $C_i$. Minimize total damage minus cost. First, why is this not just greedy? Because damage beyond what is necessary to kill a certain Chimchar doesn't count. It's obviously hopeless to try and do a DP where we keep track of the health of all Chimchars, that's exponentional. To solve the problem, we will use a simple, but important technique: reading all constraints properly, and understanding them. The part of this problem that makes it solvable is that $H_i \leq 5000$. This suggest a DP of the form $$\text{DP}[i][\text{total damage}]=\text{max damage - cost}$$ Where we never keep track of total damage more than $5000$. To properly account for the damage, we can iterate backwards. ```python n=int(input()) h=[int(i) for i in input().split()] D=[int(i) for i in input().split()] c=[int(i) for i in input().split()] # dp[damage]=max profit dp=[-10**18 for i in range(10001)] dp[0] = 0 for i in range(n-1,-1,-1): # iterate backwards so we dont count multiple times for d in range(10000, -1, -1): dp[d] = max( dp[d] + min(h[i], d), # dont attack dp[max(0, d-D[i])] + min(h[i], d) - c[i] # do attack ) print(max(dp)) ``` Note that in this implementation, we use $10000$ as the upper bound instead of $5000$. It's pretty suble, but the problem is that the code doesn't consider all ways of getting to $\geq 5000$. Using only max(H) breaks in this test case: ``` 2 2 1 2 1 0 0 ``` To solve this, it would probably be easiest to do a "push-style" DP instead of pull-style (what I did). Push-style [code](https://gist.github.com/Matistjati/183a57ff5787f896d25b394339a3efe6). If the constraints were not so tight (or if I just did solutions in C++), we could easily write the DP in a less thought-out way. # [1's For All](https://open.kattis.com/problems/1sforall) Conceptually, we can start at $n$. Then, suppose we split it into two parts, $a$ and $b$ with $a\cdot b=n$. What is the cost if we do this? The cheapeast cost to construct $a$ and $b$, respectively! And of course, we also have to consider the other two operations. Thus, we get $$DP[i]=\text{min cost to build i}$$ and we can compute all $i$ in increasing order. Let's analyze the cost of transitions. We build $i$ by: - Taking two numbers $a$ and $b$ with $a+b=i$. There are $O(N)$ such pairs. - Taking two numbers $a$ and $b$ with $a \cdot b=i$. - Taking two numbers $a$ and $b$ with $a@b=i$. For the third operation, we take the decimal number and split it in two. Since there are $\log(N)$ digits, there that many ways to split it in two. Analyzing the second operation is a bit nontrivial, but amortized over all $N$ states, it's bounded by $O(N\log(N))$. Quick proof: any $a$, $b$ with $a\cdot b=i$ are divisors of $i$. For a given $a$, it is a divisor of $a$, $2a$, $3a$, $\dots$. For a given $a$, there are thus ~$\frac{N}{a}$ numbers that it is a divisor of. Summing over all $a$, we get $\frac{N}{1}+\frac{N}{2}+\frac{N}{3}+\cdots+\frac{N}{N} \approx N\log(N)$. This sum accounts for all divisors. To see why it is $N\log(N)$, search "harmonic series growth". So in total, operations 2 and 3 take $O(N\log(N))$, but operation 1 takes $O(N^2)$. There are reasons to believe that doing operation 1 faster is kind of impossible (it is basically an online (max, +) convolution). Thus, we have two options: - Use C++, where it's simply fast enough because it's a tight loop with $N=10^5$ and a 15 second time limit - When doing plus, limit $a$ to at most let's say $100$. Intuitively, + is a very slow operation compared to the other ones. So using big $a$ will always be suboptimal. ```python n=int(input()) dp=[10000000] * (n+1) dp[1] = 1 divisors = [[] for i in range(n+1)] for i in range(1, n+1): for j in range(1, n+1): if i*j>n: break divisors[i*j].append(i) for i in range(2, n+1): for a in range(min(100,i)): dp[i] = min(dp[i], dp[a]+dp[i-a]) for a in divisors[i]: dp[i] = min(dp[i], dp[a]+dp[i//a]) i_dec = str(i) for j in range(1, len(i_dec)): l_half = i_dec[:j] r_half = i_dec[j:] if r_half[0] == '0': continue dp[i] = min(dp[i], dp[int(l_half)]+dp[int(r_half)]) print(dp[n]) ``` [C++ code](https://gist.github.com/Matistjati/0178cd82fcb5bd8a1630b16fdc9c11e5). It turns out that 12 is the smallest limiter on $a$ that can be used. I'm not sure at all whether this limiter is $O(1)$, $O(\log(N))$, or some other slow-growing function. # [Atomic Energy](https://open.kattis.com/problems/atomicenergy) If you were disgusted with "hurr durr let's limit $a$" from the last one, you won't like this one. Let's rephrase the process. First, let's think of the process in reverse: we first select a bunch of small atoms and then merge them. Now, what conditions do we have for them? They must have neutron count $K$ in total, and be mergeable. The only extra criteria for mergeability is (subtly) that after combining atoms with counts $i$ and $j$, they must have $i+j>n$. After we have combined our first two, their sum must be $>n$ by neccesity, and we can thus combine this atom with the rest, one by one. Thus, this seems awfully a lot like knapsack: given items (weight, profit), where every item can be taken multiple times, select a multiset with sum of weights equal to $K$, with minimum profit, and where two items have sum $>n$. For now, let's ignore the condition on two items having sum $>n$. Now to solve the problem for one query. Let's argue heuristically to get around the large $K$: notice that $K$ is HUGE compared to our available weights (they are at most 100). Thus, most items we take should be the one with the best profit/weight, and the rest of the items are the ones to fill for the gap that would've been left if we only take that one. Thus, we can set some threshold $T$, and repeatedly remove the best profit/weight item from $K$ while it is $\geq K$ (in $O(1)$ using a division). Then we have a normal knapsack problem, with capacity $\leq K$. If we do the knapsack bottom-up and compute all answers at once, then we can answer queries in $O(1)$. To solve this knapsack problem quickly, the easiest way is probably to forbid transitions that would take two items that are both $\leq N$. To actually get the DP started, we can simply add all of the $N^2$ pairs of items that are good. Then, just do the DP as normal. ```c++ int T = 20000; vi dp(T, inf); dp[0] = 0; rep(i, n) dp[i + 1] = energy[i]; // Add initial pairs to get it started rep(i, n) rep(j, n) if (i + j + 2 > n) dp[i + j + 2] = min(dp[i + j + 2], energy[i] + energy[j]); repp(i, n + 1, T) { rep(j, min(i, (ll)n)) { if (i - (j + 1) <= n) continue; dp[i] = min(dp[i], energy[j] + dp[i - (j + 1)]); } } ``` Now, what should $T$ be? The ugly answer is: as large as possible. In IOI, just set it to e.g. $10000$, submit it. If you get WA, increase it, if you get TLE, decrease it and resubmit. If you don't solve the problem with this, you will have to think carefully whether you need a significantly larger $T$, or if you have a bug. [C++ code](https://gist.github.com/Matistjati/640df3ed9bb4cb7c62ac8c1d5b0aa949) (it's pretty long, so I didn't paste it). A more careful analysis can be found [here](https://2020.nwerc.eu/files/nwerc2020slides.pdf). # [Not Another Constructive!](https://open.kattis.com/problems/notanotherconstructive) Let's first think how we would count the number of NAC occurences in a string. We would do something like ```python n = 0 na = 0 nac = 0 for C in s: if C=='n': n += 1 elif C=='A': na += n else: nac += na ``` Now, we can imagine doing a DP where we keep track of `n`, `na` and `nac`. $$DP[i][n][\text{na}][\text{nac}]=\text{is this state achievable?}$$ Now, how many states? - $i \leq N$ - $n \leq N$ - $na \leq N^2$ - $nac \leq k$ With $N=40$, we get ~$10^{10}$ operations, which is too slow. We can't optimize for any dimension, since no state is strictly better than another. ## Solution 1: bitsets (0.36s) The naive solution is only slightly too slow. Since all DP values are 0/1, we might try optimizing the DP by doing it bottom-up with bitsets. Writing out all transitions, this does indeed work. As a tiny optimization, we also need to cap `na` by $20^2$ instead of the naive $40^2$. [C++ code](https://gist.github.com/Matistjati/f21199ad353a479da6159b62d3b1c811). ## Solution 2: sparsity (0.09 or 0.03s) Intuitively, many states are not reachable: as we say earlier, `na` can actually be bounded by $20^2$. But if `na` is maximized, then `nac` will be 0. So one would hope to just do a top-down DP and use a unordered set to keep track of visited states. Still, just doing the DP as-is leads to too many states. To reduce the number of states, for each state I visit, I compute an upper bound on the number of NAC that can be obtained, filling in the rest of the string. If this is less than $K$, I can for sure returns early. Now, how do I compute this efficiently? There's a very cute solution: the DP $$DP[i][n][\text{na}]=\max(\text{nac})$$ Can be computed in $O(N^4)$ time, which fits in the time limit! So we can start by precomputing that, and then use it to prune the DP we really want to be doing. To get it to run fast enough, I had to compute both min and max, and also prune if no matter what we do, we will create too many NAC. [C++ solution](https://gist.github.com/Matistjati/21e0c2bf3edc18a9c8e47b713bb7c484). One other possible optimization is to use a bitset instead of unordered set to keep track of visited states. Unfortunately, this alone is not enough: you still need the other pruning. For some meta-commentary, the bitset solution was much easier and quicker to code. # [Evading a Monster](https://open.kattis.com/problems/evadingamonster) ## Subtask 1 We can do a DP in $O(N^2M)$ by keeping track of our position, how far along the monster has come, and at every point trying to go to all other positions. To avoid a complexity of $O(N^3M)$, either precompute distances, or do all transitions in $O(N)$ per state by a DFS. ## Subtask 2 The constraint in this subtask should be a big hint: we're aiming for $O(max(deg)M)$ or similar. This inspires the following strategy: we basically only need to move when we're in proximity to the monster. So after each monster movement, we take the cell it goes towards and let it spread out to adjacent nodes, and let nodes adjacent to the now free node have the option to go there. Concretely, the algorithm is: first, do a flood-fill in the initial tree. Then, recompute the DP value for the node the monster just left and all nodes adjacent to the one it just went to. It's a little non-obvious that the optimal strategy can be done like this, so take a moment to convince yourself that it's true. [C++ code](https://gist.github.com/Matistjati/9ab59470005ad9666d245be7dd4376bc). ## Subtask 3 We're now going to optimize the previous solution. There are 3 main operations that the previous code does: **Pull** ```c++ repe(e, adj[monster_pos]) { if (e == monster[i]) continue; dp[monster_pos] = min(dp[monster_pos], dp[e] + 1); } ``` **Push** ```c++ repe(e, adj[monster[i]]) { if (e == monster_pos) continue; dp[e] = min(dp[e], dp[monster[i]] + 1); } ``` **SET** ```c++ dp[monster_pos] = inf; ``` First, let's root the tree at node 0. Next, for every node, we build a lazy segment tree over its children. The segment tree supports: - Given $l,r,x$, set $A[i]=max(A[i], x)$ for all $l \leq i \leq r$ - Given $i,x$, set $A[i]=x$ - Given $l,r$, return $\min(A[l..r])$ This can be done with a normal lazy segment tree. We also keep track of the normal DP. So every node's DP values appear in two places: the normal DP, and in the parent's segment tree over children. Now, here's how to perform the operations: - **Set**: simply set DP[i] and its value in the parent's segment tree - **Pull**: take min over all children using the segment tree, and take min of that value and its value in the parent's segment tree. To exclude node that the monster is currently at, either don't count the parent, or do two range queries to exclude it - **Push**: first, obtain the up-to-date value of DP[i] by looking at DP[i] and the parent's segment tree. Then, similarly to pull, update the parent, the parent's parent's segment tree (easy to miss), and the children using the segment tree, taking care to exclude the node the monster came from. There exists a linear-time solution too.