When the Test Data Is Sus

# [When the Test Data Is Sus](https://po2punkt0.kattis.com/problems/testdatasus) This problem is the classic [knapsack problem](https://en.wikipedia.org/wiki/Knapsack_problem), but with constraints that are too large to run the $O(NC)$ DP. There are good reasons to assume that we cannot beat $O(NC)$ in general (unless some other parameter is small). The weakness in this problem is that all items weights and profits are generated uniformly at random from the interval $[1, 3\cdot 10^5]$. The capacity is also exactly $3 \cdot 10^5$ in all test cases. I will describe more than one solution to show what works and what doesn't. # Common building block: greedy solution A common building block to many solutions is the greedy solution: first, sort items by profit/weight in decreasing order. Then, take items in that order and add them to the knapsack as long as they fit. One very nice property with this is that it's optimal if we could take a fraction of items. So in some sense, it gets "close" to an optimal solution. # DP ## $O(NC)$ DP: $N \leq 10000$ If you just write your loop in a reasonable manner, the compiler will automatically apply SIMD and make the knapsack run very quickly. ```c++ int n, t; cin >> n >> t; vector<p2> items(n); repe(c, items) cin >> c.first >> c.second; vi dp(t+1); for (auto [profit, weight] : items) { for (int i = t; i - weight >= 0; i--) { dp[i] = max(dp[i], dp[i - weight] + profit); } } cout << *max_element(all(dp)) << '\n'; ``` ## $O(\sqrt{N}C)$ DP by pruning items: full points, ~0.3-0.8s runtime Just run the previous solution, but only take the first let's say 5000 items when considering them in the greedy order. This is what Erik and Olle did. ```c++ int main() { cin.tie(0)->sync_with_stdio(0); int n, t; cin >> n >> t; vector<p2> items(n); repe(c, items) cin >> c.first >> c.second; sort(all(items), [](p2 a, p2 b) { return a.first*b.second > b.first*a.second; }); while (sz(items) > 5000) items.pop_back(); vi dp(t+1); for (auto [profit, weight] : items) { for (int i = t; i - weight >= 0; i--) { dp[i] = max(dp[i], dp[i - weight] + profit); } } cout << *max_element(all(dp)) << '\n'; return 0; } ``` ### Analysis Now, how should we justify that this works? First, it turns out that the greedy solution already packs the knapsack quite well: if we run the greedy solution and break at the first item that doesn't fit (you only take a prefix), you come ~1000 off from filling the full capacity. Recall that greedy is optimal if we could take fractional items. So we want to remove some items and add some other items with similar profit/weight ratio to fill the remaining 1000 capacity. Since everything is random, and that given $N$ items, there are $2^N$ sums (not necessarily distinct, but probably distinct enough that $2log(k)$ items fills most of the sums in $[0,k]$), we surely don't need many more items than the greedy to fully pack the knapsack. There's no point in using items with much worse profit/weight, since we are already able to fully pack the knapsack with higher profit/weight. So the intuition for why 1000 is enough is that that's the number of items needed for the greedy knapsack to fill upp, and then we just need a few extra items to fill it up (probably a logarithmic amount). But why does 1000 suffice? ### Greedy takes $\sqrt{N}$ items Here are some plots. Each item is a pair (weight, profit), and we plot it on a 2D grid. We also color all items that greedy takes. N=100 ![image](https://hackmd.io/_uploads/HywJ3A4zbx.png) N=1000 ![image](https://hackmd.io/_uploads/H1Ck3RVf-g.png) N=10000 ![image](https://hackmd.io/_uploads/ByBlh0VG-g.png) So we can notice that the set of items that greedy takes is a radial sweep. For large N, this looks approximately like a triangle. ![image](https://hackmd.io/_uploads/HyyaSyBz-x.png) Let the $x$-coordinate of the top-right corner be $w$. We can then find the expected $w$ where the sum of weights (x-values) of all points in the triangle is $3 \cdot 10^5$ as a function of $n$. To find the number of items needed in our knapsack, this will just be the expected number of points within this triangle. I'm not a mathematician, so you can find a derivation by chatgpt [here](https://chatgpt.com/share/69376263-e1d0-8002-bbeb-e3cbdb8846d5). This results in $\sqrt{3/2N}$, which matches experimental results. Thus, as long as all item profits and weights are chosen uniformly at random from the interval $[1, \text{capacity}]$, it suffices to run knapsack with the first $O(\sqrt{N})$ items in greedy order. ## DP over pareto front: $N \leq 10000$ Suppose that we do the normal knapsack DP, but think of the states in the DP as pairs (profit, weight). We can realize that if there's some other state with less weight and better profit, we dont need to consider it: see the red items in the image. ![image](https://hackmd.io/_uploads/HyUk1TNMbl.png) Thus, we could do our DP and keep pairs (profit, weight) instead of a `vector<int>`. In practice, this does help a lot, but we also lose SIMD, so we don't really gain a performance boost over naive DP. ## DP over pareto front with pruning: full points, 1s Do similar pruning for items as described in the branch and bound below: for each item, use the greedy solution to compute an upper bound in log(n) with binary search and prefix sum, and remove it if there is some item present with greater profit. Doing this, there will typically only be ~50 states on average, so the solution runs in about $O(50N \log(N))$. # Simulated annealing I didn't try very hard, but simulated annealing did not work well at all. Here's a good blog on it: https://bminaiev.github.io/simulated-annealing. Intuitively, even if we start with the greedy solution, simulated annealing considers too many bad items, and the landscape of the objective function is probably not smooth enough. My half-assed attempt that solves only $N \leq 30$: [code](https://gist.github.com/Matistjati/f6cb9f38a58c0ebc973e9482e925c910). # Branch and bound ## Just do it: no points Just write a recursive function trying all $2^N$ options, not branching if taking an item would exceed the capacity. ## Upper bound: $N \leq 30$ Recall that the greedy solution is optimal if we're solving fractional knapsack. Therefore, running greedy knapsack and including the last item that didn't fit is an upper bound on the solution. How can we use this in the recursion? Let's consider items in greedy order. Before we branch in the recursion, first compute an upper bound. If this upper bound is less than the best solution found so far, we can safely exit early. The upper bound considering the remaining items can simply be found by finding the size of the prefix of unconsidered items you can take. ```c++ void rec(int i, ll profit, ll weight, vector<p2>& items) { ans = max(ans, profit); if (i >= sz(items)) return; ll ub = profit; ll weight_copy = 0; int j = i; for (; j < sz(items); j++) { if (weight_copy + items[j].second > cap) break; ub += items[j].first; weight_copy += items[j].second; } if (j < sz(items)) ub += items[j].first; // include the item that didn't fit if (ub <= ans) return; if (weight + items[i].second <= cap) rec(i + 1, profit + items[i].first, weight + items[i].second, items); rec(i + 1, profit, weight, items); } ``` ## Tighter upper bound: full points, 0.09s runtime With a tiny modification, the previous solution receives full points: when computing the upper bound, take the fraction of the item that you can afford. For why this works so well, you can reuse the analysis from the knapsack: because the first branch takes an item, it will quickly produce a very good prefix solution, and then only make small modifications to it. This gives a very good lower bound, and the upper bound is already tight enough that we can prune basically everything else. In fact, if you switch the order of the recursive calls (first taking the branch where you don't take an item), this solution only solves $N \leq 1000$. ```c++ void rec(int i, ll profit, ll weight, vector<p2>& items) { ans = max(ans, profit); if (i >= sz(items)) return; ll w = weight; ll ub = profit; int j = i; while (j < sz(items)) { if (w + items[j].second > cap) break; ub += items[j].first; w += items[j].second; j++; } if (j < sz(items)) { // better upper bound ll remaining = cap - w; ub += items[j].first * remaining / (double)items[j].second; } if (ub <= ans) return; if (weight + items[i].second <= cap) rec(i + 1, profit + items[i].first, weight + items[i].second, items); rec(i + 1, profit, weight, items); } ```