POfinal 2025 practice 1 editorials

## Pizzubestun This problem can be solved with a greedy algorithm. Let's start in the extreme case: we have to buy the most expensive pizza and pay for it at some point. Then, which pizza should we buy together with it? We might as well pick the second most expensive pizza to get for free. Now, we have removed the most and second most expensive pizzas, which we know is optimal. We can now use the exact same argument to remove two more pizzas! We can thus repeat this over and over until all pizzas have been bought. Implementing it as-is will result in $O(N^2)$, which is too slow. To speed it up, we can first sort the pizzas. ```py n=int(input()) prices = [int(input().split()[1]) for i in range(n)] prices.sort(reverse=True) ans = 0 i = 0 while i < n: ans += prices[i] i += 2 print(ans) ``` ## Accounting The key to this problem is to be lazy. We can't possibly hope to handle many events of type 2, as that will be too slow. Instead, we will say that every restart is an "epoch". We begin at epoch $e=0$. Every time an event of type 2 happens, we advance to epoch $e+1$. For every person, we keep track of the last epoch when they were part of an event of type 1. When we want to print, we check if they were changed this epoch. If so, print the value. Otherwise, print the value of the current epoch. ```py n, q = [int(i) for i in input().split()] change_time = [0 for i in range(n)] changed_to_event1 = [0 for i in range(n)] event2_value = 0 time = 0 for i in range(q): line = input() if line.split()[0] == "SET": who, value = [int(i) for i in line.split()[1:]] who -= 1 change_time[who] = time changed_to_event1[who] = value if line.split()[0] == "RESTART": time += 1 event2_value = int(line.split()[1]) if line.split()[0] == "PRINT": who = int(line.split()[1]) who -= 1 if change_time[who] == time: # print(changed_to_event1[who]) else: print(event2_value) ``` ## Inverted deck There are many approaches to this problem, some more complicated than others. Here is the simplest one I'm aware of: Sort the list, keeping track of the original indices. Find every index where the original and the sorted version differ. These indices must form an interval. However, even that is not sufficient for one flip to be enough. We can instead find the endpoints of the interval and flip them, then check if the array is sorted. Note that if the list is already sorted, we can simply output "1 1". If we don't do this, we get a runtime error. ```py n=int(input()) nums = [int(i) for i in input().split()] numsc = [i for i in nums] numsc.sort() if nums == numsc: print("1 1") exit(0) diffs = [] for i in range(n): if nums[i] != numsc[i]: diffs.append(i) l = diffs[0] r = diffs[-1] # reverse the interval nums[l:r+1] = nums[l:r+1][::-1] if nums == numsc: print(f"{l+1} {r+1}") exit(0) else: print("impossible") ``` ## Finding lines In general, the problem "given $N$ points, find a line that intersects 3 of them" is hard to do in better time than $O(N^2)$. The hardness is in the sense that no on knows how and we think it's impossible to do better than $O(N^2)$. So what makes this problem solvable? If we know that the general version is solvable, we can see that the only difference between the general version and this version is that the answer must be large. How can we use the fact that the answer is large (or impossible)? If we randomly pick two points, there is a high probability that the line between them is the optimal line. Suppose that $p=20$, the worst case. Then, we have a probability of $0.2 \times 0.2=0.04$ that our a random line is optimal. This means that in expectation, we have to try $\frac{1}{0.04}=25$ random pairs of points to find the optimal one. How does this help us? When we have a sampled a random line, we can in $O(N)$ time count the number of points lying on the line. So our algorithm is "select two random points, find the line between them, count the number of points that intersect the line. repeat 200 times. if we don't find a sufficiently good line, output impossible". What's the probability of success? If it's impossible, our algorithm always outputs impossible. If it is possible and p=20, then we have a probability of 0.96 that a given sample doesn't work. The probability that we don't find the solution after $x$ samples is $0.96^x$. For $x=200$, we get $0.96^{200} \approx 0.000284607675$. The probability that we do find a solution is then $1-0.96^{200} \approx 99.97\%$. Note that our solution is run for about 100 testcases, so we now need to compute the probability of success for all of them. $99.97^{100} \approx 97\%$, which is sufficient. If you ever submit a solution with $\geq 95%$ probability of success and you get wrong answer, it's probably worth it to submit again (assuming you're confident in your probability calculations). Note that if the problem has multiple testcases per testcase like in codeforces, it's feasible to test your solution with $\approx 10^5$ to $10^6$ testcases, oftentimes making it infeasible to write a randomized solution. However, we have skimmed over an important detail: how do we represent our line? If we use $y=kx+m$, then how do represent a line from $(0,0)$ to $(0,1)$? The answer is that we can't, so we will handle this separately by counting for each x: how many points are there with this $x$ coordinate? And then see if any has sufficiently many points, in which case we are done. This also handles the edge case of $N=1$. Even then, doing calculations with doubles (the default floating-point type for python), we get precision errors that are too large. Using fractions is too slow in both python and C++. One solution is to use long double or __float128 in C++. In order to solve it in python, we will rewrite the equation to use only integers. Let $x_1, y_1$ and $x_2, y_2$ be the points of our sample. Then, to check if a third point $x_3, y_3$ is on the line, we get $$y=kx+m$$ $$k=\frac{y_2-y_1}{x_2-x_1}$$ $$m=y_1-x_1*k$$ and the check is then whether $$x_3 \cdot k + m = y_3$$ $$x_3 \cdot \frac{y_2-y_1}{x_2-x_1} + (y_1-x_1*\frac{y_2-y_1}{x_2-x_1}) = y_3$$ $$\frac{y_2-y_1}{x_2-x_1} (x_3 - x_1) = y_3-y_1$$ $$(y_2-y_1)(x_3 - x_1) = (x_2-x_1)(y_3-y_1)$$ This check can be done using only integers! So the takeaway should be to that: - never EVER use 32-bit float. Use atleast double. If that has precision errors, try long double or __float128 if you use C++. Otherwise, too bad (in python, there is the decimal module, but it is super slow) - If time limit is not tight, using fractions might be good - If possible, doing everything using integers is always fastest and safest Another neat advantage of this form is that division by zero isn't a problem any longer. We still need to handle $N=1$ separately. ```py import random n = int(input()) if n==1: print("possible") exit(0) p = int(input()) req = (n * p + 99) // 100 pts = [] for _ in range(n): x, y = map(int, input().split()) pts.append((x, y)) for _ in range(200): (x1, y1), (x2, y2) = random.sample(pts, 2) cnt = 0 for x3, y3 in pts: if (y2-y1) * (x3-x1) == (x2-x1) * (y3-y1): cnt += 1 if cnt >= req: print("possible") exit(0) print("impossible") ``` ## Drunk texting This is another problem that is hard to do in near-linear time. In general, if we have strings $A$ and $B$, the best we can hope for in general is $O(|A||B|)$. Fortunately, this is sufficient. The idea is to do a DP. We build our goal string character by character. We keep track of how far along each string we've come. At every step, there are two choices: - add the next character from A - add the next character from B Notice that if the next character is the same, we advance both of them at once. Then, we need to actually find the answer. To do this, we save the optimal choice the DP finds, and then iterate to find the choices made in the optimal solution. Bonus problem: show that this problem is "exactly" as hard as longest common subsequence. ```py import sys sys.setrecursionlimit(2050) a = input().strip() b = input().strip() inf = 100000 maxn = 1010 memo = [[-1] * maxn for _ in range(maxn)] choice = [[-1] * maxn for _ in range(maxn)] def best(i, j, a, b): if memo[i][j] != -1: return memo[i][j] if i == len(a) and j == len(b): return 0 ret = 100000 both = (i < len(a) and j < len(b) and a[i] == b[j]) if both: ret = 1 + best(i + 1, j + 1, a, b) choice[i][j] = 0 if i < len(a): v = 1 + best(i + 1, j, a, b) if v < ret: ret = v choice[i][j] = 1 if j < len(b): v = 1 + best(i, j + 1, a, b) if v < ret: ret = v choice[i][j] = 2 memo[i][j] = ret return ret best(0, 0, a, b) state = (0, 0) ans = [] while state != (len(a), len(b)): i, j = state if choice[i][j] == 0: ans.append(a[i]) state = (i + 1, j + 1) elif choice[i][j] == 1: ans.append(a[i]) state = (i + 1, j) else: ans.append(b[j]) state = (i, j + 1) print("".join(ans)) ``` ## Liftkarta Now we get to a much harder problem. First, realize that the graph is undirected and weighted. Second, realize that to answer a particular query, we need to find a path from $A$ to $B$ that minimizes the largest edge crossed. Using that information, we can use math to calculate the number of family members that can go in $O(1)$. We strongly recommend implementing all of the subtask approaches here that you are not familiar with; all of them are very standard and may very well be used at the finals. ### $O(MQ\log(N))$ solution number 1 The problem can be solved by a modified djikstra. For every edge, instead of pushing $d+w_e$ to the queue, push $max(d+w_e)$. Read more here https://www.geeksforgeeks.org/widest-path-problem-practical-application-of-dijkstras-algorithm/. This seems difficult to generalize. ### $O(MQ\log(N))$ solution number 2 Binary search for the answer. We can check if the answer is $\leq x$ by building a graph with only edges $\leq x$ and checking if there is a path from $A$ to $B$ using these edges. Read more here https://usaco.guide/silver/binary-search?lang=cpp. This approach motivates solution 3, which generalizes to the full solution. ### $O(MQ \alpha(N) + M\log(M))$, solution number 3 Sort all edges by their weight. For each query, create a union-find, and add edges until there is a path from $A$ to $B$. Then, the last edge added is the largest weight we have to cross. Notice that this is equivalent to solution 2, but slightly faster. From this algorithm, we can realize something; only the edges added to the union-find are ever used. So in fact, we might as well only use the $N-1$ edges of the minimum spanning tree. This instantly gives us $O(QN\alpha(N)+M\log(M))$. You can read more about union-find and minimum spanning trees here: - https://cp-algorithms.com/data_structures/disjoint_set_union.html - https://usaco.guide/gold/mst?lang=cpp ### $O((N+Q)\log(N)+M\log(M))$, solution 3 We have now reduced our problem to "given a tree", answer $Q$ queries of the form "what is the largest edge in the path from node $A$ to $B$ in a tree". To do this, we can use binary lifting. https://cp-algorithms.com/graph/lca_binary_lifting.html In addition to what the article does, we will also compute $upmax[d][u]=$largest edge if we go up $d$ times from $u$. Using this, we can find the path max as follows: ![image](https://hackmd.io/_uploads/HJxj7sBrJg.png) Find the largest edge on the path from $A \rightarrow lca(A,B)$ and the largest edge on the path from $B \rightarrow lca(A,B)$. ```py import sys sys.setrecursionlimit(10**6) # Binary lifting INF = int(1e18) MAXN = int(1e5) up = [[0] * MAXN for _ in range(18)] upmax = [[-INF] * MAXN for _ in range(18)] tin = [0] * MAXN tout = [0] * MAXN timer = 0 def dfs(u, p, pm, edges): global timer tin[u] = timer timer += 1 up[0][u] = p upmax[0][u] = pm for d in range(1, 18): mid = up[d - 1][u] up[d][u] = up[d - 1][mid] upmax[d][u] = max(upmax[d - 1][u], upmax[d - 1][mid]) for v, weight in edges[u]: if v != p: dfs(v, u, weight, edges) tout[u] = timer timer += 1 def is_ancestor(a, b): return tin[a] <= tin[b] and tout[a] >= tout[b] def max_up(a, b): if is_ancestor(a, b): return -INF ret = -INF for d in range(17, -1, -1): if not is_ancestor(up[d][a], b): ret = max(ret, upmax[d][a]) a = up[d][a] return max(ret, upmax[0][a]) def path_max(a, b): return max(max_up(a, b), max_up(b, a)) n, m, q = map(int, input().split()) edges = [] for _ in range(m): a, b, s = map(int, input().split()) a -= 1 b -= 1 edges.append((s, a, b)) # Minimum spanning trees class UnionFind: def __init__(self, n): self.parent = list(range(n)) self.size = [1] * n def find(self, x): if self.parent[x] != x: self.parent[x] = self.find(self.parent[x]) return self.parent[x] def merge(self, a, b): root_a = self.find(a) root_b = self.find(b) if root_a != root_b: if self.size[root_a] < self.size[root_b]: root_a, root_b = root_b, root_a self.parent[root_b] = root_a self.size[root_a] += self.size[root_b] edges.sort() uf = UnionFind(n) adj = [[] for i in range(n)] for s, a, b in edges: if uf.find(a) != uf.find(b): uf.merge(a, b) adj[a].append((b, s)) adj[b].append((a, s)) # Precompute binary lifting etc dfs(0, 0, -INF, adj) results = [] for _ in range(q): a, b, f, k, s = map(int, input().split()) a -= 1 b -= 1 m = path_max(a, b) cannot = (m - s + k - 1) // k cnt = min(f, max(0, f - cannot)) print(cnt) ``` ### $O(Qlog^2(N))$ solution or something idk. C++ go brr We can solve this offline instead, we'll take in all the start and end destinations for each family. Sort the edges by weight. We'll do the union-find a bit differently now: For all nodes, save a pointer to a set of the IDs of all families that either start or end there. When merging two nodes, do the following apart from the usual union-find things: - Find out which of the two nodes' roots have the most IDs in the set they're pointing to - Take the smaller of the two and iterate through all elements, if one of the elements already is in the bigger set, remove it from both and save the biggest edge you've added so far as the answer to the family with that ID. - The elements that aren't in both we add to the bigger set. We can see this as the root of a union holding in all IDs of the families that have started or ended in their union. Now we iterate through all edges and add them one by one by merging the nodes they're connecting in the union-find as defined above. Then we just for all families figure out how many can go through that path. This sort of technique of merging two sets by taking the smaller one and adding it to the bigger is a common and powerful one called Smaller-To-Larger Merging. Since the smaller set at least doubles in size at every merge, each element can only be inserted logarithmic many times. https://usaco.guide/plat/merging By doing this solution we don't have to do binary lifting on a minimum spanning tree, which could save some time in contest. ```cpp #include<bits/stdc++.h> #define vl vector<ll> using namespace std; typedef long long ll; const int N = 1e6+1; ll p[N], z[N], a[N], fs[N]; set<ll> s[N]; vector<vl> ed; pair<ll, ll> f[N]; ll r(ll x) { return (p[x] == x ? x : p[x] = r(p[x])); } vl c(ll x, ll y) { x = r(x), y = r(y); if (x == y) return {}; if (z[x] < z[y]) swap(x, y); z[x] += z[y]; p[y] = x; vl v; for (auto e : s[y]) if (s[x].count(e)) { v.push_back(e); s[x].erase(e); } else s[x].insert(e); return v; } int main() { ll n, m, q; cin >> n >> m >> q; for (ll i = 0; i < m; i++) { ll u, v, w; cin >> u >> v >> w; ed.push_back({w, --u, --v}); } iota(p, p + n, 0); fill(z, z + n, 1); for (ll i = 0; i < q; i++) { ll a, b, j, k, t; cin >> a >> b >> j >> k >> t; --a, --b; s[a].insert(i), s[b].insert(i); f[i] = {k, t}; fs[i] = j; } sort(ed.begin(), ed.end()); vl o; for (auto e : ed) { vl v = c(e[1], e[2]); for (ll i : v) a[i] = e[0]; } for (ll i = 0; i < q; i++) { if (a[i] == -1) { o.push_back(-1); continue; } ll c = ceil((double)(a[i] - f[i].second) / f[i].first); o.push_back(min(fs[i], max(fs[i] - c, 0LL))); } for (ll x : o) cout << x << endl; return 0; } ```