Ch3.1 Greedy Algorithm

# Ch3.1 Greedy Algorithm :::info **You're** trying to construct some big object, and the way you do it is you sort of make one decision in the most **greedy algorithm**, which constructs a solution by making a sequence of choices. ::: At each step, it selects the option that appears to be the most **locally optimal**—meaning the best choice right now—without regard for future consequences. The process repeats until a complete solution is built. A greedy algorithm is only guaranteed to yield the **globally optimal solution** for specific problems that possess the greedy choice property and optimal substructure. For many other problems, the greedy solution will be suboptimal. ![greedy-minimum](https://hackmd.io/_uploads/H1t8g3FZ-g.jpg) :::info Build a solution step by step, hoping it leads to a globally optimal soltion ::: # 3.1.1 UMPIRE **Understand** - Make a greedy choice - Prove that it is a safe choice - Reduce to a subproblem - Solve the subproblem and estimate runtime **Match** - Which sort order is convenient? (Greedy move can be faster after sorting) **Plans** - Put max digit first - Choose patients with min treatment time - Cover leftmost point - Take item with maximum value per unit of weight **Evaluation** - A choice is called safe if there is an optimal solution consistent with this first choice - Not all first choices are safe - Greedy choices are often unsafe # 3.1.2 Core Concepts: ### A. Find The "Safe Choice" with Iteration (Proof by Exchange) A choice is "safe" if there is an optimal solution consistent with this first choice. To prove a greedy choice is safe, we often use an Exchange Argument. **Example 1:Naive Iteraion** :::info Find the Minimum, remove it, and repeat ::: >"If I claim I have the Perfect Schedule (Optimal), is it allowed for a SLOW patient ($t_1$) to be immediately in front of a FAST patient ($t_2$)? **The Scenario**:Let $t_1 = 60$ mins (Slow) and $t_2 = 5$ mins (Fast). **Scenario A (Slow before Fast)**: - Patient 1 waits 0 mins. - Patient 2 waits 60 mins. - Total cost contribution: $0 + 60 = \mathbf{60 \text{ mins}}$. **Scenario B (Fast before Slow):** - Patient 2 waits 0 mins. - Patient 1 waits 5 mins. - Total cost contribution: $0 + 5 = \mathbf{5 \text{ mins}}$. **The Conclusion**:No, it is not possible for an optimal arrangement to have $t_1 > t_2$. - Why? If you ever have a slow person in front of a fast person, you can just swap them to reduce the total waiting time. - Since you can improve it, the original arrangement was never "Optimal" to begin with. - Therefore: Treating the minimum time first is always the safe choice. **Example 2: Minimum Waiting Time** $$\text{MinWait}(n \text{ patients}) = (n-1) \times t_{min} + \text{MinWait}(n-1 \text{ patients})$$ (The logic: The shortest task $t_{min}$ is picked first, causing $n-1$ people to wait for it. Then we solve for the rest.) ### B. Case Studies **Celebration Party Problem** Organize children into minimum possible number of groups such that the age of any two children in the same group differs in at most two years Naive Search: O(n) = 2^n^ Greedy Search: - Safe Choice: Imagine each child is a dot on a number line representing their age. Cover the leftmost point with a length 2 which starts in this point. - Iteration: Remove all the point within the segment, solve the same problem with the remaining points - Proof: The Setup: Imagine a hypothetical "Optimal Solution" that covers $x_{min}$ with a segment that starts before $x_{min}$ (e.g., at $x_{min} - 0.5$).The Observation: Since $x_{min}$ is the leftmost point, the space to its left is empty. There are no points there.The Swap (The Slide):If you slide that segment to the right so it starts exactly at $x_{min}$, you lose coverage on the empty left side (which doesn't matter).You gain coverage on the right side.Conclusion: This new "slid" segment covers everything the old one did, plus potentially more. Therefore, this specific greedy choice is just as good, if not better, than any other placement. # 3.1.3 : Optimized Approach: Sort before of Iteration ### MinTotalWaitingTime(t,n) ![image](https://hackmd.io/_uploads/BkVwpu6MZl.png) ### PsuedoCode- Naive:Selection Sort ```python # Selection Sort: Scan the entire remaining list to select the single smallest item, process it, and then repeat. #imagine with 3 tasks with processing times t=[15,10,20] #initialize array of treated=[0,0,0] waitingtime <- 0 treated <- array of n zeros # Seatch status schedule <- Empty array initiated for i from 1 to n:# OuterLoop: Repeat multiple pass till the n array done t_min <- + ∞ # set as infinity to ensure the logic of update is triggered minIndex <-0 # 1-indexed, 0 means no index found yet for j from 1 to n: # j: scanner, candidate if treated[j] ==0 and t[j] < t_min: t_min <- t[j] minIndex <- j waitingTime <- waitingTime + (n-i)* t_min # ppl left in line * time the current task takes, 2 * 10 = 20 mins waiting time for all the patients treated[minIndex] <- 1 schedule.append(minIndex) #Inner Loop: The first pass: update treated [0,1,0] return waitingTime, schedule ``` O(n) = n^2^ ### PsuedoCode: Faster- Merge Sort ```python # Merge Sort, Insersion Sort hybrid ; Timsort #step1- Create pairs of (time, original index ) to preserve the IDs tasks <- [(t[1],1), (t[2],2),(t[3],3)...(t[n],n)] #step2- sort array: O(nlongn) Sort tasks by time (ascending) #step3- calculate cost(O(n)) waitingTime <-0 schedule <- array of size n for i from 1 to n: current_task <- task [i] time_val <- current_task.time original_id <- current_task.index schedule[i] <- original_id waitingTime <- waitingTime + (n-i) *time_val return waitingTime, schedule ``` O(n) = n log n # 3.1.4 PointsCoverSorted(x1,....xn) assume x~1~ ≤ x~2~ ...≤ x~n~ check and build lines, update position what am I checking? 1. Are there any kids left standing outside who need a group 2. Does the next kid in line fit inside ```python #Naive, Recursive PointsCoverSorted_naive(start_index): if start_index > n: return 0 min_groups = ∞ for j from start_index to n: if x[j] - x[start_index] <=2: cost= 1 + PointsCoverSorted_naive(j +1) if cost < min_groups: min_groups = cost else break return min_groups ``` O(n) = 2^n - Recursive: Branching Tree for everystep ```python= #Fater PointsCoverSorted segments <- empty list left <- 1 while left ≤n: (l,r) <- (x~left~, x~left~ +2) segments.append((l,r)) left <- left +1 while left ≤ n and x~left ≤ r: left <- left +1 return segment ``` O(n) = n (Update variable left) O(n)= n log(n) (sortion) move left change from 1 to n # 3.1.5 Maximizing Loot / KnapSack Problem ```python= #Naive , Recursive #BestItem(w_1,v_1,....w_n, v_n) # O(n) maxValuePerWeight <- 0 bestItem <-0 for i from 1 to n: if w_i > 0: if v_i/w_i > maxValuePerWeight: maxValuePerWeight <- v_i/w_i bestItem <- i return bestItem #KnapSack(W,w_1,v_1,....w_n, v_n) #Calls previous function Recursively #T: Quadratic- O(n^2) - n (steps) x n (work per step) # W: remaining Capacity # a: amount taken for each item amounts <- [0,0,0,...] totalValue <- 0 repeat n times: if W= 0: return(totalValue,amounts ) i <-BestItem(w_1,v_1,....w_n, v_n) a <- min(W,w_i) totalValue <- totalValue + a(v_i/w_i) w_i <- w_i -a amounts[i] <- amounts[i] + a W <- W -a return(totalValue,amounts ) ``` ```python= #KnapsackFast(W,w_1,v_1,....w_n, v_n) # Sort all items by vi/wi first # And do the linear scan # T: O(n log n)[sorting] + O(n) [scanning] = O(n log n) #Assume v_1/w_1 ≥ v_2 / w_2....≥ w_n/v_n # amounts <- [0,0,0,...] # totalValue <- 0 # for i from 1 to n: # if W =0: # return(totalValue,amounts ) # a <- min(W,w_i) # totalValue <- totalValue + a(v_i/w_i) # w_i <- w_i -a # amounts[i] <- amounts[i] + a # W <- W -a # return(totalValue,amounts ) # class: blue print of objects # class used as a zip tie to bind the three lists, weights, value and indices class Item: def __init__(self, value, weight, original_index): self.value = value self.weight = weight self.ind = original_index self.ratio = value / weight def knapsack_fast(capacity, weights, values): n= len(weights) items = [] for i in range(n): items.append(Item(values[i],weights[i],i)) #Sort items.sort(key=lamba x:x.ration, reverse=True) #Initialize amounts=[0] *n total_value=0 for item in items: if capacity=0: return total_value, # a: How much of this item can we take? a = min(capacity, item.weight) total_value += a * item.ratio amounts[item.ind] +=a capacity -=a return totalValue,amounts ``` # 3.1.6 Proving Correctness of Greedy Algorithms Practice:Largest Concatenate- New Greedy Rule ```python= # Understand: # # Input: [2, 21] # Goal: Maximize concatenated string. # Issue: 9 > 91 is false numerically, but 991 > 919. We need a custom sort order. # Match # Custom comparator sorting algorithm #Plan- New Comparison rule # function compare (x,y): # return -1 if x+y > y+x # return 1 if y+x > x+y # return 0 if x+y = y+x # T: O (N logN *k ), k is the max length of a string ``` # 3.1.7 Practice Sets- Greedy Algorithm ## 1. Money Change ![image](https://hackmd.io/_uploads/r1efxRNqEWl.png) ```python= def change(money): # UMPIRE: # U: inpit: integer, output/ constraints: count of each kinds of coin denomination (1, 5 ,10) #M: greedy -> integer division, modulo #R/E: O (n) count = 0 count += m // 10 money %= 10 count += money // 5 money %= 5 count += money return count if __name__ == '__main__': m = int(input()) print(change(m)) #input: 28 #ouput: 6 ``` ## 2. Maximum Value of the loot ![image](https://hackmd.io/_uploads/H1AVRVcVWx.png) ```python from sys import stdin # How to run: Get-Content input.txt | poetry run python fractional_knapsack.py def optimal_value(capacity, weights, values): value = 0. #U: output: maximum value # input- # 1st line : number of diff compounds, weight capacity # ith subsequent lines: nth cost, nth weight # M: greedy, sort by vi /wi # P: create an array for ith item [(vi/wi, wi,vi)] # P: sort by descending sort(reverse=True), lambda x: x[0] # P: for item: amount = min (weight, capacity) # P: read as a giant string, split into smaller string -> int -> store into list # " 3 50 60 " -> (3, 50 , 60) # R/E:T: O (n log n) items = [] for i in range(len(weights)): if weights[i] != 0: items.append((values[i] / weights[i], weights[i], values[i])) # Sort by ratio descending items.sort(key=lambda x: x[0], reverse=True) for ratio, weight, val in items: if capacity == 0: return value # Take as much as possible amount = min(weight, capacity) value += amount * ratio capacity -= amount return value if __name__ == "__main__": data = list(map(int, stdin.read().split())) n, capacity = data[0:2] values = data[2:(2 * n + 2):2] weights = data[3:(2 * n + 2):2] opt_value = optimal_value(capacity, weights, values) print("{:.10f}".format(opt_value)) # input: # 3 50 # 60 20 # 100 50 # 120 30 # 100 50 # output: # 180.0000 ``` ## 3. Car Fueling ![image](https://hackmd.io/_uploads/r1U9CVqV-e.png) ```python= from sys import stdin def min_refills(distance, tank, stops): # write your code here # Add start (0) and destination (dist) to stops stops = [0] + stops + [distance] # e.g.: [0] +[200, 300] + [1000] num_refills = 0 current_refill = 0 while current_refill <= len(stops) - 2: last_refill = current_refill # Go as far as possible while (current_refill <= len(stops) - 2 and stops[current_refill + 1] - stops[last_refill] <= tank): current_refill += 1 if current_refill == last_refill: return -1 # Impossible to reach next stop if current_refill <= len(stops) - 2: num_refills += 1 return num_refills # return -1 if __name__ == '__main__': d, m, _, *stops = map(int, stdin.read().split()) print(min_refills(d, m, stops)) # input: # 950 # 400 # 4 # 200 375 550 750 #output: # 2 ``` ## 4. Maximum Advertisement Revenue ![image](https://hackmd.io/_uploads/B1nEkrcN-e.png) ```python # from itertools import permutations #U: Input: n , 2 sequences / output: dotproduct #M: greedy -> sort both in ascending # R/E: T: O nlogn #Naive # def max_dot_product(first_sequence, second_sequence): # max_product = 0 # for permutation in permutations(second_sequence): # dot_product = sum(first_sequence[i] * permutation[i] for i in range(len(first_sequence))) # max_product = max(max_product, dot_product) # return max_product def max_dot_product(first_sequence, second_sequence): first_sequence.sort() second_sequence.sort() max_dot_product = sum (f*s for f,s in zip(first_sequence, second_sequence)) return max_dot_product if __name__ == '__main__': n = int(input()) prices = list(map(int, input().split())) clicks = list(map(int, input().split())) assert len(prices) == len(clicks) == n print(max_dot_product(prices, clicks)) #input: #3 # 2 3 9 # 7 4 2 # output: # 79 ``` ## 5. Covering Segments ![image](https://hackmd.io/_uploads/B1WqkBqVbx.png) ```python #U: input: number of segments, each line's start point and end-point coordinate #M: greedy - sort all lines by left end (descending), #M: choose the right coordinate of the first line as current point, check the next line # M: If the next line starting point bigger then current point, update #P: read all input in string, split, convert into integer (map()), #P: n, *data, the first integer is n, the rest are data #P: sort all lines by starting point #p: segments: [(1,3),(2,5)] from sys import stdin from collections import namedtuple Segment = namedtuple('Segment', 'start end') def optimal_points(segments): points = [] # write your code here segments.sort(key=lambda x: x[1]) points = [] current_point = segments[0][1] points.append(current_point) for s in segments: if current_point < s[0]: current_point = s[1] points.append(current_point) return points # for s in segments: # points.append(s.start) # points.append(s.end) # return points if __name__ == '__main__': input = stdin.read() n, *data = map(int, input.split()) segments = list(map(lambda x: Segment(x[0], x[1]), zip(data[::2], data[1::2]))) points = optimal_points(segments) print(len(points)) print(*points) #intput: # 3 #1 3 #2 5 #3 6 # output: # 1 # 3 ```