# Homework5 (HW5) ###### tags: `2020pdsa` Due: 12/11 21:00 (HW5-1) Due: 12/15 21:00 (HW5-2) Judge Open: 12/? [toc] ## Version Python: 3.8.5 (Note: no numpy) Java: openjdk14 (https://openjdk.java.net/projects/jdk/14/) Platform: Debian ## Our Judge https://c4lab.bime.ntu.edu.tw:13000 Grading Status: * AC: ACcepted * TLE: Time Limit Excess * WA: Wrong Answer * RE: Runtime Error (Mostly your program raise some error unexpectly) Memory: 2G Handin the file with utf-8 encoding if there are 中文 words inside your code (including comments). ### Template Download https://cool.ntu.edu.tw/courses/3501/files/folder/Homework_Template # HW5-1 Cluster In this homework, you are going to implement a clustering algorithm called `centroid hierarchical clustering algorithm` to hierarchically group N 2-dimensional points in the plane and generate a clustering tree. In the cluster function, we input those yellow points in Figure 1 and set cluster_num=1 (as demo), you should build a cluster tree as shown in Figure 2, and return the coordinates of the cluster. Figure 1 ![figure1](https://ceiba.ntu.edu.tw/course/58b7bd/hw/cluster.png) Figure 2 ![figure2](https://ceiba.ntu.edu.tw/course/58b7bd/hw/clusterTree.png) Your construction function will receive an list of array(Each array has two elements: x,y). The clustering procedures are described as follows: * Step 0: Treat each point as a cluster; * Step 1: Find the nearest pair of clusters (a, b) among the remaining N clusters * Step 2: Create a new cluster, c, of which the coordinates are the centroid of all the points it contains after merging the clusters a and b; * Step 3: Delete the two old clusters: a and b; * Step 4: N = N - 1; * Step 5: Re-calculate the distance of the new cluster, c, to the other remaining clusters; * Step 6: go to Step 1 unless N = cluster_num Output the centroid of those N cluster in the sorted order of (x,y) pairs. (Sorted x in the increaing order. If x is equal, sorted y in the increaing order) ## Hint Priority Queue ## Template ``` python from typing import List import heapq class Cluster: def cluster(self, points: List[List[int]], cluster_num: int) -> List[List[float]]: """ Cluster the points to cluster_num clusters. Output the sorted center coordination of those clusters. """ return sorted([[0,0]]) if __name__ == "__main__": print(Cluster().cluster([[0,0], [1,0], [3,0], [0,1]], 2)) # [[0.3333333333333333, 0.3333333333333333], [3, 0]] # [0,0], [1,0], [0,1] are in same cluster # [3,0] are in another cluster print(Cluster().cluster([[0,3], [3,3], [4,7], [9,0], [9,4]], 3)) # [[1.5, 3.0], [4, 7], [9.0, 2.0]] print(Cluster().cluster([[0,1], [0,2], [3,1], [3,2]], 2)) # [[0.0, 1.5], [3.0, 1.5]] ``` Template for java ``` java import java.util.List; import java.util.ArrayList; import java.util.Arrays; import java.util.PriorityQueue; import java.lang.Math; import edu.princeton.cs.algs4.Point2D; class Cluster { public List<double[]> cluster(List<int[]> points, int cluster_num) { ArrayList<Point2D> p = new ArrayList<Point2D>(); for(int[] i: points) { p.add(new Point2D(i[0], i[1])); } ArrayList<double[]> ans = new ArrayList<double[]>(); ans.add(new double[]{0, 1.5}); ans.add(new double[]{3, 1.5}); return ans; } public static void main(String[] args) { List<double[]> out = new Cluster().cluster(new ArrayList<int[]>(){{ add(new int[]{0,1}); add(new int[]{0,2}); add(new int[]{3,1}); add(new int[]{3,2}); }}, 2)); for(double[] o: out) System.out.println(Arrays.toString(o)); } } ``` ## TestCase * `0 <= x,y <= M`, x and y are integers. * We guarantee the coordinates are unique `N` is the number of points * We check your answer by `abs(output - answer) <= 1e-3` * 2 <= Cluster number <= 5 * We guarantee(?) there is only one solution. Cases * case1: 20 points: N <= 10, M <= 20 * case2: 20 points: N <= 500, M < 1000000 Special Case (Three samples) * case3: 20 points: N <= 100, M <= 100 * case4: 20 points: N <= 400, M <= 1000 * case5: 20 points: N <= 700, M <= 10000 ## Time Limit (We can extend it) Python: 4s Java: 500ms ## Modified from ... # HW5-2 Calendar Implement a calendar that allows adding events on it. Events will occupy the time interval `[start, end)` in the calendar, i.e. the time step x is unavailable in `start <= x < end`. Make sure there are no overlapping events on the calendar. If the event request is violating the rule, please output `False` to reject the request, otherwise return `True` and then book the event in the time interval. Initally, there is no events in calendar, i.e. It is free for any event. ## Hint * Segment Tree(Tree interval as node) OR * Balanced Binary Search Tree(Allow Finding the min or max index) OR * Interval Search Tree ## Note For java user, [`TreeMap`](https://docs.oracle.com/javase/7/docs/api/java/util/TreeMap.html) is a powerful data structure For pythoner, [`sortedcontainers`](http://www.grantjenks.com/docs/sortedcontainers/index.html) is allowed for this homework This problem is easy though, you can try implement your own RBTree or any methods instead of using `sortedcontainers` packages. You will get more confidence on tree structure if you writen by your own. Red-Black Tree implementation exmaples: * Java implemenatation https://algs4.cs.princeton.edu/33balanced/RedBlackBST.java.html * My implementation https://gist.github.com/linnil1/7db27e38fffa8f95c5443ce2ef608c6f The performance of python-based RBTree written by yourself is always worse than the packages(Written in C++). I extend the TLE limit because I don't want to punish you when you work hard to implement one. (My RBTree takes 4s in case2) ## Template ``` python class Calendar: def __init__(self): ... def book(self, start: int, end: int) -> bool: """ Book the event where interval started from start(included) to end(excluded) If availble: book it and return True else: return False """ return True if __name__ == "__main__": a = Calendar() print(a.book(10, 20)) print(a.book(15, 25)) print(a.book(20, 25)) print(a.book(17, 21)) print(a.book(0, 3)) print(a.book(2, 6)) print(a.book(3, 6)) """ True False #[15, 20) is unavailable True False #[17, 21] is unavailable True False #[2, 3) is unavailable True """ a = Calendar() print(a.book(5, 15)) print(a.book(0, 18)) print(a.book(24, 29)) print(a.book(13, 25)) print(a.book(18, 22)) print(a.book(15, 18)) """ True False #[5, 15) is unavailable True False #[24, 25] is unavailable True True """ ``` java template ```java // author: linnil1 import java.util.List; import java.util.ArrayList; import java.util.PriorityQueue; import java.lang.Math; import edu.princeton.cs.algs4.Point2D; class Calendar { public Calendar() {}; public boolean book(int start, int end) { System.out.println(start); System.out.println(end); return true; } public static void main(String[] args) { Calendar calendar = new Calendar(); System.out.println(calendar.book(0, 5)); } } ``` ## TestCase * `0 <= start, end <= M`, start and end are integer. `N` is the number of booking events Cases * case1: 20 points: N <= 10, M <= 50 * case2: 20 points: N <= 10001, M < 200000 Special Case(Three samples) * case3: 20 points: N <= 1000, M <= 1000 * case4: 20 points: N <= 20000, M <= 200000 * case5: 20 points: N <= 120000, M <= 1000000 ## Time Limit (We can extend it if needed) Python: 5s Java: ?s ## Modified from https://leetcode.com/problems/my-calendar-i/