# Homework5 (HW5)
###### tags: `2020pdsa`
Due: 12/11 21:00 (HW5-1)
Due: 12/15 21:00 (HW5-2)
Judge Open: 12/?
[toc]
## Version
Python: 3.8.5 (Note: no numpy)
Java: openjdk14 (https://openjdk.java.net/projects/jdk/14/)
Platform: Debian
## Our Judge
https://c4lab.bime.ntu.edu.tw:13000
Grading Status:
* AC: ACcepted
* TLE: Time Limit Excess
* WA: Wrong Answer
* RE: Runtime Error (Mostly your program raise some error unexpectly)
Memory: 2G
Handin the file with utf-8 encoding if there are 中文 words inside your code (including comments).
### Template Download
https://cool.ntu.edu.tw/courses/3501/files/folder/Homework_Template
# HW5-1 Cluster
In this homework, you are going to implement a clustering algorithm called `centroid hierarchical clustering algorithm` to hierarchically group N 2-dimensional points in the plane and generate a clustering tree.
In the cluster function, we input those yellow points in Figure 1 and set cluster_num=1 (as demo), you should build a cluster tree as shown in Figure 2, and return the coordinates of the cluster.
Figure 1

Figure 2

Your construction function will receive an list of array(Each array has two elements: x,y). The clustering procedures are described as follows:
* Step 0: Treat each point as a cluster;
* Step 1: Find the nearest pair of clusters (a, b) among the remaining N clusters
* Step 2: Create a new cluster, c, of which the coordinates are the centroid of all the points it contains after merging the clusters a and b;
* Step 3: Delete the two old clusters: a and b;
* Step 4: N = N - 1;
* Step 5: Re-calculate the distance of the new cluster, c, to the other remaining clusters;
* Step 6: go to Step 1 unless N = cluster_num
Output the centroid of those N cluster in the sorted order of (x,y) pairs. (Sorted x in the increaing order. If x is equal, sorted y in the increaing order)
## Hint
Priority Queue
## Template
``` python
from typing import List
import heapq
class Cluster:
def cluster(self, points: List[List[int]], cluster_num: int) -> List[List[float]]:
"""
Cluster the points to cluster_num clusters.
Output the sorted center coordination of those clusters.
"""
return sorted([[0,0]])
if __name__ == "__main__":
print(Cluster().cluster([[0,0], [1,0], [3,0], [0,1]], 2))
# [[0.3333333333333333, 0.3333333333333333], [3, 0]]
# [0,0], [1,0], [0,1] are in same cluster
# [3,0] are in another cluster
print(Cluster().cluster([[0,3], [3,3], [4,7], [9,0], [9,4]], 3))
# [[1.5, 3.0], [4, 7], [9.0, 2.0]]
print(Cluster().cluster([[0,1], [0,2], [3,1], [3,2]], 2))
# [[0.0, 1.5], [3.0, 1.5]]
```
Template for java
``` java
import java.util.List;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.PriorityQueue;
import java.lang.Math;
import edu.princeton.cs.algs4.Point2D;
class Cluster {
public List<double[]> cluster(List<int[]> points, int cluster_num) {
ArrayList<Point2D> p = new ArrayList<Point2D>();
for(int[] i: points) {
p.add(new Point2D(i[0], i[1]));
}
ArrayList<double[]> ans = new ArrayList<double[]>();
ans.add(new double[]{0, 1.5});
ans.add(new double[]{3, 1.5});
return ans;
}
public static void main(String[] args) {
List<double[]> out = new Cluster().cluster(new ArrayList<int[]>(){{
add(new int[]{0,1});
add(new int[]{0,2});
add(new int[]{3,1});
add(new int[]{3,2});
}}, 2));
for(double[] o: out)
System.out.println(Arrays.toString(o));
}
}
```
## TestCase
* `0 <= x,y <= M`, x and y are integers.
* We guarantee the coordinates are unique
`N` is the number of points
* We check your answer by `abs(output - answer) <= 1e-3`
* 2 <= Cluster number <= 5
* We guarantee(?) there is only one solution.
Cases
* case1: 20 points: N <= 10, M <= 20
* case2: 20 points: N <= 500, M < 1000000 Special Case (Three samples)
* case3: 20 points: N <= 100, M <= 100
* case4: 20 points: N <= 400, M <= 1000
* case5: 20 points: N <= 700, M <= 10000
## Time Limit
(We can extend it)
Python: 4s
Java: 500ms
## Modified from
...
# HW5-2 Calendar
Implement a calendar that allows adding events on it.
Events will occupy the time interval `[start, end)` in the calendar, i.e. the time step x is unavailable in `start <= x < end`.
Make sure there are no overlapping events on the calendar. If the event request is violating the rule, please output `False` to reject the request, otherwise return `True` and then book the event in the time interval.
Initally, there is no events in calendar, i.e. It is free for any event.
## Hint
* Segment Tree(Tree interval as node) OR
* Balanced Binary Search Tree(Allow Finding the min or max index) OR
* Interval Search Tree
## Note
For java user, [`TreeMap`](https://docs.oracle.com/javase/7/docs/api/java/util/TreeMap.html) is a powerful data structure
For pythoner, [`sortedcontainers`](http://www.grantjenks.com/docs/sortedcontainers/index.html) is allowed for this homework
This problem is easy though, you can try implement your own RBTree or any methods instead of using `sortedcontainers` packages. You will get more confidence on tree structure if you writen by your own.
Red-Black Tree implementation exmaples:
* Java implemenatation https://algs4.cs.princeton.edu/33balanced/RedBlackBST.java.html
* My implementation https://gist.github.com/linnil1/7db27e38fffa8f95c5443ce2ef608c6f
The performance of python-based RBTree written by yourself is always worse than the packages(Written in C++). I extend the TLE limit because I don't want to punish you when you work hard to implement one.
(My RBTree takes 4s in case2)
## Template
``` python
class Calendar:
def __init__(self):
...
def book(self, start: int, end: int) -> bool:
"""
Book the event where interval started from start(included) to end(excluded)
If availble: book it and return True
else: return False
"""
return True
if __name__ == "__main__":
a = Calendar()
print(a.book(10, 20))
print(a.book(15, 25))
print(a.book(20, 25))
print(a.book(17, 21))
print(a.book(0, 3))
print(a.book(2, 6))
print(a.book(3, 6))
"""
True
False #[15, 20) is unavailable
True
False #[17, 21] is unavailable
True
False #[2, 3) is unavailable
True
"""
a = Calendar()
print(a.book(5, 15))
print(a.book(0, 18))
print(a.book(24, 29))
print(a.book(13, 25))
print(a.book(18, 22))
print(a.book(15, 18))
"""
True
False #[5, 15) is unavailable
True
False #[24, 25] is unavailable
True
True
"""
```
java template
```java
// author: linnil1
import java.util.List;
import java.util.ArrayList;
import java.util.PriorityQueue;
import java.lang.Math;
import edu.princeton.cs.algs4.Point2D;
class Calendar {
public Calendar() {};
public boolean book(int start, int end) {
System.out.println(start);
System.out.println(end);
return true;
}
public static void main(String[] args) {
Calendar calendar = new Calendar();
System.out.println(calendar.book(0, 5));
}
}
```
## TestCase
* `0 <= start, end <= M`, start and end are integer.
`N` is the number of booking events
Cases
* case1: 20 points: N <= 10, M <= 50
* case2: 20 points: N <= 10001, M < 200000 Special Case(Three samples)
* case3: 20 points: N <= 1000, M <= 1000
* case4: 20 points: N <= 20000, M <= 200000
* case5: 20 points: N <= 120000, M <= 1000000
## Time Limit
(We can extend it if needed)
Python: 5s
Java: ?s
## Modified from
https://leetcode.com/problems/my-calendar-i/