# Week 5 - Sets ## Team Team name: **Group [497441]** Date: 01/08/2021 Members: Max Thielen | Role | Name | | - | - | | **Facilitator** keeps track of time, assigns tasks and makes sure all the group members are heard and that decisions are agreed upon. | Max Thielen | | **Spokesperson** communicates group’s questions and problems to the teacher and talks to other teams; presents the group’s findings. | Max Thielen | | **Reflector** observes and assesses the interactions and performance among team members. Provides positive feedback and intervenes with suggestions to improve groups’ processes. | Max Thielen | | **Recorder** guides consensus building in the group by recording answers to questions. Collects important information and data. | Max Thielen | ## Activities ### Activity 1: Set membership examples * checking if a person is able to access a dataset * checking whether a person belongs to a family * checking for class prerequsites * checking for membership in a club * checking for participation in a newsletter ### Activity 2: Comparing floating point numbers ```c= int main() { double a = 41.0 + 0.5 + 0.2 + 0.2 + 0.1; double b = 43.0 - 0.5 - 0.2 - 0.2 - 0.1; _Bool equal = (double a == double b); printf("a=%.20f\n", a); printf("b=%.20f\n", b); printf("a and b are %s", equal? "equal" : "not equal"); return 0; } ``` ![](https://i.imgur.com/NUGfBPL.png =333x75) Due to the tiny rounding error occuring in the first implementation, the best solution for this problem is to take the absolute value of the differance of the two double varibales and comapare this difference to 1e-9. This tiny number will ensure that the values are still very very close together (almost if not equal) however will not fail due to a rounding error. ```c= _Bool equals_dbl (double d1, double d2){ if (abs(d1 - d2) < 1e-9) return true; else return false; } ``` ### Activity 3: Comparing compound data types ```c= _Bool equals_prs(const void* obj1, const void* obj2){ const Person* p1 = (const Person*)obj1; const Person* p2 = (const Person*)obj2; if(p1->yearOfBirth==p2->yearOfBirth&&!strcmp(p1->name,p2->name)) return true; return false; } ``` ### Activity 4: Function prototypes ```c= typedef struct { double* data; int size; // number of items in this set int capacity; // size of the `data` array } Set; _Bool contains(Set* set, double value); _Bool add(Set* set, double value); _Bool sub(Set* set, double value); ``` ### Activity 5: Function implementations ```c= _Bool contains(Set* set, double value){ for(int i=0; i<set->size; i++){ if(set->data[i]==value) return true; } return false; } _Bool add(Set* set, double value){ if(contains(set, value) || set==NULL || set->size==DEFAULT_CAP) return false; if(set->size<set->capacity) set->data[set->size] = value; else{ set = realloc(set,set->capacity+1); set->data[set->size] = value; } set->size++; return true; } _Bool sub(Set* set, double value){ if(!contains(set, value)) return false; for(int i=0; i<set->size; i++){ if(set->data[i]==value){ set->size--; for (int j = i; j < set->size; j++) { set->data[j] = set->data[j+1]; } return true; } } } ``` ### Activity 6: Time complexity of remove and add * **add()**: the function adding values to the set calls on the contains() function going through the entire set one to check for a douplicate value this alone put the function at O(n). However, in order to expand the set in the case that there is not enough capacity the realloc() functions is used which is rather expensive. This is why I implemented a case where realloc adds space for five more values instead of one * **sub()**: the functions removing values from the set calls on the contains() function as well. If it turns out the number is not even in the set the function would return false and end with O(n). However, if the value is in the set the function uses a for loop to find the index of the value and then has to move all of the other values to the left with another for loop making it approximitly O(n^2) in the worst case. ### Activity 7: Binary search ```c= bool contains(Set* set, double needle){ int lo = 0, hi = set->size - 1; while (lo <= hi){ int mid = lo + (hi - lo) / 2; if (set->data[mid] < needle) lo = mid + 1; else if (set->data[mid] > needle) hi = mid - 1; else return true; } return false; } ``` ![](https://i.imgur.com/LCv0mu0.png) ![](https://i.imgur.com/bmSuxtX.png =125x25) ![](https://i.imgur.com/s653BQL.png) The binary search breaks the array into smaller chuncks, each time creating a smaller search range giving it **O(log n)** run time much like the merge sort method. ### Activity 8: Implementing indexOf ```c= int indexOf(Set* set, double needle){ // a few special cases if (set->size == 0) return 0; if (set->size == 1) return (set->data[0] > needle)? 0 : 1; if (set->data[set->size-1]<needle) return set->size; int lo = 0, hi = set->size - 1, mid = lo + (hi - lo) / 2; while (lo <= hi){ mid = lo + (hi - lo) / 2; if (set->data[mid] < needle) lo = mid + 1; else if (set->data[mid] > needle) hi = mid - 1; else return mid; } return (set->data[mid] > needle)? mid : mid+1; } ``` ### Activity 9: Functions add and remove for the sorted array ```c= _Bool add(Set* set, double value){ if(contains(set, value) || set==NULL || set->size==DEFAULT_CAP) return false; int index = indexOf(set, value); if(set->size<set->capacity){ for(int i=set->size+1; i>index; i--){ set->data[i] = set->data[i-1]; } set->data[index] = value; } else{ if(set->size+10<DEFAULT_CAP) set = realloc(set,set->capacity+10); else set = realloc(set,set->capacity+1); for(int i=set->size+1; i>index; i--){ set->data[i] = set->data[i-1]; } set->data[index] = value; } set->size++; return true; } _Bool sub(Set* set, double value) { if (!contains(set, value)) return false; set->size--; for (int i = indexOf(set, value); i < set->size; i++) { set->data[i] = set->data[i + 1]; } return true; } ``` ### Activity 10: Time complexity of the binary search implementation The time complexity it reduced to O(logn) due to the binary search help. ## Look back ### What we've learnt Using array as a set and applying binary search to improve run time. ### What were the surprises The greater complexity of add after improving time complexity. ### What problems we've encountered Returning the right index for indexOf function. ### What was or still is unclear N/A ### How did the group perform? N/A