# Week 3 - Sorting and Searching
https://code-with-me.jetbrains.com/qyZvU4NSOoKIk243hLZZsA#p=CL&fp=F2320E0BD580A949D185641B942CAD82A0E1C9ABF7D35751872FF36CEDFAEB10
https://blog.jetbrains.com/blog/2020/09/28/code-with-me-eap/
## Team 3
Team name: Team 3
Date: 05-03-2021
Members
Anton Vorderman
Esther Belinfante
Emilio Hodge
| Role | Name |
| ------------------------------------------------------------ | ---- |
| **Facilitator** keeps track of time, assigns tasks and makes sure all the group members are heard and that decisions are agreed upon. | Emilio |
| **Spokesperson** communicates group’s questions and problems to the teacher and talks to other teams; presents the group’s findings. | Anton |
| **Reflector** observes and assesses the interactions and performance among team members. Provides positive feedback and intervenes with suggestions to improve groups’ processes. | Emilio |
| **Recorder** guides consensus building in the group by recording answers to questions. Collects important information and data. | Esther |
## Activities
### Activity 1: Purpose of sorted lists
Record your answer here:
Dan kun je gerichter in de lijst zoeken omdat je weet dat deze lijst gesorteerd is.
**Voorbeelden:**
Telefoonboek
Muziek playlist
Bestanden op je computer
### Activity 2: Explanation of selection sort
Record your answer here:
https://www.youtube.com/watch?v=g-PGLbMth_g&ab_channel=MichaelSambol
### Activity 3: Performance of selection sort
Record your answer here:
Hoe meer elementen je hebt, hoe langer het gaat duren.
Time = O*n^2
### Activity 4: Implement selection sort
Record your answer here:
```c=
void selection_sort(Array *pArray) {
Data *temp = malloc(sizeof(Data) * pArray->capacity);
for (int x = 0; x < pArray->count; x++) {
int z = x;
for (int y = x+1; y < pArray->count; y++) {
if (pArray->values[y].popularity < pArray->values[z].popularity) {
z = y;
}
}
*temp = pArray->values[x];
pArray->values[x] = pArray->values[z];
pArray->values[z] = *temp;
}
free(temp);
}
```
### Activity 5: Explanation of merge sort
Record your answer here:
https://www.youtube.com/watch?v=JSceec-wEyw&ab_channel=GeeksforGeeks
### Activity 6: Merge sort - step by step
Record your answer here:

### Activity 7: Implement the merge function
Record you answer here:
```c=
// merges the two adjacent sub-arrays, which start at indices
// iA and (iA + nA), and have lengths at most nA,
// into the larger array pTarget.
// returns the number of merged elements.
int merge(Array *pArray, Data *pTarget, int iA, int nA)
{
int indexA = iA, indexB = iA + nA;
int nB = nA;
if (pArray->count < indexB + nB)
nB = pArray->count - indexB;
int target_idx = indexA;
while (indexA < iA + nA && indexB < iA + nA + nB){
// compare pA[indexA] and pB[indexB]
// if A > B
if (compare(&pArray->values[indexA], &pArray->values[indexB]) > 0) {
// write value from first array to target
pTarget[target_idx] = pArray->values[indexB];
indexB++;
}
// if A <= B
else {
// write value from second array to target
pTarget[target_idx] = pArray->values[indexA];
indexA++;
}
target_idx++;
}
// copy remaining elements of the first array
while (indexA < iA + nA)
pTarget[target_idx++] = pArray->values[indexA++];
// copy remaining elements of the second array
while (indexB < iA + nA + nB)
pTarget[target_idx++] = pArray->values[indexB++];
// return size of merged array
return nA + nB;
}
```
### Activity 8: Implement the divide-and-conquer algorithm
Record your answer here:
```c=
void merge_sort(Array *pArray) {
// initially, each element is a single array
int numArrays = pArray->count;
int arraySize = 1;
// create a temporary space for holding the merged arrays
Data *pWork = malloc(sizeof(Data) * pArray->capacity);
// merge the arrays until one sorted array is left
while (numArrays > 1)
{
// compute the number of pairs
const int numPairs = numArrays / 2;
// number of elements remaining to be sorted
int remaining = pArray->count;
// merge each pair of consecutive (sub)arrays into a single array
for (int i = 0; i < numPairs; i++)
{
// call merge(pArray, pWork, 2 * i * arraySize, arraySize) to merge the two sorted (sub)arrays into a sorted larger array
// decrease the number of remaining elements by the number of merged elements
//[complete the code!]
remaining -= merge(pArray, pWork, 2 * i * arraySize, arraySize);
//printf("remaining: %d", remaining);
}
// copy the unmerged values
merge(pArray, pWork, 2 * numPairs * arraySize, remaining);
// update numArrays: divide by 2 and round up
numArrays = (numArrays + 1) / 2;
// double the arraySize
arraySize = arraySize * 2;
// swap working space with array
Data *tmp = pArray->values;
pArray->values = pWork;
pWork = tmp;
}
// free the working space
free(pWork);
}
```
### Activity 9: Test merge sort
Gaat helemaal goed, ook de lijst van 100000
### Activity 10: Binary search - step by step

### Activity 11: Implement binary search
```c=
int binarySearch(const Array *pArray, const Data *pValue)
{
int lpos = 0;
int rpos = pArray->count;
while (lpos < rpos)
{
// search the interval [lpos, rpos)
// (lpos is included in the interval, rpos is not)
// determine mid position in [lpos, rpos)
int mid = (lpos + rpos) / 2;
// compare element at middle position against value searched for
int ordering = compare(&pArray->values[mid], pValue);
if (ordering > 0)
{
// must continue search in interval [lpos, mid)
rpos = mid;
}
else if (ordering < 0)
{
// must continue search in interval [mid + 1, rpos)
lpos = mid+1;
}
else
{
// value is found at index mid!
return mid;
}
}
// we didn't find the value, so stop and indicate
// to caller by returning -1.
return -1;
}
```
### Activity 12: Test binary search
```c=
int main()
{
Array array;
int sorted_ok;
Data *search_for;
array_init(&array, 100);
if (!parse_csv(&array, "C:\\Users\\Anton\\CLionProjects\\Algorithms & Datastructures\\untitled\\cmake-build-debug\\data_1000.csv"))
{
return 1;
}
printf("%d records read\n", array.count);
// TODO: sort the array, based on some criterion, using either selection sort or merge sort
//selection_sort(&array);
merge_sort(&array);
sorted_ok = 1;
printf("Contents after sorting:\n");
printf("[0] ");
data_print(&array.values[0]);
for (int i = 1; i < array.count; i++)
{
printf("[%d] ", i);
data_print(&array.values[i]);
if (compare(&array.values[i - 1], &array.values[i]) > 0)
sorted_ok = 0;
}
if (sorted_ok)
printf("\nData is sorted correctly!\n");
else
printf("\nData is *NOT* sorted correctly!\n");
printf("Select a popularity you want to search for (between 0 and 100):\n");
scanf("%d", &search_for->popularity);
int index_found = binarySearch(&array, search_for);
if(index_found != -1){
printf("Popularity found at index:%d\n", index_found);
}
else{
printf("popularity wasn't found\n");
}
return 0;
}
```
### Activity 13: Time complexity
findMax = O(n)
findMax2 = O(n^2)
### Activity 14: Compare merge sort and selection sort
| |Merge | Selection |
| --------------------------|---------------- | ---------------- |
|Data_10|0,13 ms|0,20 ms|
|Data_1000|0,33 ms|1,29 ms|
|Data_5000|1,22 ms|26,45 ms|
|Data_10000|2,45 ms|104,60 ms|
|Data_25000|6,13 ms|651,44 ms|
|Data_50000|12,85 ms|2594,41 ms|
|Data_100000|39,39 ms|12252,31 ms|
Merge:

Selection:

### Activity 15: Compare naive search and binary search
Linear search = O(n)
Binary search = O(log(n))
## Look back
### What we've learnt
We hebben geleerd te werken met het sorteren van lijsten en wat je hier mee kunt doen.
-Selection sort en merge sort
-Binary search
-Time complexity
### What were the surprises
Bij activity 4: De comments boven "[complete the code]" moesten andersom. Hier zijn we zelf al wel achter gekomen dat dit waarschijnlijk andersom moest en hebben direct gewerkt met het omgekeerde.
### What problems we've encountered
Activity 7, een aantal van de variabelen waren voor ons eerst niet helemaal duidelijk wat ze deden. De naam van de variabelen iA, nA en nB waren een beetje onduidelijk.
Activity 12: we wouden een scanf gebruiken om te kijken voor welk item we een binary search moesten doen. Dit werkte alleen niet goed, terwijl wanneer we het getal gewoon vaststellen van tevoren het wel goed gaat. Uiteindelijk miste er een &.
### What was or still is unclear
Fill in...
### How did the group perform?
How was the collaboration? What were the reasons for hick-ups? What worked well? What can be improved next time?
```c=
#include <stdio.h>
#include "time_utils.c"
#include "spotify.c"
int compare(const Data *pFirst, const Data *pSecond) {
return compare_popularity(pFirst, pSecond);
}
// sorts the array using selection sort
void selection_sort(Array *pArray) {
Data *temp = malloc(sizeof(Data) * pArray->capacity);
for (int x = 0; x < pArray->count; x++) {
int z = x;
for (int y = x+1; y < pArray->count; y++) {
if (pArray->values[y].popularity < pArray->values[z].popularity) {
z = y;
}
}
*temp = pArray->values[x];
pArray->values[x] = pArray->values[z];
pArray->values[z] = *temp;
}
free(temp);
}
// merges the two adjacent sub-arrays, which start at indices
// iA and (iA + nA), and have lengths at most nA,
// into the larger array pTarget.
// returns the number of merged elements.
int merge(Array *pArray, Data *pTarget, int iA, int nA) {
int indexA = iA, indexB = iA + nA;
int nB = nA;
if (pArray->count < indexB + nB)
nB = pArray->count - indexB;
int target_idx = indexA;
while (indexA < iA + nA && indexB < iA + nA + nB){
// compare pA[indexA] and pB[indexB]
// if A > B
if (compare(&pArray->values[indexA], &pArray->values[indexB]) > 0) {
// write value from first array to target
pTarget[target_idx] = pArray->values[indexB];
indexB++;
}
// if A <= B
else {
// write value from second array to target
pTarget[target_idx] = pArray->values[indexA];
indexA++;
}
target_idx++;
}
// copy remaining elements of the first array
while (indexA < iA + nA)
pTarget[target_idx++] = pArray->values[indexA++];
// copy remaining elements of the second array
while (indexB < iA + nA + nB)
pTarget[target_idx++] = pArray->values[indexB++];
// return size of merged array
return nA + nB;
}
void merge_sort(Array *pArray) {
// initially, each element is a single array
int numArrays = pArray->count;
int arraySize = 1;
// create a temporary space for holding the merged arrays
Data *pWork = malloc(sizeof(Data) * pArray->capacity);
// merge the arrays until one sorted array is left
while (numArrays > 1)
{
// compute the number of pairs
const int numPairs = numArrays / 2;
// number of elements remaining to be sorted
int remaining = pArray->count;
// merge each pair of consecutive (sub)arrays into a single array
for (int i = 0; i < numPairs; i++)
{
// call merge(pArray, pWork, 2 * i * arraySize, arraySize) to merge the two sorted (sub)arrays into a sorted larger array
// decrease the number of remaining elements by the number of merged elements
//[complete the code!]
remaining -= merge(pArray, pWork, 2 * i * arraySize, arraySize);
//printf("remaining: %d", remaining);
}
// copy the unmerged values
merge(pArray, pWork, 2 * numPairs * arraySize, remaining);
// update numArrays: divide by 2 and round up
numArrays = (numArrays + 1) / 2;
// double the arraySize
arraySize = arraySize * 2;
// swap working space with array
Data *tmp = pArray->values;
pArray->values = pWork;
pWork = tmp;
}
// free the working space
free(pWork);
}
int binarySearch(const Array *pArray, const Data *pValue)
{
int lpos = 0;
int rpos = pArray->count;
while (lpos < rpos)
{
// search the interval [lpos, rpos)
// (lpos is included in the interval, rpos is not)
// determine mid position in [lpos, rpos)
int mid = (lpos + rpos) / 2;
// compare element at middle position against value searched for
int ordering = compare(&pArray->values[mid], pValue);
if (ordering > 0)
{
// must continue search in interval [lpos, mid)
rpos = mid;
}
else if (ordering < 0)
{
// must continue search in interval [mid + 1, rpos)
lpos = mid+1;
}
else
{
// value is found at index mid!
return mid;
}
}
// we didn't find the value, so stop and indicate
// to caller by returning -1.
return -1;
}
int main()
{
Array array;
ElapsedTime elapsedTime;
array_init(&array, 100000);
if (!parse_csv(&array, "C:\\Users\\Anton\\CLionProjects\\Algorithms & Datastructures\\untitled\\cmake-build-debug\\data_100000.csv"))
{
return 1;
}
printf("%d records read\n", array.count);
// selection sort
const int num_shuffles = 500;
// Number of selection sorts (change this value if this takes too much time!)
const int num_sel_sorts = 4;
// Number of merge sorts
const int num_merge_sorts = 100;
printf("Computing time for shuffling array....");
time_lap(&elapsedTime);
for (int i = 0; i < 100; i++) {
array_shuffle(&array);
}
time_lap(&elapsedTime);
double shuffle_time = get_elapsed_time_ms(&elapsedTime) / num_shuffles;
printf("done\n");
printf("Sorting array %d times using selection sort...", num_sel_sorts);
time_lap(&elapsedTime);
for (int i = 0; i < num_sel_sorts; i++) {
array_shuffle(&array);
selection_sort(&array);
}
time_lap(&elapsedTime);
double selection_sort_time = get_elapsed_time_ms(&elapsedTime) / num_sel_sorts - shuffle_time;
printf("done in %.2f ms per sort\n", selection_sort_time);
printf("Sorting array %d times using merge sort...", num_merge_sorts);
time_lap(&elapsedTime);
for (int i = 0; i < num_merge_sorts; i++) {
array_shuffle(&array);
merge_sort(&array);
}
time_lap(&elapsedTime);
double merge_sort_time = get_elapsed_time_ms(&elapsedTime) / num_merge_sorts - shuffle_time;
printf("done in %.2f ms per sort\n", merge_sort_time);
return 0;
}
/*
int main()
{
Array array;
int sorted_ok;
Data *search_for;
array_init(&array, 100);
if (!parse_csv(&array, "C:\\Users\\Anton\\CLionProjects\\Algorithms & Datastructures\\untitled\\cmake-build-debug\\data_1000.csv"))
{
return 1;
}
printf("%d records read\n", array.count);
// TODO: sort the array, based on some criterion, using either selection sort or merge sort
//selection_sort(&array);
merge_sort(&array);
sorted_ok = 1;
printf("Contents after sorting:\n");
printf("[0] ");
data_print(&array.values[0]);
for (int i = 1; i < array.count; i++)
{
printf("[%d] ", i);
data_print(&array.values[i]);
if (compare(&array.values[i - 1], &array.values[i]) > 0)
sorted_ok = 0;
}
if (sorted_ok)
printf("\nData is sorted correctly!\n");
else
printf("\nData is *NOT* sorted correctly!\n");
printf("Select a popularity you want to search for (between 0 and 100):\n");
scanf("%d", &search_for->popularity);
int index_found = binarySearch(&array, search_for);
if(index_found != -1){
printf("Popularity found at index:%d\n", index_found);
}
else{
printf("popularity wasn't found\n");
}
return 0;
}
*/
```