# Week 3 - Sorting and Searching
## Team
Team name: Group 2
Date: 05/03/2021
Members:
Bogdan Iliescu
Yordan Ivanov
Paul-Arthur Astierd.2
| Role | Name |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|
| **Facilitator** keeps track of time, assigns tasks and makes sure all the group members are heard and that decisions are agreed upon. |Bogdan Iliescu |
| **Spokesperson** communicates group’s questions and problems to the teacher and talks to other teams; presents the group’s findings. |Yordan Ivanov |
| **Reflector** observes and assesses the interactions and performance among team members. Provides positive feedback and intervenes with suggestions to improve groups’ processes. |Paul-Arthur Astier |
| **Recorder** guides consensus building in the group by recording answers to questions. Collects important information and data. | |
## Activities
### Activity 1: Purpose of sorted lists
Sorted lists, as opposed to unsorted lists, are important because they can often reduce the complexity of a problem. Sorted lists are created using sorting algorithms which have a direct application in searching algorithms, database alorithms and many more examples. A more concrete example of sorted list would be "counting sort", which allows the user to sort ,for instance, a number of objects having distinct key values.
### Activity 2: Explanation of selection sort
Selection sort consists of 3 elemental steps. First of all, when the array is unsorted we go through the "selection" phase, which consists of chosing the lowest element in the remaining array. Second of all, we will go through the "swapping" phase, where we bring the lowest value to the starting position. Last of all, we have the "counter shift" phase, where we change the counter for the unsorted array by one.
Reference: "Selection Sort | GeeksforGeeks"; link: https://www.youtube.com/watch?v=xWBP4lzkoyM
### Activity 3: Performance of selection sort
Selection sort is an in-place sorting algorithm, with the complexity of O(n^2) and is very inefficient in large sorting list.
### Activity 4: Implement selection sort
```c=
void selection_sort(Array *pArray);
{
int i, j, position, swap, array[100], element;
scanf("%d", &n);//input the number of elemnts that you want to have in the array
for(i = 0; i < element; i++)
scanf("%d", &array[i]);//input the numbers which you want to sort
for(i = 0; i < element - 1; i++)
{
position = i;
for(j = 1; j < element; j++)
if(array[position] > array[j])
position = j;
}
if(position != i)
{
swap = array[i];
array[i] = array[position];
array[position] = swap;
}
for(i = 0; i < element; i++)
printf("%d", array[i]);
}
```
### Activity 5: Explanation of merge sort
Merger sort start with breaking an n array up into n arrays of single elements. It then combines the single element arrays into pairs, sorting the numbers while doing so. This process is repeated, sorting the existing, broken up arrays into pairs, until there is only one left.
https://www.youtube.com/watch?v=4VqmGXwpLqc
### Activity 6: Merge sort - step by step
|9| |3| |0| |17| |12| |1| |1| |5| |2| |10| |8| |4| |9| |3| |2|
|3-9| |0-17| |1-12| |1-5| |2-10| |4-8| |3-9| |2-X|
|0-3-9-17| |1-1-5-12| |2-4-8-10| |2-3-9-X|
|0-1-1-3-5-9-12-17| |2-2-3-4-8-9-10-X|
|0-1-1-2-2-3-3-4-5-8-9-9-10-12-17-X|
### Activity 7: Implement the merge function
And that's how you insert code blocks:
```c=
int merge(Array *pArray, Data *pTarget, int iA, int nA)
{
int indexA = iA, indexB = iA + nA;
int nB = nA;
if (Array->count < indexB + nB)
nB = pArray->count - indexB;
int target_idx = indexA;
while (indexA < nA && indexB < iA + nA +)
{
//compare pA[indexA] and pB[indexB]
if(compare(&pArray->values[indexA], &pArray->values[indexB]) > 0)
{
pTarget[target_idx] = pArray->values[indexA++];
}
else
{
pTarget[target_idx] = pArray->values[indexB++];
}
target_idx++;
}
while(indexA < iA + nA)
pTarget[target_idx++] = pArray->values[indexA++];
while(indexB < iA + nA + nB)
pTarget[target_idx++] = pArray->values[indexB++];
return nA + nB;
}
```
### Activity 8: Implement the divide-and-conquer algorithm
```c=
void merge_sort(Array *pArray)
{
int num_arrays = pArray->count;
int array_size = 1;
Data *pWork = malloc(sizeof(Data) * pArray->capacity);
while (num_arrays > 1)
{
const int num_pairs = num_arrays / 2;
int remaining = pArray->count;
int i;
for(i = 0; i < num_pairs; i++)
{
merge(pArray, pWork, 2 * i * array_size, array_size);
remaining = array_size*2;
}
merge(pArray, pWork, 2 * num_pairs * array_size, remaining);
num_arrays = (num_array + 1) / 2;
array_size = array_size * 2;
Data *tmp = pArray->values;
pArray->values = pWork;
pWork = tmp;
}
free(pWork);
}
```
### Activity 9: Test merge sort
```c=
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <spotify.h>
void merge_sort(Array *pArray)
{
int num_arrays = pArray->count;
int array_size = 1;
Data *pWork = malloc(sizeof(Data) * pArray->capacity);
while (num_arrays > 1)
{
const int num_pairs = num_arrays / 2;
int remaining = pArray->count;
int i;
for(i = 0; i < num_pairs; i++)
{
merge(pArray, pWork, 2 * i * array_size, array_size);
remaining = array_size*2;
}
merge(pArray, pWork, 2 * num_pairs * array_size, remaining);
num_arrays = (num_array + 1) / 2;
array_size = array_size * 2;
Data *tmp = pArray->values;
pArray->values = pWork;
pWork = tmp;
}
free(pWork);
}
int merge(Array *pArray, Data *pTarget, int iA, int nA)
{
int indexA = iA, indexB = iA + nA;
int nB = nA;
if (Array->count < indexB + nB)
nB = pArray->count - indexB;
int target_idx = indexA;
while (indexA < nA && indexB < iA + nA +)
{
//compare pA[indexA] and pB[indexB]
if(compare(&pArray->values[indexA], &pArray->values[indexB]) > 0)
{
pTarget[target_idx] = pArray->values[indexA++];
}
else
{
pTarget[target_idx] = pArray->values[indexB++];
}
target_idx++;
}
while(indexA < iA + nA)
pTarget[target_idx++] = pArray->values[indexA++];
while(indexB < iA + nA + nB)
pTarget[target_idx++] = pArray->values[indexB++];
return nA + nB;
}
int main()
{
Array array;
int sorted_ok;
array_init(&array, 100);
if (!parse_csv(&array, "data_10.csv"))
{
return 1;
}
printf("%d records read\n", array.count);
merge_sort(&array);
selection_sort(&array) ;
sorted_ok = 1;
printf("Contents after sorting:\n");
printf("[0] ");
data_print(&array.values[0]);
int i ;
for ( i = 1; i < array.count; i++)
{
printf("[%d] ", i);
data_print(&array.values[i]);
if (compare(&array.values[i - 1], &array.values[i]) > 0)
sorted_ok = 0;
}
if (sorted_ok)
printf("\nData is sorted correctly!\n");
else
printf("\nData is *NOT* sorted correctly!\n");
return 0;
}
```
### Activity 10: Binary search - step by step
| step | Search range | Middle element (index/value) | Discarded range |
| -------- | -------- | -------- | ------- |
| 1 | [0..22) | 11--20 | [0..11]
| 2 | [12..22) | 17--25 | [17..22]
|3 | [12..16) | 14--41 | [12--14]
|4 | [14-16) | 15--42 | [16]
|5 | [15) | 15--42 | [NULL]
### Activity 11: Implement binary search
```c=
int binarySearch(const Array *pArray, const Data *pValue)
{
int lpos = 0;
int rpos = pArray->count;
while (lpos < rpos)
{
int mid = (lpos + rpos) / 2;
int ordering = compare(&pArray->values[mid], pValue);
if (ordering > 0)
{
rpos = mid;
}
else if (ordering < 0)
{
lpos = mid+1;
}
else
{
return mid;
}
}
return -1;
}
```
### Activity 12: Test binary search
```c=
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include "spotify.h"
int compare(const Data*pFirst, const Data *pSecond)
{
return compare_popularity(pFirtst, pSecond);
}
void merge_sort(Array *pArray);
int merge(Array *pArray, Data *pTarget, int iA, int nA);
int binary_search(const Array *pArray, const Data)
```
### Activity 13: Time complexity
### Activity 14: Compare merge sort and selection sort
### Activity 15: Compare naive search and binary search
It depends on how the array is sorted and its size. For unsorted arrays, linear search will always be more efficient than binary. For small sorted arrays, linear is as efficient than binary, but in larger arrays, binary will be more efficient.
The time complexity of linear is O(n) and the time complexity for binary search is O(log n).
The time complexity for the linear search on a linked list remais the same, O(n), but the binary search changes time complexity changes to o(n).
## Look back
### What we've learnt
Our team has learned the complexity and uses of sorting arrays, how to implement it.
### What were the surprises
### What problems we've encountered
### What was or still is unclear
### How did the group perform?