# Answer Sheet I
## Week 1 - Arrays

### Activity 1: Fixed-sized arrays
Fixed-sized arrays Discuss with your group members what the advantages and disadvantages are of having an array with a fixed capacity. Provide at least one use case where arrays with a fixed capacity are useful.
* Answer:
* Advantage: simpler to use, no need to (re)allocate memory dynamically.
* Disadvantage: size is fixed, so it’s impossible to store more than its capacity. Sometimes it’s difficult to estimate the required capacity, so very pessimistic (large) capacities are used, leading to a high memory usage.
```c=
int size = 100;
int values[size];
for(int i=0; i<size; i++){
array[i] = i;
}
```
**$dza:** That's ok. A great advantage of fixed sized array is their use when memory is limited. With fixed sizes the memory usage of a program can be checked statically by a compiler - this way you are sure you never run out of memory. Moreover, all the trouble of allocating and freeing is gone.
---
---
---

### Activity 2: Random access
Can you think of a situation or use case where having random access is not necessary? Provide at least one use case.
* Answer: If we only need to access the first (or last) element, then no random access is required.
```c=
int size = 10;
int array[size];
for(int i=0; i<size; i++){
array[i] = i;
}
//grabbing both ints takes the same amount of time
int var1 = array[0];
int var2 = array[9];
```
---
---
---


### Activity 3: The sizeof operator
Find out, using the sizeof function, how many bytes are used to store a memory address (a pointer). Does the size of a pointer depend on its type? What is the reason for this?
* Answer: The size of a pointer does not depend on the type, because a pointer contains a memory address, the size of which is independent of whatever is stored at that address
```c=
int *p1;
float *p2;
double *p3;
printf("the size of p1 is: %d\n",sizeof(p1)); //prints out 4
printf("the size of p1 is: %d\n",sizeof(p2)); //also prints out 4
printf("the size of p3 is: %d\n",sizeof(p3)); //the output is still 4
```
---
---
---

### Activity 4: The sizeof operator and arrays
Use the sizeof operator to find out what the size of array is.
```c=
int array[10];
//divide sizeof(array) by sizeof(element in array)
int array_size = sizeof(array)/sizeof(array[0]);
```
---
---
---


### Activity 5: Variables in function scope
Discuss: Does this function work properly? Provide arguments supporting your answer.
* Answer: The function does not work properly: the array values is allocated on the stack, because it is a local variable. This means that the memory containing the values is discarded (reused by other functions) as soon as the function ends. The contents of the values array will thus not be stable.
```c=
typedef struct Array_T{
int capacity;
int count;
int *values;
} Array;
void initArray(Array *pArray, int capacity){
int values[capacity];
pArray->capacity = capacity;
pArray->values = values;
pArray->count = 0;
}
```
---
---
---

### Activity 6: Stack vs. heap
Search on the internet what the difference between the stack and the heap is, and explain these differences in your own words.
* Answer: The stack contains local variables and arguments passed to a function. Its lifetime is short - as soon as the scope holding the local variables is exited (e.g., end of a function call), the memory is discarded. The heap is a larger pool of memory that contains data that has a longer lifetime, which is claimed and released through dynamic memory management.
---
---
---

### Activity 7: Using malloc
Use the malloc function to:
* allocate memory to store one unsigned long number,
* allocate memory to store 256 float numbers,
* write a function that takes a count as its argument, allocates enough memory to hold count ints and returns the pointer to the allocated memory block: int* allocateMemory(int count);
---
* Answer: The **malloc()** function is able to allocate memory of a pointer specified by the arguement. This becomes useful since using malloc then allows the **realloc()** function to be used to resize the array and also deallocate memory with the **free()** function. This gives more control over the memory reserved for a certain array.
```c=
unsigned long *num1 = (unsigned long*)malloc(sizeof(unsigned long));
float *num2 = (float*)malloc(265*sizeof(float));
int* allocateMemory(int counter){
int *ptr = (int*)malloc(counter*sizeof(int));
return ptr;
}
```
**$dza:** A pointer returned by `malloc` needs not to be casted, this suffices:
```c
float* numbers = malloc(sizeof(float[10]));
```
### Activity 8: Limits of malloc
Discuss: How much memory can be allocated per call to malloc?
* Answer: The malloc() function reserves a block of storage of size bytes. The function does not initialize all elements to 0. The maximum size of malloc() is **16711568 bytes**.
### Activity 9: Creating an array with dynamic capacity
Below, the prototype for the function array\_init is given. This function initializes an empty array (i.e. count = 0) with a given capacity. Implement the function by making use of the malloc function.
```c
// creates an empty array with the given capacity
void array\_init(Array *pArray, int capacity);
```
---
* Answer: Using the structure and fucntion in Activity 5 the new array_init fucntion uses malloc instead of a newly initialized array. This way the initilized array is able to be resized if needed.
```c=
void array_init(Array *pArray, int capacity)
{
pArray->values = malloc(capacity*sizeof(Array));
pArray->capacity = capacity;
pArray->count = 0;
}
```
---
---
---

### Activity 10: Dangerous frees
Find out what happens when you try to call the free function on a pointer that was not obtained using malloc, and on a pointer, the value of which is NULL:
* Answer: The **free()** function takes a pointer as an argument and deallocates its memory without returning anything. However the pointer sent into the function has to have been initalized with **malloc()** otherwise the program will complain and **throw an error**. Simply passing a NULL pointer wont do anything though since there is no memory to be deallocated.
```c=
int* ptr = (int*)123456789;
free(ptr);
int* null_ptr = NULL;
free(null_ptr);
```
**$dza:** calling `free` on `NULL` is, as per standard, always allowed and results in no-operation
### Activity 11: Infinite memory?
Discuss: What happens when the dynamic memory that is no longer needed is not freed? How is it called when such a thing happens and what can be the consequences of not reclaiming memory in a longer run?
* Answer: Dynamic memory that is not explicitly freed is not released (until the program ends). Programs that ’forget’ to free memory have a ’memory leak’ - a consequence may be that slowly all memory is used up.
### Activity 12: Resizing an Array
Implement the resize function with a prototype given below.
```c
// resizes the array so that it can contain at most newCapacity ←- elements
void resize(Array *pArray, int newCapacity);
```
---
* Answer:
```c=
void resize(Array *pArray, int newCapacity)
{
pArray->values = realloc(pArray->values, newCapacity*sizeof(Array));
pArray->capacity = newCapacity;
// $dza: why this last line?
pArray->values[pArray->capacity-1]=0;
}
```
---
---
---

### Activity 13: Array insertion
Suppose an element is inserted into an array that contains n elements. What kind of steps are involved in this insertion? What is the worst-case number of steps that need to be performed?
* Answer: Elements must be shifted in order to make space for the new ele
* At first we need to check if the array is full and if it is, we expand it by 1. Then we need to move all the elements before the index of insertion back. Then we can finally insert the value. This is an example:
```c=
void array_insert(Array *pArray, int index, int value)
{
if(pArray->count==pArray->capacity){
resize(pArray,pArray->capacity++);
}
for (int i=(pArray->capacity); i>=index; i--){
pArray->values[i] = pArray->values[i--];
}
pArray->values[index] = value;
pArray->count++;
}
```
**$dza:** You used post-increment for `capacity`. It means the the value of `capacity` will be incremented only after the call to `resize` has returned. This, in turn, means that the capacity you passed to `resize` is your old, unchanged capacity. Either use pre-increment (`resize(pArray, ++pArray->capacity)`) or just do it explicitly (`resize(pArray,pArray->capacity + 1)`) and increment afterwards.
---
---
---

### Activity 14: Basic list operations
Implement the functions that are described above, and listed below. Make sure to use the resize function to dynamically resize the array when necessary.
```c
// append an element to the end of an array
void array\_append(Array \*pArray, int element);
// remove an element from the array at index
void array\_remove(Array \*pArray, int index);
// remove all elements
void array\_clear(Array \*pArray);
// insert an element into the array, before index
void array\_insert(Array \*pArray, int index, int value);
// swap the two elements located at indices index\_a and index\_b
void array\_swap(Array *pArray, int index\_a, int index_b);
```
---
* Answer:
```c=
void array_append(Array *pArray, int element)
{
// $dza: what it there is not enough capacity?
pArray->values[pArray->count] = element;
pArray->count++;
}
void array_remove(Array *pArray, int index)
{
// $dza: the loop contition should be `i < pArray->count`
for(int i=index; i<pArray->capacity--; i++){
pArray->values[i] = pArray->values[i+1];
}
pArray->values[pArray->capacity--] = 0;
pArray->count--;
}
void array_clear(Array *pArray)
{
for(int i=0; i<pArray->capacity; i++){
pArray->values[i] = 0;
}
pArray->count = 0;
}
void array_insert(Array *pArray, int index, int value)
{
if(pArray->count==pArray->capacity){
// $dza: again post-increment instead of pre-increment
resize(pArray,pArray->capacity++);
}
// $dza: the loop contition should be `i < pArray->count`
for (int i=(pArray->capacity); i>=index; i--){
pArray->values[i] = pArray->values[i--];
}
pArray->values[index] = value;
pArray->count++;
}
void array_swap(Array *pArray, int index_a, int index_b)
{
int temp_a=pArray->values[index_a];
pArray->values[index_a] = pArray->values[index_b];
pArray->values[index_b] = temp_a;
}
```
---
---
---

### Activity 15: Test your code
Run the following tests. If the output does not match the expected output, fix the function that contains the error.
```c
void array\_print(const Array * array)
{
printf("\[");
if (array->count > 0)
printf("%d", array->values\[0\]);
for (int i = 1; i < array->count; i++)
printf(", %d", array->values\[i\]);
printf("\] (capacity: %d)\\n", array->capacity);
}
int main()
{
Array array;
array\_init(&array, 4);
array\_append(&array, 0);
array\_print(&array); // expected result: \[0\] (capacity: 4)
array\_append(&array, -1);
array\_print(&array); // expected result: \[0, -1\] (capacity: 4)
array\_append(&array, -2);
array\_print(&array); // expected result: \[0, -1, -2\] (capacity: 4)
array\_append(&array, -3);
array\_print(&array); // expected result: \[0, -1, -2, -3\] (capacity: 4)
array\_insert(&array, 4, -4);
array\_print(&array); // expected result: \[0, -1, -2, -3, -4\] (capacity←- : X)
array\_insert(&array, 2, 42);
array\_print(&array); // expected result: \[0, -1, 42, -2, -3, -4\]
array\_remove(&array, 3);
array\_print(&array); // expected result: \[0, -1, 42, -3, -4\]
array\_swap(&array, 1, 3);
array\_print(&array); // expected result: \[0, -3, 42, -1, -4\] 34
}
```
---
* Answer:
```c=
typedef struct Array_T
{
int *values;
int capacity;
int count;
} Array;
void array_init(Array *pArray, int capacity)
{
pArray->values = malloc(capacity*sizeof(Array));
pArray->capacity = capacity;
pArray->count = 0;
}
void resize(Array *pArray, int newCapacity)
{
pArray->values = realloc(pArray->values, newCapacity*sizeof(Array));
pArray->capacity = newCapacity;
pArray->values[pArray->capacity--] = 0;
}
void array_append(Array *pArray, int element)
{
pArray->values[pArray->count] = element;
pArray->count++;
}
void array_remove(Array *pArray, int index)
{
for(int i=index; i<pArray->capacity--; i++){
pArray->values[i] = pArray->values[i+1];
}
pArray->values[pArray->capacity--] = 0;
pArray->count--;
}
void array_clear(Array *pArray)
{
for(int i=0; i<pArray->capacity; i++){
pArray->values[i] = 0;
}
pArray->count = 0;
}
void array_insert(Array *pArray, int index, int value)
{
if(pArray->count==pArray->capacity){
resize(pArray,pArray->capacity++);
}
for (int i=(pArray->capacity); i>=index; i--){
pArray->values[i] = pArray->values[i--];
}
pArray->values[index] = value;
pArray->count++;
}
void array_swap(Array *pArray, int index_a, int index_b)
{
int temp_a=pArray->values[index_a];
pArray->values[index_a] = pArray->values[index_b];
pArray->values[index_b] = temp_a;
}
void array_print(const Array * array)
{
int count;
int values;
int capacity;
printf("[");
if(array->count>0)
printf("%d", array->values[0]);
for(int i=1; i< array->count; i++)
printf(", %d",array->values[i]);
printf("] (capacity: %d)\n", array -> capacity);
}
int main()
{
Array array;
array_init(&array, 4);
array_append(&array, 0);
array_print(&array);
array_append(&array, -1);
array_print(&array);
array_append(&array, -2);
array_print(&array);
array_append(&array, -3);
array_print(&array);
array_insert(&array, 4, -4);
array_print(&array);
array_insert(&array, 2, 42);
array_print(&array);
array_remove(&array, 3);
array_print(&array);
array_swap(&array, 1, 3);
array_print(&array);
return 0;
}
```
---
---
---

### Activity 16: Solve the partitioning problem
Write the function partition that solves this problem. Use the following prototype for this function:
```c
void partition(Array *pArray);
```
After implementing the function, download the accompanying verify.c file. This file contains a main function which, when run, will verify the correctness of your partition function, and run a benchmark using arrays with different (large) sizes. It will provide the measured average runtime as output.
Let your program call the benchmark function that is contained in the verify.c file. Running your program will produce a set of numbers. Write down these numbers in the table below. Use the table to answer the following question: how does your algorithm scale as the array length increases?
---
* Answer:
```c=
void partition(Array *pArray)
{
int index;
if(pArray->values[0]>=0){index=0;}
else{index=1;}
for(int i=0; i<pArray->capacity; i++){
// $dza: please, please do read about how post- and pre-increment work
if(pArray->values[i++]<0 && pArray->values[i]>=0){
array_swap(pArray, i++, index);
index++;
while(pArray->values[index]<0){
index++;
}
//array_print(pArray);
}
}
}
```
**$dza:** your idea for partitioning was good. However, I'm afraid you applied way too much kool-aid++ to it.
## Week 2 - Linked List

### Activity 1: Linked lists versus arrays
In the courses programming 1 & 2 you have used arrays a lot, and in the previous week, you have revisited them. Common operations performed on arrays and linked lists are retrieval, appending, insertion, and deletion.
Discuss how these four operations compare for arrays and linked lists, in terms of the effort it takes to implement these operations, and in terms of the time these operations take. Can you think of a situation where you would not want to use an array, but rather a (singly or doubly) linked list?
* Answer:
* Retrieval: an array has random access, which means that lookup (by index) is O(1).
* For a linked list, the list needs to be traversed, which is O(n).
* Insertion and deletion is easier in a linked list, because no elements have to be shifted. Appending is efficient in both an array and a linked list (but in a linked list, there’s somewhat more overhead due to the use of pointer indirections).
---
---
---


### Activity 2: Linked list initialization
Implement the linkedlist\_init function, using the prototype definition listed below. Make sure that the head and tail of the initialized linked list are set to NULL.
```c
void linkedlist\_init(LinkedList *);
```
---
* Answer:
```c=
typedef struct LLNode {
int value;
struct LLNode* pPrevious;
struct LLNode* pNext;
} LinkedListNode;
typedef struct {
LinkedListNode *pHead;
LinkedListNode *pTail;
int count;
} LinkedList;
void linkedlist_init(LinkedList *pList){
*pList = (LinkedList){ .pHead = NULL, .pTail=NULL, .count=0};
}
```
------
------
------

### Activity 3: Adding nodes & memory
Discuss with your group: when adding a new value to the list, a LinkedListNode structure must be created to hold the value. Should this structure be kept on the stack or on the heap? Why?
* This structure must be kept on the heap, because its lifetime is longer than the function call invoked to add the value to the list.
### Activity 4: Appending and prepending elements
Implement the two functions to append and prepend a new integer to the list, given by the following prototypes:
```c
// appends value to the list referred to by pList
void linkedlist\_append(LinkedList \*pList, int value);
// prepends value to the list referred to by pList 5
void linkedlist\_prepend(LinkedList \*pList, int value);
```
After implementing the two functions, test your implementation using the program listed below:
```c
#include "SimpleAssert.h"
int main()
{
LinkedList list;
linkedlist\_init(&list);
linkedlist\_append(&list, 2);
ASSERT\_EQ(list.pHead, list.pTail, "one element: head must be equal to ←- tail");
linkedlist\_prepend(&list, 1);
linkedlist\_append(&list, 3);
ASSERT\_EQ(list.pHead->value, 1, "head must be 1");
ASSERT\_EQ(list.pTail->value, 3, "tail must be 3");
ASSERT\_EQ(list.pHead->pNext->value, 2, "next(head) must be 2");
ASSERT\_EQ(list.pHead->pNext->pNext->value, 3, "next(next(head)) must ←- be 3");
ASSERT\_EQ(list.pHead->pNext->pNext->pNext, NULL, "next(next(next(head)←- )) must be NULL");
ASSERT\_EQ(list.pHead->pNext->pNext->pPrevious->value, 2);
ASSERT\_EQ(list.pHead->pNext->pPrevious->value, 1, "previous(next(head)←- ) must be 1");
ASSERT_EQ(list.pHead->Previous, NULL, "previous(head) must be NULL");
}
```
---
* Answer:
```c=
void linkedlist_append(LinkedList *pList, int value){
LinkedListNode *newNode = (LinkedListNode *) malloc(sizeof(LinkedListNode));
newNode->value = value;
newNode->pNext = pList->pTail;
if(pList->count==0) {
pList->pHead = newNode;
newNode->pPrevious = NULL;
}
else {
LinkedListNode *temp = pList->pHead;
while (temp->pNext != NULL) {
temp = temp->pNext;
}
temp->pNext = newNode;
newNode->pPrevious = temp;
}
++pList->count;
}
// prepends value to the list referred to by pList
void linkedlist_prepend(LinkedList *pList, int value){
//create new node
LinkedListNode *newNode = (LinkedListNode*)malloc(sizeof(LinkedListNode));
newNode->value = value;
pList->pHead->pPrevious = newNode;
newNode->pNext = pList->pHead;
newNode->pPrevious = NULL;
pList->pHead = newNode;
++pList->count;
}
```
---
---
---

### Activity 5: Print the contents of a linked list
Below a code fragment is given for the function linkedlist\_print, which prints the values stored in a linked list. Complete the code by adding the correct statements at the ’TODO’ comments.
```c
void linkedlist\_print(const LinkedList \*pList)
{
LinkedListNode \*pCurrent = list->pHead;
printf("\[");
if (pCurrent != NULL)
{
/* TODO: 1. print the value stored in the node referred to by pCurrent.
2. make pCurrent refer to the next node. */
while (pCurrent != NULL)
{
/* TODO: 1. print a comma, followed by the value stored in the node
referred to by pCurrent
2. make pNode refer to the next node */
}
}
printf("\]\\n");
}
```
---
* Answer:
```c=
void linkedlist_print(const LinkedList *pList)
{
LinkedListNode *pCurrent = pList->pHead;
printf("[");
if (pCurrent != NULL)
{
printf("%d", pCurrent->value);
pCurrent = pCurrent->pNext;
while (pCurrent != NULL)
{
printf(", %d", pCurrent->value);
pCurrent = pCurrent->pNext;
}
}
printf("] ");
printf("count: %d\n", pList->count);
}
```
---
---
---

### Activity 6: Element retrieval
Implement the linkedlist_get function, for which the prototype is given below. Make sure that the function returns NULL in case an invalid index is passed
```c
// returns the node at index 'index' in the list referred to by pList
const LinkedListNode \*linkedlist\_get(const LinkedList \*pList, int index←- );
```
Use the code listed below to test your implementation of the linkedlist\_get function.
```c
LinkedList list;
linkedlist\_init(&list);
ASSERT\_EQ(linkedlist\_get(&list, -1), NULL, "get(-1) must be NULL");
ASSERT\_EQ(linkedlist\_get(&list, 0), NULL, "get(0) must be NULL for empty list");
linkedlist\_append(&list, 1);
linkedlist\_append(&list, 2);
ASSERT\_EQ(linkedlist_get(&list, 0), list.pHead, "get(0) must be list. pHead");
ASSERT\_EQ(linkedlist\_get(&list, 1), list.pTail, "get(1) must be list. pTail");
```
---
* Answer:
```c=
const LinkedListNode *linkedlist_get(const LinkedList *pList, int index){
//check if index is valid
if(index>=0) {
//check if list is initialized and contains values
if(pList->count==0) {return NULL;}
else{
//create temporary node
LinkedListNode *temp = pList->pHead;
//loop to desired index
for (int i = 0; i < index; i++) {
temp = temp->pNext;
}
//return LinkListNode
return temp;
}
}
return NULL;
}
```
### Activity 7: The const keyword
Discuss the implication of the const keyword used in the definition of the function linkedlist_get (a detailed explanation can be found here). What restrictions does the const keyword impose on the code in the body of the function?
* Answer: The const keyword imposes the restriction that the contents of the list can not be modified.
### Activity 8: Time complexity of retrieval
How does the worst-case time needed for retrieval scale as the list’s number of elements increase? For a list with 2N elements, how much time is needed for retrieval of an arbitrary element, in the worst case, compared to a list of N elements?
* Answer: For a list with 2N elements, retrieval takes roughly twice as long compared to a list with N elements, in the worst case.
---
---
---

### Activity 9: Implementing insertion
Implement the function linkedlist_insert, as defined below. Make sure that, when using this function to insert a value in front of the list, the list’s head is updated as well.
```c
// updates the list by inserting value before the node referred to by ←- pInsertionPoint
void linkedlist\_insert(LinkedList \*pList, LinkedListNode \*←- pInsertionPoint, int value);
```
Test your implementation, using the code listed below:
```c
LinkedList list;
linkedlist\_init(&list);
linkedlist\_append(&list, 42);
linkedlist\_insert(&list, list.pHead, 4);
ASSERT\_EQ(list.pHead->value, 4, "head must be 4");
ASSERT\_EQ(list.pHead->pNext, list.pTail, "next(head) must be tail");
ASSERT\_EQ(list.pTail->pPrevious, list.pHead, "prev(tail) must be head")←- ;
linkedlist\_insert(&list, list.pTail, 2);
ASSERT\_EQ(list.pHead->value, 4, "head must be 4");
ASSERT\_EQ(list.pHead->pNext->value, 2, "next(head) must be 2");
ASSERT\_EQ(list.pHead->pNext->pNext->value, 42, "next(next(head)) must ←- be 42");
ASSERT\_EQ(list.pTail->pPrevious->pPrevious->value, 4, "prev(prev(tail))←- must be 4");
ASSERT\_EQ(list.pTail->pPrevious->value, 2, "prev(tail) must be 2");
ASSERT\_EQ(list.pTail->value, 42, "tail must be 42");
```
---
* Answer:
```c=
// updates the list by inserting value before the node referred to by pInsertionPoint
void linkedlist_insert(LinkedList *pList, LinkedListNode *pInsertionPoint, int value){
//create new node
LinkedListNode *newNode = (LinkedListNode *) malloc(sizeof(LinkedListNode));
//check if prepend is applicable
if(pInsertionPoint == pList->pHead){
linkedlist_prepend(pList, value);
}
else{
//set value
newNode->value = value;
//set pointers
newNode->pPrevious = pInsertionPoint->pPrevious;
newNode->pNext = pInsertionPoint;
//integrate in list
pInsertionPoint->pPrevious->pNext = newNode;
pInsertionPoint->pPrevious = newNode;
}
}
```
---
---
---

### Activity 10: Linked lists and memory management
Discuss: must the linkedlist\_remove function release the memory (i.e., using the free function that you have used in the previous week’s activities) that was previously allocated for the LinkedListNode? Why (not)?
Suppose that the linkedlist\_remove function would not release the memory, whose responsibility is it then to release this memory?
* Answer: It depends - when moving linked list nodes around (this may happen while sorting the list, for example), the remove is used to ’unlink’ a node from the list, and insert is used to re-insert it somewhere else. In that case, remove should not release the memory, because this would introduce lots of unnecessary memory allocations and releases. In that case, the caller of the remove function must release the memory.
### Activity 11: Implementation of removal
Implement the linkedlist_remove function, using the definition given above. Use the following code to test your implementation:
```c
LinkedList list;
linkedlist\_init(&list);
linkedlist\_append(&list, 1);
linkedlist\_append(&list, 2);
linkedlist\_append(&list, 3);
linkedlist\_remove(&list, list.pHead->pNext);
ASSERT\_EQ(list.pHead->value, 1, "head must be 1");
ASSERT\_EQ(list.pHead->pNext->value, 3, "next(head) must be 3");
ASSERT\_EQ(list.pTail->value, 3, "tail must be 3");
ASSERT\_EQ(list.pTail->pPrevious->value, 1, "prev(tail) must be 1");
linkedlist\_remove(&list, list.pHead);
ASSERT\_EQ(list.pHead, list.pTail, "head must be equal to tail");
ASSERT\_EQ(list.pHead->pPrevious, NULL, "prev(head) must be NULL");
ASSERT\_EQ(list.pHead->pNext, NULL, "next(head) must be NULL");
ASSERT\_EQ(list.pHead->value, 3, "head must be 3");
linkedlist\_remove(&list, list.pTail);
ASSERT\_EQ(list.pHead, list.pTail, "head must be equal to tail");
ASSERT_EQ(list.pHead, NULL, "head must be NULL");
```
---
* Answer:
```c=
// updates the list by removing the node referred to by pNode from the list
void linkedlist_remove(LinkedList *pList, LinkedListNode *pNode){
if(pList->count>=3){
if(pNode==pList->pHead){
pList->pHead = pNode->pNext ;
pList->pHead->pPrevious = NULL;
}
else if(pNode==pList->pTail){
pList->pTail = pNode->pPrevious;
pList->pTail->pNext = NULL;
}
else {
pNode->pPrevious->pNext = pNode->pNext;
pNode->pNext->pPrevious = pNode->pPrevious;
}
pNode = NULL;
}
else if(pList->count==2){
if(pNode==pList->pHead){
pList->pTail->pPrevious = NULL;
pList->pHead=pList->pTail;
}
else if(pNode==pList->pTail){
pList->pHead->pNext = NULL;
pList->pTail=pList->pHead;
}
}
else{
pList->pHead = NULL;
pList->pTail = NULL;
}
free(pNode);
--pList->count;
}
```
---
---
---


### Activity 12: Implementing merge
Write a function that merges one (sorted) linked list into another one. The function has two arguments, both of which are (sorted) linked lists, and merges the second list into the first one, which will be sorted as well (Hint: there is no need to explicitly sort the list after merging). You may use the following function prototype:
```c
// merges the contents of pListB into pListA
// on return, pListB will be empty.
void linkedlist_merge(LinkedList \*pListA, const LinkedList \*pListB);
```
Test your function on different inputs, including the following ones:
* merging \[1, 3, 5\] and \[0, 2, 4\] into \[0, 1, 2, 3, 4, 5\],
* merging \[0, 1, 2\] and \[3, 4, 5\] into \[0, 1, 2, 3, 4, 5\],
* merging \[1, 2, 4\] and \[0, 3, 5, 6\] into \[0, 1, 2, 3, 4, 5, 6\],
* merging \[1, 10, 20\] and \[12, 14, 16, 42\] into \[1, 10, 12, 14, 16, 20, 42\].
--------------------------------------------------------
--------------------------------------------------------
* Answer:
```c=
// merges the contents of pListB into pListA
// on return, pListB will be empty.
void linkedlist_merge(LinkedList *pListA, LinkedList *pListB){
for(int i = 0; i<pListA->count; i++) {
while(linkedlist_get(pListA, i)->value >= pListB->pHead->value){
linkedlist_insert(pListA, linkedlist_get(pListA, i), pListB->pHead->value);
linkedlist_remove(pListB, pListB->pHead);
}
}
while(pListB->count>0) {
linkedlist_append(pListA, pListB->pHead->value);
linkedlist_remove(pListB, pListB->pHead);
}
}
```
### Activity 13: Time complexity of merge
Discuss how the time needed to merge two lists depends on their lengths. If you merge two lists of length N, how does the runtime of merging scale as N is increased?
* Answer: The worst-case time complexity of merging two lists of length N is O(N). This means that if the runtime scales linearly: if N becomes twice as large, the runtime also increase by a factor of roughly 2.
---
---
---

### Activity 14: Implementing sorted insert
Write a function named linkedlist\_sorted\_insert, which takes two arguments: a (sorted) linked list, and a value to insert, as defined by the prototype listed below.
The function must insert the value into the list, such that the list remains sorted (Note: the function may assume that the linked list is already sorted, it only needs to insert a new element such that the order is maintained).
```c
// inserts \`value\` into \`pList\`, such that pList remains sorted
void linkedlist\_sorted\_insert(LinkedList *pList, int value);
```
--------------------------------------------------------
--------------------------------------------------------
* Answer:
### Activity 15: Time complexity of sorted insert
Discuss how the time, needed to insert a value into a sorted (linked) list with N elements, scales as N increases. How would you approach this in case you had to insert a value into a sorted array, rather than a linked list? How would the insertion scale then?
* Answer:
* Sorted insert is O(N), because the point at which the new value is to be inserted needs to be determined, which requires scanning sequentially through the list. The time needed to insert a value thus scales linearly in N.
* For an array, locating the insertion point can be done more efficiently (using binary search, see next week), but insertion in an array requires that elements are shifted, which is also O(N).
## Week 3 - Sorting

### Activity 1: Purpose of sorted lists
Discuss with your group what the purpose of a sorted list (array or linked list) is over an unsorted list. Search on the internet to find a couple of concrete cases in which sorted lists are used.
* Answer:
1. makes the data more readable and comprehendable to the user
2. allows for easier traversing through the dataset for the computer (easier searching)
For example, the **merge** and **sorted_insert** functions from last weeks Linked List activities required sorted lists as inputs in order to properly complete their resepective tasks.
Another good example, is the **filing** done by a computer's OS. In order to make it easier on the user to find files stored on the hard drive the OS is able to sort the files and folders based on the users prefeance for instance, name or date last accessed.
---
---
---


### Activity 2: Explanation of selection sort
Search YouTube for a video that explains selection sort, and watch it. In your log book, include a reference to the video you’ve used.
* Answer: Selection sort splits the array into two sections, first being the sorted side which to start with is empty and second being the given unsorted array. From there a varible scannes through the array searching for the lowest number which is then swapped with the first index in the array. Next, the size of the sorted side of the array is increased by one and the size of the unsorted side of the array decreased by one. The scan is repeated until the unsorted array is all take up by the sorted one comfirming that the selection sort function is complete.
* *Source: https://www.youtube.com/watch?v=xWBP4lzkoyM&feature=youtu.be*
### Activity 3: Performance of selection sort
How does the runtime of selection sort depend on the number of elements in the array? To answer this question, indicate how the time needed by selection sort increases as the number of elements in the array, given by n, increases.
For example, suppose selection sort needs T milliseconds to sort a list of n elements. How much time (roughly) will the algorithm need to sort a list of 2n or 4n elements? If you were to use a linked list instead of an array, would the performance be any different?
* Answer: The worst-case time complexity of selection sort is O(n 2 ). This means that if the algorithm requires T milliseconds to sort n elements, then for 2n elements the algorithm needs roughly 4T milliseconds (roughly - this won’t be true for small n, but as n becomes larger, the quadratic worst-case time complexity will become increasingly more visible.)
---
---
---


### Activity 4: Implement selection sort
Implement selection sort in C, using the prototype given below. The array to be sorted contains structures as defined above. You are free to choose on which of the fields in the structure you want to sort. Also, make sure to reuse the C code that you’ve written during week 1 (Arrays).
```c
// sorts the array using selection sort
void selection\_sort(Array *pArray);
```
Test your implementation of merge sort using the test code (see the file test\_sort.c) provided on Blackboard. Note that the file depends on other header and implementation files, which are provided on Blackboard as well. Use different sorting criteria, and fix any bugs that may show up.
---
* Answer:
```c=
typedef struct Data_Spotify{
char artists[80];
char name[80];
int popularity; // 0..100
float danceability; // 0..1
float energy; // 0..1
int year; // 1921..
float tempo; // (BPM)
char id[32];
} Data;
int compare_popularity(const Data *pFirst, const Date *pSecond){
if (pFirst->popularity > pSecond->popularity){
// pFirst should come after pSecond
return 1;
}
else if (pFirst->popularity < pSecond->popularity){
// pSecond should come after pFirst
return -1;
}
else{
// no need to change the order
return 0;
}
}
// sorts the array using selection sort
void selection_sort(Array *pArray)
{
int i, j, MIN_IDX;
for(i=0; i<pArray->count-1; i++){
MIN_IDX = i;
for(j=i+1; j<pArray->count; j++){
if(compare(&pArray->values[MIN_IDX], &pArray->values[j])==-1){
MIN_IDX = j;
}
}
array_swap(pArray, MIN_IDX, i);
}
}
```
---
---
---


### Activity 5: Explanation of merge sort
Search YouTube for a video that explains selection sort, and watch it. In your log book, write down a reference to the video you have used.
* Answer: Merge sort implements a divide an conquer technice to bring order to a data structure. By splitting the array into individual pieces and then slowly putting it back together first making multible size two ordered arrays then moving to four and continuing to double the size until the original size is restored. This approach naturally produces less steps than the select sort.
* *Source: https://www.youtube.com/watch?v=4VqmGXwpLqc&feature=youtu.be*
### Activity 6: Merge sort - step by step
The image below provides a graphical view on how merge sort is applied to a list of 15 numbers. In the top row, each of the 15 elements is regarded as an array consisting of one element. In the next row, each pair of two adjacent arrays is merged into a sorted array of length 2, which are then merged into sorted arrays of length 4, etc. Fill in the blanks, row by row. Note that the bottom row must contain all values of the first row, and furthermore the bottom row must be sorted!

---
Answer:

---
---
---

### Activity 7: Implement the merge function
The merge function merges two adjacent sorted arrays into one larger array, in such a way, that the resulting array is sorted.
Part of this function is already given below. Complete the code at the indicated points.
```c
// merges the two adjacent sub-arrays, the first of which starts at index iA, and have lengths at most nA, into the larger array pTarget.
// Returns the number of merged elements.
int merge(Array \*pArray, Data \*pTarget, int iA, int nA)
{
int indexA = iA;
if (pArray->count < indexA + nA)
nA = pArray->count - indexA; // adjust size of first array
int indexB = iA + nA; // start index of second array
int nB = nA; // size of second array
if (pArray->count < indexB + nB)
nB = pArray->count - indexB; // adjust size of second array
int target\_idx = indexA;
while (indexA < iA + nA && indexB < iA + nA + nB)
{
// compare pA\[indexA\] and pB\[indexB\]
if (compare(&pArray->values\[indexA\], &pArray->values\[indexB\]) > 0){
// write value from first array to target
\TODO:[complete the code!\]
}
else {
// write value from second array to target
\TODO[complete the code!\]
}
target\_idx++;
}
// copy remaining elements of the first array
while (indexA < iA + nA)
pTarget\[target\_idx++\] = pArray->values\[indexA++\];
// copy remaining elements of the second array
while (indexB < iA + nA + nB)
pTarget\[target\_idx++\] = pArray->values\[indexB++\];
// return size of merged array
return nA + nB;
}
```
---
* Answer:
```c=
// merges the two adjacent sub-arrays, which start at indices
// iA and (iA + nA), and have lengths at most nA,
// into the larger array pTarget.
// returns the number of merged elements.
int merge(Array *pArray, Data *pSpace, int iA, int nA)
{
int indexA = iA, indexB = iA + nA;
int nB = nA;
if (pArray->count < indexB + nB){
nB = pArray->count - indexB;
}
int target_idx = indexA;
while (indexA < iA + nA && indexB < iA + nA + nB){
// compare pA[indexA] and pB[indexB]
if (compare(&pArray->values[indexA], &pArray->values[indexB]) > 0) {
// write value from first array to target
pSpace[target_idx] = pArray->values[indexA];
++indexA;
}
else {
// write value from second array to target
pSpace[target_idx] = pArray->values[indexB];
++indexB;
}
target_idx++;
}
// copy remaining elements of the first array
while (indexA < iA + nA){
pSpace[target_idx++] = pArray->values[indexA++];
}
// copy remaining elements of the second array
while (indexB < iA + nA + nB){
pSpace[target_idx++] = pArray->values[indexB++];
}
// return size of merged array
return nA + nB;
}
```
### Activity 8: Implement the divide-and-conquer algorithm
The function that completes the merge-sort algorithm is responsible for the pairwise merging of increasingly larger adjacent arrays. Part of this function is given below. Complete the code at the indicated point. Note that besides line 18 of the listing, you may need to add one or more variable(s) and / or statements to make the code work correctly! Use the video explanations you’ve found in an earlier activity
```c
void mergeSort(Array *pArray)
{
// initially, each element is a single array
int numArrays = pArray->count;
int arraySize = 1;
// create a temporary space for holding the merged arrays
Data \*pWork = malloc(sizeof(Data) \* pArray->capacity);
// merge the arrays until one sorted array is left
while (numArrays > 1)
{
// index of first sub-array to be merged with adjacent array
int index = 0;
// merge each pair of consecutive (sub)arrays into a single array
while (index < pArray->count)
{
// call merge(pArray, pWork, index, arraySize) to merge the two sorted (sub)arrays into a sorted larger array
// increase the index by the number of merged elements
\TODO:[complete the code!\]
}
// update numArrays: divide by 2 and round up
numArrays = (numArrays + 1) / 2;
// double the arraySize
arraySize = arraySize * 2;
// swap working space with array
Data *tmp = pArray->values;
pArray->values = pWork;
pWork = tmp;
}
// free the working space
free(pWork);
}
```
---
* Answer:
```c=
void merge_sort(Array *pArray) {
// initially, each element is a single array
int numArrays = pArray->count;
int arraySize = 1;
// create a temporary space for holding the merged arrays
Data *pWork = malloc(sizeof(Data) * pArray->capacity);
// merge the arrays until one sorted array is left
while (numArrays > 1){
// compute the number of pairs
const int numPairs = numArrays / 2;
// number of elements remaining to be sorted
int remaining = pArray->count;
// merge each pair of consecutive (sub)arrays into a single array
for (int i = 0; i < numPairs; i++){
// call merge(pArray, pWork, 2 * i * arraySize, arraySize) to merge the two sorted (sub)arrays into a sorted larger array
// decrease the number of remaining elements by the number of merged elements
merge(pArray, pWork, 2 * i * arraySize, arraySize);
--remaining;
}
// copy the unmerged values
merge(pArray, pWork, 2 * numPairs * arraySize, remaining);
// update numArrays: divide by 2 and round up
numArrays = (numArrays + 1) / 2;
// double the arraySize
arraySize = arraySize * 2;
// swap working space with array
Data *tmp = pArray->values;
pArray->values = pWork;
pWork = tmp;
}
// free the working space
free(pWork);
}
```
### Activity 9: Test merge sort
Test your implementation of merge sort using the test code (see the file test_sort.c) provided on Blackboard. Note that the file depends on other header and implementation files, which are provided on Blackboard as well. Use different sorting criteria, and fix any bugs that may show up.
```c=
int compareE(const Data *pFirst, const Data *pSecond) {
return compare_energy(pFirst, pSecond);
}
int compareP(const Data *pFirst, const Data *pSecond) {
return compare_popularity(pFirst, pSecond);
}
```
---
---
---


### Activity 10: Binary search - step by step
Consider the following example array of numbers, which consists of 22 values:

Fill in the table below to describe each step of the binary search algorithm, when searching for the value of 42 in the array (the first step is already given). Note that the notation \[a..b) indicates the range of values including a, but excluding b.

### Activity 11: Implement binary search
```c
int binarySearch(const Array \*pArray, const Data \*pValue)
{
int lpos = 0;
int rpos = pArray->count;
while (lpos < rpos)
{
// search the interval \[lpos, rpos)
// (lpos is included in the interval, rpos is not)
// determine mid position in \[lpos, rpos)
int mid = (lpos + rpos) / 2;
// compare element at middle position against value searched for
int ordering = compare(&pArray->values\[mid\], pValue);
if (ordering > 0)
{
// must continue search in interval \[lpos, mid)
\TODO:[complete the code!\]
}
else if (ordering < 0)
{
// must continue search in interval \[mid + 1, rpos)
\TODO:[complete the code!\]
}
else
{
// value is found at index mid!
return mid;
}
}
// we didn't find the value, so stop and indicate
// to caller by returning -1.
return -1;
}
```
---
* Answer:
```c=
int binarySearch(const Array *pArray, const Data *pValue)
{
int lpos = 0;
int rpos = pArray->count;
while (lpos < rpos)
{
// determine mid position in [lpos, rpos)
int mid = (lpos+rpos)/2;
// compare element at middle position against value searched for
int ordering = compare(&pArray->values[mid], pValue);
if (ordering < 0) {rpos=mid;}
else if (ordering > 0) {lpos=mid;}
else {return mid;}
}
// we didn't find the value, so stop and indicate
return -1;
}
```
### Activity 12: Test binary search
Test your implementation of the binary search algorithm by writing a program that:
1. loads a CSV file containing several Spotify data records into an array,
2. sorts the array using either selection sort or merge sort,
3. searches for a particular element,
4. reports the element (if it was found).
---
* Answer:
```c=
int main()
{
Array array;
array_init(&array, 10000);
if (!parse_csv(&array, "data_1000.csv"))
{
return 1;
}
printf("%d records read\n", array.count);
selection_sort(&array);
printf("Contents:\n");
for (int i = 0; i < array.count; i++)
{
printf("[%d] ", i);
data_print(&array.values[i]);
}
Data value = array.values[99];
int found = binarySearch(&array, &value);
printf("\nSearched for: %d\n", found);
return 0;
}
```
---
---
---


### Activity 13: Time complexity
Finding the greatest element in an array of 1 million elements surely takes longer than finding the greatest among 1000 elements. How much longer it takes to process the larger array depends on the algorithm that is used. Below two algorithms are given to find the maximum element in an array. Study the two algorithms, and answer the following questions:
* Which of these two algorithms is more efficient? In your explanation, describe, for each of the two algorithms, how it scales, in the worst case, as the size of the input (i.e. the number of elements in the array) increases.
* What is the time complexity of each of the two algorithms? Give the answer in "Big O-notation".
```c
// finds the max. element in array, assumes pArray->count > 0
int findMax(const Array *pArray)
{
// assume that max is at index 0
int max = pArray->values[0];
// update max by iterating over array
for (int i = 1; i < pArray->count; i++)
{
if (pArray->values\[i\] > max)
max = pArray->values[i];
}
return max;
}
// finds the max. value in array, assumes pArray->count > 0
int findMax2(const Array *pArray)
{
for (int i = 0; i < pArray->count; i++)
{
// assume that the max is at index i
int indexIContainsTheMax = 1;
for (int j = 0; j < pArray->count; j++)
{
if (pArray->values[i] < pArray->values[j])
{
// pArray->values\[i\] can't be the max
indexIContainsTheMax = 0;
break;
}
}
// the max value must indeed be at index i
if (indexIContainsTheMax)
return pArray->values[i];
}
}
```
* Answer: findMax has a single loop that iterates over the n elements of the array, so it scales linearly, and has a time complexity of O(n). The other function, findMax2, has two nested loops - both the inner and outer loop iterate over all n elements of the array, so this function scales quadratically, and has a time complexity of O(n 2 ).
---
---
---

### Activity 14: Compare merge sort and selection sort
Download the file sort_eval.c from Blackboard. The main function in this file loads records from a CSV file, and then runs either merge sort or selection sort multiple times. Note that the file depends on other header and implementation files, which are provided on Blackboard as well.
Run the main function using different input files (i.e., with different numbers of records), to measure how the runtime of merge sort and selection sort scale as the size of the input increases (try to run your code in Release mode to get better performance). Log your results (input size, runtimes of the two algorithms) in a table.
Which of the two sorting algorithms is more efficient? Based on your results, can you say anything about the time complexity of the two algorithms?
---
* Answer:

As seen in the run times above the merge sort algorithm is much more efficient than select sort. So much so that once the file size goes up to 100,000 the run time of select sort is almost **half a minute** compared to less than **half a second** with the merge sort. It turns out that due to the nested for loops in the select sort algorithm the O notation is **O(n^2)** making it radically inefficient for large sized arrays. On the other hand the merge sort turns out to be getting more and more efficient the bigger the array becomes. Very interesting but also understandibe with the divide and conquer technice. For this reason the algorithm is labled as **O(log n)**.



---
---
---

### Activity 15: Compare naive search and binary search
Compare the binary search algorithm with the "linear" search algorithm listed below.
```c=
int linearSearch(const Array \*pArray, const Data \*pValue)
{
for (int i = 0; i < pArray->count; i++)
{
if (compare(&pArray->values\[i\], pValue) == 0)
return i;
}
return -1;
}
```
Suppose you would use both functions (i.e., binary and linear search) to search an array for a particular element. Which of the two functions will be more efficient? What are the time complexities of the two functions? How do the time complexities change if the functions would be applied to a linked list rather than an array?
* Answer: Binary search will be more efficient than the linear search, because binary search halves the search space at each step. The time complexity of linear search is O(n), because in the worst case it must search the entire array. The time complexity of binary search is O(log n). In case a linked list would be used, then both linear and binary search will be O(log n), because a linked list does not offer random access.
## Week 4 - Stacks and Queues

### Activity 1: LIFO vs FIFO
Use the internet to find out what the acronyms LIFO and FIFO mean. Why is a stack sometimes called a LIFO memory buffer?
Suppose you would be able to add / remove items from both the bottom and the top of the stack. Would this give you FIFO access to the elements?
* Answer:
* LIFO = Last in, First Out.
* FIFO = First In, First Out.
* If you would have access to both ends of the stack, then you essentially have a queue, which offers FIFO (but also LIFO, it just depends on how you access the elements) access.
---
---
---

### Activity 2: Complete the stack implementation
Locate the header file stack.h and associated source file stack.c in the files downloaded from Blackboard.
These files contain an incomplete implementation of a stack data structure, which stores float elements. Lines that need to be completed are marked with "\[complete the code!\]". Complete the implementation by implementing the marked lines.
The main function (located in the main.c file) tests whether your implementation of the stack data structure is correct. Make sure that all tests pass before you continue to the next activity.
```c=
void stack_push(Stack *p_stack, float value) {
stack_ensure_capacity(p_stack, p_stack->count + 1);
// add the value, update the count
++p_stack->count;
p_stack->pValues[p_stack->count] = value;
}
float stack_peek(Stack *p_stack) {
if (p_stack->count > 0) {
// return the top value, decrement count
return p_stack->pValues[p_stack->count];
}
else {
// no values available, indicate error
return FP_NAN;
}
}
float stack_pop(Stack *p_stack) {
if (p_stack->count > 0) {
// return the top value
--p_stack->count;
return p_stack->pValues[p_stack->count+1];
}
else {
// no values available, indicate error
return FP_NAN;
}
}
```
---
---
---


### Activity 3: Reverse Polish Notation
Implement the function process_rpn (a skeleton is already given) that processes an RPN string in the form of tokens (see the accompanying source code), and uses a stack to compute the result of the expression. The function must be able to process four different operators: addition (+), subtraction (-), multiplication (*), and division (/).
The function must print the actions it performs to the console’s output, in the following way:
* whenever it explicitly (i.e. not if it is caused by the processing of an operator) pushes a value V onto the stack, the program must print "push V ".
* whenever it pops a value from the stack, the program must print "pop".
* whenever an operator is processed, the program must print add, sub, mul, or div, for the respective operators +, −, ∗, and /.
* it must print the result of the computation as the last line
For example, when given the RPN expression "3 4 * 5 +", it must give the output:
push 3
push 4
mul
push 5
add
=
result: 17
The main function in the accompanying main.c file of this activity contains code to test your function.
Make sure that all tests pass before you continue to the next activity.
In your log book, record the output of your program for the following RPN expression:
**12 3 3 5 * 3 2 - / 20 4 / + 6 - * -**
---
Answer:
```c=
void process_rpn(const RPNSequence *sequence, Stack *pStack) {
for (int i = 0; i < sequence->count; i++) {
Token token = sequence->tokens[i];
if (token.type == VALUE) {
// push to the stack
// output a "push xxx" to the console
// TODO [complete the code!]
stack_push(pStack, token.value);
printf("push %.2f\n", token.value);
}
else if (token.type == OPERATOR) {
// apply operation to top 2 operands on the stack, push result
// output a "mul", "add", "div", or "sub" to the console
// TODO [complete the code!]
float value1 = stack_pop(pStack);
float value2 = stack_pop(pStack);
float product;
if(token.operator=='+'){
product=value2+value1;
printf("add -> %.2f+%.2f = %.2f\n",value2,value1,product);
}
else if(token.operator=='-'){
product=value2-value1;
printf("sub -> %.2f-%.2f = %.2f\n",value2,value1,product);
}
else if(token.operator=='*'){
product=value2*value1;
printf("mul -> %.2f*%.2f = %.2f\n",value2,value1,product);
}
else {
product=value2/value1;
printf("div -> %.2f/%.2f = %.2f\n",value2,value1,product);
}
stack_push(pStack, product);
}
}
}
```
**Output:**

---
---
---


### Activity 4: Application of queues
Search the internet to find out what the typical applications of (double-ended) queues are. Try to come up with at least three examples, and log these in your log book.
* Answer: Queues are very helpful in cases of:
* process schedules
* message queues for communication
* data buffers for I/O devices
Most of the time having limited resources and having to share it across multible processes requires setting up a queue to manage incoming requests.
---
---
---


### Activity 5: Implement a Queue using a linked list
The files ll\_queue.h and ll\_queue.c contain a partial implementation of a linked list-based queue. Finish the implementation by completing the lines marked with \[complete the code!\].
The main function in the main.c file contains code that will test the implementation. Make sure that all tests pass before moving on to the next activity
```c=
void llq_push_back(LLQueue *p_queue, int value) {
LinkedListNode *node = (LinkedListNode*) malloc(sizeof(LinkedListNode));
node->value = value;
node->next = NULL;
node->prev = p_queue->tail;
if (p_queue->head == NULL) {
p_queue->head = node;
}
else {
p_queue->tail->next = node;
}
p_queue->tail = node;
p_queue->count++;
}
void llq_push_front(LLQueue *p_queue, int value) {
LinkedListNode *node = (LinkedListNode*) malloc(sizeof(LinkedListNode));
node->value = value;
node->prev = NULL;
node->next = p_queue->head;
if (p_queue->tail == NULL) {
p_queue->tail = node;
}
else {
p_queue->head->prev = node;
}
p_queue->head = node;
p_queue->count++;
}
int llq_back(const LLQueue *p_queue) {
// return value at end / rear of queue
LinkedListNode *back = p_queue->tail;
int value = back->value;
return value;
}
int llq_front(const LLQueue *p_queue) {
// return value at front / head of queue
LinkedListNode *back = p_queue->head;
int value = back->value;
return value;
}
int llq_pop_back(LLQueue *p_queue) {
LinkedListNode *back = p_queue->tail;
p_queue->tail = p_queue->tail->prev;
if (p_queue->tail != NULL) {
// break connection from queue's tail to back
p_queue->tail->next = NULL;
}
else {
// also set head to NULL
p_queue->head = NULL;
}
int value = back->value;
// release memory
free(back);
p_queue->count--;
return value;
}
int llq_pop_front(LLQueue *p_queue) {
LinkedListNode *front = p_queue->head;
p_queue->head = p_queue->head->next;
if (p_queue->head != NULL) {
// break connection from queue's head to front
p_queue->head->prev = NULL;
}
else {
// also set tail to NULL
p_queue->tail = NULL;
}
int value = front->value;
// release memory
free(front);
p_queue->count--;
return value;
}
```
---
---
---


### Activity 6: Implement a Queue using an array
The files array\_queue.h and array\_queue.c contain a partial implementation of a queue based on a circular array. Finish the implementation by completing the lines marked with \[complete the code!\].
Note that the partially given implementation does not store the index of the tail in a separate variable. The reason for this is that the index of the tail can be computed from the index of the head, the number of elements in the queue, and the capacity of the backing array.
The main function in the main.c file contains code that will test the implementation. Make sure that all tests pass before moving on to the next activity.
```c=
void aq_push_back(ArrayQueue *p_queue, int value) {
// make sure there's enough space in the queue
aq_ensure_capacity(p_queue, p_queue->count + 1);
// append value & update count (hint: use the modulo (%) operator!)
if(p_queue->count!=0){p_queue->idx_tail=(p_queue->idx_tail+1)%p_queue->capacity;}
p_queue->p_values[p_queue->idx_tail] = value;
p_queue->count++;
}
void aq_push_front(ArrayQueue *p_queue, int value) {
// make sure there's enough space in the queue
aq_ensure_capacity(p_queue, p_queue->count + 1);
// prepend value, update idx_head & count, make sure that idx_head does not become negative!
if(p_queue->count!=0){p_queue->idx_head=(p_queue->idx_head-1)%p_queue->capacity;}
if(p_queue->idx_head==-1){p_queue->idx_head=p_queue->capacity-1;}
p_queue->p_values[p_queue->idx_head] = value;
p_queue->count++;
}
int aq_back(const ArrayQueue *p_queue) {
// return value at end / rear of queue
return p_queue->p_values[p_queue->idx_tail];
}
int aq_front(const ArrayQueue *p_queue) {
// return value at front / head of queue
return p_queue->p_values[p_queue->idx_head];
}
int aq_pop_back(ArrayQueue *p_queue) {
int value = aq_back(p_queue);
p_queue->idx_tail=(p_queue->idx_tail-1)%p_queue->capacity;
if(p_queue->idx_tail==-1){p_queue->idx_tail=p_queue->capacity-1;}
p_queue->count--;
return value;
}
int aq_pop_front(ArrayQueue *p_queue) {
int value = aq_front(p_queue);
p_queue->idx_head=(p_queue->idx_head+1)%p_queue->capacity;
p_queue->count--;
return value;
}
```
---
---
---


### Activity 7: Algorithm analysis
What is the time complexity of the naive approach given above? That is, how will the runtime of this algorithm scale as the size of the input increases? Does it scale (roughly) linearly, quadratically, or is there are more complex scaling?
Give your answer in Big O notation.
* Answer: The naive approach contains two nested loops. The outer loop iterates all n elements of the array. The inner loop performs more or less a constant number of iterations (this depends on the values in the array). That means that the inner loop can be regarded as taking a constant amount of time.
* Consequently, the runtime scales roughly linearly, worst-case time complexity is O(n).
```c=
int find_sum_naive(const Array *array, int target, int *count) {
for (int i = 0; i < array->count; i++) {
int sum = 0;
for (int j = i; j < array->count; j++) {
sum += array->values[j];
if (sum == target) {
// found!
*count = j - i + 1;
return i;
}
else if (sum > target)
break;
}
}
return -1;
}
```
---
---
---



### Activity 8: Solve the problem using the linked list-based queue
Based on the approach sketched abouve, implement the function find\_sum\_llqueue, which uses the linked list-based queue that you’ve created earlier to solve the problem.
The provided main function will verify the correctness of your algorithm. Make sure that it works correctly before proceeding to the next activity.
```c=
int find_sum_llqueue(const Array *array, int target, int *count) {
LLQueue q;
llq_init(&q);
int sum = 0;
for (int i = 0; i < array->count; i++) {
// add element to queue, update sum, remove elements while sum greater than target
llq_push_back(&q,array->values[i]);
sum += array->values[i];
while(sum>target){
sum-=llq_pop_front(&q);
}
// if sum == target, set *count and return index
if (sum == target) {
// cleanup
llq_free(&q);
// set *count, return index of sequence
*count = q.count;
return i - q.count + 1;
}
}
// cleanup
llq_free(&q);
return -1;
}
```
### Activity 9: Solve the problem using the array-based queue
Implement the function find\_sum\_arrayqueue, which uses the array-based queue that you’ve created earlier to solve the problem.
Again, the provided main function will verify the correctness of your algorithm. Make sure that it works correctly before proceeding to the next activity.
```c=
int find_sum_arrayqueue(const Array *array, int target, int* count) {
ArrayQueue q;
aq_init(&q, 100);
int sum = 0;
for (int i = 0; i < array->count; i++) {
// add element to queue, update sum, remove elements while sum greater than target
aq_push_back(&q, array->values[i]);
sum+=array->values[i];
while(sum>target){
sum-=aq_pop_front(&q);
}
// if sum == target, set *count and return index
if (sum == target) {
// cleanup
aq_free(&q);
// set *count, return index of sequence
*count = q.count;
return i - q.count + 1;
}
}
// cleanup
aq_free(&q);
return -1;
}
```
---
---
---

### Activity 10: Benchmark the three algorithms
In the provided main function, uncomment the line in which the benchmark function is called. This will let the program perform a benchmark on a number of files with increasingly many numbers.
For each of the three algorithms, log the reported runtimes in your log book. Note that the times reported by the program are averages taken over many runs. If the program needs too much time for the larger files, it may be necessary to adjust the number of runs per
**The Daytona 500 Leaderboard:**

### Activity 11: Time complexity
What are the time complexities of the three algorithms? In other words, how do the runtimes of these algorithms scale as the size of the input increases? Do they scale (roughly) linearly, quadratically, or are there are more complex scalings? Motivate your answer using the runtime measurements provided by the main function in the main.c file.
* Answer: The three algorithms all have linear time complexity: O(n). The algorithm using a linked list will have more overhead due to the release and allocation of memory, which results in a slightly lower performance
### Activity 12: Explain the impact of the chosen data structure
Time complexity will tell you which algorithm scales best, but it does not tell you which algorithm performs best. Which algorithm needs the least time to solve the problem? Can you explain the differences in the runtimes of the three algorithms?
* Answer: See previous answer: linked list = more overhead due to the use pointers, circular array = fastest, because of a better locality of reference. Naive approach is slower because it recomputes the sum for each element in the array.
## Week 5 - Sets

### Activity 1: Find out: set membership
Name at least five other examples of membership testing occurring in real life, that follow the same principle as the examination list check (Membership is based not on some property of an object but on belonging to a set of items / a list).
* Answer:
* checking if a person is able to access a dataset
* checking whether a person belongs to a family
* checking for class prerequsites
* checking for membership in a club
* checking for participation in a newsletter
---
---
---

### Activity 2: Activity: comparing floating point numbers
Test if the double numbers a and b, initialized as shown below, are equal using the == comparison:
```c
double a = 41.0 + 0.5 + 0.2 + 0.2 + 0.1;
double b = 43.0 - 0.5 - 0.2 - 0.2 - 0.1;
```
Find on the internet what the reasons are for the results that you observe. Is there a better way to compare floating point numbers?
Implement a function equals\_dbl that can correctly compare floating points numbers, like those obtained with the expressions above.
```c
_Bool equals_dbl(double d1, double d1);
```
---
* Answer:
```c=
int main() {
double a = 41.0 + 0.5 + 0.2 + 0.2 + 0.1;
double b = 43.0 - 0.5 - 0.2 - 0.2 - 0.1;
_Bool equal = (double a == double b);
printf("a=%.20f\n", a);
printf("b=%.20f\n", b);
printf("a and b are %s", equal? "equal" : "not equal");
return 0;
}
```

Due to the tiny rounding error occuring in the first implementation, the best solution for this problem is to take the absolute value of the differance of the two double varibales and comapare this difference to 1e-9. This tiny number will ensure that the values are still very very close together (almost if not equal) however will not fail due to a rounding error.
```c=
_Bool equals_dbl (double d1, double d2){
if (abs(d1 - d2) < 1e-9) return true;
else return false;
}
```
---
---
---

### Activity 3: Comparing compound data types
Implement the function equals\_prs that compares two instances of the Person struct. Use the strcmp defined in the header string.h for comparing name’s.
```c
typedef struct{
const char* name;
const int yearOfBirth;
} Person;
_Bool equals_prs(const void* obj1, const void* obj2);
```
The function equals takes two void pointers instead of Person*. Passing void pointers is the only way in C to write generic code that can work with different types. To be able to compare two Person instances, you’ll have to first cast those pointers to their correct types:
```c
const Person* p1 = (const Person*)obj1;
```
---
* Answer:
```c=
_Bool equals_prs(const void* obj1, const void* obj2){
const Person* p1 = (const Person*)obj1;
const Person* p2 = (const Person*)obj2;
if(p1->yearOfBirth==p2->yearOfBirth&&!strcmp(p1->name,p2->name)) return true;
return false;
}
```
---
---
---

### Activity 4: Function prototypes
Write the prototypes of the add, remove and contains functions for items of type double and a set data structure defined as:
```c
typedef struct {
double* data;
int size; // number of items in this set
int capacity; // size of the \`data\` array
} Set;
```
---
* Answer:
```c=
typedef struct {
double* data;
int size; // number of items in this set
int capacity; // size of the `data` array
} Set;
_Bool contains(Set* set, double value);
_Bool add(Set* set, double value);
_Bool sub(Set* set, double value);
```
---
---
---


### Activity 5: Function implementations
Implement the Set functions add, remove and contains. There can be more items than DEFAULT_CAP in the Set, you have to monitor this and resize the data array as needed. You can add any functions that you want to facilitate this.
Use the Standard C Library functions memcpy and memmove to move and copy blocks of memory, instead of doing manual reassignment.
```c=
_Bool contains(Set* set, double value){
for(int i=0; i<set->size; i++){
if(set->data[i]==value) return true;
}
return false;
}
_Bool add(Set* set, double value){
if(contains(set, value) || set==NULL || set->size==DEFAULT_CAP) return false;
if(set->size<set->capacity) set->data[set->size] = value;
else{
set = realloc(set,set->capacity+1);
set->data[set->size] = value;
}
set->size++;
return true;
}
_Bool sub(Set* set, double value){
if(!contains(set, value)) return false;
for(int i=0; i<set->size; i++){
if(set->data[i]==value){
set->size--;
for (int j = i; j < set->size; j++)
{
set->data[j] = set->data[j+1];
}
return true;
}
}
}
```
---
---
---

### Activity 6: Time complexity of remove and add
What is the worst case time complexity of the functions add and remove in the current implementation? Support your answers with arguments.
* Answer: The add function calls the contains function, which is O(n). In case the element is not contained in the array, the add function appends the element to the array, which is O(1). Consequently, the time complexity of the add function is O(n). The remove function iterates over all the elements, and shifts elements to the left. This gives a total worst-case time complexity of O(n) as well.
---
---
---

### Activity 7: Binary search
On the schematic representation of an array that contains sixteen sorted integers, mark the values of the lo and hi variables after each iteration of the while loop in the contains function as implemented above. The starting position is indicated with the letter S. The successive iterations are numbered using Roman numerals.
Value to find: 44:

Value to find: 11:

What is the time complexity of binary search?
* Answer: Binary search has a worst-case time complexity of O(log n).
---
---
---

### Activity 8
```c
int indexOf(Set* set, double needle){
// a few special cases
if (set->size == 0)
return 0;
if (set->size == 1)
return (set->data\[0\] > needle)? 0 : 1;
// the rest of the implementation comes here:
}
// assume that set contains the following double numbers:
// {1, 5, 8, 9, 14, 15, 16, 18, 27, 35, 36, 43, 44, 46, 59, 64}
#include <assert.h>
assert(0 == indexOf(set, 1.0));
assert(0 == indexOf(set, 0.0));
assert(16 == indexOf(set, 65.0));
assert(15 == indexOf(set, 64.0));
assert(15 == indexOf(set, 63.0));
assert(1 == indexOf(set, 5.0));
assert(2 == indexOf(set, 6.0));
assert(2 == indexOf(set, 7.0));
assert(2 == indexOf(set, 8.0));
assert(3 == indexOf(set, 9.0));
```
---
* Answer:
```c=
int indexOf(Set* set, double needle){
// a few special cases
if (set->size == 0) return 0;
if (set->size == 1) return (set->data[0] > needle)? 0 : 1;
if (set->data[set->size-1]<needle) return set->size;
int lo = 0, hi = set->size - 1, mid = lo + (hi - lo) / 2;
while (lo <= hi){
mid = lo + (hi - lo) / 2;
if (set->data[mid] < needle) lo = mid + 1;
else if (set->data[mid] > needle) hi = mid - 1;
else return mid;
}
return (set->data[mid] > needle)? mid : mid+1;
}
```
---
---
---

### Activity 9: Functions add and remove for the sorted array
Implement the functions add and remove for the Set variant that maintains a sorted data array. Use the indexOf function to simplify your implementations.
Use the Standard C Library functions memcpy and memmove to move and copy blocks of memory, instead of doing manual reassignment.
```c=
_Bool add(Set* set, double value){
if(contains(set, value) || set==NULL || set->size==DEFAULT_CAP) return false;
int index = indexOf(set, value);
if(set->size<set->capacity){
for(int i=set->size+1; i>index; i--){
set->data[i] = set->data[i-1];
}
set->data[index] = value;
}
else{
if(set->size+10<DEFAULT_CAP) set = realloc(set,set->capacity+10);
else set = realloc(set,set->capacity+1);
for(int i=set->size+1; i>index; i--){
set->data[i] = set->data[i-1];
}
set->data[index] = value;
}
set->size++;
return true;
}
_Bool sub(Set* set, double value) {
if (!contains(set, value)) return false;
set->size--;
for (int i = indexOf(set, value); i < set->size; i++) {
set->data[i] = set->data[i + 1];
}
return true;
}
```
### Activity 10: Time complexity of the binary search implementation
What is the worst, best and average/ expected time complexity of the function add, remove and contains in this new implementation? Why?
* Answer: The worst-case time complexities of add and remove are both O(n). This is due to the fact that elements must be shifted to make place for the new element or to fill the gap for the deleted element - in the worst case, all elements must be shifted. The worst-case time complexity of the contains function is O(log n), because it uses binary search.
---
---
---





### Activity 11: Functions indexOf and contains
Implement the function gs\_indexOf that works with GenericSet. This functions should work as previously – return the index of needle if it’s found in the set or the index of an insertion point that maintains the sorted ordering if needle is not in the set. Remember to use gs->cmp(...) for comparing elements and to use the size of an element for array indexing.
```c
int gs_indexOf(const GenericSet* gs, const void* needle);
// contains implementation that uses gs_ondexOf
_Bool gs_contains(GenericSet* gs, const void* needle){
void* element = gs_element_at(gs, gs_indexOf(gs, needle));
if (element && gs->cmp(element, needle) == 0)
return true;
return false;
}
```
### Activity 12: Functions add and remove
Finally, it’s time to implement the functions gs\_add and gs\_remove. They both should use gs\_indexOf.
```c
// Adds item to the set. Returns true if item was added and false if it was already a member of the set
_Bool add(GenericSet* gs, void* item);
// Removes item from the set. Returns true if item was removed and false if it wasn't a member of the set
_Bool remove(GenericSet* gs, void* item);
```
## Week 6 - Dictionaries/Maps

### Activity 1: Activity: count letters in a string
Implement a function countLetters. This functions takes an array freqs of (26) unsigned←- longs and a null-terminated string as its arguments. It updates the freqs array by adding the counts of the letters in the string to it. It returns the total number of letters it counted.
Counting is not case-sensitive. Remember to ignore all non-letter characters in str.
```c
unsigned long countLetters(unsigned long counts\[static 26\], const char* str);
int main(void){
unsigned long counts\[26\] = {0};
const char* text = "Alice in Wonderland.";
(void)countLetters(&counts\[0\], text);
}
```
After executing the above piece of code the counts array will have the following values stored in it:
[2, 0, 1, 2, 2, 0, 0, 0, 2, 0, 0, 2, 0, 3, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0]
---
* Answer:
```c=
unsigned long countLetters(unsigned long counts[static 26], const char* str){
for(int i=0; i<strlen(str); i++){
counts[tolower(str[i]) - 'a']++;
}
return *counts;
}
```
---
---
---

### Activity 2: Recognizing languages
**Part 1:**
Write a function makeSignature that creates a six-letter language signature from a letter counts array. A language signature consists of the six most frequent letters ordered from the most to the least frequent.
```c=
const char* makeSignature(unsigned long counts\[static 26\]){
static char signature\[7\];
// write the implementation
signature\[6\] = '\\0'; // null terminator
return &signature\[0\];
}
```
Notice that this function returns a pointer to an array defined locally within a function. Normally, this is illegal and dangerous. However, signature is a static variable, which means that’s it allocated in the global program memory and guaranteed to exist throughout the lifetime of the program. The danger of using such a construct is that the signature returned by makeSignature is valid only until the next time this function is called. Moreover, returning a pointer from a function usually has the meaning of transferring ownership of a dynamically allocated object (like for the function processFile). Here, the makeSignature function remains the owner of signature (and it’s illegal to call free on the pointer returned by this function). Always document such uncommon behaviors (they also happen in the standard C library, see for example the function local_time: https://en.cppreference.com/w/c/chrono/localtime.)
**Part 2:**
Use the function matchLanguage (LanguageReconizer.h header) to automatically recognize the languages the aliceX.txt files are written in. What are the results?
```c=
const char* makeSignature(unsigned long counts[static 26]){
static char buffer[7];
for(int i=0; i<6; i++){
int biggest_index = 0;
for(int j=1; j<ALPHABET_SIZE; j++){
if(counts[biggest_index]<counts[j]){
biggest_index=j;
}
}
buffer[i] = biggest_index + 'a';
counts[biggest_index]=0;
}
buffer[6] = '\0';
return &buffer[0];
}
```
---
---
---

### Activity 3: Find out: dictionaries around us
Name at least five (other) real-life examples of key-value relationships that can be modeled as a dictionary
* Answer:
* each phone number is mapped to person
* each address is mapped to a landlord
* each city is mapped to a country
* each class is mapped to a teacher
* each book is to an author
---
---
---


### Activity 4: Find out: Dictionaries in other languages
Find out how a dictionary type is called in at least five other programming languages (including Python and C#). For two of them list the functions that: insert a key-value pair, delete a key, check if a key exists and retrieve the value associated with a key, writing code examples of how it’s done.
* Answer:
* **Java's Hashtable:**
* Hashtable<type, type> dict = new Hashtable<type, type>();
* dict.put(key,val);
* dict.remove(key);
* dict.get(key);
* **Python's Dictionary:**
* dict = {'jack': 4098, 'sape': 4139}
* dict['guido'] = 4127
* del dict['sape']
* dict['jack']
* **C#'s Dictionary/Hashtable**
* **Rust's HashMap**
* **R's RDict**
---
---
---




### Activity 5: ensureCapacity function
Implement the \_ensureCapacity function with the following signature:
```c
_Bool _ensureCapacity(Counter counter[static 1], const int minimumCapacity);
```
This function takes an argument minimumCapacity and, if the capacity of the counter passed to it is less than this value, resizes this counter’s data array to fit at least minimumCapacity items. This function returns true if resizing succeeded. If resizing failed (for instance, if allocating new memory failed) it returns false.
_ensureCapacity should use the realloc function of the standard library.
The initial check for whether there is enough capacity can be done as follows:
```c
static const int GROWTH\_FACTOR = 2;
int new\_capacity = counter->capacity;
if (new\_capacity == 0){
new\_capacity = DEFAULT\_CAPACITY;
}
while (minimumCapacity > new\_capacity){
new\_capacity *= GROWTH\_FACTOR;
}
if (new_capacity > counter->capacity){
// resize the data array and modify the appropriate counter's variables
}
```
---
* Answer:
```c=
_Bool _ensureCapacity(Counter counter[static 1], const int minimumCapacity){
int new_capacity = counter->capacity;
if (new_capacity == 0){
new_capacity = DEFAULT_CAPACITY;
}
while (minimumCapacity > new_capacity){
new_capacity *= GROWTH_FACTOR;
}
if (new_capacity > counter->capacity){
counter = (Counter*) realloc(counter, sizeof(Counter*) * new_capacity);
counter->capacity = new_capacity;
if(counter!=NULL){
return true;
}
else return false;
}
return true;
}
```
### Activity 6: insertAt function
Now that _ensureCapacity is ready, it’s time to implement the insertAt function. Notice, that the first thing this function does is calling _ensureCapacity to resize the data array as needed:
```c
_Bool _insertAt(Counter counter[static 1], const int index, const char* key, const unsigned long value){
if (_ensureCapacity(counter, counter->size + 1)){
...
}
return false; //ensuring capacity failed
}
```
In the remainder of the function it must:
1. Create a new Pair with the passed key and value.
2. Shift the items of the data array in the range [index...counter->size - 1] by one position to make space for the inserted item. Use memmove to do this.
3. Insert the just created pair into the freed spot of data.
4. Manipulate the counter’s variables if needed. Similarly to _ensureCapacity this function returns false if the (insertion) operation succeeded and false otherwise.
---
* Answer:
```c=
static _Bool _insertAt(Counter counter[static 1], const int index, const char* key, const unsigned long value){
if (_ensureCapacity(counter, counter->size + 1)) {
Pair new_entry = makePair(key, value);
if(counter->size>0) memmove(&counter->data[index + 1], &counter->data[index], sizeof(Pair[counter->size - index]));
counter[index].data = &new_entry;
counter->size++;
return true;
}
return false;
}
```
### Activity 7: Activity: increment function
Finally, implement the increment function. This function increments the count for an item with the passed key. If such an item doesn’t belong to the counter it is inserted with the count of 1. The function returns the new count for an item or 0 if increment hasn’t succeeded (this can happen if the key didn’t exist and insertion failed). Use the functions getOrDefault and insert in your implementation – no other function calls are needed. increment has the following signature:
```c
unsigned long increment(Counter counter[static 1], const char* key);
```
---
* Answer:
```c=
unsigned long increment(Counter counter[static 1], const char* key){
KeyIndex index = indexOf(counter, key);
if (index.found) {
return ++counter[index.index].data->value;
}
else{
_insertAt(counter, index.index, key, 1);
return 1;
}
return 0;
}
```
---
---
---

### Activity 8: Find out: function strtok
It’s time for you to find out how the strtok function works! Describe what you discovered. Moreover, write down explanations of what each line of code in the snippet above does.
* Answer: The **strok** seperates the char array pointer 'line' given as an argument based on the established 'sep' chars in the first line. The token is an individual word from the line and as long as the line has words it calls the increment function with the token then sets the token to NULL.
---
---
---


### Activity 9: How many words
**Part 1**
How many time are the following words occurring in the "alice0.txt" document: "Alice", "the", "rabbit"?
**Part 2**
How many unique words are there in the "alice0.txt" document?
## Week 7 - Hashing and Hashmaps

### Activity 1: Find out: cost of maintaining order
RecordPairs are stored in a sorted array. This array has a capacity of 8 and is initially empty. How many times a relocation of items in this array will be needed if the elements are being inserted with ids in the following order: [8609, 6348, 4992, 5839, 1622, 5450, 7197, 4827]? How many RecordPairs will be moved in total during this process?
For instance, for the ids: [3, 2, 1] items will need to be relocated two times and precisely three RecordPair will be moved:
1. Insertion of 3 – array after: [3], no relocation needed.
2. Insertion of 2 – array after: [2, 3], one relocation consisting of moving one RecordPair to make space for the inserted item.
3. Insertion of 1 – array after: [1, 2, 3], one relocation consisting of moving two RecordPairs to make space for the inserted item.
---
* Answer: All values except the first one will cause a relocation, so seven relocations are triggered. The total number of moves is 19.
---
---
---


### Activity 2: Making numbers fit
What kind of transformations can you think of that can make the array needed to store elements with ids [8609, 6348, 4992, 5839, 1622, 5450, 7197, 4827] smaller? To get you started: an obvious choice is to subtract the minimum value (1622) from each index. This will bring the indices in the range of 0 . . . 6987 and reduce the required capacity to 8609-1622+1=6988 elements.
What is the time complexity if the insertion, removal and lookup operations of a map, where keys are transformed into indices directly?
* Answer: Insertion, removal and lookup all are O(1), because a key is transformed into an index, and indexing an array is O(1) (random access).
---
---
---

### Activity 3: Find out: calculate the moduli
Calculate the remainders for divisions of the ids: [8609, 6348, 4992, 5839, 1622, 5450, 7197, 4827] by 8, 32 and 31. What can you say about the results? Can indices obtained in this way be used for storing the elements with those ids in an array?

---
* Answer: This depends on the divisor - if we use mod 8 or mod 16, each id will be transformed to a unique index, but if we use mod 17, then there will be ids that map to the same index.
---
---
---


### Activity 4: Find out: hash functions
Find out and what hash functions are and describe their properties. Give a few examples of hash functions.
* Answer: Hash functions are used in order to turn big integers and string into smaller more practical intergers which can be used as indexes to store the values in an array. This can be done in many different ways for example the moduli calculations above would count as a hash function know as the division method. Another example would be the multiplication method.
---
---
---

### Activity 5: Implement FNV-1a
Implement the functions hash_fnv_gen and hash_fnv_str that calculate FNV hashes for generic data and null–terminated strings, respectively. Those functions use constants FNV_BASIS_OFFSET and FNV_PRIME – they are defined in source hash.c.
```c
// computes fnv-1a hash of data with given size
hash_t hash_fnv_gen(const void* data, size_t size);
// computes fnv-1a hash of null-terminated string
hash_t hash_fnv_str(const char* str);
```
Test your implementation with function test_fnv() declared in header tests.h.
The FNV algorithm is as follows:

---
* Answer:
```c=
hash_t hash_fnv_gen(const void* data_, size_t size){
hash_t hash = FNV_OFFSET_BASIS;
const unsigned char* ptr = (const unsigned char*)data_;
while (size--) {
hash ^= *ptr++;
hash *= FNV_PRIME;
}
return hash;
}
hash_t hash_fnv_str(const char* str){
hash_t hash = FNV_OFFSET_BASIS;
while (*str) {
hash ^= (unsigned char) *str++;
hash *= FNV_PRIME;
}
return hash;
}
```
---
---
---

### Activity 6: Implement hash-based indexing
Implement functions hash_index_str and hash_index_gen with the following signatures:
```c
// returns the modulo index of a binary key based on its FNV-1a hash
size_t hash_index_gen(const void* key, size_t size, size_t M);
// returns the modulo index of a string key based on its FNV-1a hash
size_t hash_index_str(const char* key, size_t M);
```
Those functions calculate an index at which an item with a given key will be stored in an array of size M. An algorithm for those functions is shown below:

---
* Answer:
```c=
size_t hash_index_gen(const void* data, size_t size, size_t M){
hash_t hash = hash_fnv_gen(data, size);
size_t index = hash % M;
return index;
}
size_t hash_index_str(const char* str, size_t M){
hash_t hash = hash_fnv_str(str);
size_t index = hash % M;
return index;
}
```
---
---
---






### Activity 7: Word counting
Play around with the hash\_maps target and modify the main function in main\_words.c to see how many times various words occur in the file alice0.txt by calling hm_getOrDefault. Check how those results compare with the previous weeks’ activity. Write down the counts of words Alice, the, and rabbit. Now check how many times the word it occurs. Try to find other words that are in the file but weren’t counted properly.
Alice: 400
the: 1685
rabbit: 2
it: 0
---
---
---

### Activity 8: Hash collisions
Modify function hm_increment to count how many times in total a hash collision occurs during processing of the text file alice0.txt. Next, include printf statements in this function to see for what word (key) a collision happens and what existing key (that’s already stored in the map) it collides with. Note the first five collision pairs (a key that cannot be inserted and a key that’s in the map at the conflicting index ). How many entries are stored in the map after processing the file?
Change the constant DEFAULT_CAPACITY in the simple map source to 3049. Perform the same experiments again and note the results. Has the total number of collisions changed?
Before:
collisions: 11553
After:
collisions: 5385
Alice: 400
the: 1685
rabbit: 2
it: 521
---
---
---




### Activity 9: Activity
The default data array size in the separate chaining implementation is 499. Run your program using this implementation to count the words. It will work correctly, however many slots will contain long chains of linked items. Modify your program so that after counting the words in file alice0.txt it prints the following information:
* The number of free and occupied slots in the data array.
* The load factor (count/size).
* The number of slots that contain chains of more than one item.
* The length of the longest chain in the data array.
* The keys that belong to the items in the longest chain.
* Note all of those results in your logbook. Can you find the DEFAULT_CAPACITY for which no hash collisions occur while processing alice0.txt? What is this value?
Size: 499
Count: 3873
---
---
---


### Activity 10: Implement rehashing (optional)
This activity might be frustrating, proceed with caution.
At the end of the hash\_map\_closed.c file, you’ll find a partially implemented rehash function. This function is triggered by insertion operations when the load factor exceeds 0.7. Analyze other functions to familiarize yourself with the implementation. Try to implement the rehashing algorithm. When done, process alice0.txt and check how many times rehash was triggered. Moreover, note the same numbers in your logbook as in the previous activity.