Topic 5 Algorithms & Complexity

# Topic 5 Algorithms & Complexity > Author: Eric Tsao (NTU BA) > Inspector: Queven Liu (NTU IM) ## 5.1 Algorithms #### Introduction [Programming = Data structure + Algorithms](https://en.wikipedia.org/wiki/Alogorithms_%2B_Data_Structure_%3D_Programs) To design a program, choose data structures to store your data and ==choose algorithms to process your data==. "Data structure" has a course in the next semester "Algorithm" has a course in the first half of the semester next year. Simply put, today our definition of an algorithm is =="Give you a task, you just need to collect a series of steps and follow it. Finally, you can complete the task or the subproblem."== #### Select the algorithm For a promblem me have multiple algorithms, which one is the best algorithm? 1.correctness 2.low time complexity #### professor yt Example Before writing this question, because this chapter focuses on ==improve logical thinking== rather than implement upgrading, so I write ==pseudocode== to prevent everyone forgets the logical algorithms to improve code syntex, allowing everyone ignore the details of implementations. ![](https://i.imgur.com/VsozoGN.jpg) Listing all prime numbers 1. If you have no idea about this question, you can try this basic way ```cpp= input n for i from 2 to n for j from 2 to i - 1 if j can divide i set i to be composite number if i is not the composite number set i to be prime number ``` 2. Next, we use arithmetic thinking, and then we discover that we can run fewer numbers. ```cpp= input n for i from 2 to n for j * j from 2 to i if j can divide i set i to be composite number if i is still not the composite number set i to be prime number ``` Because every number of its factor you least need 2 to ==company with a number larger than sqrt(i)==, you can find some arbitrary number and list the factor, and you will understand this algorithm. If you still don't understand why we just necessary to find until each sqrt(i), you can check this [YouTube](https://www.youtube.com/watch?v=V0RTWy4XyCU"), you might find more detailed explain of this algorithm. 3. If we want to run more faster, we need to change our fundamental thinking way. ```cpp= given a Boolean array that lengh n + 1 Initialize all elements is true //assuming they are prime for i from 2 to n if element is true for j from 2 to (n / i) //eliminating composite numbers set i * j to be false ``` #### Sumarize An algorithm is a step-by-step procedure for solving a problem. Like giving a task, we need to describe **what to do** at each moment to complete that task, and it is detailed description of actions such that each action is doable. If you want to know more [programming algorithm](https://www.geeksforgeeks.org/top-algorithms-and-data-structures-for-competitive-programming/), check this website you will find some common algorithm for programming, like tree, greedy, binary search...... --- ## 5.2 Recursion #### Introduction & Definition The definition of a function is recursive if it invokes itself either directly or not. Recursion is the best way to cut the problem into a ==subproblem==, particularly to the original problem that has much repeatitive action or quite similar to the original promble. Simply put, we only need to write one function, and we can solve big problem. Thus, I used a simple example of **finding the maximum** to let my explanation more clear and understand more easily. #### Finding the maximum We want to find the maximum number in n numbers, and we used an array `A[0]`, `A[1]`......`A[n - 1]` to store n numbers, and we discover that if we find the maximum number of `A[0]`, `A[1]`......`A[n - 2]` and make it compare to the `A[n - 1]` we will find the answer. However, we also discover that if we want to find the maximum of `A[0]`, `A[1]`......`A[n - 2]` we can just use the same action to find the maximum, and this is the most standard way of recursion run. However, don't forget that if the recursion runs to the `A[0]` we don't have other subproblems that need to solve, and we call it **stopping condition** if we don't have this thing, we will have the Infinity Loop. So we remain only one question that didn't solve "How to change this algorithm into code?" ```cpp= int findMax(int [], int); int main() { ...... } int findMax(int array[], int n){ if(n == 1){ return array[0]; } int max = findMax(array, n - 1); if(array[n - 1] > max){ max = array[n - 1]; } return max; } ``` If you still don't understand the recursion's major principle and why we usually use recursion. You can use Sample2 to explain those questions. #### Computing factorials How to write a funtion that computes the factorial of n we can follow Sample1 steps 1. find the subproblem 2. think the strategy 3. build a stopping condition We notice that if we find the factorial of (n - 1), we will solve this problem. Thus, we find the subproblem. Next, we need to think the stratege and how to write the recursion basic equation, we find out that if we call the function and turn the argument into (n - 1). Thus, we just need to multiply the the return value to variable of n, and we will find the answer of computes the factorial of n. In the end, we find out the stopping condition is 1, as we discover that we can find out the answer of "computes the factorial of 1" is 1. ```cpp= int factorial(int); int main() { ...... } int factorial(int n){ if(n == 1){ return 1; }else{ return factorial(n - 1) * n; } } ``` #### Inefficient to use recursion Fibonacci Why it is inefficient by using recursion? ```cpp= #include <iostream> using namespace std; // 1 1 2 3 5 8 13 21 int fibonacci(int); int main() { ...... } int fibonacci(int n){ if(n == 1){ return 1; }else if(n == 2){ return 1; }else{ return fibonacci(n - 1) + fibonacci(n - 2); } } ``` The graph would tell you ![](https://i.imgur.com/E4HvU2v.jpg =x300) When we call the small number of n of function we need to call the same function when we call the larger one, so the larger is, the longer recursion needs to run, even though this algorithm is easy to understand, we still do not advise you to use it. so we recommand you to use repetitive way ```cpp= #include <iostream> using namespace std; // 1 1 2 3 5 8 13 21 int fibonacci(int); int main() { ...... } int fibonacci(int n){ if(n == 2 || n == 1){ return 1; } int fib1 = 1, fib2 = 1; int fib3 = 0; for(int i = 2; i < n; i++){ fib3 = fib1 + fib2; fib1 = fib2; fib2 = fib3; } return fib3; } ``` If we used Big O to compare the two algorithms, we notice that the repetitive way only used O(n), but the recursive way need O(2^n). However, we also said that the repetitive way is a ==polynomial-time==, it is seen like n, n^2, n^3, logn......, and said that the recursive way is a ==exponential-time==, it is seen like 2^n, 3^n......; in general, the exponential-time is too inefficient. #### Efficient to use recursion Hanoi Tower ```cpp= #include <cstdio> using namespace std; void hanoiTower(char, char, char, int); int main() { char a = 'A'; char b = 'B'; char c = 'C'; hanoiTower(a, b, c, 2); } void hanoiTower(char from, char via, char to, int n){ if(n == 1){ //stop condtion printf("From %c to %c\n", from, to); return; } hanoiTower(from, to, via, n - 1); // -- 1 printf("From %c to %c\n", from, to); // -- 2 hanoiTower(via, from, to, n - 1); // -- 3 } ``` ![](https://i.imgur.com/0xANk1X.jpg =x300) It is a great example to use recursion not only to enhance programmers' readability but also efficient the program speed. --- ## 5.3 Searching and sorting I use some of the code to represent the searching and sorting, and the code in the book is only changing in syntex and doing some customized print, the algorithm is the same, you can try it by yourself, and I also attach a website that have graph and more specific example. #### Searching Algorithms ###### Binary Search It is more efficient than linear search when array is large, but it is possible only if array is sorted. ```cpp= #include <iostream> using namespace std; int BinarySearch(int, int, int, int []); int main() { ...... } int BinarySearch(int low, int high, int target, int data[]){ int mid = (high + low) / 2; if(high == mid){ return -1; // if not found return -1 } if(target > data[mid]){ BinarySearch(mid + 1, high, target, data); }else if(target < data[mid]){ BinarySearch(low, mid, target, data); }else{ return mid + 1; } } ``` [binary search](https://shengyu7697.github.io/cpp-binary-search/) if you can use this searching way and the linear search proficient. In general, you can solve the most of the searching problem. #### Sorting Algorithms ###### Insertion Sort It is the popular sorting way that everytime we play card. ```cpp= #include <iostream> using namespace std; void insertionSort(int, int []); int main() { ...... } void insertionSort(int n, int data[]){ if(n <= 1){ return; } insertionSort(n - 1, data); int last = data[n - 1]; int j = n - 2; while(j >= 0 && data[j] > last){ data[j + 1] = data[j]; j--; } data[j + 1] = last; } ``` [insertion sort](http://alrightchiu.github.io/SecondRound/comparison-sort-insertion-sortcha-ru-pai-xu-fa.html) ###### Merge Sort sry, I don't know how to write it myself, you can check the website first, if I got the answer I will put it as soon as I can. [merge sort](http://alrightchiu.github.io/SecondRound/comparison-sort-merge-sorthe-bing-pai-xu-fa.html) ###### Bubble Sort supplement for other sorting way ```cpp= #include <iostream> using namespace std; void bubbleSort(int, int []); int main() { ...... } void bubbleSort(int n, int data[]){ for(int i = 0; i < n; i++){ for(int j = 1; j < n - i; j++){ if(data[j - 1] > data[j]){ int change = data[j]; data[j] = data[j - 1]; data[j - 1] = change; } } } } ``` [bubble sort](https://medium.com/@oturngo/study-note-01-%E6%B0%A3%E6%B3%A1%E6%8E%92%E5%BA%8F%E6%B3%95-bubble-sort-ee534b6f91eb) --- ## 5.4 Complexity We can approximately sort the complexity into two types: - [ ] Time complexity <-- typically more critical - [ ] Space complexity To explain the complexity batter, we need to go back to the instance we discuss in 5.1. By the way, for the convenience of explaining, ==I use the name of 1, 2, 3 in order== differences with the name on the picture. ![](https://i.imgur.com/2UmbIpv.png =x300) In this picture, you can obviously see that algorithm 1 have a lot of a chance to TLE when data need 18 bits. So we can notice that “Everyone can write the correct code, but the people who can write good code are very few.”Like the 2, 3 algorithms can run even if they are input a lot of data. Run time may be affected by hardware efficiency or how many number of programs are running at the same time, etc. Big O: O(1): The algorithm is independent of the number of vector element. This algorithm said to have a ==constant runtime==,which is meaning that it doesn't grow as the size of the vector increases. For instance, If the vector has 10 elements, this algorithm require only one comparison.If the vector has 100 elements, this algorithm still require only one comparison. By the way, an algorithm that tests whether the first element of a vector is equal to any of the next three elements will always require three comparisons, but in Big O it's still considered O(1). So in O(1) we are focus on the constant runtime feature, and it is often pronounce "on the order of one" or more simpley "**order one**". O(n): The algorithm that test whether the first elements of a vector is equal to any of the other elements of the vector require at most n - 1 comparison, where n is number of elements in the vector. An O(n) alorithm is refer to as having a **linear runtime**, and it is often pronounce "on the order of n" or more simpley "**order n**". O(n^2): The algorithm that tests whether any element of a vector is duplicate elsewhere in the vector. The first element must be compared with every other element except the first. The third element must be compared with every other element except the first two. In the end, this algorithm will end up making (n - 1) + (n - 2) + ...... + 2 + 1 or n^2 / 2 + n / 2 comparison. And in Big O we highlight the n^2 term, leaving n^2 / 2. --- Summarize, in amateur programming, we usually simply see how much `for` or `while` function inside the code, in each recursive function we usually said its runtime is O(n), and if we increase one recursive function **inside** we usually said its runtime is O(n^2), remember that it is in most of the cases not for all and the each comparison of recursive function need close to n time. Big O is concerned with how an algorithm's runtime **grow** in relation to the number of items processed. Back to the algorithm require n^2 comparisons. With four elements, the algorithm require 16 comparisons. With 8 elements, 64 comparisons. With this algorithm, doubling the number of elements quadruple(四重) the number of comparisons. Consider similar algorithm requiring n^2 / 2 comparison. With four elements, the algorithm require 8 comparisons. With 8 elements, 32 comparisons. Again, doubling the number of elements quadruple the number of comparisons. You can discover that both of these algorithms grow as the square of n, **so Big O ignores the constant** and its subproject of n, and both algorithms are considered to be O(n^2).