Arrays in IB Pseudocode

# Arrays in IB PSEUDOCODE ###### tags: `Computer Science` `IB` [ToC] ## Arrays :::info For this section I'm relying heavily in this sources: https://www.programiz.com/cpp-programming/arrays https://www.w3schools.com/cpp/cpp_arrays.asp ::: Arrays are a particular type of collection. The idea is that we are going to store in one variable more than one value. One way to do it is using an array. This is the declaration of an array of integers (int) with 5 numbers: ```cpp= int numbers[5] = {1, 10, 20, 30, 50}; ``` **The length (or size) of a specific array is how many elements can contain.** Arrays in most programming languages are very efficient (specially regarding memory usage) but they have some inconvenients. The main one is that once we have declared **the size of an array it cannot be changed** during the execution. ### Array index and values The first problem that we need to address is that we want to access a particular element (also called member) of the array. To do that we need to use the concept of index. The index is the address inside of the array. To access one of the elements we need to write the name of the array and, in brackets write the number of the index. Indexes in computer science start with zero, so the first element is the [0] element, the second is the [1] element and the last is [n-1] being n the length of the array. Let's supose the example of the web in programiz. In C++ we can initialize the array like this: ```cpp= // declare and initialize and array int x[6] = {19, 10, 8, 17, 9, 15}; ``` Here is the visualization of the elements ![imagen](https://hackmd.io/_uploads/HktB_0GOT.png) [source](https://www.programiz.com/cpp-programming/arrays) So if we write this code we can use those members to asign other values or we can use it in if statements as we please. For example ```cpp= int x[6] = {19, 10, 8, 17, 9, 15}; int y = x[2]; //8 int z = 0; if (x[4] < x[1]) //this happens because 9 is lesser than 10 { z = x[3] + x[0]; //17+19 =36 } else { //this doesn't happen z = z + x[5]; } ``` ### Array definition and array access You may see that sometimes we are using MyArray[100] and other times is MyArray[0]. Why is that? Is not the size of the array constant? When we define the array in many programming languages we can define the array size in the same line. For example: ```cpp= int x[6] = {19, 10, 8, 17, 9, 15}; ``` #### Exercise 15 Write the output of this code that it's going to happen through serial. :::info This and the next is a typical exam exercise to see if you get the concept of index of arrays ::: ```cpp= Serial.begin(9600); int x[7] = {19, 10, 8, 17, 9, 15, -10}; int y = x[3]; Serial.println(y); int z = 0; if (x[6] < x[1]) { z = x[4] + x[0]; Serial.println("Dinosaur"); } else { z = z + x[2] +10; Serial.println("Shark"); } Serial.println(z); ``` The solution with the explanation in the comments is in the spoiler section :::spoiler Solution: ``` 17 Dinosaur 28 ``` The answer in this case are separated through lines since we're using Serial.println but in an exam I will accept them separated in other ways as long as they are in the correct order. ```cpp= Serial.begin(9600); int x[7] = {19, 10, 8, 17, 9, 15, -10}; int y = x[3]; //forth element Serial.println(y); int z = 0; if (x[6] < x[1]) //this happens { z = x[4] + x[0]; // 9 +19 Serial.println("Dinosaur"); } else { //this doesn't happen z = z + x[2] +10; Serial.println("Shark"); } Serial.println(z); ``` ::: #### Exercise 16 Write the output of this code that it's going to happen through serial. ::: success In this case we are going to use also values as indexes. Beware! ::: ```cpp= Serial.begin(9600); int numbers[8] = {3, 1, 5, 8, 1, 2, 1, 3}; int x = numbers[1]; int y = numbers[x+1]; int z = numbers[numbers[2]]; Serial.println(x); Serial.println(y); Serial.println(z); ``` The solution with the explanation in the comments is in the spoiler section :::spoiler Solution: ``` 1 5 2 ``` The answer in this case are separated through lines since we're using Serial.println but in an exam I will accept them separated in other ways as long as they are in the correct order. ```cpp= Serial.begin(9600); int numbers[8] = {3, 1, 5, 8, 1, 2, 1, 3}; int x = numbers[1]; // Second element int y = numbers[x+1]; // x is 1, so the third element. 5 int z = numbers[numbers[2]]; //the third element is 5, so the sixth element is 2 Serial.println(x); Serial.println(y); Serial.println(z); ``` ::: #### The length problem As a teacher I observed that the concept of length of an array sometimes is not clear and is _key_ for understanding the loops through arrays. ::: warning The length of the array will be usually given in a variable or as part of the problem "we have a hotel with 100 rooms, and each number of the room is stored in an array" means that we have an array of length 100. ::: In C++ we cannot use length of an array as propiety so it's common to have something like this ```cpp= int MyArrayLength = sizeof(MyArray) / sizeof(MyArray[0]); ``` This calculates the size in memory of the whole array and divides it by the size of the first element. This gives the numbers of elements. Sometimes we have a more complex version of this. In the code that we used in the [melody player](https://github.com/robsoncouto/arduino-songs/blob/master/greenhill/greenhill.ino) we found this code in the line 172 ```cpp= int notes = sizeof(melody) / sizeof(melody[0]) / 2; ``` This was because in this code the melody array has 2 elements for each note (frequency and the divider of the duration). More info [here](https://www.w3schools.com/cpp/cpp_arrays_size.asp) In other programming languages (such as java) we usually can access the length of an array using a length propiety. ```java= //BEWARE JAVA CODE int[] exampleArray = {1,2,3,4,5}; int exampleArraySize = exampleArray.length; System.out.print("This Java array size is: " + exampleArraySize ); //The Java array length example prints out the number 5 ``` [source](https://www.theserverside.com/blog/Coffee-Talk-Java-News-Stories-and-Opinions/Java-array-size-explained-by-example) ### Loops with arrays In class we have discussed the implementation of simple algorithms such as min, max and average. For doing that we need to use loops (for loops or while loops). This types of algorithms are called **iterate through an array**. ![Sin título](https://hackmd.io/_uploads/ryvGWFmd6.png) [source](https://iterategame.com/) Usually we can see 2 types of iteration algorithms (but there are more!). * _Extensive loops._ This kind needs to go through each element to do something with that element (add to a sum when we find the average for example). * _Selective loops._ that only want to do something to part of the elements or even just one (like when we do the min) but we need to check a condition to do that (With an if clause) In the for loop usually will be something like this: ```cpp= for (int i = 0; i < [ARRAY_LENGTH]; i ++) { //code to execute } ``` ``` //PSEUDOCODE loop I from 0 to [ARRAY_LENGTH]-1 //things to do //the element will be accessed with ARRAY[I] end loop ``` Is common to use i of index, but this is usually up to the programmer and I will consider correct any name for the variable. The ARRAY_LENGTH is the array length given by the context. Consult [the length problem](#The-length-problem) to revise if you need. In the code usually we want to do something with the element of the array. If the array is called "myArray" this would be "myArray[i]". A selective loop would look like this ```cpp= for (int i = 0; i < [ARRAY_LENGTH]; i ++) { if (condition) { //code to execute } } ``` ``` //PSEUDOCODE loop I from 0 to [ARRAY_LENGTH]-1 if (condition) then //things to do //the element will be accessed with ARRAY[I] end if end loop ``` The condition usually will look for a condition of the specific element so we can expect to see in the condition "myArray[i]". :::info We can use for loops, while loops and for each loops. I haven't covered for each loops but I will mark it as correct (if they are correctly written). You can find more info about for each loops [in this LINK](https://www.w3schools.com/cpp/cpp_arrays_loop.asp) ::: Now that we have discussed the abstraction we can see the basic algorithms. #### Exercise 17 Now we're running a foodtruck about selling corn. ![imagen](https://hackmd.io/_uploads/Hkts2FmdT.png) [source](https://www.achoclonados.cl/) We have an array with all the prices of our products (23 different ways of having a cup of corn with stuff on it) with a range that goes from 2 euros to 10 euros. Output the minimum, the maximum and the average of the prices that are available. The solutions with the explanation in the comments is in the spoiler section :::success This exercise is not an exam like question but the student should know how these works in order to create more complex algorithms ::: Maximum :::spoiler ``` PSEUDOCODE I = 0 MAX = prices [0] loop while I < 23 if MAX < prices[I] then MAX = prices[I] end if I = I +1 end loop output MAX ``` ```cpp= double max = prices[0]; //we initialize the maximum with the first value. It's a double because we're comparing doubles. for(int i = 0; i < 23; i++) //the number is the name of different products that we can find { if (max < prices[i]) { //if the element that we are inspectinc has a bigger value that our max, we update the max max = prices[i]; } } Serial.println(max); ``` ::: Minimum :::spoiler ```cpp= double min = prices[0]; //we initialize the maximum with the first value. It's a double because we're comparing doubles. for(int i = 0; i < 23; i++) //the number is the name of different products that we can find { if (min > prices[i]) {//if the element that we are inspectinc has a smaller value that our min, we update the min min = prices[i]; } } Serial.println(min); ``` ::: Average :::spoiler ``` SUM = 0 loop for X from 0 to 22 SUM = SUM + prices[X] end loop average = sum div 23 output average ``` ```cpp= double sum = 0; // we're adding all the values on the array so we start with 0 for(int i = 0; i < 23; i++) //the number is the name of different products that we can find { sum = sum + prices[i]; //also } double average = sum / 23; Serial.println(min); ``` ::: #### Exercise 18 We have an array of 50 different grades from students that go from 0.0 to 10. We want to know the index of the highest array and the number of passes (they pass with a 5.0 or higher grade). Construct a code that outputs this information. The name of the array is grades. The solutions with the explanation in the comments is in the spoiler section. :::success This exercise is an exam like question. ::: :::spoiler We can do it in just one loop or two different loops. Both options are fine for this purpose. Finding the maximum index with pseudocode ``` PSEUDOCODE MAX = grades[0] MAX_I = 0 //maximum index. Temporary for now loop A from 0 to 49 if MAX < grades[A] MAX = grades[A] MAX_I = A end if end loop output MAX_I ``` Output the number of passes ``` PSEUDOCODE COUNT= 0 I = 0 loop while I<=49 if grades[I] >= 5 then COUNT = COUNT + 1 end if I = I + 1 end loop output COUNT ``` ```cpp= int bestGradeIndex = 0; //we can initialize this as 0 or -1. It's an int because the double bestGrade = grades[0]; //the same idea when we had the max. We need to use it because we still need to know the max. for(int i = 0; i < 50; i++) //the number is the number of grades { if (bestGrade < grades[i]) { //if the element that we are inspectinc has a bigger value that our max, we update the max bestGrade = grades[i]; bestGradeIndex = i; //We need to update the index that is the one that we're going to output } } Serial.println(bestGradeIndex); //output the first task. int passed = 0; //initialized counter of people that have passed. for(int i = 0; i < 50; i++) //the number is the number of grades { if (grades[i] >= 5) { //if the element that we are inspectinc has a bigger value that our max, we update the max passed ++; } } Serial.println(passed);//output the second task. ``` Solution with only one loop (a bit more confusing but legal for the exam) ```cpp= int bestGradeIndex = 0; double bestGrade = grades[0];¡ for(int i = 0; i < 50; i++)¡ { if (bestGrade < grades[i]) { ¡ bestGrade = grades[i]; bestGradeIndex = i; ¡ } if (grades[i] >= 5) { passed ++; } } Serial.println(bestGradeIndex); //output the first task. Serial.println(passed);//output the second task. ``` ::: #### Exercise 19 :::info This exercise is the same as 18 but is asking about construting functions. Take notice of the differences in the statement ::: We have an array of 50 different grades from students that go from 0.0 to 10. We want to know the index of the highest array and the number of passes (they pass with a 5.0 or higher grade). We're going to implement a couple of functions that return this information. The first one is "highestGradeIndex" and the second is "passes". The name of the array is grades and is a double type and is a global variable. :::info Remember that global variables are variables that we can access in any part of the code. ::: We can consider that the instruction Serial.begin(9600); has been already called. The solutions with the explanation in the comments is in the spoiler section. :::spoiler We need to split this into two different functions, we cannot do them in just one algorithm. The highestGradeIndex function: ```_= highestGradeIndex () BEST_GRADE = grades[0] BEST_GRADE_INDEX = 0 loop for I from 1 to 49 if BEST_GRADE < grades[I] then BEST_GRADE_INDEX = I BEST_GRADE = grades[I] end if end loop return BEST_GRADE_INDEX end method ``` ```_= passes () PASSED = 0 loop for I from 0 to 49 if (grades[I]) >= 5) then PASSED = PASSED +1 end if end loop return passed end method ``` ::: ### Strings: arrays of characters ::: success Here you have more activities to do to practice (with pseudocode and C++) https://hackmd.io/@dprieto/FLA-array ::: ![imagen](https://hackmd.io/_uploads/SkGotDrda.png) Strings are arrays of characters. We have access to several functions that are going to do something ### Parallel arrays When we have 2 or more arrays which elements of the same index refer to the same entity (a student, a car, an employee, a flight) we call them Parallel arrays. We usually have to collect information from one of the arrays and output the other one. Classic example is to store in one array the grades of the students and in other the names of the students. So, if you want to output who got the best (or worst) grade you will check the array of the grades but you will use the same index to output the name of the student using the array which has the names of the students #### Exercise 20 ![EXO31](https://hackmd.io/_uploads/rJC7hILtT.jpg) Number of followers of EXO. We're the managers of EXO and we want to check the performance of these members in social media. We have 2 arrays. One array is EXObandMember with the names of the band members ( Baekhyun, Chen, Lay, Sehun, Chanyeol, D.O., Kai, Suho, Xiumin) and the other is a parallel array called followersInstagram Create in pseudocode an algorithm that outputs who has the most number of followers in instagram and the number of follows that they have. The solutions with the explanation in the comments is in the spoiler section. Result only: :::spoiler ```= PSEUDOCODE EXObandMember[9] followersInstagram[9] MAX = followersInstagram[0] MAX_INDEX = 0 loop for N from 1 to 8 // we start at 1 because we're already have the first one and we end in LENGTH -1 if followersInstagram[N] > MAX then MAX = followersInstagram[N] MAX_INDEX = N end if end loop output EXObandMember[MAX_INDEX] output MAX ``` ::: Process :::spoiler ``` //EXObandMember[9] // finding only the maximum value of followersInstagram MAX = followersInstagram[0] loop for I from 0 to 8 //The array has 9 elements if MAX < followersInstagram[I] then MAX = followersInstagram[I] end if end loop // finding also the index of followersInstagram that has the maximum value MAX = followersInstagram[0] MAX_INDEX = 0 loop for I from 0 to 8 //The array has 9 elements if MAX < followersInstagram[I] then MAX = followersInstagram[I] MAX_INDEX = I end if end loop // using that index to output the name of the person //followersInstagram [] //EXObandMember [] MAX = followersInstagram[0] MAX_INDEX = 0 loop for I from 0 to 8 //The array has 9 elements if MAX < followersInstagram[I] then MAX = followersInstagram[I] MAX_INDEX = I end if end loop Output EXObandMember[MAX_INDEX] //band member with most followers on instagram Output MAX //most followers on instagram //Output EXObandMember[MAX] Error because EXObandMember[ 3 million something]doesn't exist since there are only 9 members Output followersInstagram[MAX_INDEX] //Same as output Max ``` ::: #### Exercise 21 In the same context of the last exercise we need to know the member that is nearest the average amount of followers in instagram. For this we need 2 loops. First to find the average. Second to decide which band member has a lower difference with the average. The solutions with the explanation in the comments is in the spoiler section. :::spoiler ```= PSEUDOCODE EXObandMember[9] followersInstagram[9] //First find the average SUM = 0 loop for N from 0 to 8 SUM = SUM + followersInstagram[N] end loop AVG = SUM div 9 //remember to use div in these contexts // Once we know the average we need to know the difference between each member audience and that average. And which difference has a lower absolute value. //We find the difference between the average and the first group member. (later we're going to compare it with the rest of the differences) DIFFERENCE = AVG - followersInstagram[0] //In this part we asign to LOWEST_DIFFERENCE the absolute value if (DIFFERENCE < 0) then LOWEST_DIFFERENCE = - DIFFERENCE else LOWEST_DIFFERENCE = DIFFERENCE end if // if we had access to something like abs(x) that returns the absolute value of a given number we can write this other one liner // LOWEST_DIFFERENCE = ABS(AVG - followersInstagram[0]) LOWEST_DIFFERENCE_INDEX = 0 //We have the index value because we want the name so we have to save it somewhere. loop for I from 1 to 8 // FROM 1 because we're comparing the first one with the rest, so we can start with the second one DIFFERENCE = AVG - followersInstagram[I] //calculate difference if (DIFFERENCE < 0) DIFFERENCE = - DIFFERENCE // Finding the absolute value of the difference end if if (DIFFERENCE < LOWEST_DIFFERENCE) //Minimum absolute value finding LOWEST_DIFFERENCE = DIFFERENCE LOWEST_DIFFERENCE_INDEX = I end if end loop output EXObandMember[LOWEST_DIFFERENCE_INDEX] ``` ::: #### Exercise 22 Using the same context of EXO members, output the name of all the members that are over 10 000 000 followers in instagram. :::spoiler (solution by Y.C) ![IMG_1166](https://hackmd.io/_uploads/BJnXY4BF6.jpg) ::: ### Flag variables while looking into arrays Flag variables are boolean variables that we use to store information. #### Exercise 23 ![BCN](https://hackmd.io/_uploads/SkSM2VSta.jpg) Flights destinations. We have in an array called **flightDestinations[4500]** the destination code for all the flights that had happen in one month in Barcelona's Airport. This destination codes are 3 letters that are unique for each airport ("IBZ" for Ibiza, "SCL" Santiago de Chile, "EZE" for Buenos Aires, "MAD" for Madrid and so on). These flights can have the same destination so they will be repeated. For example we can expect that IBZ could be even repeated 100 times in a month. 1st part. Output how many times there is a flight to ICN (Seoul) ``` PSEUDOCODE COUNT = 0 loop for N from 0 to 4499 if flightDestinations[N] = "ICN" then //remember to put the quotes in strings! COUNT = COUNT + 1 end if end loop output COUNT ``` We have another array that is called SpanishDestinations[] with the codes of the Spanish airports. [There are 48 airports in Spain](https://www.aena.es/es/pasajeros/nuestros-aeropuertos.html). The arrays SpanishDestinations and flightDestinations are not parallel (they don't refer to the same flight) First we're going to check if the first flight is a National flight or an international flight. To do that we need to check if the String is inside Spanish Destinations. So let's start simple and let's see if the 3rd flight of the month(index number 2, remember) is a National or international flight. Output "national" if it is, "international" otherwise ```_ PSEUDOCODE flightDestinations[4500] // the array with all the flights even if we're using here just one to start easy SpanishDestinations[48] // the size is the number of airports NATIONAL_DESTINATION = false // we set a boolean flag. The name of the variable, as always is depending on the student. I wrote here a long one but you can write FLAG or NATIONAL or even F loop for N from 0 to 47 // Length of Spanish Destinations - 1 if (flightDestination[2] = SpanishDestinations[N]) then //here we are checking if the 3rd flight on flightDestinations is equal to each of the elements of the array of 48 airports in Spain NATIONAL_DESTINATION = true // if it is, we need to change the flag variable to true, because we have found that this destination is national break //here we break the loop so we don't keep looking if we have found what we're looking for. end if end loop if NATIONAL_DESTINATION then //writing NATIONAL_DESTINATION is the same as NATIONAL_DESTINATION = true output National else output international end if ``` Now let's output how many national flights have we had in Barcelona last month. In this case we are going to need that flag variable and we're not looking for only one of the flights in flightDestinations but all of them. So we need to loop through all of them. We're going to need a nested loop (a loop inside a loop) ```_ PSEUDOCODE flightDestinations[4500] // the array with all the flights even if we're using here just one to start easy SpanishDestinations[48] // the size is the number of airports COUNT = 0 loop for X from 0 to 4499 //total number of flights last month NATIONAL_DESTINATION = false loop for N from 0 to 47 if (flightDestination[X] = SpanishDestinations[N]) then NATIONAL_DESTINATION = true break end if end loop if NATIONAL_DESTINATION then //writing NATIONAL_DESTINATION is the same as NATIONAL_DESTINATION = true COUNT = COUNT +1 //instead of outputing "national" we update the counter end if end loop output COUNT ``` #### Execrise 23 B Change the previous code so it displays the international destinations instead of the national ones. Solution in the spoilers: :::spoiler Three posible solutions ```_ flightDestinations[4500] // the array with all the flights even if we're using here just one to start easy SpanishDestinations[48] // the size is the number of airports COUNT = 0 loop for X from 0 to 4499 //total number of flights last month NATIONAL_DESTINATION = false loop for N from 0 to 47 if (flightDestination[X] = SpanishDestinations[N]) then NATIONAL_DESTINATION = true break end if end loop if NATIONAL_DESTINATION then //writing NATIONAL_DESTINATION is the same as NATIONAL_DESTINATION = true COUNT = COUNT +1 //instead of outputing "national" we update the counter end if end loop output 4500 - COUNT ``` ```_ flightDestinations[4500] // the array with all the flights even if we're using here just one to start easy SpanishDestinations[48] // the size is the number of airports COUNT = 0 loop for X from 0 to 4499 //total number of flights last month NATIONAL_DESTINATION = false loop for N from 0 to 47 if (flightDestination[X] = SpanishDestinations[N]) then NATIONAL_DESTINATION = true break end if end loop if NOT_NATIONAL_DESTINATION then //writing NATIONAL_DESTINATION is the same as NATIONAL_DESTINATION = true COUNT = COUNT +1 //instead of outputing "national" we update the counter end if end loop output COUNT ``` ```_ flightDestinations[4500] SpanishDestinations[48] COUNT = 0 loop for X from 0 to 4499 INTERNATIONAL_DESTINATION = True loop for N from 0 to 47 if (flightDestination[X] = SpanishDestinations[N]) then INTERNATIONAL_DESTINATION = False break end if end loop if INTERNATIONAL_DESTINATION then COUNT = COUNT +1 end if end loop output COUNT ``` :::danger This solution is not correct because it doesn't evaluate properly ``` flightDestinations[4500] // the array with all the flights even if we're using here just one to start easy SpanishDestinations[48] // the size is the number of airports COUNT = 0 loop for X from 0 to 4499 //total number of flights last month INTERNATIONAL_DESTINATION = FALSE loop for N from 0 to 47 if (flightDestination[X] =/ SpanishDestinations[N]) then INTERNATIONAL_DESTINATION = TRUE break end if end loop if INTERNATIONAL_DESTINATION then COUNT = COUNT +1 end if end loop output COUNT ``` ::: #### Exercise 24 Output the _percentage_ of flights that were **international** last month. (Small variation from the last exercise) Solution: :::spoiler ```_? PSEUDOCODE flightDestinations[4500] // the array with all the flights even if we're using here just one to start easy SpanishDestinations[48] // the size is the number of airports COUNT = 0 loop for X from 0 to 4499 //total number of flights last month NATIONAL_DESTINATION = false loop for N from 0 to 47 if (flightDestination[X] = SpanishDestinations[N]) then NATIONAL_DESTINATION = true break end if end loop if NOT NATIONAL_DESTINATION then //writing NATIONAL_DESTINATION is the same as NATIONAL_DESTINATION = true COUNT = COUNT +1 //instead of outputing "national" we update the counter end if end loop PERCENTAGE = (COUNT / 4500) * 100 output PERCENTAGE ``` credits to S.A. ::: Output how many different locations are there in the array flightDestinations. ```_= PSEUDOCODE flightDestinations[4500] //given array differentLocations[4500] //empty array of strings with the Maximum possible different number of destinations (the number of actual flights) COUNT = 1 // there is at least 1 posible location differentLocations[0] = flightDestinations[0] //We set the first possible flight destination loop for N from 1 to 4499 //look for every flight that is not the first one newDestination = true // this is a boolean flag. It's true if it's a new destination but it will be false if we prove otherwise. loop for X from 0 to COUNT -1 //we do a loop looking to the filled elements of the array differentLocations if flightDestinations[N] = differentLocations[X] newDestination = false break // this is optional. This will break the inner loop (the X one) end if end loop if (newDestination) then //if there is no match in the differentLocations array it means that we have to add it differentLocations[COUNT] = flightDestinations[N] COUNT = COUNT + 1 end end loop output COUNT ``` Let's trace this table for this 6 values of flightDestinations {MAD, IBZ, MAD, SCL, EZE, IBZ} For being more understandable we are going to write the line of the code that we are. In an exam you are not going to be asked to do that. | line | COUNT| N | flightDestinations[N] | X | differentLocations [X] | newDestination | | -------- |----- | -------- | -------- |-------- | -------- | -------- | | 5 | 1 | | | | | | | 7 | 1 | 1 | IBZ | | | | | 8 | 1 | 1 | IBZ | | | true | | 8 | 1 | 1 | IBZ | 0 | MAD | true | | 10| 1 | 1 | IBZ | 0 | MAD | true | | 15| 1 | 1 | IBZ | / | / | true | | 16| 1 | 1 | IBZ | / | / | true | | 8 | 2 | 2 | MAD | / | / | true | | 9 | 2 | 2 | MAD | 0 | MAD | true | | 10| 2 | 2 | MAD | 0 | MAD | true | Do a trace table with this data in Flight Destinations {MAD, CDG, MAD, SCL, CDG, IBZ} | COUNT| N | flightDestinations[N] | X | differentLocations [X] | newDestination | |----- | -------- | -------- |-------- | -------- | -------- | | | - - | -------- |-------- | -------- | -------- | #### Counting with another array Output which Spanish airport has had the most number of flights from Barcelona last month. In this case we have these arrays flightDestinations[4500] SpanishDestinations[48] So for doing that we are going to need another array (SpanishCounters) to store the counts of each of the 48 Spanish airports. After that we just have to find the maximum and output the name using the code stored in SpanishDestinations ```_ PSEUDOCODE flightDestinations[4500] // the array with all the flights even if we're using here just one to start easy SpanishDestinations[48] // the size is the number of airports SpanishCounters[48] //we define this array loop for X from 0 to 4499 //total number of flights last month //no need for flag variable loop for N from 0 to 47 if (flightDestination[X] = SpanishDestinations[N]) then SpanishCounters[N]= SpanishCounters[N] + 1 break end if end loop end loop output MAX = SpanishCounters[0] MAX_INDEX loop C from 0 to 47 if MAX < SpanishCounters[C] then MAX = SpanishCounters[C] MAX_INDEX = C end if end loop output SpanishDestinations[MAX_INDEX] ``` #### EVEN MORE COMPLEX Exercise 25 Tell (output) which destination is the most popular in the last month For doing that we need the array with diffent possible destinations. If it's not given, we can do it with the previous code. After this, then we need to populate an array. ```_= PSEUDOCODE flightDestinations[4500] differentLocations[4500] COUNT = 1 differentLocations[0] = flightDestinations[0] loop for N from 1 to 4499 newDestination = true loop for X from 0 to COUNT -1 if flightDestinations[N] = differentLocations[X] newDestination = false break end if end loop if (newDestination) then differentLocations[COUNT] = flightDestinations[N] COUNT = COUNT + 1 end end loop //define a new array with the population of the destinations (numbers that are repeated) counterLocations[COUNT] //started with 0 all the elements loop for D from 0 to 4499 // loop through flightDestinations loop for I from 0 to COUNT - 1 // loop through differentLocations if differentLocations[I] = flightDestinations[D] then counterLocations[I] = counterLocations[I] +1 end if end loop end loop // MAX = counterLocations[0] MAX_INDEX = 0 loop T from 0 to COUNT - 1 if MAX < counterLocations[T] then MAX = counterLocations[T] MAX_INDEX = T end if end loop output differentLocations[MAX_INDEX] ``` ``` flightDestinations[4500] differentLocations[4500] counterLocations[4500] COUNT = 1 differentLocations[0] = flightDestinations[0] counterLocations[0] = 1 loop for N from 1 to 4499 newDestination = true loop for X from 0 to COUNT -1 if flightDestinations[N] = differentLocations[X] newDestination = false counterLocations[X] = counterLocations[X] +1 break end if end loop if (newDestination) then differentLocations[COUNT] = flightDestinations[N] counterLocations[COUNT] = 1 COUNT = COUNT + 1 end end loop // MAX = counterLocations[0] MAX_INDEX = 0 loop T from 0 to COUNT - 1 if MAX < counterLocations[T] then MAX = counterLocations[T] MAX_INDEX = T end if end loop output differentLocations[MAX_INDEX] ``` ### Exercise 26 from a class exam We selling lemonade and other juices for having some extra fundings. To analyze the different sales we have filled an array with all the names of the different products PRODUCTS with these values [“Lemonade”, “Orange juice”, “Orange banana”, “Orange carrot”, “Tropical smash”] and we have another parallel array with their corresponding prices called PRICES with values that go from 500 to 3000 Chilean Pesos. A.1) State the size of the array PRICES [1 mark] :::spoiler 5 ::: Last weekend we have been taking notes of the sales and we have an array called SALES with the 80 sales with the name of the different products we have sold. Here you can see the first values of the array SALES: {“Orange juice”, “Orange juice”, “Lemonade”, “Orange carrot”, “Lemonade”, “Orange juice”, ...} All the possible values of SALES are inside PRODUCTS. A.2) Construct a code that outputs the number of times that we have sold lemonade last weekend. [4 marks] :::spoiler ```_= COUNT = 0 loop for X from 0 to 79 if SALES[X] = "Lemonade" then COUNT = COUNT +1 end if end loop output COUNT ``` Marking: 1 for correct loop, 1 for output, 1 for updating the variable, 1 for correct if ::: A.3) Construct a code that outputs the total amount of income in chilean pesos this weekend. Use the parallel array PRICES. [6 marks] :::spoiler The solution is at the end of the explanation. For doing this we need to understand that each sale has associated with one name (in products) and each product has a price. To find the total amount of income we need to add the prices of all the elements that we have sold. So first we need a loop through the array of SALES (that one has a length of 80). For each element we need to loop through the array of PRODUCTS (that one has a length of 5) to find which index is it. Once we know the index of the PRODUCT then we can add to SUM the price of the product since PRICES is a parallel array of PRODUCTS. ```_= SUM = 0 loop for X from 0 to 79 loop for J from 0 to 4 if SALES[X] = PRODUCTS[J] then SUM = SUM +000 PRICES[J] break //optional end if end loop end loop output SUM ``` Marking: 1 for correct each loop, 1 for output, 1 for updating the variable, 1 for correct if, 1 for using the parallel array ::: ### Exercise 27 trace table of simple array (inspired by Nov 22 exercise) Consider this one-dimensional array COUNTRIES | | | | ----| ----| | [0] | China | | [1] | Tailand | | [2] | Australia | | [3] | Egypt | | [4] | Denmark | Construct a trace table for the following algorithm ``` T = 4 loop while T > = 0 B = T mod 4 output COUNTRIES[B] T = T -1 end while ``` Solution: :::spoiler | T | T > = 0 | B | Output | | --- | ---| --- | --- | | 4 | true| 0 | China | | 3 | true| 3 | Egypt | | 2 | true| 2 | Australia | | 1 | true| 1 | Tailand | | 0 | true| 0 | China | | -1 | false| 0 | - | ::: #### Exercise 27 variation 1 (I changed the algorithm) Consider this one-dimensional array COUNTRIES | | | | ----| ----| | [0] | China | | [1] | Tailand | | [2] | Australia | | [3] | Egypt | | [4] | Denmark | Construct a trace table for the following algorithm ``` T = 6 loop while T > = 0 B = T mod 4 output COUNTRIES[B] if T mod 2 = 0 then T = T - 3 else T = T + 1 end while ``` Solution :::spoiler ![imagen](https://hackmd.io/_uploads/ryFqNxrqp.png) credit to Y.C. ::: #### Exercise 27 variation 2 (I changed the algorithm) Consider this one-dimensional array COUNTRIES | | | | ----| ----| | [0] | China | | [1] | Tailand | | [2] | Australia | | [3] | Egypt | | [4] | Denmark | Construct a trace table for the following algorithm ``` T = 0 loop while T < 7 B = T mod 4 output COUNTRIES[B] T = T + 1 + B end while ``` ### Exercise with parallel arrays and use of functions. (credit: R.G. with modifications) There is a calculation that you can use to find the Goal Assist that is a ratio that ponders a little bit more the Goal than the assist (we don't need to know the details). To find out you need to use the function calcGA(Goals, Assist) that if you send the number of goals and the number of assist it will send you back the info that ratio. We have 3 parallel arrays that refer to 30 players, with NAMES, GOALS and ASSIST that they have done over a season. Construct an algorithm which will output the names of all the football players whose GA (Goal/assist)is greater than this group's average GA. Be aware that you should use "calcGA()" in your answer Solution in the spoiler :::spoiler sum = 0 loop for K from 0 to 29 sum = sum + calcGA(GOALS[K], ASSIST[K]) end loop average = sum/30 loop for K from 0 to 29 if calcGA (GOALS[K], ASSISTS[K]) > average then output NAME[K] end if end loop ::: In the first part of this code (1-5 line) we are counting group's average GA, using "for" and then dividing sum to 30 (30 people). In the second part we are already writing the code itself, wich calculates how many people have GA greater than the group's average ## Standard algorithms on linear arrays (4.2.1) ### Search algorithms #### Sequential search ("linear to find") Starts with the first element and compares each elemen to the one we're looking for until it is found. ``` MY_ARRAY [25] //array that we need to find something SEARCHED_VALUE //Value that we're looking for loop for I from 0 to 24 if MY_ARRAY[I] == SEARCHED_VALUE then // do something // usually we add here break end if end loop ``` Works on every type of array, no matter if it's sorted and it has a O(n) complexity. With collections work this way: ``` MY_COLLECTION SEARCHED_VALUE //Value that we're looking for loop while MY_COLLECTION.hasNext() CURRENT_VALUE = MY_COLLECTION.getNext() if CURRENT_VALUE == SEARCHED_VALUE then // do something // usually we add here break end if end loop ``` Example 1: We have a couple of parallel arrays with a size 20. One that is called NAMES and the other is called TIMES_THAT_HAVE_GONE_TO_SKI If we want to know if somebody has gone to ski exactly 3 times we can use a linear search. Construct the pseudocode to output the name of somebody who has been skiing exactly 3 times ``` loop I from 0 to 19 if (TIMES_THAT_HAVE_GONE_TO_SKI[I] == 3) then output NAMES[I] break //if we want to output only one end if end loop ``` #### Binary search (aka sorted and split) Also known as "half-interval search". This is a good algorithm (O(log (n)) of complexity) but requires that the array has to be **sorted** before executing it. So the steps are Compare the search value with the element of the middle of the array. The posibilities are 3. 1) We found it! 2) The value in the middle of the array is less than the searched value. Then we have to look in the top part of the array. 3) The value in the middle of the array is more than the searched value. Then we have to look in the bottom part of the array. Here you have an image of how it works ![BinarySearch](https://hackmd.io/_uploads/H16Dkl-sp.png) _source geeksforgeeks_ :::danger Beware: this array doesn't work! find out why using a trace table! ``` MY_ARRAY [25] //SORTED array that we need to find something SEARCHED_VALUE //Value that we're looking for I = 25 div 2 +1 HALF = 25 div 2 RANGE = HALF FOUND = FALSE EXIT = FALSE loop while NOT FOUND AND RANGE > 0 RANGE = HALF if SEARCHED_VALUE = MY_ARRAY[I] then FOUND = TRUE else if SEARCHED_VALUE < MY_ARRAY[I] then HALF = HALF div 2 I = I - HALF else then HALF = HALF div 2 I = I + HALF end if end loop if FOUND then //do something else then // do something else end if ``` ::: Implementation transalated from here: https://www.geeksforgeeks.org/binary-search/ ``` MY_ARRAY [25] //SORTED array that we need to find something SEARCHED_VALUE //Value that we're looking for FOUND = FALSE LOW = 0 TOP = 25-1 //MY_ARRAY length - 1 loop while LOW <= TOP and NOT FOUND MID = LOW + ((TOP - LOW) div 2); if MY_ARRAY[MID] = SEARCHED_VALUE then FOUND = TRUE else if MY_ARRAY[MID] < SEARCHED_VALUE then LOW = MID + 1 else TOP = MID - 1 end if end loop if FOUND then //do something else then // do something else end if ``` Remember that LOW, MID and TOP are refering to **indexes** and not about values. The index of the Searched value will be stored in the MID variable in this implementation. Example: We have a dictionary of slang words used in the 70s in London in a corpus that has 1000 words. All those words are sorted alphabetically. The name of the array is WORDS_CORPUS :::info Corpus linguistics is the study of a language as that language is expressed in its text corpus (plural corpora), its body of "real world" text. More info [here](https://en.wikipedia.org/wiki/Corpus_linguistics) ::: Construct a pseudocode algorithm using _binary search_ to try to find if "punk" is in that dictionary. It will output if it's or not. :::info In _most_ programming languages you can compare words in English with usual comparators and it will work fine. For not English languages things are usually tricker. In this way 'A' is "smaller" than 'B' and 'D' is "bigger" than 'C' Also "Aaron" will be evaluated as smaller if you compare it with "Berthe" and "Sofya" will be bigger than "Chenxi" ::: Solution in the spoiler section. :::spoiler ```= WORDS_CORPUS [1000] //we write the array so we have clear what are we working with SEARCHED_WORD = "punk" FOUND = false LOW = 0 TOP = 1000 -1 loop while LOW <= TOP and FOUND == false MID = LOW + ((TOP-LOW) div 2) if WORDS_CORPUS [MID] == SEARCHED_WORD then FOUND = true else if WORDS_CORPUS [MID] < SEARCHED_WORD then LOW = MID +1 else //this is n case WORDS_CORPUS is bigger than searched word TOP = MID -1 end if end loop if FOUND then output "the word "+ SEARCHED_WORD + " is in the corpus" else output "the word "+ SEARCHED_WORD + " is not in the corpus" end if ``` ::: ### Exercise of tracing a binary search We have the following array: ```TIMESTAMPS = {1002, 1003, 1023, 1240, 1560, 1850, 1950}``` Write a tracetable using a binary search algorithm. One of the colums should be output. The searched value is 1560 ``` TIMESTAMPS = {1002, 1003, 1023, 1240, 1560, 1850, 1950} SEARCHED_VALUE = 1560 FOUND = FALSE LOW = 0 TOP = 7-1 //TIMESTAMPS length - 1 loop while LOW <= TOP and NOT FOUND MID = LOW + ((TOP - LOW) div 2); if TIMESTAMPS[MID] = SEARCHED_VALUE then FOUND = TRUE else if TIMESTAMPS[MID] < SEARCHED_VALUE then output "we need to look in the upper values" LOW = MID + 1 else output "we need to look in the lower values" TOP = MID - 1 end if end loop if FOUND then output "value found!" else then output "value NOT found! = (" end if ``` :::spoiler Possible solution ![imagen](https://hackmd.io/_uploads/rycV6WGjp.png) credit to Y.C. ::: Do the same with the searched values of 900, 1003 and 1951 ### Sorting arrays #### Why sorting Because if we have the elements sorted, we can use binary search, so everything is way more easy to find. ![Sin título](https://hackmd.io/_uploads/r1BmO3_iT.jpg) ([source](https://www.google.com/url?sa=i&url=https%3A%2F%2Fwww.britannica.com%2Fsummary%2Fdictionary&psig=AOvVaw1aXH6pXMD3bXGuRU1OVIvd&ust=1707903349398000&source=images&cd=vfe&opi=89978449&ved=0CBMQ3YkBahcKEwjAsZSAgqiEAxUAAAAAHQAAAAAQDA)) Imagine that we have a dictionary without sorting, it would be very complicated to use! #### Remember how to swap ellements For sorting we need to one way or another swap elements between them. For doing that the main way is with this template. ```? //we want to interchange MY_ARRAY[N] with MY_ARRAY[M] TEMP = MY_ARRAY[N] MY_ARRAY[N] = MY_ARRAY[M] MY_ARRAY[M] = TEMP ``` Remember that MY_ARRAY depends on context and it's the name of an array, and N and M are variables that can vary on context or be dependent one to the other. In python ```python myArray[N], myArray[M] = myArray[M], myArray[N] ``` There are more sorting methods but IB wants you to know these: ### Bubble sort Bubble sort works by comparing pairs of elements, if they match they swap, if not, we continue looking. More info on that here: https://www.geeksforgeeks.org/bubble-sort/ Citing them: ```! In Bubble Sort algorithm, traverse from left and compare adjacent elements and the higher one is placed at right side. In this way, the largest element is moved to the rightmost end at first. This process is then continued to find the second largest and place it and so on until the data is sorted. ``` The complexity is $O(n^2)$ so is not good. But is easy to implement. Generic implementation in IB pseudocode ```= MY_ARRAY [X] //array of x elements loop I from 0 to X-2 loop J from 0 to (X-2)- I if MY_ARRAY[J] > MY_ARRAY[J+1] then TEMP = MY_ARRAY[J] MY_ARRAY[J]=MY_ARRAY[J+1] MY_ARRAY[J+1] = TEMP end if end loop end loop ``` :::info #### WHY -2 ?? ? QUESTION MARK ? Remember that we are looking into pairs of elements. So, if we're looking to the element J and the next, J+1, the last element index is J+1. The last index is X-1 so the maximum value of J is X -2 Using math: $J+1 = X-1$ $J = X - 2$ ::: ### Selection sort More info here: https://www.geeksforgeeks.org/selection-sort/ ``` Selection sort is a simple and efficient sorting algorithm that works by repeatedly selecting the smallest (or largest) element from the unsorted portion of the list and moving it to the sorted portion of the list. ``` ### Sort of sorting exercise (similar to Nov 2018) We have 2 **parallel** arrays: FIRSTNAMES [6000] and LASTNAMES [6000]. They hold the information of the last 6000 people that want a ticket to go to a concert of Coldplay. Since we want to use binary search to get the information of these people, we need to sort these arrays. ![favsu7ywiamypvk](https://hackmd.io/_uploads/Hye6IhdiT.jpg) ([source](https://www.google.com/url?sa=i&url=https%3A%2F%2Fwww.totbarcelona.cat%2Fcultura-i-oci%2Fcoldplay-barcelona-nova-gira-226900%2F&psig=AOvVaw1-SDi1MnaK9fjJF4CHV8RJ&ust=1707903001675000&source=images&cd=vfe&opi=89978449&ved=0CBMQ3YkBahcKEwjgmrbUgKiEAxUAAAAAHQAAAAAQAw) I think that 6000 is maybe too few) Construct in pseudocode an algorithm that sort these arrays by last name **using bubble sort** (A not very good idea but you have to follow orders). Take into account that you will also need to change the first names so the arrays keep being parallel. ![imagen](https://hackmd.io/_uploads/By2BQXqoT.png) Credit Y.C. ``` loop I from 0 to 6000 -2 UNSORTED = false loop J from 0 to 6000 -2 - I if LASTNAMES[J] > LASTNAMES[J+1] then TEMP1 = LASTNAMES[J+1] TEMP2 = FIRSTNAMES[J+1] LASTNAMES[J+1] = LASTNAMES[J] FIRSTNAMES[J+1] = FIRSTNAMES[J] LASTNAMES[J] =TEMP1 FIRSTNAMES[J] = TEMP2 UNSORTED = TRUE end if end loop if NOT UNSORTED then break end if end loop ``` Credit L.C. (with edit) Construct in pseudocode an algorithm that sort these arrays by last name **using selection sort** (A not very good idea but you have to follow orders). Take into account that you will also need to change the first names so the arrays keep being parallel. ``` FIRSTNAMES[6000] LASTNAMES[6000] LOOP MIN FROM 0 TO 5998 I = MIN LOOP CURRENT FROM MIN+1 TO 5998 if LASTNAMES[CURRENT]< LASTNAMES[I] then I = CURRENT end if end loop if I != MIN then TEMP = LASTNAMES[I] LASTNAMES[I]= LASTNAMES[MIN] LASTNAMES[MIN] = TEMP TEMP = FIRSTNAMES[I] FIRSTNAMES[I] = FIRSTNAMES[MIN] FIRSTNAMES[MIN] = TEMP end if end loop ``` Credit A.P. (with edit) ### Sorting exercise 2 We have an array of the grades of students. There are 2 parallel arrays, one is NAMES and the othere is GRADES (50 students). The grades are integers that go from 1 to 7 (IB style) First we want to order the Students so the students with the best grades are in the first part of the array and the last are the ones with lower grades. Construct an algorithm that using selection sort sort both arrays. Construct an algorithm that using bubble sort sort both arrays. ### DOUBLE UP :::danger This is just for practice. I don't recall any problem with 2 levels of order in an IB exam. ::: In the same context (NAMES, GRADES, 50 elements) Now we want to order not only by grades (that we have from 1 to 7 but we want also a sub-order that is going to be by name) so if there are 2 students with the same grade, the first in the alphabetical order will be displayed first. For this one we are going to order it by parts. We need to detect if they are the same and them use a function to sort the sub-arrays. ```? loop I from 0 to 48 UNSORTED = false loop J from 0 to 48 - I // remember that this is lesser than! if GRADES[J] < GRADES[J+1] then //SWAPPING TEMP1 = GRADES[J+1] TEMP2 = NAMES[J+1] GRADES[J+1] = GRADES[J] NAMES[J+1] = NAMES[J] GRADES[J] =TEMP1 NAMES[J] = TEMP2 UNSORTED = TRUE end if end loop if NOT UNSORTED then break end if end loop //Now Grades is sorted and names is mostly sorted CURRENT_INDEX = 0 loop I from 1 to 48 if GRADE[CURRENT_INDEX] != GRADE[I] loop J from CURRENT_INDEX to I -1 loop K from CURRENT_INDEX to I -1 -J // we're looking for alphabetical order // so we swap in the other case if NAMES[K] > NAMES[K+1] then TEMP = NAMES[K] NAMES[K] = NAMES [K+1] NAMES[K+1] = TEMP //we don't need to update grades because they yave the same value! end if end loop end loop CURRENT_INDEX = I end if end loop ```