--- tags: Python Workshop 沈煒翔 --- # Lesson 4: Advance Data Structures <!-- ## Recap (loops) ### Multiplication Table Print Nine-Nine Multiplication Table (九九乘法表) --> ## Data Structures Variables are used for storing a single value, such as an int or a float. When you want to store multiple values, it can be troublesome. ```python # Storing the days in each month jan = 31 feb = 28 mar = 31 apr = 30 # ... not very practical ``` We use data structures to store multiple values at once. ```python # Making list in Python days_in_month = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31] # Getting the days in the fifth month (May) print(days_in_month[4]) # >>> 31 print(days_in_month[1]) # >>> 28 ``` In different programming languages, there are different data structures. In Python, there are 4 built-in data structures. 1. List 2. Tuple 3. Dictionary 4. Set ## List List is used to store a sequence of values and is indexed by their integer position. ### List indexing ```python # Initializing a list with some elements in it scores = [1, 2, 3, 4, 5] # Normally, We name the list using plural # Index the list print(scores[0]) # the first element in the list # >>> 1 print(scores[1]) # the second element in the list # >>> 2 print(scores[-1]) # the last element in the list # >>> 5 print(scores[-3]) # the third last element in the list # >>> 3 # We can change the value of the element scores[0] = 0 print(scores) # >>> [0, 2, 3, 4, 5] # Also, we can index the list by slicing print(scores[2:4]) # Getting the range from 2 to 4 (which is 2, 3) # >>> [3, 4] print(scores[:3]) # Slice the list to get the first three elements # >>> [0, 2, 3] print(scores[-2:]) # Slice the list to get th last two elements # >>> [4, 5] ``` ### Adding/removing elements in list We can also add or remove elements into the list. ```python scores = [1, 2, 3] scores.append(4) # add an element at the tail of the list print(scores) # >>> [1, 2, 3, 4] scores.insert(3, 5) # insert an element at position 3 print(scores) # >>> [1, 2, 3, 5, 4] scores.pop() # remove an element at the tail of the list print(scores) # >>> [1, 2, 3, 5] scores.pop(2) # remove the element at position 2 print(scores) # >>> [1, 2, 5] ``` ### The length of the list ```python scores = [1, 2, 3, 4] print(len(scores)) # >>> 4 ``` ### Using list in loops We can use while loops or for loops to go through a list. ```python scores = [1, 2, 3] i = 0 # our looping index while i < len(scores): print(scores[i]) i += 1 # >>> 1 # >>> 2 # >>> 3 ``` Most of the time, we would use for loop instead of while loop for better readability and also better performance. ```python scores = [1, 2, 3] for score in scores: print(score) # >>> 1 # >>> 2 # >>> 3 ``` Sometimes, we want to also know the index (like the looping index) of each element. ```python results = [50, 60, 70] for i, result in enumerate(results): print(i, result) # >>> 0 50 # >>> 1 60 # >>> 2 70 ``` ### Using list in if-else statements We can use ```in``` to return the Boolean results of whether an element is in the list. ```python fruits = ["apple", "banana", "grape"] if "peach" in fruits: print('Yes peach') else: print('No peach') # >>> No peach ``` ### Exercise Print the mean values of the list. ```python numbers = [1, 3, 6, 8, 17] ``` Add two lists to form a new list. ```python group1 = [1, 3, 4, 5] group2 = [3, 5, -1, 7.5] # >>> [4, 8, 3, 12.5] ``` Print the days in each month. ```python month = 4 # April # >>> 30 month = 7 # July # >>> 31 ``` ## Tuple Tuples are also a data structure for storing sequential data, like list. However, tuples are immutable, which means you cannot change/modify the tuple after it is created. ```python lottery_numbers = (3, 17, 8, 3, 9) # You can only read the tuple. print(lottery_numbers[3]) # >>> 3 # You can not change it lottery_numbers[0] = 1 # >>> TypeError: 'tuple' object does not support item assignment ``` So why use tuple in the first place? 1. Prevent changing the tuple items (less bug and increased readability) 2. Tuples use less memory space (compared with lists) ## Dictionary In lists and tuples, we use the integer position to index the elements inside the data structures. It is useful to store sequential data but not useful for storing data without orders. Elements in a dicionary is indexed by its key. ```python # Storing the height of each person person_height = {'Emily': 162, 'Tommy': 175, 'Larry': 168} # Adding a element person_height['Sunny'] = 155 # Updating a element person_height['Sunny'] = 160 # Test if a key is inside a dictionary print('Kevin' in person_height) # >>> False # Print all keys in the dictionary print(person_height.keys()) # >>> dict_keys(['Emily', 'Tommy', 'Larry', 'Sunny']) # Note that the order of the keys is not tracked and can be changed at any time # Print all the values in the dictionary print(person_height.values()) # >>> dict_values([162, 175, 168, 160]) # The order is also arbitrary ``` ### Storing data with multiple levels Dictionaries are often used for storing multi-level data. ```python student_infos = { "Emily": { "tests": [88, 95, 97], "gender": "female", "ID": 1 }, "Bob": { "tests": [50, 58, 68], "gender": "male", "ID": 2 }, } print(student_infos['Emily']['tests'][0]) # >>> 88 ``` ### Exercise Get the price of the given item. ```python prices = {"coke": 20, "tea": 30, "coffee": 40} item = "coke" # >>> print(20) item = "water" # >>> print("Free?") ``` ## Class Exercise Given a list of tuples. Each tuple represents the education level and the salary of a single person. ```python datas = [('master', 1038), ('bachelor', 843), ('master', 1034), ('master', 1008), ('bachelor', 810), ('master', 1004), ('bachelor', 839), ('bachelor', 806), ('master', 990), ('master', 999), ('bachelor', 750), ('bachelor', 758), ('bachelor', 764), ('bachelor', 790), ('bachelor', 792), ('master', 1030), ('PhD', 1208), ('master', 956), ('PhD', 1207), ('master', 985)] ``` 1. You collect 3 more data, append the data to the original list. ```python new_data = [('master', 1017), ('master', 1027), ('bachelor', 829)] ``` 2. Calculate the average salary of master ```python print(1008) ``` 3. Which education level is the most common? What is the percentage? ```python #print("bachelor") #print(0.43478) ``` 4. You notice that all the collected data from PhD is incorrect. Remove all data point of PhD. ## Take-home Practices ### List operations Double the values of each element in the list and then print the list in reverse order. ```python the_list = [-9, 6, 0, 3, 8] # Your result should look like # >>> [16, 6, 0, 12, -18] ``` ```python the_list = [-9, 6, 0, 3, 8] new_list = [] # traverse the list in reverse order for i in reversed(range(len(the_list))): value = the_list[i] new_list.append(value*2) print(new_list) ``` ### List operations 2 Remove all numbers in a given list whose factors contain 3 and then print the list. ```python the_list = [3, 7, 17, 21, 1, -6, 5, 9] # Your result should be # >>> [7, 17, 1, 5] ``` ```python the_list = [3, 7, 17, 21, 1, -6, 5, 9] new_list = [] for value in the_list: if value % 3 != 0: new_list.append(value) print(new_list) ``` ### Dictionary operation Count the top three most frequency appearing alphabet letters in the string. ```python the_string = "Bitcoin is a decentralized digital currency, without a central bank or single administrator, that can be sent from user to user on the peer-to-peer bitcoin network without the need for intermediaries." # HINT: to loop through the alphabet letters, we can also use for loops. for letter in "Egg": print(letter) # >>> 'E' # >>> 'g' # >>> 'g' ``` ```python the_string = "Bitcoin is a decentralized digital currency, without a central bank or single administrator, that can be sent from user to user on the peer-to-peer bitcoin network without the need for intermediaries." # count all the appear letters counting_dict = {} for letter in the_string: if letter in counting_dict: counting_dict[letter] += 1 else: counting_dict[letter] = 1 # get the most frequent letter three times (so top three) for _ in range(3): max_count = 0 max_letter = '' for key in counting_dict.keys(): # linear search the most frequent letter if counting_dict[key] > max_count: max_letter = key max_count = counting_dict[key] print('Letter', max_letter, 'appears', max_count, 'times') # remove the letter from the dictionary by setting it to zero counting_dict[max_letter] = 0 ``` ### Sequence prediction There is a sequence of values with the following relation. ```x[n] = (x[n-1] + x[n-2]) / 2``` Given a list of values, predict the next 5 values and print the list. ```python the_list = [24, 8, 16] # Your result should look like # >>> [24, 8, 16, 12, 14, 13, 13.5, 13.25] ``` ```python the_list = [24, 8, 16] for _ in range(5): next_value = (the_list[-1] + the_list[-2]) / 2 the_list.append(next_value) print(the_list) ``` ### Record lucky numbers A lucky number is a number that has the digit 7 in it, such as 7, 17, 37, or 76. Build a dictionary whose keys is integer ranging from 1 to 999, and the values is a Boolean value that indicates whether the integer key is a lucky number. ```python # HINT: you can get the units digit (個位數), tens digit (十位數), hundreds digits (百位數) using: num = 123 units_digit = num // 10**0 % 10 # "//" is floor division tens_digit = num // 10**1 % 10 hundreds_digit = num // 10**2 % 10 # Your result dictionary should look something like this lucky_numbers = {1: False, 2: False, 3: False, 4: False, 5: False, 6: False, 7: True} # and so on to 1000 ``` ```python lucky_numbers = {} for num in range(1, 1000): units_digit = num // 10**0 % 10 tens_digit = num // 10**1 % 10 hundreds_digit = num // 10**2 % 10 if units_digit==7 or tens_digit==7 or hundreds_digit==7: lucky_numbers[num] = True else: lucky_numbers[num] = False ```