---
tags: Python Workshop 沈煒翔
---
# Lesson 4: Advance Data Structures
<!-- ## Recap (loops)
### Multiplication Table
Print Nine-Nine Multiplication Table (九九乘法表) -->
## Data Structures
Variables are used for storing a single value, such as an int or a float. When you want to store multiple values, it can be troublesome.
```python
# Storing the days in each month
jan = 31
feb = 28
mar = 31
apr = 30
# ... not very practical
```
We use data structures to store multiple values at once.
```python
# Making list in Python
days_in_month = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]
# Getting the days in the fifth month (May)
print(days_in_month[4])
# >>> 31
print(days_in_month[1])
# >>> 28
```
In different programming languages, there are different data structures. In Python, there are 4 built-in data structures.
1. List
2. Tuple
3. Dictionary
4. Set
## List
List is used to store a sequence of values and is indexed by their integer position.
### List indexing
```python
# Initializing a list with some elements in it
scores = [1, 2, 3, 4, 5] # Normally, We name the list using plural
# Index the list
print(scores[0]) # the first element in the list
# >>> 1
print(scores[1]) # the second element in the list
# >>> 2
print(scores[-1]) # the last element in the list
# >>> 5
print(scores[-3]) # the third last element in the list
# >>> 3
# We can change the value of the element
scores[0] = 0
print(scores)
# >>> [0, 2, 3, 4, 5]
# Also, we can index the list by slicing
print(scores[2:4]) # Getting the range from 2 to 4 (which is 2, 3)
# >>> [3, 4]
print(scores[:3]) # Slice the list to get the first three elements
# >>> [0, 2, 3]
print(scores[-2:]) # Slice the list to get th last two elements
# >>> [4, 5]
```
### Adding/removing elements in list
We can also add or remove elements into the list.
```python
scores = [1, 2, 3]
scores.append(4) # add an element at the tail of the list
print(scores)
# >>> [1, 2, 3, 4]
scores.insert(3, 5) # insert an element at position 3
print(scores)
# >>> [1, 2, 3, 5, 4]
scores.pop() # remove an element at the tail of the list
print(scores)
# >>> [1, 2, 3, 5]
scores.pop(2) # remove the element at position 2
print(scores)
# >>> [1, 2, 5]
```
### The length of the list
```python
scores = [1, 2, 3, 4]
print(len(scores))
# >>> 4
```
### Using list in loops
We can use while loops or for loops to go through a list.
```python
scores = [1, 2, 3]
i = 0 # our looping index
while i < len(scores):
print(scores[i])
i += 1
# >>> 1
# >>> 2
# >>> 3
```
Most of the time, we would use for loop instead of while loop for better readability and also better performance.
```python
scores = [1, 2, 3]
for score in scores:
print(score)
# >>> 1
# >>> 2
# >>> 3
```
Sometimes, we want to also know the index (like the looping index) of each element.
```python
results = [50, 60, 70]
for i, result in enumerate(results):
print(i, result)
# >>> 0 50
# >>> 1 60
# >>> 2 70
```
### Using list in if-else statements
We can use ```in``` to return the Boolean results of whether an element is in the list.
```python
fruits = ["apple", "banana", "grape"]
if "peach" in fruits:
print('Yes peach')
else:
print('No peach')
# >>> No peach
```
### Exercise
Print the mean values of the list.
```python
numbers = [1, 3, 6, 8, 17]
```
Add two lists to form a new list.
```python
group1 = [1, 3, 4, 5]
group2 = [3, 5, -1, 7.5]
# >>> [4, 8, 3, 12.5]
```
Print the days in each month.
```python
month = 4 # April
# >>> 30
month = 7 # July
# >>> 31
```
## Tuple
Tuples are also a data structure for storing sequential data, like list. However, tuples are immutable, which means you cannot change/modify the tuple after it is created.
```python
lottery_numbers = (3, 17, 8, 3, 9)
# You can only read the tuple.
print(lottery_numbers[3])
# >>> 3
# You can not change it
lottery_numbers[0] = 1
# >>> TypeError: 'tuple' object does not support item assignment
```
So why use tuple in the first place?
1. Prevent changing the tuple items (less bug and increased readability)
2. Tuples use less memory space (compared with lists)
## Dictionary
In lists and tuples, we use the integer position to index the elements inside the data structures. It is useful to store sequential data but not useful for storing data without orders.
Elements in a dicionary is indexed by its key.
```python
# Storing the height of each person
person_height = {'Emily': 162, 'Tommy': 175, 'Larry': 168}
# Adding a element
person_height['Sunny'] = 155
# Updating a element
person_height['Sunny'] = 160
# Test if a key is inside a dictionary
print('Kevin' in person_height)
# >>> False
# Print all keys in the dictionary
print(person_height.keys())
# >>> dict_keys(['Emily', 'Tommy', 'Larry', 'Sunny'])
# Note that the order of the keys is not tracked and can be changed at any time
# Print all the values in the dictionary
print(person_height.values())
# >>> dict_values([162, 175, 168, 160])
# The order is also arbitrary
```
### Storing data with multiple levels
Dictionaries are often used for storing multi-level data.
```python
student_infos = {
"Emily": {
"tests": [88, 95, 97],
"gender": "female",
"ID": 1
},
"Bob": {
"tests": [50, 58, 68],
"gender": "male",
"ID": 2
},
}
print(student_infos['Emily']['tests'][0])
# >>> 88
```
### Exercise
Get the price of the given item.
```python
prices = {"coke": 20, "tea": 30, "coffee": 40}
item = "coke"
# >>> print(20)
item = "water"
# >>> print("Free?")
```
## Class Exercise
Given a list of tuples. Each tuple represents the education level and the salary of a single person.
```python
datas = [('master', 1038),
('bachelor', 843),
('master', 1034),
('master', 1008),
('bachelor', 810),
('master', 1004),
('bachelor', 839),
('bachelor', 806),
('master', 990),
('master', 999),
('bachelor', 750),
('bachelor', 758),
('bachelor', 764),
('bachelor', 790),
('bachelor', 792),
('master', 1030),
('PhD', 1208),
('master', 956),
('PhD', 1207),
('master', 985)]
```
1. You collect 3 more data, append the data to the original list.
```python
new_data = [('master', 1017),
('master', 1027),
('bachelor', 829)]
```
2. Calculate the average salary of master
```python
print(1008)
```
3. Which education level is the most common? What is the percentage?
```python
#print("bachelor")
#print(0.43478)
```
4. You notice that all the collected data from PhD is incorrect. Remove all data point of PhD.
## Take-home Practices
### List operations
Double the values of each element in the list and then print the list in reverse order.
```python
the_list = [-9, 6, 0, 3, 8]
# Your result should look like
# >>> [16, 6, 0, 12, -18]
```
```python
the_list = [-9, 6, 0, 3, 8]
new_list = []
# traverse the list in reverse order
for i in reversed(range(len(the_list))):
value = the_list[i]
new_list.append(value*2)
print(new_list)
```
### List operations 2
Remove all numbers in a given list whose factors contain 3 and then print the list.
```python
the_list = [3, 7, 17, 21, 1, -6, 5, 9]
# Your result should be
# >>> [7, 17, 1, 5]
```
```python
the_list = [3, 7, 17, 21, 1, -6, 5, 9]
new_list = []
for value in the_list:
if value % 3 != 0:
new_list.append(value)
print(new_list)
```
### Dictionary operation
Count the top three most frequency appearing alphabet letters in the string.
```python
the_string = "Bitcoin is a decentralized digital currency, without a central bank or single administrator, that can be sent from user to user on the peer-to-peer bitcoin network without the need for intermediaries."
# HINT: to loop through the alphabet letters, we can also use for loops.
for letter in "Egg":
print(letter)
# >>> 'E'
# >>> 'g'
# >>> 'g'
```
```python
the_string = "Bitcoin is a decentralized digital currency, without a central bank or single administrator, that can be sent from user to user on the peer-to-peer bitcoin network without the need for intermediaries."
# count all the appear letters
counting_dict = {}
for letter in the_string:
if letter in counting_dict:
counting_dict[letter] += 1
else:
counting_dict[letter] = 1
# get the most frequent letter three times (so top three)
for _ in range(3):
max_count = 0
max_letter = ''
for key in counting_dict.keys(): # linear search the most frequent letter
if counting_dict[key] > max_count:
max_letter = key
max_count = counting_dict[key]
print('Letter', max_letter, 'appears', max_count, 'times')
# remove the letter from the dictionary by setting it to zero
counting_dict[max_letter] = 0
```
### Sequence prediction
There is a sequence of values with the following relation.
```x[n] = (x[n-1] + x[n-2]) / 2```
Given a list of values, predict the next 5 values and print the list.
```python
the_list = [24, 8, 16]
# Your result should look like
# >>> [24, 8, 16, 12, 14, 13, 13.5, 13.25]
```
```python
the_list = [24, 8, 16]
for _ in range(5):
next_value = (the_list[-1] + the_list[-2]) / 2
the_list.append(next_value)
print(the_list)
```
### Record lucky numbers
A lucky number is a number that has the digit 7 in it, such as 7, 17, 37, or 76. Build a dictionary whose keys is integer ranging from 1 to 999, and the values is a Boolean value that indicates whether the integer key is a lucky number.
```python
# HINT: you can get the units digit (個位數), tens digit (十位數), hundreds digits (百位數) using:
num = 123
units_digit = num // 10**0 % 10 # "//" is floor division
tens_digit = num // 10**1 % 10
hundreds_digit = num // 10**2 % 10
# Your result dictionary should look something like this
lucky_numbers = {1: False, 2: False, 3: False, 4: False, 5: False, 6: False, 7: True} # and so on to 1000
```
```python
lucky_numbers = {}
for num in range(1, 1000):
units_digit = num // 10**0 % 10
tens_digit = num // 10**1 % 10
hundreds_digit = num // 10**2 % 10
if units_digit==7 or tens_digit==7 or hundreds_digit==7:
lucky_numbers[num] = True
else:
lucky_numbers[num] = False
```