<style> .markdown-body h1:first-of-type { margin-top: 24px; } .markdown-body h1 { margin-top: 64px; } .markdown-body h1 + h2 { margin-top: 32px; } .markdown-body h2 { margin-top: 48px; } .markdown-body h2.topics { font-size: 1.8em; border-bottom: none; } .markdown-body h3 { color: cornflowerblue; } .markdown-body p strong { font-weight: normal; color: red; } .exercise { font-size: 150%; font-weight: bold; color: rgb(227,112,183); } .note { color: red; } .reveal { background-color: black; margin-bottom: 16px; border: 1px solid white; } .reveal p { transition: opacity 1s ease; opacity: 0; } .reveal:hover p { opacity: 1; } </style> # ACIT 1515 - Lesson 5 <h2 class="topics">Topics</h2> - [Sequences: Strings, Lists, and Tuples](#Sequences-Strings-Lists-and-Tuples) - [Sequence Conversion Methods](#Conversion-Methods-for-Sequences) - [For Loops](#For-Loops) - [Ranges](#Ranges) ## Sequences: Strings, Lists, and Tuples In Lesson 2 we discussed some of the basic [data types](https://hackmd.io/@charris17/acit1515-lesson2#Data-Types) in Python: ints and floats (for whole and decimal numbers), booleans (True/False), None (to indicate a variable is empty, or unset), and ==strings==. <!-- Strings are simply characters inside of single or double quotes, and are used for storing _text_. The examples below demonstrate the difference between the _text representation of a number_ and the actual number, as well as the _text representation of a value_ and the value itself: ```python= course_number = 1515 # the integer 1515 course_number_string = '1515' # the string 1515 learns_python = True # the value True learns_data_types = 'True' # the string True print(course_number == course_number_string) # returns False! The int 1515 is not the same as the string '1515'! ``` --> Strings (previously described strings as characters inside of single or double quotes) are in fact ==sequences== of characters: an ordered collection of individual values. Strictly speaking, the value below is not the *word* Python ```python= language = 'Python' ``` it is the _sequence_ of the letters P, y, t, h, o, and n. ### Sequence Operations Python defines several operations that can be performed on any sequence type. If we want to know whether a particular value is found in a sequence of values, we can use the ==in== operator: ```python= language = 'Python' print('P' in language) # prints True print('x' in language) # prints False print('x' not in language) # prints True print('ython' in language) # prints True ``` We can concatenate (join) sequences together: ```python= first = 'ACIT' second = '1515' print(first + second) # prints ACIT1515 ``` We can also get the _length_ of a sequence using the `len()` function: ```python= language = 'Python' num_characters = len(language) print(num_characters) # prints 6 ``` If we want to read _part_ of a sequence, we can use ==square brackets== and a numeric ==index==: ```python= language = 'Python' first_letter = language[0] # the first character is always at index zero, not one! print(first_letter) # prints P second_letter = language[1] print(second_letter) # prints y ``` A negative index is allowed if we want to read starting at the _end_ of the sequence: ```python= language = 'Python' last_letter = language[-1] # -1 is always the last character in the string print(last_letter) # prints n second_last_letter = language[-2] print(second_last_letter) # prints o ``` We can even read multiple values within a sequence using square bracket _slice_ notation: ```python= language = 'Python' starts_at = language[2:] # start at index 2, stop at end of string print(starts_at) # prints thon starts_and_ends_at = language[2:4] # start at index 2, end at 3! (the second index, 4, is not included) print(starts_and_ends_at) ends_at = language[:3] # start at the beginning, end at index 2 print(ends_at) # prints Pyt ``` We can _count_ the number of occurrences of a value inside a sequence using the `count()` function: ```python= state = 'Mississippi' how_many = state.count('s') print(how_many) # prints 4 ``` and find the index of the _first occurrence_ of a value inside a sequence: ```python= state = 'Mississippi' first_s = state.index('s') print(first_s) # prints 2, the first index of the character s ``` ### Lists If strings are sequences of individual characters, e.g. the sequence 'P', 'y', 't', 'h', 'o', and 'n', then lists (referred to as 'arrays' in most other programming languages) can be thought of as sequences where the individual values can be _anything_, not just characters. Lists are a way for us to store ==multiple values in a single variable==. Lists are declared using square brackets and can be made up of a mix of numbers, strings, booleans, and more - even lists inside of lists! Here are a few simple examples: ```python= number_list = [1, 2, 3, 4, 5] string_list = ['a', 'b', 'c', 'Python'] boolean_list = [True, False, True, True] mixed_list = [1, '1', True] ``` Because they are sequences, all of the above [sequence operations](#Sequence-Operations) can be used on lists as well! We can check if a value is ==in== a list, concatenate, get the length of a list, slice, count, and so on. ```python= my_list = ['BCIT', 'SFU', 'VCC', 'UBC'] print('BCIT' in my_list) # prints True print(my_list + my_list) # prints ['BCIT', 'SFU', 'VCC', 'UBC', 'BCIT', 'SFU', 'VCC', 'UBC'] print('Capilano' not in my_list) # prints True print(len(my_list)) # prints 4 print(my_list[1:3]) # prints ['SFU', 'VCC'] ``` The last example: ```python= print(my_list[1:3]) # prints ['SFU', 'VCC'] ``` is instructive because it shows us two things about lists. 1. When we slice a list using square bracket notation, e.g. [1:3], we get back a _new_ list made up of values from the original list 2. Even though the values inside `my_list` are strings (sequences), they are each counted as **one** value. 'SFU' is **one** of the values in `my_list`, 'VCC' is another **one** of the values in the list, and so on. Below are several self-tests (answers are provided at the end of this section): Given the following list ```python= courses = [1310, 1515, 1620, 1630] ``` 1. What is the result of the following print statement? ```python= print(courses[2]) ``` Given the following _nested_ list (i.e. a list inside of a list) ```python= letters = [ 'a', 'b', ['c', 'd', 'e'], 'f' ] ``` 2. How would you print the letter 'b'? 3. How would you print the list ['c', 'd', 'e']? 4. How would you print the letter 'd'? #### Answers ```python 1. 1620 2. print(letters[1]) 3. print(letters[2]) 4. print(letters[2][1]) ``` Question 4 demonstrates that we can always follow a list with square brackets. Because `letters` is a list, it can be followed by square brackets to read one of the values, e.g. `letters[2]`. The _value_ at index 2 is itself also a list, so it too can be followed by square brackets to read a value, e.g. `letters[2][1]` ### Differences Between Lists and Strings As noted above, lists and strings can (for the most part) be treated the same. Sequence operations for strings can be used on lists and vice-versa. Lists _differ_ from strings in at least one important way: lists are ==mutable==. 'Mutable' is just a fancy way of saying we can _change_ the values inside a list. Strings are ==immutable==, meaning that we can never change the values inside the string; we can only create *new* strings. Because lists are mutable, there are certain operations that are valid for lists that are not valid for strings. For example, it is common to need to add new values to the _end_ of a list (typically because we are not concerned with the _order_ of things inside a list), so lists have an `append()` method: ```python= a_list = [] # empty list a_list.append('a') a_list.append('b') a_list.append('c') print(a_list) # prints ['a', 'b', 'c'], values are in the order they are appended ``` We can also assign new values inside a list. If square bracket notation appears to the left of an assignment operator, the list value at that index is changed. ```python= a_letter_list = ['a', 'b', 'c', 'd'] a_letter_list[1] = 'B' print(a_letter_list) # prints ['a', 'B', 'c', 'd'] ``` As a reminder, these operations are _not possible_ with strings. Strings are ==immutable==, meaning the characters inside the sequence cannot be changed. ```python= language = 'python' language[0] = 'P' # TypeError! 'str' object does not support item assignment ``` ### Tuples Tuples are another sequence type in Python, functionally identical to lists with one exception: they are (like strings) ==immutable==. Tuples are declared using parentheses: ```python= first_tuple = ('BCIT', 'SFU', 'UBC') ``` and use square brackets in exactly the same way as lists: ```python= first_tuple = ('BCIT', 'SFU', 'UBC') print('SFU' in first_tuple) print(first_tuple[0]) # prints BCIT print(first_tuple[1:]) # slice - prints ('SFU', 'UBC') ``` but you cannot change the values inside a tuple: ```python= second_tuple = ('a', 'b', 'c') second_tuple[0] = 'A' # TypeError: 'tuple' object does not support item assignment ``` If tuples behave exactly like lists, then when and why should you use a tuple? Use a tuple any time the values inside should *not* change. If you want to ensure that values inside a list do not accidentally get changed or modified, use a tuple instead. ## Conversion Methods for Sequences Just like the conversion functions that exist for our basic data types, all of the above sequences have related functions that allow us to change from one type to another. ``` str() list() tuple() ``` These methods have many useful applications, like: - turning a string into a list of individual characters - converting all the values in a list into a single string - preventing values in a list from being modified by turning it into a tuple ```python= grades = [90, 95, 80, 76] immutable_grades = tuple(grades) print(grades) # prints (90, 95, 80, 76) ``` ## Using Loops with Sequences Consider the following example of printing a list, one value per line: ```python= cities = ['Vancouver', 'Richmond', 'Surrey', 'Burnaby'] print(f'1. {cities[0]}') print(f'2. {cities[1]}') print(f'3. {cities[2]}') print(f'4. {cities[3]}') ``` The code above works correctly, but (as with any programming problem) we should always analyze what might happen if we use the same approach under different circumstances. What if the list contained 1000 cities? You could (but shouldn't) write 1000 lines of code to print all the cities in the list. What if the list was being modified while the program was running and the length was therefore unknown? In this case the solution above fails completely. Instead of hard-coding each line, we can use a ==loop== to automatically step through the list and achieve the same output with less code. Generally speaking, loops allow us to repeat one or more lines of code multiple times. They help to make our code more [D.R.Y.](https://www.baeldung.com/cs/dry-software-design-principle) (i.e. less repetitive), shorter, and more able to handle the types of problems mentioned above. Any time you see code like the example above, think about how it can be rewritten to be more efficient using a technique like looping. ## For Loops Unlike `while` loops, which run until a boolean condition becomes false, ==for== loops can be used to run one or more statements a set number of times or to step through a sequence. We can step through sequences backwards or forwards, in increments of 1 or more. The first example below is the simplest implementation of a for loop - it steps through the list we specify (letters), one at a time, from beginning to end. ```python= letters = ['a', 'b', 'c', 'd'] print('Before loop') for letter in letters: print(letter) print('After loop') # prints: # Before loop # a # b # c # d # After loop ``` Note that line 6, `print(letter)` runs 4 times. Why 4 times? Because the length of the list is 4. This style of for loop will run every statement indented underneath for as many times as there are values in the list. And what is `letter`? `letter` is a variable that we are declaring in line 5 that will be automatically assigned the values from the list, in order. The example uses the name `letter` because it is the singular equivalent of the name of the list, `letters`, but this is not required. The variable declared after `for` can be named whatever you like. The first time the loop runs, `letter` is assigned the value at `letters[0]`. The second time the loop runs, `letter` is assigned the value at `letters[1]` and so on, until the end of the list is reached. Once we have stepped through the entire list, the rest of the script continues on. Here is another example, demonstrating that we can put as many statements inside a for loop as we want. Note that the loop is considered finished (just like a conditional statement) when the next _unindented_ line is encountered. ```python= letters = ['a', 'b', 'c', 'd'] print('Before loop') for letter in letters: print('Inside loop') print(letter) print('After loop') # prints: # Before loop # Inside loop # a # Inside loop # b # Inside loop # c # Inside loop # d # After loop ``` ## Ranges All of the `for` loop examples from the previous section show looping through a sequence or set, one value at a time, beginning to end. The variables we declare (_character_ and _val_) are assigned the *values* from the sequence, one-by-one, beginning to end. Using the `range()` function in a loop gives us a sequence of *numbers*, not values, that we can use to do things like: - loop backwards - step through a sequence in increments greater than one - loop multiple statements with no sequence involved <!--- The `range()` function gives us three things (two of them optional): 1. A start point 2. An end point 3. An amount to increment It does *not* generate a sequence of numbers - just the above-mentioned start/end/step values. ---> Here is a basic example of using range to loop one or more statements: ```python= for i in range(5): print(i) # prints # 0 # 1 # 2 # 3 # 4 ``` As demonstrated by the output above, the variable `i` is assigned _numbers_ from ==start== to ==end== - ==step==. When only one number is passed to range it is assigned to ==end==. ==start== defaults to zero, and ==step== defaults to one, so we get the sequence 0 to 4 (i.e. 0 to (5 - 1)). To define a different start point, or different step value, we must pass more numeric arguments to `range()` in the form `range(start, end, step)`: ```python= # start at 1 (instead of zero), go to 4, increase by one each time for i in range(1, 5): print(i) # prints 1, 2, 3, 4 # start at 2, go to 4, increase by two each time for i in range(2, 6, 2): print(i) # prints 2, 4 # start at 100 # go to end - step = 0 - (-10) = 10 # which gives the sequence 100 through 10 in increments of 10 for i in range(100, 0, -10): print(i) ``` Rather than passing in a hard-coded number, we can pass in the result of a function, like `len()`. This way we can create a (more complicated) version of the simple for loops from the previous section: ```python= cities = ['Vancouver', 'Richmond', 'Surrey', 'Burnaby'] # Simpler version without range for city in cities: print(city) # Version with range for i in range(len(cities)): print(cities[i]) ``` While the second version using range is more complicated than the first, it does offer benefits. We have access to our current position in the sequence (through the variable i), so we could conceivably end the loop at any point we choose, or, for example, print only the even numbered cities. Using range in this example gives us much more flexibility than just start to end, one value at a time. ## Enumeration Given a sequence, it is sometimes useful to be able to access both the _values_ (e.g. `for city in cities`) *and* a numeric index (e.g. `for i in range(10)`). The `enumerate()` function in Python allows us to do both. It takes a sequence of values, e.g.: ```python= options = ['Add', 'Update', 'Delete'] ``` and returns a new sequence of _tuples_, where each tuple contains a) the numeric position and b) the value. We can access this sequence of tuples by either using `enumerate` in a for loop, or converting it to a list: ```python3= for index, value in enumerate(options): print(index, value) # prints: # (0, 'Add') # (1, 'Update') # (2, 'Delete') enumerated = enumerate(options) print(list(enumerated)) # prints [(0, 'Add'), (1, 'Update'), (2, 'Delete')] ``` Note that when using enumerate we have to declare *two* variables, _index_ and _value_ (these two variables can also be named whatever you like). Two variable declarations are necessary because we need to be able to read the first and second values in the tuples returned from the enumerate function. ## The `break` and `continue` Keywords Recall this example of an 'infinite' loop from the previous lesson: ```python= while True: print('Hello!') ``` The above loop will run forever, with the user unable to stop it (unless they type ctrl+c, which stops any running program in the terminal). Using `while True` as its condition _can_ be safe to use provided you include a way for the loop to stop. You can stop any loop (while or for) early by using the ==break== statement. Example using `while`: ```python= password = '' while True: password = input('Please enter your password: ') if len(password) >= 16: break # condition has been met, stop the loop ``` Example using `for`: ```python= schools = ['UBC', 'SFU', 'BCIT', 'VCC', 'Capilano'] for school in schools: if school == 'BCIT': break # stop the loop if/when we reach BCIT ``` We can also use the ==continue== statement to **skip** any iterations of the loop we wish. `continue` essentially means 'stop at this line and go back to the beginning of the loop'. Below is an example: ```python= for i in range(10): if i % 2 == 1: # if i is an odd number, continue # restart the loop, skipping line 5 print(i) # prints 0, 2, 4, 6, 8 ``` ## Combining Loops and Conditional Statements As shown above, it is often necessary to put conditional statements _inside_ of loops, so that we can test values, or end loops early. It is also possible of course to put a loop inside of a conditional statement, meaning: only run a loop if a condition is true. The second half of ACIT 1515 is focused on these types of combinations - taking the basic tools (data types, conditional statements, loops, and functions) and combining them in different ways to achieve a solution to a given problem. How to combine these techniques correctly takes practice, but an important reminder as we advance through the course is: **indentation matters**. Remember that Python relies on indentation to group lines of code together, and incorrect indentation can lead to incorrect results. Take these two examples: example 1: ```python= print('Before loop') for i in range(5): print('Inside loop') print('After loop') ``` example 2: ```python= print('Before loop') for i in range(5): print('Inside loop') print('After loop') ``` The only difference is that line 6 is _indented_ in the second example, and because it is indented it becomes _part of the loop_. Recall that any 'block' (a condition, or loop - any statement that ends with a colon) is considered finished _whenever the next **unindented** line is encountered_. This minor change results in very different output! Example one outputs: ``` Before loop Inside loop Inside loop Inside loop Inside loop Inside loop After loop ``` and example two outputs: ``` Before loop Inside loop After loop Inside loop After loop Inside loop After loop Inside loop After loop Inside loop After loop ``` This problem becomes even more pronounced when we have blocks inside of other blocks, e.g. conditional statements inside loops. There is unfortunately no easy fix; make sure that you are paying attention to how your statements line up horizontally, and test your scripts often!