# Lists Lists are interesting and versatile data types. We will see everything we can do with lists in this module. ## What is a list A list is a data type that contains elements (making it a collection type). Lists can contain any type of data (ints, floats, strings, other lists, etc). It can also contain a mix of data types. However, it is good practice to have lists use only one data type to avoid confusion or errors. Example: list of ints -- `[1,2,3,4,5]` Example: list of strings -- `['a','b','c']` The elements in a list are separated by commas. ## Comparing lists Like all things, we can use the equality operators (== and !=) to compare lists! * == returns True if the two lists are exactly the same * != returns True if the two lists are different ## Index Notation, Slice Notation, and Stride Notation Similar to strings, we can refer to and access specific elements in a list by using their indices. We label list indices in the same way we label string indices: the first element is index 0, the second element is index 1, etc... ``` list : ['a', 'b', 'c'] index : 0 1 2 ``` ### Index Notation We access list elements the same way we access string elements: `my_list[index]` ```python= my_list = ['a','b','c'] print(my_list[0]) # prints 'a' print(my_list[-1]) # prints 'c' because negative ``` Remember, when we're using lists, the data type that is returned depends on what data type the list holds. ### Slice Notation We can also do slice notation the same way we did for strings. The only difference is that slice notation for strings returns a substring, while slice notation for lists returns a *sublist* (that is, a smaller list). Thus, when we slice (and stride) a list, we have to remember that it returns a list type. ```python= my_list = ['a','b','c'] print(my_list[:2]) # output: ['a', 'b'] ``` ### Stride Notation By this point, you will assume that stride notation is similar to that of strings. And you would be correct. Stride adds an additional factor, step or increment, which dictates what sublist is returned. Similarly to strings, it is important to understand how stride works and how to use it, but the most relevant use of stride is the use of *negative* stride. ```python= my_list = ['a','b','c'] print(my_list[::-1]) # output: ['c', 'b', 'a'] ``` **Remember:** slice and stride return a *sublist* and do not modify the actual list itself. Everything returned by these notations is a modified copy of the original list. ## List concatenation Recall string concatenation, where we were able to add strings together to make a bigger string. The same can be said about lists: lists can be added together to produce bigger lists. We might want to save the new bigger list to a new variable: ```python= list1 = [1,2,3] list2 = [4,5,6] list3 = list1 + list2 ``` Or we might want to add/concatenate a list to the end of an existing list: ```python= list1 = [1,2,3] list2 = [4,5,6] list1 += list2 ``` ## Iterating through indices of a list Similar to strings, we might want to keep track of the current index as we iterate through a list. The syntax is just like how we do it for strings: ```python= list1 = ["apple", "orange", "banana"] for index in range(len(list1)): print(f"Index {index}: {list1[index]}") ``` ## Lists are mutable Recall from last week how we called strings *immutable*. That is, we cannot modify specific characters in a string. Lists are *immutable*, meaning we *can* modify elements at specific indices of the list. Say we have a list, `fruit_list = ["apple", "orange", "banana"]` and we want to change "banana" to "pear". Intead of reassigning the entire fruit_list variable with the new list, we can just modify the value of the element where "banana" is. ```python= fruit_list = ["apple", "orange", "banana"] fruit_list[2] = "pear" ``` We are saying that we want to replace the element at index 2 with the value "pear". However, we must be careful when we do this. Just like when we try to access indices out of range, we can get IndexErrors when we try to modify elements at an index out of range... ```python= # CAUSES AN INDEX ERROR fruit_list = ["apple", "orange", "banana"] fruit_list[10] = "pear" ``` **Takeaway:** We can modify the elements in a list directly through in-place modification. Mutability means we can change the values of element without having to reassign the entire list. We will learn other ways mutability affects lists when we talk about methods. ## List Methods Lists' mutability also affects the way we apply methods to it. 7 of the 8 list methods provided in the List Methods "Cheat Sheet" directly change the elements in the list. Some might change values in the list, some might change the order of the list, some might change the length of the list. The methods that modify lists do not return anything because they are modifying the list directly. That means, unlike strings, we don't have to "save" the changes. ### .append(x) The `.append(x)` method *adds* a value (x) to the **end** of the list. ```python= my_list = ["apple", "orange", "banana"] my_list.append("pear") # list is now: ["apple", "orange", "banana", "pear"] ``` ### .insert(i,x) The `.insert(i,x)` method inserts a value, x, at index i. x is inserted *at* index i, meaning that everything previously at index i (and after) get shifted to the right. If i *exceeds* the valid index range, x gets inserts at the end of the list. ```python= my_list = ["apple", "orange", "banana"] my_list.insert(1, "peach") # my_list is now: ["apple", "peach", "orange", "banana"] ``` ### .remove(x) The `.remove(x)` removes the first instance of value x fom the list. If there is multiple instances of x, it will remove the one at the smallest index. If there is no x in the list, it will cause a ValueError ```python= my_list = ["apple", "orange", "banana", "orange"] my_list.remove("orange") # result list: ["apple", "banana", "orange"] ``` ```python= # ERROR my_list = ["apple", "orange", "banana"] my_list.remove("watermelon") ``` ### .pop() and .pop(i) The `.pop()` method has two uses: `.pop()` and `.pop(i)`. `.pop()` removes and *returns* the last element of the list. If the method is applied to an empty list, it will produce an IndexError. ```python= my_list = ["apple", "orange", "banana"] my_list.pop() # resulting list: ["apple", "orange"] ``` Since `.pop()` returns the value it removes, we can save the removed element to use in various other task ```python= my_list = ["apple", "orange", "banana"] removed_fruit = my_list.pop() # code that uses the removed_fruit # removed_fruit will contain the string "banana" ``` `.pop(i)` removes and *returns* the element at index i of the list. If i is out of the index range, it will produce an IndexError. ```python= my_list = ["apple", "orange", "banana"] my_list.pop(1) # my_list becomes ["apple", "banana"] ``` Since `.pop(i)` returns the value it removes, we can save the removed value to use in other actions... ```python= my_list = ["apple", "orange", "banana"] removed_fruit = my_list.pop(i) # code that uses removed_fruit # removed_fruit is given value "orange" ``` ### .reverse() The `.reverse()` method reverses the list *in-place*. Unlike using negative stride, `[::-1]`, `.reverse()` directly modifies the list it is attached to. ```python= my_list = ["apple", "orange", "banana"] my_list.reverse() # my_list is now ["banana", "orange", "apple"] ``` ### .sort() The `.sort()` method sorts a list *in-place*. For this class, we will restrict the use of `.sort()` to lists with numeric types. The `.sort()` method works on other data types, but we won't worry about that... ```python= my_list = [1,5,4,2,3] my_list.sort() # my_list now contains [1,2,3,4,5] ``` ### .count(x) The `.count(x)` method is the only method that does not modify lists. This method counts the number of times the element x appears in the list and returns the count. ```python= my_list = ["apple", "orange", "banana"] print(my_list.count("apple")) # output: 1 print(my_list.count("pear")) # output: 0 ``` ## Nested Lists Similarly to nested loops, we do not have to worry about nested lists... But, since lists can contain any data type, it can also contain lists! This produces what is known as a *nested* list. Similarly to other lists, it is best practice that if a list is going to contain another list as an element, all of the list's elements should be lists. This way, you know that the object/variable is all a list of lists. When we have nested lists, we refer to the bigger list (the list that contains the other lists) as the **outer list**, and the smaller lists are the **inner lists**. A nested list might looks something like: ```! [["corgi", "golden retriever", "husky"], ["tabby", "calico"], ["cardinal", "blue jay", 'pigeon']] ``` (The outer list contains lists of types of animals, the inner lists contain specific animal types) For readability, we can also look at the nested list like: ``` [ ["corgi", "golden retriever", "husky"], ["tabby", "calico"], ["cardinal", "blue jay", 'pigeon'] ] ``` Notice that each inner list is separated by commas, as we do for all elements in a list. ### Index Notation for Nested Lists We can access specific elements of nested lists (whether that is the inner list as a whole or elements of the inner list) through index notation! When we want to access elements of a nested list, we must access *two* indices: the index of the inner list we want to access *and* the index of the element *in* the inner list that we want to access. ``` nested_list[outer index][innder index] ``` (This is not the case if we just want to access the entire nested list...) **Example:** Given our list of animals, let's access some specific nested elements. ```python= animal_list = [["corgi", "golden retriever", "husky"], ["tabby", "calico"], ["cardinal", "blue jay", 'pigeon']] ``` If I want to access the "corgi" from my `animal_list`, I have to figure out where it is. The nested list that "corgi" is in is the first nested list (aka the first element of the outer list). So the outer index that I am accessing is 0: `animal_list[0][inner index]` In the list that contains "corgi", `["corgi", "golden retriever", "husky"]`, "corgi" is the first element in the list. So the inner index is also 0: `animal_list[0][0]` ### Iterating through Nested Lists We also have a way of accessing each individual element, no matter if it's in the inner list or the outer list. The way we can do this is through *nested loops*. The outer loop accesses the elements of the outer lists, and the inner loop accesses the elements of the inner lists. Thus, given our list of animals, we can access each string: ```python= animal_list = [["corgi", "golden retriever", "husky"], ["tabby", "calico"], ["cardinal", "blue jay", 'pigeon']] for groups in animal_list: # for each iteration, group holds an inner list # e.g. for the first iteration, # group = ["corgi", "golden retriever", "husky"] # if we want to iterate over the inner list to # see the individual elements of the inner list # we have to have our inner loop iterate through the elements # of the temp variable, groups: for animal in group: print(animal) ``` ## Lists as arguments An interesting characteristic of lists is that, since they are mutable, if you write a function that takes a list as a parameter and your function modifies that list, you do NOT need to return the list to save the changes. The function will still modify the list directly, altering the list that exists outside of the function! As a basic example, let's write a function that takes a list of numbers and sorts it in descending order: ```python= def sort_list_descending(my_list): my_list.sort() my_list.reverse() # we could also say: # my_list.sort().reverse() # because methods are read in order of left to right # (my_list.sort()).reverse() list_num = [1,6,5,8,9,13,2,33,5,66,342] # since sort_list_descending() doesn't return anything, we don't # need an assignment statement sort_list_descending(list_num) print(list_num) ``` ## Caution: adding/removing elements to/from a list during a loop If we try to add or remove elements in the middle of a list (during a loop), we can run into issues... ```python= # example: removing items from a list using a loop # something wonky happens... list_fruits = ['apple', 'grape', 'banana', 'orange', 'pear', "strawberry"] list_fruits2 = ['apple', 'blueberry', 'banana', 'strawberry', 'pear'] for fruit in list_fruits: if fruit in list_fruits2: list_fruits.remove(fruit) print(list_fruits) # output: ['grape', 'orange', 'strawberry'] ``` For this particular problem, we can choose to iterate over a *copy* of the list (because we are not dealing with indices): ```python= # exmaple of iterating over a copy of a list # same example as above, but prevents the mistake of not removing 'strawberry' from the list list_fruits = ['apple', 'grape', 'banana', 'orange', 'pear', "strawberry"] list_fruits2 = ['apple', 'blueberry', 'banana', 'strawberry', 'pear'] for fruit in list_fruits[:]: # remember, slicing returns a NEW value that contains the elements from index start:end if fruit in list_fruits2: list_fruits.remove(fruit) print(list_fruits) # output: ['grape', 'orange'] ``` However, this is not always the correct solution. If you come accross this problem, try to figure it out on your own, but I am happy to help! ## Examples involving lists Fill a list with 30 random numbers (choose your range). Sort the list in descending order. Calculate and print the average of the values in the list. ```python= import random # make the list variable. It's initially empty random_list = [] # fill the list for x in range(30): rand = random.randint(1,500) random_list.append(rand) # sort in descending order random_list.sort().reverse() # the order of evaluation for methods is left to right # so this is like saying (random_list.sort()).reverse() # get average sum = 0 for num in random_list: sum += num print(f"average: {sum / 30}") ``` ## Lists Practice Exercises 1. Write a function that makes a list of the unique letters used in a string. That is, if the letter x is used twice in a sentence, it should appear only once in your list. Capitalization shouldn't matter, meaning that if there is an "a" and an "A" in the string, only one a (either uppercase or lowercase, your choice) should be in the list. [^1] [^1]: Answer ```python= def get_unique_letters(str): letter_list = [] for letter in str: if not (letter in letter_list): # saying if letter is not in the list letter_list.append(letter) return letter_list ``` 2. Fill a list with 30 random numbers (choose your range). Sort the list in descending order. Calculate and print the average of the values in the list. [^2] [^2]: Answer: ```python= import random # make the list variable. It's initially empty random_list = [] # fill the list for x in range(30): rand = random.randint(1,500) random_list.append(rand) # sort in descending order random_list.sort().reverse() # the order of evaluation for methods is left to right # so this is like saying (random_list.sort()).reverse() # get average sum = 0 for num in random_list: sum += num print(f"average: {sum / 30}") ``` Could also find sum with built-in function, `sum`, that returns the sum of a list of numeric types: ```python= sum = sum(random_list) print(f"average = {sum / 30}") ``` 3. Write a function that, given a list as a parameter, removes all duplicates from the list. Outside of the function, print the modified list. (this one is tricky) [^4] [^4]: Answer: ```python= def get_unique_list(my_list): for item in my_list: while my_list.count(item) > 1: my_list.remove(item) my_list = [1,2,1,5,3,5,0] get_unique_list(my_list) # since the actions on my_list in the function are # done directly to the list, we don't need to return # the list. Since there is no return # statement in the function, there is no need to do # any assignment statement ```