--- tags: python-course title: lesson-07 --- # Collections of Values [![hackmd-github-sync-badge](https://hackmd.io/_-6IcLirRS6WCnjqCm1smg/badge)](https://hackmd.io/_-6IcLirRS6WCnjqCm1smg) :::info :bulb: The data you are working with will often be several values. This lesson introduces collection types to aid in dealing with multiple related values. :school: Teacher is Nina, student is James. ::: :::success :movie_camera: VIB background fading to course title slide. James and Ninas smiling faces appear. ::: :::warning :notes: Upbeat intro music ::: **James**: Do you know how many DNA base pairs you have in every cell? **Nina**: How many? **James**: More than 3 billion! If I want to work with a human genome, do I have to write out 3 billion variables? **Nina**: Thankfully no! You could make your life much easier, hopefully that's what you're using the computer for. You can put each base in a "list" and store that list in a single variable. :::success :movie_camera: Animation of Python syntax storing DNA bases in a list of strings. ```python ["A", "T", "G", "C"] ``` ::: **Nina** (over the above animation): This is how you can write a list of values in Python. You can treat the list as a single value, for example, you can assign it to a variable. :::success :movie_camera: Animation storing the list in a string. ```python dna = ["A", "T", "G", "C"] ``` ::: **James**: Can I still access each base seperately? **Nina**: This list contains 4 string values: "A", "T", "G", and "C". You can still access each of them seperately if you want to. First, it's important to undestand that the list is ordered: what I mean is that it makes sense to ask, "what is the value at the beginning of the list?" or" what is the value at the end of the list?". **Nina** (over indexing animation): You can extract a value from a list using a special operator, known as the "index" operator. You can think of this as an "offset" of the number of values from the beginning of the list. So the first value in the list is the beginning plus an offset of 0 values. So the "index" of the first value is 0. :::success :movie_camera: Animation of indexing the first value in the `dna` list. ```python dna[0] # then ["A", "T", "G", "C"] #^ beginning + 0 "A" ``` > [name=Bruna Piereck] Could be interesting in the animatin to number all the index bellow the list 0 - 1 - 2 - 3, for a visual reference to highlight it doesnt start in '1'. It will be specilly important in the slicing This could be shown in steps as it is narrated. ::: **James**: What if there are no values in the list, or if I try to access the dna list at index 500? **Nina**: You should try to find out. **James** (over animation): Python tells me I made a mistake. That the list index I used is out of range. :::success :movie_camera: Animation of ```python >>> dna[500] IndexError: list index out of range ``` ::: **Nina**: Of course that index is out of range because there are only 4 values in the list. So it doesn't make any sense to access the 500th. To avoid this error, you can ask Python how many values the list contains before trying to index into the list. To do that you ask, "what is the length of the list?" :::success :movie_camera: Animation ```python len(dna) ``` ::: **James** (over animation): Now that I can work on individual values in the list. Suppose I change the value, how can I put it back into the list? I tried to assign the first value in my list to a variable and then assign "T" to that variable. But that doesn't change the value in my list. :::success :movie_camera: Animation in the background showing: ```python first = dna[0] # then first = "T" # then dna # ["A", "T", "G", "C"] but not ["T", "T", "G", "C"] ``` > [name=Bruna Piereck] We should also print the result of 'first' in both assignments for visualization and easier understanding Show steps demonstrating that the value in the list is not changed. ::: **Nina**: When you extract the first value from the list with the index operator and assign it to a variable, this variable now has no connection to the list. Instead, you will need to assign to the list directly at index 0 like this: :::success :movie_camera: Animation in the background showing: ```python= dna[0] = "T" # Then show that dna contains ["T", "T", "G", "C"] ``` ::: **Nina**: There are several more operations on lists of values that are useful when writing programs: namely slicing, sorting, and checking if a value is in a list. **James**: Ok I can understand what sorting and checking are. But, what is slicing? **Nina**: Slicing is a lot like indexing. The operator even looks the same. The difference is that, while you can index a single value, you can slice ranges of indices. If I want to discard every second value from my list I can use a slice instead of an index. **James**: So slicing looks like indexing, but for ranges of indexes? **Nina** (over animation): That's right James! The first number in the slice is the "start" index. Then you write the "end" index after a colon. And finally you can write the "step", the number to add to the start index to get to the next value. So we start at 0, that's "A" and we keep that. Then we step by 2, skipping over "T" to get to "G" and we keep "G". Then we step 2 again, over "C" and there are no more values in the list. :::success :movie_camera: Animation in the background showing the construction of a slice discarding every second value. ```python dna[ # then dna[0 # then dna[0:4 # then dna[0:4:2] # then ["A", "G"] # ["A", x"T", "G", x"C"] ``` ::: **Nina**: As a result that gives you another list. Slicing a list always gives you a list back. **James**: So what happens to the `dna` list? **Nina**: The original `dna` list is unchanged. The list you get from the slicing operator is a new list containing the sliced values. And you can operate on it as a seperate value: assign it to a variable, sort it, etc. **James**: Ok I think that makes sense. So how would I sort the resulting list? **Nina** (over demo): You can use the `sorted` built in function to sort lists. **James** (over demo): Ok. Sorting the `dna` list gives me what I would expect... Sorting numbers too. What about my name... :::success :movie_camera: Demo of `sorted()` ```python # First sorted(dna) # Second sorted([8, 9, 7, 3, 6, 5, 8]) # Third sorted("james") ``` ::: **Nina**: Is that what you expected to happen? **James**: So, Python knows when to sort alphabetically and when to sort numerically based on the types in the list? **Nina**: Not quite. Computers don't really have a notion of "text". Really, under the hood, it's all just numbers and humans have chosen specific numbers to represent letters. For example, the number 65 represents the upper case 'A'. 66 upper case 'B' and so on. So really it's all just sorting numerically and you can see that by sorting a mixed case string. :::success :movie_camera: Sort a mixed case string ```python sorted("James") ``` ::: **James**: Alright, that seems hard to understand. **Nina**: Yes it's complicated, but it's not super important to undestand these details. Instead, you can remember that uppercase lettes are sorted before lower case letters. **James**: Ok, I can leave that for if it ever becomes important. Finally, how do I check if a value exists in a list? **Nina** (over the demo): You can use an operator called "in". By using it, you're essentially asking a question, "Is something in a list?" and that's exactly how you write it... :::success :movie_camera: Demo `in` operator ```python "A" in dna # => True # Then "X" in dna # => False ``` ::: **James**: This seems useful in `if` statements! **Nina** (over the below demo): Exactly! Anything that results in a boolean value is useful in an `if` statement. Let's write an example while constructing a list of DNA bases... :::success :movie_camera: Demo constructing a list of bases ```python= dna = [] if "T" not in dna: dna = dna + ["T"] if "A" not in dna: dna = dna + ["A"] if "G" not in dna: dna = dna + ["G"] if "C" not in dna: dna = dna + ["C"] # dna = ["T", "A", "G", "C"] ``` ::: **Nina**: That's obviously a trivial example but it illustrates the use of the `in` operator. You can imagine a less trivial example where you don't know the contents of dna before trying to modify it. It also illustrates the use of the plus operator when the operands are lists. The plus operator concatenates lists when its operands are lists. Just like in one of the previous lessons we saw that plus concatenates when its operands are strings. :::success :movie_camera: Animation highlighting the `+` operator on lists. ```python if "T" not in dna: dna = dna + ["T"] ``` ::: **James**: I suppose the operand types need to match when using the `+` operator? **Nina** (over animation): Just like with strings and numbers, if the operand types do not match then Python will tell you. :::success :movie_camera: Animation showing typeerror. ```python >>> if "T" not in dna: >>> dna = dna + "T" TypeError: can only concatenate list (not "str") to list ``` ::: **James**: So that is how I add values to a list? **Nina**: That's exactly right, you concatenate your list with a new list containing the values you want to include then assign the result. > [name=James Collier] Deletion should be covered in the notebook / or linked somewhere. **James**: Thanks Nina! **Nina**: You're welcome! **Nina** (to the camera): In this lesson you learned how to store multiple values in an ordered list and store that list in a variable. **James**: I also learned about some of the operations on lists, including concatenation, indexing, slicing, sorting, and checking the existance of a value inside a list. **James and Nina**: See you next time! >[name=Bruna Piereck]Can be interesting to show or comment in how to handle CAPS sensitivity when seaching strings... :::warning :notes: Upbeat outro music. James and Nina wave. ::: :::success :movie_camera: Fade to VIB logo slide. :::