---
tags: python-course
title: lesson-07
---
# Collections of Values (edited)
[](https://hackmd.io/_-6IcLirRS6WCnjqCm1smg)
:::info
:bulb: The data you are working with will often be several values. This lesson introduces collection types to aid in dealing with multiple related values.
:school: Teacher is Nina, student is James.
:::
:::success
:movie_camera: VIB background fading to course title slide. James and Ninas smiling faces appear.
:::
:::warning
:notes: Upbeat intro music
:::
**James**: Do you know how many DNA base pairs you have in every cell?
**Nina**: How many?
**James**: More than 3 billion! If I want to work with a human genome, do I have to write out 3 billion variables?
**Nina**: Thankfully no! You could make your life much easier, hopefully that's what you're using the computer for. You can put each base in a "list" and store that list in a single variable.
:::success
:movie_camera: Animation of Python syntax storing DNA bases in a list of strings.
```python
["A", "T", "G", "C"]
```
:::
**Nina** (over the above animation): This is how you can write a list of values in Python. You can treat the list as a single value, for example, you can assign it to a variable.
:::success
:movie_camera: Animation storing the list in a string.
```python
dna = ["A", "T", "G", "C"]
```
:::
**James**: Can I still access each base seperately?
**Nina**: This list contains 4 string values: "A", "T", "G", and "C". You can still access each of them seperately if you want to. First, it's important to undestand that the list is ordered: what I mean is that it makes sense to ask, "what is the value at the beginning of the list?" or" what is the value at the end of the list?".
**James**: Ok, then how would I access the first string, the "A"?
**Nina** (over indexing animation): You can extract a value from a list using a special operator, known as the "index" operator. You can think of this as an "offset" of the number of values from the beginning of the list. So the first value in the list is the beginning plus an offset of 0 values. So the "index" of the first value is 0.
:::success
:movie_camera: Animation of indexing the first value in the `dna` list.
```python
dna[0]
# then
["A", "T", "G", "C"]
#^ beginning + 0
"A"
```
This could be shown in steps as it is narrated.
:::
**James**: What if there are no values in the list, or if I try to access the dna list at index 500?
**Nina**: You should try to find out.
**James** (over animation): Python tells me I made a mistake. That the list index I used is out of range.
:::success
:movie_camera: Animation of
```python
>>> dna[500]
IndexError: list index out of range
```
:::
**Nina**: Of course that index is out of range! But you won't always know how many values are in the list, or even if there are any in the list. So you can avoid an IndexError by first asking Python how many values the list contains. To do that you ask, "what is the length of the list?"
:::success
:movie_camera: Animation
```python
len(dna)
```
:::
**James** (over animation): Now that I can work on individual values in the list. Suppose I change the value, how can I can I put it back into the list? I tried to assign the first value in my list to a variable and then assign "T" to that variable. But that doesn't change the value in my list.
:::success
:movie_camera: Animation in the background showing:
```python
first = dna[0]
# then
first = "T"
# then
dna # ["A", "T", "G", "C"] but not ["T", "T", "G", "C"]
```
Show steps demonstrating that the value in the list is not changed.
:::
**Nina**: You're over thinking this. Instead, you can assign the new value directly into the list where you want it with the index operator directly. To change the value at index 0 you can do this:
:::success
:movie_camera: Animation in the background showing:
```python=
dna[0] = "T"
# Then show that dna contains
["T", "T", "G", "C"]
```
:::
**James**: Great! Thank you! But I was also thinking about extracting a gene, a sequence of DNA nucleotides, from the list. If I know the exon coordinates of the gene how would I extract just the sequence for the gene from the sequence?
**Nina**: Let's assume you have the sequence for the entire chromosome in your `dna` variable. Then the exon coordinates are the start and end indexes of your gene of interest. You can extract just this range of nucleitides by _slicing_ the list.
**James**: Perfect! How would I slice, say index 35 to 71?
**Nina**: The slicing operator looks and behaves like the index operator except that you can give it a range of indexes to extract and gives you another list as output. Slicing a list always gives you a list back.
:::success
:movie_camera: Animation in the background showing the construction of a slice.
```python
>>> dna[35:71]
["A","C","T","C","T","T","C","T","G","G","T","C", ..., "A"]
```
:::
**James**: Ok that's nearly what I expected but it's missing a single letter at the very end of the gene. What happened?
**Nina**: Well spotted James! The slice starts at the index you give it but ends one index before your provided end index. So to get the whole sequence you want you will need to add `1` to the end index.
:::success
:movie_camera: Animation in the background showing the construction of a slice.
```python
>>> dna[35:72]
["A","C","T","C","T","T","C","T","G","G","T","C", ..., "A", "G"]
```
:::
**James**: Oh ok that's strange?
**Nina**: It does seem strange doesn't it. But I promise you'll get used to it and it does make thinking about indexes easier in many situations. For now, you should just accept it as a rule for how slicing a range of indexes works.
**James**: Alright I can do that. Another task I would like to do is validate that the gene sequence isn't nonsense, does it contain error markers like an "X" character?
**Nina**: So you would like to check if the string "X" is anywhere in the list you sliced? You can use the conveniently named `in` operator.
:::success
:movie_camera: Animation in the background showing the construction of a slice.
```python
>>> "X" in dna[35:72]
False
```
:::
**James**: Oh! That was easy! I thought that was going to be hard.
**Nina**: It's such a common question to ask, "Is some value part of a collection?". Did you see the type of the result too?
**James**: A boolean false?
**Nina**: Yeah! And where are booleans useful?
**James**: In if statements to alter the behaviour of my program! Ok. And I do want to alter the behaviour of my program now because I know there are no error markers in the sequence. And if there are error markers I'd like to know how many there are...
**Nina**: Can you think of a way to find out how many are there?
**James**: Can I sort the list so that all the "X" error markers are in the same place?
**Nina**: Good idea! And you can indeed. You can use the `sorted` built in function to sort lists.
:::success
:movie_camera: Sorting a gene
```python
>>> sorted(["G", "T", "G", "X", "A", "C", "X", "C", "A"])
['A', 'A', 'C', 'C', 'G', 'G', 'T', 'X', 'X']
```
:::
**James**: Ok so this sequence contains 2 error markers which I can now clearly see at the end of the list.
**Nina**: That's good but if your sequence is really long you probably prefer to see the error markers at the begginning of the list. Can you sort in reverse order?
**James**: I have no idea how to... Maybe I can ask Python for help...
:::success
:movie_camera: Sorting a gene
```python
>>> help(sorted)
sorted(iterable, /, *, key=None, reverse=False)
[...]
the reverse flag can be set to request the result in descending order.
```
:::
**James**: Aha! So I can just reverse the order by passing an extra argument to the `sorted()` function!
:::success
:movie_camera: Sorting a gene
```python
>>> sorted(["G", "T", "G", "X", "A", "C", "X", "C", "A"], reverse=True)
['X', 'X', 'T', 'G', 'G', 'C', 'C', 'A', 'A']
```
:::
**Nina**: Great! Now you can easily see the error markers. But... sorting is a common thing you will do when programming so lets explore it a little bit. What happens when you sort a list of numbers?
:::success
:movie_camera: Sorting a list of numbers
```python
>>> sorted([5, 3, 4, 8, 1, 9, 7, 0, 2, 6])
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
```
:::
**Nina**: Is that what you expected to happen?
**James**: Yes it is now sorted. But it seems obvious. Why am I practising this?
**Nina**: Hold on Mr. Confindence. What happens when you sort your full name?
:::success
:movie_camera: Sorting a list of numbers
```python
>>> sorted(["J", "a", "m", "e", "s", " ", "C", "o", "l", "l", "i", "e", "r"])
[' ', 'C', 'J', 'a', 'e', 'e', 'i', 'l', 'l', 'm', 'o', 'r', 's']
```
:::
**James** (over demo): Wow!! Why is "C" and "J" before "a"? And what's the space even doing there?
**Nina**: Computers don't really have a notion of "text". Really, under the hood, it's all just numbers and humans have chosen specific numbers to represent letters. For example, the number 65 represents the upper case 'A'. 66 upper case 'B' and so on. So really it's all just sorting numerically.
**James**: Ok I can accept that. I have one final question for you though: how would I remove the error markers from my list? And how would I add a new value to the list?
**Nina**: Excellent question! Removing values is easy, you can use the `remove()` method... (interuped)...
**James**: Ok let me try...
:::success
:movie_camera: Removing error markers...
```python=
["G", "T", "G", "X", "A", "C", "X", "C", "A"].remove("X")
```
:::
**Nina**: Yeah I was going to say that `remove()` modifies the list in-place, not like `socted()` that gives you back a new sorted list. So you get nothing back from `remove()`.
**James**: Oh ok so I need to keep my list in a variable...
:::success
:movie_camera: Removing error markers...
```python=
>>> error_sequence = ["G", "T", "G", "X", "A", "C", "X", "C", "A"]
>>> error_sequence.remove("X")
>>> error_sequence
['G', 'T', 'G', 'A', 'C', 'X', 'C', 'A']
```
:::
**James**: Ok but it only removed the first "X"?
**Nina**: Yeah. But we will cover repeating your actions in another lesson.
**James**: And adding a new nucleotide to my list?
**Nina**: Do you remember the `+` operator and how it worked on strings?
**James**: Yes I think so, it concatenates the strings together. Does it do the same thing for lists?
**Nina**: Why don't you try it?
:::success
:movie_camera: Adding to a list
```python=
>>> ["A", "T"] + ["G", "C"]
['A', 'T', 'G', 'C']
```
:::
**James**: Cool!
**Nina**: Do you see how this can be used to add values to your list?
**James**: No. Not really.
**Nina** (over demo): Well, if you have a list in a variable, you can append to that list just by concatenating your new value then assigning the result back to your variable.
:::success
:movie_camera: Appending to a list
```python=
>>> error_sequence = ["G", "T", "G", "X", "A", "C", "X", "C", "A"]
>>> error_sequence = error_sequence + ["C"]
>>> error_sequence
['G', 'T', 'G', 'X', 'A', 'C', 'X', 'C', 'A', 'C']
```
:::
**Nina**: That's obviously a trivial example but it illustrates the use of the assignment and the `+` operator for appending values to a list
**James**: Thanks Nina!
**Nina**: You're welcome!
**Nina**: In this lesson you learned how to store multiple values in an ordered list and store that list in a variable.
**James**: I also saw some operations on lists, including concatenation, indexing, slicing, sorting, and checking the existance of a value inside a list.
:::warning
:notes: Upbeat outro music. James and Nina wave.
:::
:::success
:movie_camera: Fade to VIB logo slide.
:::