# Input and Output [![hackmd-github-sync-badge](https://hackmd.io/_-6IcLirRS6WCnjqCm1smg/badge)](https://hackmd.io/_-6IcLirRS6WCnjqCm1smg) :::info :bulb: In this lesson students learn about the basics of file and network I/O. :school: Teacher is James, student is Nina. ::: :::success :movie_camera: VIB background fading to course title slide. James and Ninas smiling faces appear. ::: :::warning :notes: Upbeat intro music ::: > scene 1 **Nina**: When I write real programs I usually don't want to edit my program to work with new data. For example, I don't want to manually change a list when I have a file on my laptop with the raw data I want to process. Why haven't we talked about accepting data into our programs and getting data out of our programs yet? **James** (laughing nervously): Well it's simultaneously an interesting topic and not very interesting. It's not interesting because this only happens at the "boundary" of our programs, all of the interesting or useful stuff happens on the inside. And it is interesting because this "boundary" is a major source of errors. Nevertheless, "accepting data in" and "getting data out" is the topic of this very lesson. **James**: Incidentaly programmers shorten the phrases "accepting data in" and "getting data out" to "input and output" and sometimes even just "I/O". **Nina**: What do you mean by the "boundary"? **James**: I just mean the place in your code where your program talks to the outside world. Where you're not thinking in terms of variables, loops, and lists but in terms of files, or the keyboard. **Nina**: Or even the internet? **James**: Yes indeed! So Nina, where might you want to store or retrieve data from? **Nina**: I usually want to read data from a file. But sometimes I also want to fetch data from the internet. **James**: this is likely to be where your data sources and sinks are too. However, you may also access data over serial ports, I2C, or various other devices. Luckily the same general pattern applies in all cases. :::success :movie_camera: Animation of context manager pattern ```python= with access_resource() as handle: handle.read() # Read data from resource handle.write() # Write data to resource ``` ::: > scene 2 **James** (over animation): In Python, there is a general pattern for interacting with the outside world called a "context manager". Here you can see what the general pattern looks like. The `with` keyword creates a context that is usable within the indented block. It also gives us a variable that we can use to read data from- or write data to- the resource. We've called that variable `handle` in this example but you can call it whatever you like -- it is just a variable name. **Nina**: What's a `resource` James? **James**: Ah! Good question! "Resource" is the name we will use for the general class of things outside our program. Things like files, a document on the network, a serial device, or any other I/O mechanism. **Nina**: Ok but when I run this it says, `NameError: 'access_resource' is not defined.` **James**: Sorry Nina. I'm only using that name as a placeholder to describe the pattern. In reality there is a different function for each different type of resource. For files the function is called `open()` and you can give it the name of the file you want to interact with. :::success :movie_camera: Animation from `access_resource()` to `open()` ```python= with open("test.txt", mode="w") as handle: handle.write("Hello world!") ``` ::: > scene 3 **James**: The `open()` function takes an optional second argument, called `mode`. By default, when you open a file and do not specify the mode, you will only be allowed to read data from the file. If you want to write data to the file you will have to specify `mode='w'`. **Nina**: Thanks James. Maybe I can apply this myself now. I have a file containing a table of data: `scores.csv`. Each column has a score and I want to know the average score for each column. I can start by reading in the data... :::success :movie_camera: Animation of Nina opening "scores.csv". ```python= with open("scorez.csv") as handle: data = handle.read() print(data) ``` ::: > scene 4 **Nina**: Ah but now it says "File not found"! **James**: This is one of the problems of doing I/O. In this case, either the file you're trying to read has a different name, or it's not there at all. But you can't know this when you're writing your program. It's an unavoidable potential error. **Nina**: Oh yeah I made a typo in the name of my file. (Pause) Now I see the data printed out but how do I get the table? **James**: That's a good start Nina. You've got the data into your program from the filesystem. In your case, the data has some structure: it's made into a string before it's saved to your disk using a format called Comma Seperated Values. Comma-Seperated Values contain columns seperated by commas, and rows seperated by newlines. You could easily turn a CSV string into a table yourself, but I want to use this opportunity to introduce you to the `csv` standard library module. :::success :movie_camera: Animation of reading "scores.csv" ```python= import csv a = 0 b = 0 c = 0 with open("scores.csv") as handle: myreader = csv.reader(handle) for row in myreader: a = a + float(row[0]) b = b + float(row[1]) c = c + float(row[2]) ``` ::: > scene 5 **Nina**: Why would I do this instead of making the table myself? **James**: You definitely could, and I encourage you to try. But in general you should try to use existing facilities because saving data has a lot of wierd corner-cases that can easily ruin your day. **Nina**: Ok so how do I use this `csv` module? **James** (Over animation): You can start by opening your file resource with a context manager. Inside the indented block you can make a "reader" by passing the file handle to `csv.reader`. You can then loop over the rows in your csv table using a for loop. You can access each column of the row by using the index operator like this (point at the demo). **Nina**: Ok that seems very convenient. But in my case, the data I want to analyse isn't a file on my laptop, it's on the internet. Can I use `open` to get files from the internet? **James** (over animation): Ah! No. Each resource you might want to use has it's own function to access it. You will still use the context manager pattern but you will need to replace the `open` function application with `request.urlopen`. You need an import for that as shown here (point). Instead of using the `open()` function, you use the `request.urlopen()` function from the `urllib` module of the standard library. Otherwise, the code is exactly the same as when reading data from a local file. :::success :movie_camera: Animation of opening a network resource. ```python from urllib import request with request.urlopen("https://httpbin.org/get") as handle: data = handle.read() print(data) ``` ::: > scene 6 **Nina**: What does the `b` in front of the string output mean? **James**: That's an excellent question but I can't answer that without some back story. The short answer is that this isn't a string. It's raw binary data. That's what the `b` means. Earlier in this course we talked about how strings are just characters "strung" together. And that individual characters are really just numbers that someone made up to represent those numbers. Well we simplified that a little. It turns out that lots of people came up with lots of incompatible ways of representing characters with different numbers. The most common assignment of numeric representations on the internet today is called "UTF-8" which assigns numbers to everything from latin characters to emoji. **Nina**: So how do I convert it to a string? Can I use the `str()` conversion function? **James**: That's an excellent guess Nina but unfortunately that won't do what you want in this case. Instead you need to "decode" the data into a string using "UTF-8". **Nina**: How do you know it's UTF-8? **James**: In general I don't. But it's very very likely to be UTF-8, and if the guess is wrong it will become obvious because the string will contain weird characters. **Nina**: Ok I'll take your word for it. Now I should be able to load my table of data from the internet... :::success :movie_camera: Animation of reading csv data from the internet. ```python from urllib import request with request.urlopen("https://storage.googleapis.com/vib-training-data/iris.csv") as handle: data = handle.read() text = data.decode("utf8") reader = csv.reader(text) for row in reader: print(row) ``` ::: **Nina**: Huh? I don't understand what happened. **James**: the `csv.reader` function expects you to give it a list of rows. But you've given it a string so it treats each character as a row. **Nina** Ah ok that makes sense. And now I can work with my data. > scene 7 **Nina**: What if my data isn't in a CSV format? Is there a standard library module for every concievable data format? **James**: Not every format, no. But if there's no standard library module for it then there will likely be a third-party library available to help you. Installing and using third-party libraries is out-of-scope for this course though. **Nina**: Ah! but we didn't yet talk about how to save data to a file. **James**: Right! Well it's a very similar pattern. Let's start by just saving a string to a file named, `zzz.txt`. :::success ```python import csv with open('eggs.csv', 'w', newline='') as csvfile: spamwriter = csv.writer(csvfile) spamwriter.writerow(['Spam'] * 5 + ['Baked Beans']) spamwriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam']) ``` ::: > scene 8 **Nina**: That looks very similar to reading data from the file. But this time you set the mode argument to "w"? **James**: Yeah, `mode=w` means that I want to write to the file resource that I've opened. **Nina**: Why do I use `writerow` rather than `write` tis time? **James**: Notice that you're not directly using the file handle (called `csvfile`). Instead you're using this special "writer" thing from the csv module. It doesn't have a `write` function but a more specialised "writerows" that writes a list as a csv row. **Nina**: What if I want to read _and_ write to the file resource? **James**: You can do that, but you have to be very careful. You can easily get yourself in a mess and corrupt the data you already had there. **Nina**: Ok I guesss I should use the `csv` standard library module to write CSV data then? **James**: That's what I would do. Can you give it a try? :::success Insert demo here ::: > scene 8 **James**: You can even insert column headers like this: ... :::success Insert demo here ::: **Nina**: That was really easy. **James**: That's the basics of I/O and the end of this lesson. Can you tell me what you've learned? **Nina**: Thanks James! Ok, I learned about "I/O" in this lesson. Accepting data into a program and getting data out of a program. There is a pattern called a context manager that allows me to do this independently of how I need to access the data: be it in a local file, a remote website, or otherwise. :::warning :notes: Upbeat outro music ::: :::success :movie_camera: Fade to VIB logo slide. :::