# M6: Files & Exceptions (Theory)
## a. Writing our first file
```python
file = open("testfile.txt","w")
file.write("Hello World\n")
file.write("This is our new text file\n")
file.write("and this is another line.\n")
file.write("Why? Because we can.\n")
file.close()
# Appending new lines
file = open("testfile.txt", "a")
file.write("This is a test\n")
file.write("To add more lines.\n")
file.close()
```
- **Opening** a file creates what we call a file handle. In this example, the variable file refers to the new handle object. Our program calls methods on the handle, and this makes changes to the actual file which is usually located on our disk.
- With **mode** "w", if there is no file named `testfile.txt` on the disk, it will be created. If there already is one, it will be replaced by the file we are writing.
* Files in Python can be opened in one of four modes: read ("r"), write ("w"), append ("a"), and read and write ("+").
- **Closing** the file handle (line 6) tells the system that we are done reading or writing and makes the disk file available for reading or writing by other programs (or by our own program).
* While in principle you could keep a file open during the execution of the program, hence, it is a matter of good manners towards other programs to close your files when you don't need access to them any more. For this reason, in our examples we are always closing our files.
* It’s important to understand that when you use the `close()` method, any further attempts to use the file object will fail.
- The file **`write`** method only requires a single parameter, which is the string you want to be written. This method can also be used to add information or content to an existing file. You just need to make sure to open the file in **append** mode "a" to make sure you append, instead of overwriting the existing file.
* You can also use the writelines method to write (or append) multiple lines to a file at once:
```python
file = open("testfile.txt", "a")
lines_of_text = ["One line of text here\n", "and another line here\n"]
file.writelines(lines_of_text)
file.close()
```
## b. Reading a Text File in Python
```python
file = open("testfile.txt", "r")
print(file.read())
file.close ()
# Or
file = open("testfile.txt", "r")
print(file.readline())
print(file.readline(),end="")
file.close ()
# Or
file = open("testfile.txt", "r")
print(file.readlines())
file.close ()
```
- The output of this command will display all the text inside the file
- Another way to read a file is to read a certain number `n` of characters : `read(n)`
- Finally, if you would want to read the file line by line – as opposed to pulling the content of the entire file in a string at once – then you can use the `readline()` method.
* Note that by default, the `print()` command always prints a newline after every string.
* We can tell the `print` command to end the line being printed not by a newline character, for example the empty character `""`
* Related to the `readline()` method is the `readlines()` method.
* The output you would get is a list containing each line as a separate element.
* Every line is ended with a `\n`, the newline character
```python
print(file.readlines()[2])
```
would print
> and this is another line.
## c. Looping over a file object
```python
# Not memory-efficient method
file = open("testfile.txt", "r")
for line in file.readlines ():
print(line,end='')
file.close ()
# Better:
file = open("testfile.txt", "r")
for line in file:
print(line,end='')
file.close ()
```
* When you want to read all the lines from a file in a more memory efficient, and fast manner, using a for-loop, prefer the second option. In this case, Python will avoid loading the entire file in memory.
## d. Processing files with `split()`
```python
file = open("testfile.txt", “r”):
data = file.readlines()
for line in data:
words = line.split()
print(words)
```
The output for this will be something like (depending on what your testfile currently contains):
```bash
['One', 'line', 'of', 'text', 'here']
['and', 'another', 'line', 'here']
```
The reason the words are presented in this manner is because they are stored – and returned – as a list.
## e. Binary files
> A binary file is computer-readable but not human-readable. All executable programs are stored in binary files, as are most numeric data files. In contrast, text files are stored in a form (usually ASCII) that is human-readable.
* Files that hold photographs, videos, zip files, executable programs, etc. are called **binary files**: they're not organized into lines, and cannot be opened with a normal text editor.
* Python works just as easily with binary files, but when we read from the file we're going to get **bytes** back rather than a string
```python
f = open("somefile.zip", "rb")
g = open("thecopy.zip", "wb")
while True:
buf = f.read(1024)
if len(buf) == 0:
break
g.write(buf)
f.close()
g.close()
```
* In lines 1 and 2 we added a "b" to the mode to tell Python that the files are binary rather than text files.
* In line 5, we chose to read and write up to 1024 bytes on each iteration of the loop.
* In line 6, when we get back an empty buffer from our attempt to read, we know we can break out of the loop and close both the files.
* If we print `type(buf)` we'll see that the type of buf is `bytes`. We don't do any detailed work with *bytes objects* in this textbook.
## f. Directories
> **File system**
A method for naming, accessing, and organizing files and the data they contain.
> **Directory**
A named collection of files, also called a folder. Directories can contain files and other directories, which are referred to as subdirectories of the directory that contains them.
> **Path**
A sequence of directory names that specifies the exact location of a file.
* When we create a new file by opening it and writing, the new file goes in the **current directory** (wherever we were when we ran the program). Similarly, when we open a file for reading, Python looks for it in the current directory.
* If we want to open a file somewhere else, we have to specify the **path** to the file.
```python
wordsfile = open("/usr/share/dict/words", "r")
wordlist = wordsfile.readlines()
# Prints out the first 5 elements
print(wordlist[:6])
```
> ['\n', 'A\n', "A's\n", 'AOL\n', "AOL's\n", 'Aachen\n']
* This (Unix) example opens a file named words that resides in a directory named `dict`, which resides in `share`, which resides in `usr`, which resides in the top-level directory of the system, called `/` .
* A Windows path might be `"c:/temp/words.txt"` or `"c:\\temp\\words.txt"`.
* Because backslashes are used to escape things like newlines and tabs, we need to write two backslashes in a literal string to get one! So the length of these two strings is the same!
* **NB** We cannot use / or \ as part of a filename; they are reserved as a delimiter between directory and filenames.
# 2. Exceptions
## a. Glossary
> **Exception**
An error that occurs at runtime. Spoiler: Exception it is a built-in Python *Object*!
> **Handle an exception**
To prevent an exception from causing our program to crash, by wrapping the block of code in a ´try ... except´ construct.
> **Raise an exception**
To create a deliberate exception by using the `raise` statement.
## b. Catching exceptions
```python
>>> tup = ("a", "b", "d", "d")
>>> tup[2] = "c"
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
```
* Whenever a runtime error occurs, it creates an **exception** object.
* The program stops running at this point and Python prints out the traceback, which ends with a message describing the exception that occurred.
* Examples:
* Dividing by zero
* Accessing a non-existent list item
* Trying to make an item assignment on a tuple
* The error message has two parts: the type of error before the colon, and specifics about the error after the colon.
* Sometimes we want to execute an operation that might cause an exception, but we don't want the program to stop. In this case, we wish to handle the exception by using `try...except` :
```python
filename = input("Enter a file name: ")
try:
f = open(filename, "r")
lines = f.readlines ()
f.close ()
except:
print("There is no file named", filename)
```
* The `try` statement executes and monitors the statements in the first block.
* If no exceptions occur, it skips the block under the `except` clause. If any exception occurs, it executes the statements in the `except` clause and then continues.
## c. Raising our own exceptions
```python
def get_age():
age = int(input("Please enter your age: "))
if age < 0:
# Create a new instance of an exception
my_error = ValueError("{0} is not a valid age".format(age))
raise my_error
return age
```
* `ValueError` is a type of exception that is built into Python and is used by Python in case it encounters a value problem; we can also raise it ourselves.
* Python's `raise` statement is somewhat similar to the return statement: it also returns information to a program that called this function. There are however some important differences :
* Where a return statement in a function will always return to the place where the function was called, a raise statement will break of multiple function calls, till it reaches a place where the exception is handled using a `try ... except` block.
* We call this "*unwinding the call stack*".
```python
def get_information ():
age = get_age ()
username = get_username ()
return ( age, username )
try:
age, username = get_information ()
print ( "Your username is {0}; your age is {1}".format ( username, age ))
except:
print ( "Error entering information")
```
* In this program:
* function `get_information ()` does not contain a `try ... except` block.
* function `get_username ()` and `get_age()` do
* What happens if the user does not enter a valid age?
* In this case, the `get_age` function raises an exception. However, the execution will not continue in the `get_information` function.
* As `get_information` does not handle exceptions, the program will backtrack towards the main part of the program, which contains a `try...except` block.
* Here, it will not print the `Your username is... ` message, but will rather print the `Error entering information message`
* However, if we can we can also print the more specific error message using this code:
```python
try:
#...
except ValueError as error:
print ( error )
```
* In this case, the `try ... except` block will only catch an exception of the type `ValueError`. It will store information regarding this exception, as created by `raise` statement in the error value, which we can subsequently `print`.
## d. The `with` statement
```python
# Many tricky things are happening in this code...
filename = input("Enter a file name: ")
try:
f = open(filename, "r")
username = get_username ()
for line in f:
if line == username:
print ( "{0} found!".format ( username ) )
f.close ()
except IOError:
print("There is no file named", filename)
except ValueError:
print("Incorrect name provided")
```
* In the program below, if an incorrect `username` is entered, the program will raise an exception, and jump towards printing the message `Incorrect name provided` **without** executing the `close()` instruction !
* To resolve this issue, the proper way to combine exception handling with file processing is as follows:
```python
filename = input("Enter a file name: ")
try:
with open(filename, "r") as f:
username = get_username ()
for line in f:
if line == username:
print ( "{0} found!".format ( username ) )
except IOError:
print("...")
```
* What does the construction `with open(filename, "r") as f:` do?
* Essentially, it associates the result of `open(filename, "r")` to `f`, and executes the next block of code
* The `with` statement can be used to ensure that a file is automatically closed in all circumstances, whether good or bad.
* Finally, in order to print all the data in a file, we will remember that we can then use:
```python
with open("testfile.txt") as file:
data = file.readlines()
for line in data:
print(line, end='')
```
* It really makes things a lot easier, doesn’t it?