--- slideOptions: theme: white --- # Chapter 8 - File manipulation ###### tags: `Computer programming` --- ## Learning objectives 1. Describe the steps for processing a file 2. Read and write data from and to the files 3. Use the functions for text file and binary file input/output ## Program and files In the previous chapters, our programs have some variables and arrays to store the data. The data is stored in RAM when the program is running. However, as RAM is volatile, all data will be lost when the program is shutted down. If the data is important and required to be stored permanently, the data should be stored in ROM as it is non-volatile. The data, in forms of *files*, can be read even after the computer is shutted down. However, we cannot use it directly in the program. In Python, we can read and write files with the specific functions. It is just like reading and writing down something in a log book. A file can be read by more than one program, and one program can write more than one file. ![](https://i.imgur.com/Fhpra1B.png) ## Streams In computers, the data is input and output from a *stream*. Python uses two types of streams. 1. **Text**: sequence of *characters* divide into lines 1. **Binary**: sequence of data values such as *integer*, *real*, or *complex* ![](https://i.imgur.com/6FWBB8R.png) ### Differences between text file and binary file Both text files and binary files are storing data in secondary memory. However, text files stores data as a *sequence of characters*, while binary files store data as they are stored in primary memory. ![](https://i.imgur.com/YxHBc5q.png) ## Processing a file There are totally *three* steps to processing a file in Python, as follows. ### Open the stream/file To open a file, we need to use the function open with following syntax. `file_variable* = open("filename", "mode")` The *filename* is required to include the path of the file if *the file is not in the same folder as the program*. Note that `\\` used instead of `\`. For example, we need to write `"MYDATA.DAT"` to access the file in `C:\MYDATA.DAT`. The mode would be discussed in the next section. Below shows an example on opening a file for writing. `spData = open("MYFILE.DAT", "w")` `spData = open("C:\\MYFILE.DAT", "w") ` Before opening the file, we can import the library `os` and use the method `os.path.exist(filename)` to check the existence of the file. Below shows an example to check the existence of `"MYDATA.DAT"`. ```python! import os if os.path.exists("MYDATA.DAT") == False: print("File not exist.") ``` ### Read or write data There are three *basic modes* for accessing the files, namely read (`r`), write (`w`) and append (`a`). We can add a plus sign (`+`) to indicate the *update mode* where files can be read and written after it is opened. The following table shows the file modes. |**Mode**|**r**|**w**|**a**|**r+**|**w+**|**a+**| | :-: | :-: | :-: | :-: | :-: | :-: | :-: | |Open State|read|<p>write,</p><p>point at beginning</p>|<p>write,</p><p>point at end</p>|read|<p>write,</p><p>point at beginning</p>|<p>write,</p><p>point at end</p>| |Read Allowed|yes|no|no|yes|yes|yes| |Write Allowed|no|yes|yes|yes|yes|yes| |Append Allowed|no|no|yes|no|no|yes| |File Must Exist\*|yes|no|no|yes|no|no| |Contents of Existing File Lost|no|yes|no|no|yes|no| \* For yes: if file does not exist, error created; for no: if file does not exist, it is created. ![](https://i.imgur.com/wb5Z6iw.png) ### Close the stream/file If the file is no longer needed, we need to close the file (save the content from RAM to ROM) by close method. Syntax: `file_object.close()` Below shows a whole picture of using a file. ```python! … spTemp = open("MYDATA.DAT", "w") … spTemp.close() … ``` ## Input/output methods We have learnt input and print through keyboard (standard input) and monitor (standard output). We use the read method and write method of the file object for I/O through files. Below table shows some file methods for file input and output. |**Method**|**Meaning**| | :-: | :-: | |`file.read(size)`|<p>Returns the specific number (`size`) of byres from the file. </p><p>`size` is optional. Default is `-1` which means the whole file.</p>| |`file.readline(size)`|<p>Returns one line from the file.</p><p>`size` is optional. The number of bytes from the line to return. Default is `-1` which means the whole line.</p>| |`file.readlines(hint)`|<p>Returns a list containing each line in the file as a list item.</p><p>`hint` is optional. If the number of bytes returned exceeds the hint number, no more lines will be returned. Default value is `-1`, which means all lines will be returned.</p>| |`file.write(byte)`|<p>Writes the `byte` to the file.</p><p>`byte` means the text of or byte object that will be inserted.</p>| |`file.writelines(list)`|`list` means the list of texts or byte objects that will be inserted.| |`print(string, file=file)`|Write the specific `string` to the file in `file`.| The user can use use Ctrl + Z in Windows to input EOF. <iframe src="https://trinket.io/embed/python3/0be7d34e91?runOption=run" width="100%" height="356" frameborder="0" marginwidth="0" marginheight="0" allowfullscreen></iframe> ### Example 1: copy text file of integers. <iframe src="https://trinket.io/embed/python3/804868064f?runOption=run" width="100%" height="400" frameborder="0" marginwidth="0" marginheight="0" allowfullscreen></iframe> ### Example 2: Append data to file <iframe src="https://trinket.io/embed/python3/671d7bbf2d?runOption=run" width="100%" height="420" frameborder="0" marginwidth="0" marginheight="0" allowfullscreen></iframe> ## Block input and output (\*) For reading and writing binary files, we need to understand the block input and output first. In a binary file, data is stored in *bytes* instead of characters. Normally, Hence, we need to convert the data (especially for numbers) into bytes first. For numbers, we can use the bytearray to convert a list of numbers into bytes. Below shows an example. <iframe src="https://trinket.io/embed/python3/46d4d6e6cf?runOption=run&start=result" width="100%" height="200" frameborder="0" marginwidth="0" marginheight="0" allowfullscreen></iframe> We can use the write method of the file object to put the *bytes* into the binary file. To open a binary file, it is normally the same as opening a text file, except the mode becomes "`rb`", "`wb`", "`ab`", "`rb+`", "`wb+`" and "`ab+`". Below shows an example of writing a binary file. <iframe src="https://trinket.io/embed/python3/f6bf750415?runOption=run" width="100%" height="356" frameborder="0" marginwidth="0" marginheight="0" allowfullscreen></iframe> The file `C08-04.DAT` is created, but you cannot read the content by the text editor. ![](https://i.imgur.com/cMdTgII.png) For reading binary files, it is similar to reading from a text file, except we need to convert the bytes into strings for us to read. <iframe src="https://trinket.io/embed/python3/dc40f93d5f?runOption=run" width="100%" height="300" frameborder="0" marginwidth="0" marginheight="0" allowfullscreen></iframe> ### Example 3: write a binary file and print the stream on the screen <iframe src="https://trinket.io/embed/python3/84f6c08130?runOption=run" width="100%" height="600" frameborder="0" marginwidth="0" marginheight="0" allowfullscreen></iframe> ## Chapter Summary - Text files are multi-line strings stored in secondary memory, while binary files are stored in the form of bytes. - A file may be opened for reading, writing or appending. - When a file is opened for writing (`w` mode), the existing contents of the file are erased. - Data is written to a file using the `print` and `write` function. - When processing is finished, the files should be closed. - ## Exercise 1. Which of the following is *not* a file-reading method in Python? A. read B. readline C. readall D. readlines 1. Before reading or writing a to a file, a file object must be crated via A. open B. create C. File D. Folder ## Programming exercise ### Exercise 1 Write a menu-driven text utility program. The program is maintaining 2 text files, text1.txt and text2.txt. The two files should exist or be created in the beginning of the program. The program will have the following features. 1. Display the content of a file to the screen 1. Append some text entered by user to a particular file 1. Append the content of a file to another file **Example:** ``` 1 - Print a file 2 - Append a file 3 - Append a file to another file 4 - Quit What do you want? (1-4): 2 1 - text1.txt 2 - text2.txt Which file to append? 1 Enter text below (ended by a new line with EOF): This is a new line of text ^Z text1.txt is updated 1 - Print a file 2 - Append a file 3 - Append a file to another file 4 - Quit What do you want? (1-4): 1 1 - text1.txt 2 - text2.txt Which file to print? 1 text1.txt ========= This is a new line of text 1 - Print a file 2 - Append a file 3 - Append a file to another file 4 - Quit What do you want? (1-4): 2 1 - text1.txt 2 - text2.txt Which file to append? 2 Enter text below (ended by a new line with EOF): I am editing a text file ^Z text2.txt is updated 1 - Print a file 2 - Append a file 3 - Append a file to another file 4 - Quit What do you want? (1-4): 1 1 - text1.txt 2 - text2.txt Which file to print? 2 text2.txt ========= I am editing a text file 1 - Print a file 2 - Append a file 3 - Append a file to another file 4 - Quit What do you want? (1-4):3 1 - text1.txt 2 - text2.txt Source file? 1 1 - text1.txt 2 - text2.txt Destination file? 2 text1.txt is appended to text2.txt 1 - Print a file 2 - Append a file 3 - Append a file to another file 4 - Quit What do you want? (1-4): 1 1 - text1.txt 2 - text2.txt Which file to print? 2 text2.txt ========= I am editing a text file This is a new line of text 1 - Print a file 2 - Append a file 3 - Append a file to another file 4 - Quit What do you want? (1-4): 4 Bye Bye ``` ### Exercise 2 Write a program that prints itself to the standard output. (*Hint: The program source file itself is a text file.*) ### Exercise 3 Write a program to parse the words in the file `data.txt` onto separate lines, that is, locate and write each word to its own line. Words are defined as one or more characters separated by whitespace ( ), comma (,) and full stop (.). **Example:** If `data.txt` consist of the following text > Write a program to parse words onto separate lines, >that is, locate and write each word to its own line. **The output is** ``` Write a program to parse words onto separate lines that is locate and write each word to its own line ``` # Exercise 4 Write a program that copies the contents of a binary file of itnegers and output on the screen, and copy them into a second binary file.