<style>
.reveal {
font-size: 27px;
}
.reveal div.para {
text-align: left;
}
.reveal ul {
display: block;
}
.reveal ol {
display: block;
}
img[alt=drawing] { width: 200px; }
</style>
# COMP1010
## 3.8 - Python: Serialization
---
## What is Serialization?
* Turning our data structures (lists, dictionaries, variables etc) into a sequential format, usually for the purpose of writing to a file.
---
## Why?
* Store data on the server.
* Save data between executions of program.
* Great for storing data that's independent of the user (things which aren't appropriate for cookies) or things we don't want to risk the user changing/deleting.
* Can't be seen, modified or deleted by the user without the server granting access/permission.
* Simpler to implement than databases but lacks a lot of features a database brings.
* Can be stored more or less securely using encryption (not covered in this course).
---
## Different from Cookies
* Stored on server rather than user's machine.
* Can't be deleted by user.
* No (much bigger) size limit.
* Can store things that are needed for all users rather than just the one.
---
## File Formats
* File Formats
* Plain text
* CSV (Comma Separated Values)
* JSON (JavaScript Object Notation)
* Lecture Plan:
* Today we will cover plain text and CSV primarily to develop an appreciation for the ways in which JSON is wonderful.
* Then we will cover JSON.
* Context:
* JSON is useful for saving data on the server for our own applications.
* JSON is also commonly used when accessing data from online sources which we will cover in week 7.
---
## Plain Text
```{python}
def get_sample_data():
return ["hello", "how are you?", "that's good",
"nice to meet you", "goodbye\nI'll see you again"]
```
---
## Plain Text
```{python}
import os
def write_file(data, filename):
# open file
f = open(f'files/{filename}.txt', 'w')
# write contents
for text in data:
f.write(text + os.linesep)
# close file
f.close()
```
```{python}
def read_file(filename):
# open
f = open(f'files/{filename}.txt', 'r')
# read the contents
data = []
for line in f.readlines():
data.append(line.strip())
# close the file
f.close()
return data
```
---
## CSV
* Comma Separated Values
* Common data format
* Stores tabular data (eg anything that could go in a spreadsheet)...
* in rows...
* separated by commas
* Each row goes on a new line
---
## CSV
```{python}
import csv
```
---
## CSV: List
```{python}
def get_sample_data_list():
return [['T15A', 'Liz', 21],
['H12A', 'My Name Has A, Comma In It', 21],
['H19A', 'My Name Has A\n newline In It', 21],
['H15A', 'William', 21]]
```
---
## CSV: List
```{python}
def write_list_as_csv(data, filename):
# open file for writing (saving)
f = open(f'files/{filename}.csv', 'w')
csv_writer = csv.writer(f)
# write each element in tutorials list, into the file
for row in data:
csv_writer.writerow(row)
# close the file
f.close()
```
```{python}
def read_csv_as_list(filename):
# open file
f = open(f'files/{filename}.csv', 'r')
csv_reader = csv.reader(f)
data = []
# read in lines from file to data list
for line in csv_reader:
data.append(line)
# close the file
f.close()
# return the list
return data
```
---
## CSV: Dictionary
```{python}
def get_sample_data_dict():
return [{'tutorial':'T15A', 'tutor':'Liz', 'enrollments':21},
{'tutorial':'H12A', 'tutor':'My Name Has A, Comma In It', 'enrollments':21},
{'tutorial':'H15A', 'tutor':'William', 'enrollments':21}]
```
---
## CSV: Dictionary
```{python}
def write_dict_as_csv(data, filename, fieldnames):
f = open(f'files/{filename}.csv', 'w', newline='')
csv_writer = csv.DictWriter(
f, fieldnames=fieldnames
)
csv_writer.writeheader()
for row in data:
csv_writer.writerow(row)
f.close()
```
```{python}
def read_csv_as_dict(filename):
# create empty list to read the file into
data = []
# open the file
f = open(f'files/{filename}.csv', 'r')
# create a dictionary reader
csv_reader = csv.DictReader(f)
# read in the file
for row in csv_reader:
data.append(row)
# close the file
f.close()
# return the information read in
return data
```
---
## Problems
* Newlines within pieces of data
* Complex data structures (where a list/dictionary may contain another list/dictionary of unknown length)
---
## JSON
* JavaScript Object Notation (JSON) is a serialisation format with its origins in the Javascript language.
* More recently it has become a universal standard used by many languages and tools for storing data.
* Important points:
* JSON looks very similar to how we write data structures in Python.
* JSON supports two structures:
* Arrays, which are similar in to lists in Python
* Objects, which are similar to dictionaries (keys *must* be strings)
* Single values can be strings or numbers.
* In VSCode, you can format your JSON documents by right-click$\rightarrow$"Format Document"
---
## JSON
```{python}
import json
```
```{python}
def get_sample_data():
return [
{'tutorial': 'T15A', 'tutor': 'Sim', 'enrollments': 20},
{'tutorial': 'T17A', 'tutor': 'Kai', 'enrollments': 23},
{'tutorial': 'T17A', 'tutor': 'Ka\ni', 'enrollments': 23},
]
```
---
## JSON
```{python}
def write_json_file(data, filename):
# open
f = open(f'files/{filename}.json', 'w')
# write
f.write(json.dumps(data))
# close
f.close()
```
```{python}
def read_json_file(filename):
# open
f = open(f'files/{filename}.json', 'r')
# read
data = json.loads(f.read())
# close
f.close()
return data
```
---
## JSON: Complex Data Structures
```{python}
def get_sample_data():
return {
"fridge": ["chiller", "door", "shelves"],
"dishwasher": ["motor", "rack", "tube"],
"vacuum": ["hose", "motor", "bag"]}
```
---
{"metaMigratedAt":"2023-06-16T21:21:59.120Z","metaMigratedFrom":"YAML","title":"3.8 - Python Serialization","breaks":true,"slideOptions":"{\"transition\":\"slide\"}","contributors":"[{\"id\":\"969c3c3d-0ef4-4f08-b22a-2f2b8951224b\",\"add\":6282,\"del\":27}]"}