<style> .reveal { font-size: 27px; } .reveal div.para { text-align: left; } .reveal ul { display: block; } .reveal ol { display: block; } img[alt=drawing] { width: 200px; } </style> # COMP1010 ## 3.8 - Python: Serialization --- ## What is Serialization? * Turning our data structures (lists, dictionaries, variables etc) into a sequential format, usually for the purpose of writing to a file. --- ## Why? * Store data on the server. * Save data between executions of program. * Great for storing data that's independent of the user (things which aren't appropriate for cookies) or things we don't want to risk the user changing/deleting. * Can't be seen, modified or deleted by the user without the server granting access/permission. * Simpler to implement than databases but lacks a lot of features a database brings. * Can be stored more or less securely using encryption (not covered in this course). --- ## Different from Cookies * Stored on server rather than user's machine. * Can't be deleted by user. * No (much bigger) size limit. * Can store things that are needed for all users rather than just the one. --- ## File Formats * File Formats * Plain text * CSV (Comma Separated Values) * JSON (JavaScript Object Notation) * Lecture Plan: * Today we will cover plain text and CSV primarily to develop an appreciation for the ways in which JSON is wonderful. * Then we will cover JSON. * Context: * JSON is useful for saving data on the server for our own applications. * JSON is also commonly used when accessing data from online sources which we will cover in week 7. --- ## Plain Text ```{python} def get_sample_data(): return ["hello", "how are you?", "that's good", "nice to meet you", "goodbye\nI'll see you again"] ``` --- ## Plain Text ```{python} import os def write_file(data, filename): # open file f = open(f'files/{filename}.txt', 'w') # write contents for text in data: f.write(text + os.linesep) # close file f.close() ``` ```{python} def read_file(filename): # open f = open(f'files/{filename}.txt', 'r') # read the contents data = [] for line in f.readlines(): data.append(line.strip()) # close the file f.close() return data ``` --- ## CSV * Comma Separated Values * Common data format * Stores tabular data (eg anything that could go in a spreadsheet)... * in rows... * separated by commas * Each row goes on a new line --- ## CSV ```{python} import csv ``` --- ## CSV: List ```{python} def get_sample_data_list(): return [['T15A', 'Liz', 21], ['H12A', 'My Name Has A, Comma In It', 21], ['H19A', 'My Name Has A\n newline In It', 21], ['H15A', 'William', 21]] ``` --- ## CSV: List ```{python} def write_list_as_csv(data, filename): # open file for writing (saving) f = open(f'files/{filename}.csv', 'w') csv_writer = csv.writer(f) # write each element in tutorials list, into the file for row in data: csv_writer.writerow(row) # close the file f.close() ``` ```{python} def read_csv_as_list(filename): # open file f = open(f'files/{filename}.csv', 'r') csv_reader = csv.reader(f) data = [] # read in lines from file to data list for line in csv_reader: data.append(line) # close the file f.close() # return the list return data ``` --- ## CSV: Dictionary ```{python} def get_sample_data_dict(): return [{'tutorial':'T15A', 'tutor':'Liz', 'enrollments':21}, {'tutorial':'H12A', 'tutor':'My Name Has A, Comma In It', 'enrollments':21}, {'tutorial':'H15A', 'tutor':'William', 'enrollments':21}] ``` --- ## CSV: Dictionary ```{python} def write_dict_as_csv(data, filename, fieldnames): f = open(f'files/{filename}.csv', 'w', newline='') csv_writer = csv.DictWriter( f, fieldnames=fieldnames ) csv_writer.writeheader() for row in data: csv_writer.writerow(row) f.close() ``` ```{python} def read_csv_as_dict(filename): # create empty list to read the file into data = [] # open the file f = open(f'files/{filename}.csv', 'r') # create a dictionary reader csv_reader = csv.DictReader(f) # read in the file for row in csv_reader: data.append(row) # close the file f.close() # return the information read in return data ``` --- ## Problems * Newlines within pieces of data * Complex data structures (where a list/dictionary may contain another list/dictionary of unknown length) --- ## JSON * JavaScript Object Notation (JSON) is a serialisation format with its origins in the Javascript language. * More recently it has become a universal standard used by many languages and tools for storing data. * Important points: * JSON looks very similar to how we write data structures in Python. * JSON supports two structures: * Arrays, which are similar in to lists in Python * Objects, which are similar to dictionaries (keys *must* be strings) * Single values can be strings or numbers. * In VSCode, you can format your JSON documents by right-click$\rightarrow$"Format Document" --- ## JSON ```{python} import json ``` ```{python} def get_sample_data(): return [ {'tutorial': 'T15A', 'tutor': 'Sim', 'enrollments': 20}, {'tutorial': 'T17A', 'tutor': 'Kai', 'enrollments': 23}, {'tutorial': 'T17A', 'tutor': 'Ka\ni', 'enrollments': 23}, ] ``` --- ## JSON ```{python} def write_json_file(data, filename): # open f = open(f'files/{filename}.json', 'w') # write f.write(json.dumps(data)) # close f.close() ``` ```{python} def read_json_file(filename): # open f = open(f'files/{filename}.json', 'r') # read data = json.loads(f.read()) # close f.close() return data ``` --- ## JSON: Complex Data Structures ```{python} def get_sample_data(): return { "fridge": ["chiller", "door", "shelves"], "dishwasher": ["motor", "rack", "tube"], "vacuum": ["hose", "motor", "bag"]} ``` ---
{"metaMigratedAt":"2023-06-16T21:21:59.120Z","metaMigratedFrom":"YAML","title":"3.8 - Python Serialization","breaks":true,"slideOptions":"{\"transition\":\"slide\"}","contributors":"[{\"id\":\"969c3c3d-0ef4-4f08-b22a-2f2b8951224b\",\"add\":6282,\"del\":27}]"}
    527 views