Reading files in Python

--- title: Reading files in Python tags: Python Cheatbook --- # Reading files in Python Read a single file line by line ```python= import os your_file = './data/site1/B1/path_data_files/somefile.txt' if os.path.isfile(your_file): f = open(your_file, 'r', encoding='utf-8') for line in f: print(line) # Do whatever you like here. f.close() ``` Reading files line by line in a directory ```python= import os your_path = './data/site1/B1/path_data_files' files = os.listdir(your_path) print('found {} files'.format(len(files))) for file in files: if os.path.isfile(os.path.join(your_path, file)): f = open(os.path.join(your_path, file), 'r', encoding='utf-8') for line in f: print(line) # Do whatever you like here. f.close() ``` - `os.path.isfile(file)` check if its a valid file. - `open(path, mode, [encoding])` opens a file with given `mode` and optional `encoding` parameter, usually `encoding='utf-8'`. - `os.path.join(path, file)` Joins two path. - `file.close()` Closes the filestream. # Reading a JSON file ```python= import json with open(json_file_path) as file: json_data = json.load(file) print(json_data['myJsonKey']) ``` - `open(filepath)` opens a file - `json.load(file)` parses a json filestream to a dictonary object. # Reading a CSV file [1] ## With `csv` module ```python= import csv filename = "car.data" fields = [] rows = [] with open(filename, 'r') as csvfile: csvreader = csv.reader(csvfile) # uncomment to take first row as fields # fields = next(csvreader) for row in csvreader: rows.append(row) print("Total no. of rows: %d" % (csvreader.line_num)) print('Field names are:' + ', '.join(field for field in fields)) print('\nFirst 5 rows are:') print('----------------') for row in rows[:5]: for col in row: if col == row[len(row)-1]: print(col) else: print(col, end=',') ``` - `csv` module makes it easy to read csv - `csv.reader(csvfile)` prepares a csv reader which can iterated to extract csv data **Output:** ``` Total no. of rows: 1728 Field names are: First 5 rows are: ---------------- vhigh,vhigh,2,2,small,low,unacc vhigh,vhigh,2,2,small,med,unacc vhigh,vhigh,2,2,small,high,unacc vhigh,vhigh,2,2,med,low,unacc vhigh,vhigh,2,2,med,med,unacc ``` Alternatively you can use `pandas` library to read the csv into dataframes and do exploratory data anlysis much easily. [Check the `pandas` tutorial to learn more](https://hackmd.io/9kh0_lzMT46HuevPkckrFw#Read-seperated-value-files-CSV-TSV-etc). # References 1. https://www.geeksforgeeks.org/working-csv-files-python/