Python pandas Quick Reference

Read/Write CSV

import pandas as pd
a = pd.read_csv('input.csv')
a.to_csv('output.csv', index=False)

Merge DataFrames

Merge by a common column

"Merge" two data CSV files that

the first line is the column header
at least 1 common column exists in both CSV files

import pandas as pd

a = pd.read_csv('file1.csv')
b = pd.read_csv('file2.csv')
a = a.merge(b, on="CommonColumn", how="outer")
a.to_csv('output.csv', index=False)

Notes:

The option how determines how the merge is performed. See https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.merge.html

Append new rows of identical columns

"Merge" two data CSV files that

the first line is the column header
the column headers are the same
the rows are from different data sets

import pandas as pd

a = pd.read_csv('file1.csv')
b = pd.read_csv('file2.csv')

# Introduce a new column 'DataSource', if necessary, to indicate the source of the entry
a['DataSource'] = 'file1'
b['DataSource'] = 'file2'

merged_data = pd.DataFrame()
merged_data = merged_data.append(a, ignore_index=True, sort=False)
merged_data = merged_data.append(b, ignore_index=True, sort=False)
merged_data.to_csv('output.csv', index=False)

Add new columns

d = {
    'Name': ['Alice', 'Bob'],
    'Math': [100, 90],
    'English': [80, 85],
}
df = pd.DataFrame(data=d)
"""
    Name  Math  English
0  Alice   100       80
1    Bob    90       85
"""

# Add a new column to all rows
df['Class'] = 'ClassA'

# Adda new column based on other columns
df['Avg'] = df.apply(lambda row: (row['Math'] + row['English']) / 2, axis=1)

Back to @lancec

Python pandas Quick Reference

Read/Write CSV

Merge DataFrames

Merge by a common column

Append new rows of identical columns

Add new columns

tags: quick-ref

Read more

Kalman Filter

Steering Vector

Python Quick Reference

Git Quick Reference

tags: `quick-ref`