# 4.Thursday, 18-07-2019, Numby and Pandas ### Numby NumPy, short for Numerical Python * Functions |Function Name | NaN-safe Version | Description |:-|:-|:-| | np.sum | np.nansum | Compute sum of elements | | np.prod | np.nanprod | Compute product of elements | | np.mean | np.nanmean | Compute mean of elements | | np.std | np.nanstd | Compute standard deviation | | np.var | np.nanvar | Compute variance | | np.min | np.nanmin | Find minimum value | | np.max | np.nanmax | Find maximum value | | np.argmin | np.nanargmin | Find index of minimum value | | np.argmax | np.nanargmax | Find index of maximum value | | np.median | np.nanmedian | Compute median of elements | | np.percentile | np.nanpercentile | Compute rank-based statistics of elements | | np.any | N/A | Evaluate whether any elements are true | | np.all | N/A | Evaluate whether all elements are true | * Array ``` a = np.array([[1, 2, 3]]) print([1,2,3]) # [1, 2, 3] print(a) # [1 2 3] print(type(a)) # <class 'numpy.ndarray'> print(a.dtype) # int64 print(a.shape) # (3,) print(a.ndim) # 2 (only 1 matrix with 1 row & 1 column) print(a[0][2]) a[0][1] = 5 print(a) # [[1 5 3]] b = np.array([[1,2,3], [4,5,6]]) # Create a rank 2 array print(b.shape) # Prints "(2, 3)" print(b[0, 0], b[0, 1], b[1, 0]) # Prints "1 2 4" print(b[:2, :2]) # [[1 2] [4 5]] print(b[0, 0]) # 1 ``` ### Image * Library ``` from skimage import io photo = io.imread('/content/gdrive/My Drive/CoderSchool-FTMLE/data/assets/images/landscape.jpg') print(type(photo)) import matplotlib.pyplot as plt import seaborn as sns sns.set_style("whitegrid", {'axes.grid' : False}) ``` * Function shows photo ``` def show_image(photo): plt.figure(figsize=(12,12)) plt.axis("off") plt.imshow(photo) ``` >[ a : b : c ] >a: width >b: height >c: pixel, color > ### Pandas Work with Data Frame * Library ``` import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt %matplotlib inline import warnings warnings.filterwarnings('ignore') ``` * Function ``` titanic.head() titanic.tail(1) titanic.sample(5) titanic.info() titanic.describe() titanic['Sex'].unique() titanic['Sex'].value_counts() titanic['Pclass'].unique() titanic['Pclass'].value_counts() titanic['Survived'].value_counts() ``` ### Seaborn Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics. * Charts ``` sns.countplot(x='Sex', hue='Survived', data=titanic) sns.distplot(tips['total_bill'], bins=20) sns.jointplot(x='tip', y='total_bill', data=tips) sns.jointplot(x='tip', y='total_bill', kind='kde', data=tips) sns.heatmap(tips.corr(), annot=True, fmt='.3f', cmap='YlGnBu') sns.boxplot(x='sex', y='total_bill', hue='smoker', data=tips) sns.swarmplot(x='day', y='tip', hue='smoker', data=tips) sns.violinplot(x='day', y='tip', hue='smoker', data=tips) tips.groupby('day').sum()['total_bill'].plot(kind='bar') --draw by method ``` * Code for Pie chart ``` labels = [] sizes = [] explode = [] labels = tips['size'].value_counts().index.sort_values() for index, size in enumerate(labels): sizes.append(size) if size==3: explode.append(0.2) else: explode.append(0) f, ax1 = plt.subplots() ax1.pie(sizes, explode=explode, labels=labels, autopct='%1.1f%%', shadow=True, startangle=90) # Equal aspect ratio ensures that pie is drawn as a circle ax1.axis('equal') plt.tight_layout() plt.show() ```