## Espaço de Tecnologias e Artes - Sesc Avenida Paulista ### `hackmd.io/@sesc-av-paulista/estudos-em-python-4-junho` # Grupo de estudos em Python ## 04/6 Explorando elementos de visualização de dados ### Algumas de muitas bibliotecas > ![image](https://hackmd.io/_uploads/B1I375jER.png) > Diagrama de 2018 do Jake VanderPlas que aparece no artigo https://www.anaconda.com/blog/python-data-visualization-2018-why-so-many-libraries - **matplotlib** - uma das mais antigas e populares - **seaborn** - baseado no matplotlib, mas mais alto nível - **bokeh** - usa uma saída em html+js para interatividade Vamos deixar a parte de mapas e dados georeferenciados para outro dia! Uma referência aberta legal é o *Handbook* do VanderPlas: https://jakevdp.github.io/PythonDataScienceHandbook/ - Tour do Python https://jakevdp.github.io/WhirlwindTourOfPython/ - Keynote do Jake VenderPlas https://www.youtube.com/watch?v=ZyjCqQEUa8o&ab_channel=PyCon2017 #### matplotlib - https://matplotlib.org/ - Histograma https://matplotlib.org/stable/gallery/statistics/hist.html#sphx-glr-gallery-statistics-hist-py ```python= import matplotlib.pyplot as plt import numpy as np plt.style.use('seaborn-v0_8-whitegrid') fig = plt.figure() ax = plt.axes() plt.show() ``` ![Figure_1](https://hackmd.io/_uploads/H1VuvCn40.png) ![image](https://hackmd.io/_uploads/SySM_0nER.png) ```python import matplotlib.pyplot as plt import numpy as np mu, sigma = 100, 15 x = mu + sigma * np.random.len(10000) hist, bins = np.histogram(x, bins=50) width = 0.7 * (bins[1] - bins[0]) center = (bins[:-1] + bins[1:]) / 2 plt.bar(center, hist, align='center', width=width) plt.show() ``` ![EDxOG](https://hackmd.io/_uploads/HkMFjR3V0.png) - https://jakevdp.github.io/PythonDataScienceHandbook/04.01-simple-line-plots.html - https://colab.research.google.com/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.00-Introduction-To-Matplotlib.ipynb#scrollTo=UYpQdOIMux5z #### seaborn - https://seaborn.pydata.org/ Mais alto nível, melhor para usar com *pandas* (dataframes) ```python= # Import seaborn import seaborn as sns # Apply the default theme sns.set_theme() # senão fica sem o quadriculado # Carrega dados exemplo das gorjetas tips = sns.load_dataset("tips") # Create a visualization sns.relplot( data=tips, x="total_bill", y="tip", col="smoker", hue="time", style="smoker", size="size", # x="size", y="tip", col="smoker", # hue="time", style="smoker", size="total_bill", ) import matplotlib.pyplot as plt plt.show() ``` import matplotlib.pyplot as plt import seaborn as sns import numpy as np import pandas as pd sns.set() # seaborn's method to set its chart style data = np.random.multivariate_normal([0, 0], [[5, 2], [2, 2]], size=2000) data = pd.DataFrame(data, columns=['x', 'y']) for col in 'xy': plt.hist(data[col], density=True, alpha=0.5) ```python import matplotlib.pyplot as plt plt.show() ``` - https://colab.research.google.com/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.14-Visualization-With-Seaborn.ipynb #### bokeh - https://bokeh.org/ - https://colab.research.google.com/github/bebi103a/bebi103a.github.io/blob/master/lessons/06/intro_to_plotting_with_bokeh.ipynb ## Digressão sobre "vetorização" ```python= import numpy as np from math import sin, pi, tau # for n in range(1000): # i = 1 + n # a = (tau / 1000) * i # s = sin(a) # print(s) n = np.linspace(0, 1, 1001) a = tau * n s = np.sin(a) print(s) ``` Desafios com dados: https://www.kaggle.com/ Querido diário: https://ok.org.br/projetos/ Serenata de amor: https://serenata.ai/