---
tags: Python Workshop 沈煒翔
---
# Lesson 8: Plotting
## Line plot
Matplotlib is a library commonly used for plotting figures in Python.
```python
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 2, 100) # Sample data.
plt.figure()
plt.plot(x, x, label='linear') # Plot some data on the (implicit) axes.
plt.plot(x, x**2, label='quadratic') # etc.
plt.plot(x, x**3, label='cubic')
plt.xlabel('x label')
plt.ylabel('y label')
plt.title("Simple Plot")
plt.legend()
```
In some environments, you have to call ```plt.show()``` at the end of the script to show the figures.

You can control some properties of the line plot.
```python
plt.figure()
plt.plot(x, x, 'r.', label='linear') # color + line style
plt.plot(x, x**2, '.-', label='quadratic') # line style
```

## Scatter plot
We can plot a 2D scatter plot.
```python
x = np.random.normal(loc=0, scale=1, size=200)
y = np.random.normal(loc=0, scale=1, size=200)
x2 = np.random.normal(loc=3, scale=2, size=200)
y2 = np.random.normal(loc=2, scale=2, size=200)
plt.figure()
plt.scatter(x, y)
plt.scatter(x2, y2)
plt.legend(['a', 'b'])
```

## Histogram
We can plot a 1D histogram
```python
x = np.random.normal(loc=10, scale=3, size=5000)
plt.figure()
plt.hist(x, bins=50)
```

## Bar chart
We can plot a bar chart to display multiple 1D data.
```python
people = ('Tom', 'Dick', 'Harry', 'Slim', 'Jim')
y_pos = np.arange(len(people))
performance = 3 + 10 * np.random.rand(len(people))
error = np.random.rand(len(people))
plt.figure()
plt.barh(y_pos, performance, xerr=error, align='center')
plt.yticks(ticks=y_pos, labels=people)
```

### Exercise
Plot a line ```y = x^2 + 3x -5``` and plot another line in points using the same equation but with some errors.
```python
x = np.linspace(-5, 5, 30)
```

### Exercise
Plot three 2D normal distribution.

Input a random point (e.g. (0,0)), use the nearest neighbor method to find which distribution it belongs to.
```python
sample_point = (0, 0)
```
## Data processing
Assume we have a 1D time series that is noisy. We can apply moving average smoothing to eliminate some noise.
```python
x = np.linspace(-5, 5, 100)
y = x**2 + 3*x - 5 + 2*np.random.randn(100)
plt.figure()
plt.plot(x, y, '.-')
y_smoothed = np.zeros(x.shape)
for i in range(5, len(x)-5):
y_smoothed[i] = np.mean(y[i-5:i+5])
y_smoothed[:5] = np.isnan
y_smoothed[-5:] = np.isnan
plt.figure()
plt.plot(x, y, '.-')
plt.plot(x, y_smoothed, '.-')
```


### Exercise
Assume we have a 1D time series, but some points are loss and record as nan.
Correct the nan using the average of the previous and the next value.
```python
x = np.linspace(-5, 5, 100)
y = x**2 + 3*x - 5
y_drop = y
for i in range(100):
if np.random.random() < 0.05:
y_drop[i] = np.nan
plt.figure()
plt.plot(x, y_drop, '.-')
# YOUR CORRECTION
# HINT: use np.isnan() to detect nan values!
# YOUR PLOTTING
```
Solution:
```python
# YOUR CORRECTION
# HINT: use np.isnan() to detect nan values!
y_corrected = np.zeros(y_drop.shape)
for i in range(len(y_drop)):
if np.isnan(y_drop[i]):
# reconstruct
y_corrected[i] = (y_drop[i-1] + y_drop[i+1])/2
else:
y_corrected[i] = y_drop[i]
# YOUR PLOTTING
plt.figure()
plt.plot(x, y_corrected, '.-')
```
## Optimization
Assume we observe a data and want to model it with a ```y=ax+b``` system.
```python
x = np.linspace(-5, 5, 100)
y = 3*x - 5 + np.random.randn(100)
plt.figure()
plt.plot(x, y)
```

We can solve the optimal (a, b) using linear sweeping.
```python
optimal_error = 1e20
optimal_a = 0
optimal_b = 0
for a in np.arange(-10, 10, 0.1):
for b in np.arange(-10, 10, 0.1):
y_pred = a*x + b
error = np.mean((y_pred - y)**2)
if error < optimal_error:
optimal_a = a
optimal_b = b
optimal_error = error
print(optimal_a, optimal_b)
```
Then we can plot it.
```python
y_pred = optimal_a*x + optimal_b
plt.figure()
plt.plot(x, y)
plt.plot(x, y_pred)
```

Linear sweeping is the most naive approach and takes a lot of time. You should learn more advance optimization techniques (e.g. gradient descent) later in the course.
### Exercise
Assume there is a ```y=ax^2+bx+c``` system. Follow the above methods.
1. Generate the 1D data (with noise)
2. Use linear sweeping to model it with a ```y=ax+b``` system
3. Plot (line plot) both data (observation/prediction)