---
tags: Python Workshop 沈煒翔
---
# Lesson 9: Recursion, Dynamic Programming, T Test
## Recursion
Recursion (遞迴) is something defined by itself. Let's see an example:
Factorial is defined as
```
n! = 1 x 2 x ... x (n-1) x n
```
Or, we could use a recursive definition

```
factorial(n) = n * factorial(n-1)
```
In Python, we can use functions to design recursion.
```python
# returns the value of n!
def factorial(n):
if n == 1: # this is like the termination condition in while loop
return 1
else:
return n * factorial(n-1)
```
### Exercise
Define a function that can calculate the Fibonacci sequence: 1, 1, 2, 3, 5, 8, 13, 21, ...
```python
def fib(n: int) -> int:
pass
print(fib(8)) # >>> 21
print(fib(21)) # >>> 10946
```
## Dynamic programming
In the previous example when calculating Fibonacci sequence, we can break down the overall recursion steps. For example when calculating fib(5),

We can see a lot of repeated work is done, so this is a bad implementation.
A smarter way is to record the previous answers instead of calculate them everytime. ```Memoization``` is to track previously solved solutions so that they aren't recalculated when they are encountered again.
We can store previous ```fib(n)``` results in an array. When results are already computed, we won't compute that again
```python
N = 36
memo = np.zeros((N, )) # list also works
def fib_dp(n: int) -> int:
# if there is pervious result
if memo[n] != 0:
# return previous result from memo
return memo[n]
# compute the fib value
if n == 1:
ans = 1
elif n == 2:
ans = 2
else:
ans = fib_dp(n-1) + fib_dp(n-2)
# store answer into memo
memo[n] = ans
return ans
```
"Memoization" and "Dynamic Programming" are often used interchangeably.
We can see the execution speed of both implementations.
```python
import time
start_time = time.time()
ans1 = fib(35)
print(time.time() - start_time, 'sec') # >>> 3.33 sec
start_time = time.time()
ans2 = fib_dp(35)
print(time.time() - start_time, 'sec') # >>> 0.0 sec
```
Also, for others fibonacci numbers, DP approach can solve extremely quickly because all results are stored
### Exercise
Calculate the 35-th Tribonacci sequence: 1, 1, 1, 3, 5, 9, 17, 31, ...
```python
N = 35
```
## 2D Dynamic Programming
There's a 2D map with different cities, and each city has its cost to walk pass. Assume we can only walk right or down, what is the minimum cost to get to the end?

```python
arrs = np.array(
[[1, 7, 9, 2],
[8, 6, 3, 2],
[1, 6, 7, 8],
[2, 9, 8, 2],]
)
min_costs = np.zeros(arrs.shape)
def find_min_path(i, j):
# previous result from memo
if min_costs[i, j] != 0:
return min_costs[i, j]
# compute
if i==0 and j==0:
ans = arrs[0, 0]
elif i==0:
ans = find_min_path(i, j-1) + arrs[i, j]
elif j==0:
ans = find_min_path(i-1, j) + arrs[i, j]
else:
ans = min(find_min_path(i-1, j), find_min_path(i, j-1)) + arrs[i, j]
# store answer into memo
min_costs[i, j] = ans
return ans
ans = find_min_path(3, 3)
```
### Exercise
How about the maximum path? Also, there's a rock you cannot walk on.

```python
# 0 is the rock
arrs = np.array(
[[1, 7, 9, 2],
[8, 6, 3, 2],
[1, 0, 7, 8],
[2, 9, 8, 2],]
)
```
Solution that includes getting the actual path. We use an additional `arrows` array to record which city we came from to get to the current city.
```python
import numpy as np
arrs = np.array(
[[1, 7, 9, 2],
[8, 6, 3, 2],
[1, 0, 7, 8],
[2, 9, 8, 2],]
)
min_costs = np.zeros(arrs.shape)
arrows = np.zeros(arrs.shape)
def find_max_path(i, j):
# previous result from memo
if min_costs[i, j] != 0:
return min_costs[i, j]
# compute
if i==0 and j==0:
ans = arrs[0, 0]
elif i==0:
arrows[i, j] = 0
ans = find_max_path(i, j-1) + arrs[i, j]
elif j==0:
arrows[i, j] = 1
ans = find_max_path(i-1, j) + arrs[i, j]
else:
ans = max(find_max_path(i-1, j), find_max_path(i, j-1)) + arrs[i, j]
if find_max_path(i-1, j) > find_max_path(i, j-1):
arrows[i, j] = 1
else:
arrows[i, j] = 0
# store answer into memo
min_costs[i, j] = ans
return ans
ans = find_max_path(3, 3)
```

## Two sample T-test
In addition to `numpy`, there is another library in Python to handld scientific computation `scipy`.
It is impossible to memorize all the functions in every library, so you need to look it up whenever you need them.
For example, two sample T test
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html#scipy.stats.ttest_ind
```python
import scipy.stats
import numpy as np
# Creating data groups
data_group1 = np.array([14, 15, 15, 16, 13, 8, 14,
17, 16, 14, 19, 20, 21, 15,
15, 16, 16, 13, 14, 12])
data_group2 = np.array([15, 17, 14, 17, 14, 8, 12,
19, 19, 14, 17, 22, 24, 16,
13, 16, 13, 18, 15, 13])
# https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html#scipy.stats.ttest_ind
# Perform the two sample t-test with equal variances
scipy.stats.ttest_ind(a=data_group1, b=data_group2, equal_var=True)
# https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mannwhitneyu.html
# Perform the Mann-Whitney U rank test on two independent samples.
scipy.stats.mannwhitneyu(data_group1, data_group2)
```
### Exercise

There's a 2D maze with some rocks on the way and you cannot walk across rocks. Calculate how many different but meaningful routes a person a take from the starting point to the ending point.
```python
# 0 means road, 1 means rock
maze = np.array(
[[0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 1],
[0, 0, 0, 0, 1, 0, 0],]
)
```