Lesson 9: Recursion, Dynamic Programming, T Test

--- tags: Python Workshop 沈煒翔 --- # Lesson 9: Recursion, Dynamic Programming, T Test ## Recursion Recursion (遞迴) is something defined by itself. Let's see an example: Factorial is defined as ``` n! = 1 x 2 x ... x (n-1) x n ``` Or, we could use a recursive definition ![](https://i.imgur.com/CSgVobE.png) ``` factorial(n) = n * factorial(n-1) ``` In Python, we can use functions to design recursion. ```python # returns the value of n! def factorial(n): if n == 1: # this is like the termination condition in while loop return 1 else: return n * factorial(n-1) ``` ### Exercise Define a function that can calculate the Fibonacci sequence: 1, 1, 2, 3, 5, 8, 13, 21, ... ```python def fib(n: int) -> int: pass print(fib(8)) # >>> 21 print(fib(21)) # >>> 10946 ``` ## Dynamic programming In the previous example when calculating Fibonacci sequence, we can break down the overall recursion steps. For example when calculating fib(5), ![](https://i.imgur.com/kP0Gcs1.png) We can see a lot of repeated work is done, so this is a bad implementation. A smarter way is to record the previous answers instead of calculate them everytime. ```Memoization``` is to track previously solved solutions so that they aren't recalculated when they are encountered again. We can store previous ```fib(n)``` results in an array. When results are already computed, we won't compute that again ```python N = 36 memo = np.zeros((N, )) # list also works def fib_dp(n: int) -> int: # if there is pervious result if memo[n] != 0: # return previous result from memo return memo[n] # compute the fib value if n == 1: ans = 1 elif n == 2: ans = 2 else: ans = fib_dp(n-1) + fib_dp(n-2) # store answer into memo memo[n] = ans return ans ``` "Memoization" and "Dynamic Programming" are often used interchangeably. We can see the execution speed of both implementations. ```python import time start_time = time.time() ans1 = fib(35) print(time.time() - start_time, 'sec') # >>> 3.33 sec start_time = time.time() ans2 = fib_dp(35) print(time.time() - start_time, 'sec') # >>> 0.0 sec ``` Also, for others fibonacci numbers, DP approach can solve extremely quickly because all results are stored ### Exercise Calculate the 35-th Tribonacci sequence: 1, 1, 1, 3, 5, 9, 17, 31, ... ```python N = 35 ``` ## 2D Dynamic Programming There's a 2D map with different cities, and each city has its cost to walk pass. Assume we can only walk right or down, what is the minimum cost to get to the end? ![](https://hackmd.io/_uploads/H1HCchsvo.png) ```python arrs = np.array( [[1, 7, 9, 2], [8, 6, 3, 2], [1, 6, 7, 8], [2, 9, 8, 2],] ) min_costs = np.zeros(arrs.shape) def find_min_path(i, j): # previous result from memo if min_costs[i, j] != 0: return min_costs[i, j] # compute if i==0 and j==0: ans = arrs[0, 0] elif i==0: ans = find_min_path(i, j-1) + arrs[i, j] elif j==0: ans = find_min_path(i-1, j) + arrs[i, j] else: ans = min(find_min_path(i-1, j), find_min_path(i, j-1)) + arrs[i, j] # store answer into memo min_costs[i, j] = ans return ans ans = find_min_path(3, 3) ``` ### Exercise How about the maximum path? Also, there's a rock you cannot walk on. ![](https://hackmd.io/_uploads/SkYYZaovs.png) ```python # 0 is the rock arrs = np.array( [[1, 7, 9, 2], [8, 6, 3, 2], [1, 0, 7, 8], [2, 9, 8, 2],] ) ``` Solution that includes getting the actual path. We use an additional `arrows` array to record which city we came from to get to the current city. ```python import numpy as np arrs = np.array( [[1, 7, 9, 2], [8, 6, 3, 2], [1, 0, 7, 8], [2, 9, 8, 2],] ) min_costs = np.zeros(arrs.shape) arrows = np.zeros(arrs.shape) def find_max_path(i, j): # previous result from memo if min_costs[i, j] != 0: return min_costs[i, j] # compute if i==0 and j==0: ans = arrs[0, 0] elif i==0: arrows[i, j] = 0 ans = find_max_path(i, j-1) + arrs[i, j] elif j==0: arrows[i, j] = 1 ans = find_max_path(i-1, j) + arrs[i, j] else: ans = max(find_max_path(i-1, j), find_max_path(i, j-1)) + arrs[i, j] if find_max_path(i-1, j) > find_max_path(i, j-1): arrows[i, j] = 1 else: arrows[i, j] = 0 # store answer into memo min_costs[i, j] = ans return ans ans = find_max_path(3, 3) ``` ![](https://hackmd.io/_uploads/BJNoxw3ws.png) ## Two sample T-test In addition to `numpy`, there is another library in Python to handld scientific computation `scipy`. It is impossible to memorize all the functions in every library, so you need to look it up whenever you need them. For example, two sample T test https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html#scipy.stats.ttest_ind ```python import scipy.stats import numpy as np # Creating data groups data_group1 = np.array([14, 15, 15, 16, 13, 8, 14, 17, 16, 14, 19, 20, 21, 15, 15, 16, 16, 13, 14, 12]) data_group2 = np.array([15, 17, 14, 17, 14, 8, 12, 19, 19, 14, 17, 22, 24, 16, 13, 16, 13, 18, 15, 13]) # https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html#scipy.stats.ttest_ind # Perform the two sample t-test with equal variances scipy.stats.ttest_ind(a=data_group1, b=data_group2, equal_var=True) # https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mannwhitneyu.html # Perform the Mann-Whitney U rank test on two independent samples. scipy.stats.mannwhitneyu(data_group1, data_group2) ``` ### Exercise ![](https://i.imgur.com/NX0v2M5.png) There's a 2D maze with some rocks on the way and you cannot walk across rocks. Calculate how many different but meaningful routes a person a take from the starting point to the ending point. ```python # 0 means road, 1 means rock maze = np.array( [[0, 0, 0, 0, 0, 0, 0], [0, 0, 1, 0, 0, 0, 1], [0, 0, 0, 0, 1, 0, 0],] ) ```