Phyics 325 Scientific Computing

# Phyics 325 Scientific Computing # Week 1: Introduction to Python ### Good study habits * allocate time for studying & doing homework * space out your studying * recall instead of re-read * the importance of hand-writing * the Pomodoro method ### Syllabus * hand-written class notes * homework assignments * lab assignments * exams * final project ### Reviewing the basics [A good resource: W3 Schools](https://www.w3schools.com/python/default.asp) * variables * operators & precedence (PEMDAS): `a/b/c` * exponent (power): right to left `a**b**c` * modulo (remainder) `7 % 3` * comments * print * strings * list * round parentheses (), square brackets [], curly braces {} * scientific notation * list and indexing * range() ### Laptop exercises [Google Colab](https://colab.research.google.com/) * print "Hello World!" * verify the precedence of `/` and `*` * print the integers from 0 to 10 ### Functions * you can use it like a math function (returning a number) * the "argument" is like a placeholder * you can do something e.g. print * inputs and outputs are optional * use 4 spaces ``` def f(x): return x**2 def f(x, y): return x+y, x*y ``` ### if-else statements ``` mass = -3 if mass < 0: print("mass cannot be negative") # 4 spaces elf mass == 0: print("massless") else: print("mass is positive") ``` ### for-loops * loop through a list of integers, floats, strings ``` for i in range(10): print(i) # 4 spaces for i in range(0,101, 50): print(i**2) for particle in ['electron', 'proton', 'neutron']: print('the particle zoo includes', particle) ``` ### Laptop exercises * write a function $f(x) = x^{3/2}$ * scientific notation * $6.63\times10^{-34}$ * $3\times10^8$ * $10^{13}$ ### Pencil-paper exercises * $10 ^\circ = \ ? \ \ \mbox{radian}$ * $\sin(30^\circ) = \ ?$ # Week 2: Numpy and Matplotlib ### Python list indexing `fruit_list = ['apple', 'banana', 'kiwi']` `price_list = [0.2, 0.5, 0.8]` * how to get the first element? * how to get the second element? * how to get the last element? ```import numpy as np``` ### The Numpy library * np.pi * np.exp() * a numpy array * np.arange() * Indexing ### Numpy array * `np.array([0.1, 0.2, 0.3, 0.4, 0.5])` * `np.arange(0.1, 0.6, 0.1)` * rule: start, end, step; the end is excluded * useful if we know the step size we need. * `np.linspace(0.1, 0.5, 5)` * rule: (start, end, number of elements) * useful if we know the number of points we need. ### Practice generate `np.array([0.3, 0.5, 0.7, 0.9, 1.1])` * using np.arange() * using np.linspace() ### Python `range` vs. `np.arange` `range(start, stop, step)` only takes integers e.g. `range(0, 11, 2)` `np.arange(start, stop, step)` can take floats e.g. `np.arange(0.1, 0.91, 0.2)` ### Numpy array: broadcasting Using two numbers: ``` m = 50 v = 5 print(0.5 * m * v**2) ``` Using two arrays: ``` import numpy as np m_arr = np.array([50, 100]) # don't forget [] v_arr = np.array([5, -3]) print(m_arr * v_arr) kinetic_energy_arr = 0.5 * m_arr * v_arr**2 print(kinetic_energy_arr) ``` ### Using numpy array in a function ``` import numpy as np def f(x): return x**2 # you can put in a value x = 0.5 y = f(x) print(y) # or you can put in a numpy array x_arr = np.array([0.1, 0.2, 0.3, 0.4, 0.5]) y_arr = f(x_arr) print(y_arr) ``` ### Using numpy array in a function & make a plot ``` import numpy as np import matplotlib.pyplot as plt def f(x): return x**2 # you can put in a number x = 0.5 y = f(x) print(y) # or you can put in a numpy array x_arr = np.array([0.1, 0.2, 0.3, 0.4, 0.5]) y_arr = f(x_arr) print(y_arr) plt.plot(x_arr, y_arr) # make a plot! ``` ### Jupyter tricks * Go to *Runtime* and click *restart and run all* before submitting homework * indenting a block of text * commenting out a block of text ### Practice $f(x) = e^{-x^2}$ 1. How to write this as a Python function? 2. How to generate an array 0, 0.1, 0.2, 0.3, …., 10 (x_arr = …)? 3. How to plot this function for x from 0 to 10? ### Practice How to calculate $\sum_{n=1}^{100} n^2$? ### Numpy array: * array filtering * find min and max of an array * array sorting Example: We have a table of the 50 states' population, area, etc. * We want to find all the states with population > 5 million * We want to find the states with the largest population * We want to sort the states by population (high to low) ### array filtering ``` x = 7 print(x < 9) # answer: True ``` ``` import numpy as np x_arr = np.array([7, 8, 9, 10]) filter_arr = (x_arr < 9) print(filter_arr) # answer: [True True False False] x_arr_filtered = x_arr[filter_arr] print(x_arr_filtered) # answer: [7 8] ``` In this example, we have the heights of 7 people. ``` import numpy as np x_arr = np.array([35, 46, 43, 10, 23, 58, 60]) ``` * Write a filter array that selects the numbers greater than 42. Print them. * Write a filter array that selects even numbers (hint: use %2). Print them. ### Filtering two arrays of the same length ``` import numpy as np x_arr = np.array([7, 8, 9, 10]) name_arr = np.array(['Alice', 'Bob', 'Carol', 'Dan']) filter_arr = (x_arr < 9) print(filter_arr) # answer: [True True False False] x_arr_filtered = x_arr[filter_arr] name_arr_filtered = name_arr[filter_arr] print(name_arr_filtered) # answer ['Alice' 'Bob'] ``` ### Find the maximum: `np.max()`, `np.argmax()` ``` import numpy as np x_arr = np.array([7, 8, 9, 10]) name_arr = np.array(['Alice', 'Bob', 'Carol', 'Dan']) print(np.max(x_arr)) # 10 index_max = np.argmax(x_arr) print(index_max) # 3 print(name_arr[index_max]) # 'Dan' ``` ### Numpy array sorting: np.sort() and np.argsort() ``` import numpy as np x_arr = np.array([1.2, 1.1, 1.3]) print(np.sort(x_arr)) # [1.1 1.2 1.3] index_sort = np.argsort(x_arr) print(index_sort) # [1 0 2] print(x_arr[index_sort]) ``` All sorting in computing is from small to large. ### Laptop Practice ``` import numpy as np name_arr = np.array(['Abby', 'Ernie', 'Elmo', 'Oscar', 'Bert', 'Grover', 'Big Bird']) height_arr = np.array([35, 46, 43, 10, 23, 58, 60]) ``` * Write a filter array that select the height greater than 42. * Print those heights. * Print the names of the characters with heights greater than 42. * Print the name of the tallest character * Print the name of the shortest character * Print the names of the characters from the shortest to the tallest. * Reverse the order: from the tallest to the shortest # Week 3: Numpy and Matplotlib (cont'd) ### Lap recap ``` import numpy as np sat_scores = np.array([1440, 1256, 1543, 1043, 989, 1412, 1343]) students = np.array(["Elmo", "Big Bird", "Abby", "Bert", "Grover", "Count von Count", "Cookie Monster"]) ``` Print out students' names sorted by their scores, from the highest to the lowest. ### Lapboard Practice * How to type $10^{11}$ in Python? * How to calculate $\sum_{n=1}^{100} n$? * What are the following? ``` np.arange(3) np.arange(3, 8) np.arange(3, 8, 2) np.arange(8, 3, -2) ``` * Which one is different from the other two? 1. `1/x*(1+x)` 2. `1/(x*(1+x))` 3. `1/x/(1+x)` ### Array indexing and slicing `fruit_list = ["apple0", "banana1", "cherry2", "durian3", "eggfruit4", "fig5", "grape6", "honeydew7"]` * rule: [start: end: step] * `fruit_list[7]` get element 7 * `fruit_list[:7]` start from 0 and end before 7 * `fruit_list[2:]` start from 2 and all the way to the end * `fruit_list[2:7]` start from 2 and end before 7 * `fruit_list[2:7:2]` start from 2 and end before 7, with an increment of 2 * `fruit_list[5]` ? * `fruit_list[1:7:3]` ? * `fruit_list[:]` ? * `fruit_list[::-1]` ? ### Scope of variables ``` a = 3 def f(x): return 1/(1+(x/a)**2) x = np.arange(10) print(f(x)) ``` * What's the difference between blue x and orange x? (Note: better call orange x something else, e.g. x_arr) * What are the roles of a vs. blue x? (You can learn more [here](https://realpython.com/python-namespaces-scope/#variable-scope)) ### 2D arrays (e.g., matrices, data tables) Rule: row by column `arr = np.array([ [1, 2], [3, 4] ])` * How to get the number 1 (element 0,0) * How to get the first row (row 0) * How to get the first column `arr = np.array([[1, 2, 3], [4, 5, 6]])` * arr[0, 0] # row, column * arr[:, 0] * arr[1, :] * arr[0] same as arr[0,:] * arr[0:1] * np.transpose(arr) * np.sum(arr) * np.sum(arr, axis=0) #### Numpy: write calculation results in a file ``` import numpy as np x_arr = np.arange(-3, 3.1, 0.1) def gauss(x): return np.exp(-0.5*x**2)/np.sqrt(2*np.pi) y_arr = gauss(x_arr) np.savetxt('gauss_function.dat', np.transpose([x_arr, y_arr]), '%g', header="x, y") ``` #### Read the data back in from a file ``` data_in = np.loadtxt('gauss_function.dat') # np.genfromtxt does the same x_in = data_in[:, 0] # array for the first column y_in = data_in[:, 1] # array for the second column # Let's make a plot plt.plot(x_in, y_in) plt.show() ``` For Google Colab you'll need to upload the file from your laptop ``` from google.colab import files uploaded = files.upload() ``` ### Formatted string (superseding the % or .format method) Example: print pi (3.1415926), 2 decimal places `print(f"{np.pi:.2f}")` It also works for a few numbers `print(f"{np.pi:.2f} {np.sqrt(2):.2f}")` It can mix numbers and strings ``` particle = 'electron' mass = 9.10938356e-31 print(f"The {particle} mass is {mass:e} kg.") ``` Number formatting ([reference](https://mkaz.blog/working-with-python/string-formatting/)) | Number | Format | Output | Description | | ----------- | ----------- |----------- | ----------- | |3.1415926 | f | 3.141593 | 6 decimal places | |3.1415926 |.2f |3.14|2 decimal places| |3.1415926 |g|3.14159|6 significant digits (dropping 0s after the decimal point) | |3.1415926 |.2g |3.1 |2 significant digits | |$6.62607004\times10^{-34}$ | e |6.626070e-34|Scientific notation, with 6 decimal places | |$6.62607004\times10^{-34}$ |.2e |6.63e-34|Scientific notation, with 2 decimal places | ### Practice * print `np.e` with 4 significant digits ### Lab 3: COVID data ``` import numpy as np data = np.loadtxt('covid_ada_2023.txt') days = data[:, 0] # first column cases = data[:, 1] # second column # What are these? print(data.shape) print(len(data)) print(days.shape) print(len(days)) ``` Array slicing ``` import numpy as np data = np.loadtxt('covid_ada_aug30_2022.txt') days = data[:, 0] # first column cases = data[:, 1] # second column How to get the elements 0 to 9 for cases? How to get elements 1 to 7 for cases? ``` day 1: `cases[1] - cases[0]` day 2: `cases[2] - cases[1]` ... day i: `cases[i] - cases[i-1]` ... day 899: `cases[899] - cases[898]` You can write a for-loop. You'll need "append" to form an array for the new cases. Number of new cases vs. day | day # | today | yesterday | | ----- | ----- | --------- | | day 1 | n1 | n0 | | day 2 | n2 | n1 | | day 3 | n3 | n2 | | ... | ... | ... | | day 1000 | n1000 | n999 | ``` # array slicing new_cases = cases[1:] - cases[0:-1] days_new_cases = days[1:] # cut Day 0 plt.plot(days_new_cases, new_cases) # you can also use np.diff np.diff(cases) ``` 7-day moving average Day 3, we calculate the average of Days 0, 1, 2, 3, 4, 5, 6 Day 4, we calculate the average of Days 1, 2, 3, 4, 5, 6, 7 ... Day i, we calculate the average of Days i-3, i-2, i-1, i, i+1, i+2, i+3 ... Day 896, we calculate the average of Days 893, 894, 895, 896, 897, 898, 899 That is, for each day, we "slice" out 7 days around that day ``` days_7day = [] mean_7day = [] ndays_new_cases = len(new_cases) for i in range(3, ndays_new_cases-3): # if ndays=7, use range(3,4) days_7day.append(days[i]) mean = np.mean(new_cases[i-3:i+4]) # if day 3: new_cases[0:7] mean_7day.append(mean) plt.plot(days_7day, mean_7day) ``` Plot the raw data and the 7-day moving average ``` plt.plot(days_new_cases, new_cases, color='gray', alpha=0.5, label='raw data')# transparent gray line, with legend plt.plot(days_7day, mean_7day, label='7-day average') plt.xlabel('days') plt.ylabel('daily new cases') plt.legend() # put this last ``` # Week 4: ### How to add axis labels, legends, horizontal lines, vertical lines, plot title ### Python style guide [PEP 8](https://peps.python.org/pep-0008/) is the insdustry standard * For variables, functions, and file names, use all lower cases with underscores, e.g., `angular_momentum`, `mass_in_kg`, `simple_harmonic_oscillators.py`, etc. * Use 4 spaces for indentation. ``` import this ``` ### Plotting a user-defined function #### Method 1: (for functions that can take arrays) ``` import numpy as np import matplotlib.pyplot as plt def f(x): return np.exp(-x**2) x_arr = np.linspace(-10, 10, 50) y_arr = f(x_arr) plt.plot(x_arr, y_arr) ``` #### Method 2: for-loop (for functions that cannot take arrays) Note: Some functions (like this one) only take scalars & don't take arrays. ``` def f(x): if x > 0: return np.exp(-x**2) else: return 0 x_arr = np.linspace(-10, 10, 50) # y_arr = f(x_arr) # can't do this y_arr = [] for x in x_arr: y_arr.append(f(x)) y_arr = np.array(y_arr) # make it a numpy array plt.plot(x_arr, y_arr) ``` ### Laptop practice: Plot $f(x) = \frac{1}{1+x^2}$, for x between -10 and 10, 100 sample points 1. using the "numpy array" method 2. using the "for-loop" method ### np.linspace vs. np.logspace `lin_arr = np.linspace(0.1, 1, 10)` `log_arr = np.logspace(-1, 0, 11)` Note: np.logspace(a, b, n) is the same as 10**np.linspace(a, b, n) ### Log-Log plot $f(x) = \frac{1}{x(1+x^2)}$ * write a python function for this f(x) * make a log-spaced array called x_arr: from $10^{-3}$ to 10, 100 elements, using `np.logspace()` * evaluate y_arr = f(x_arr) * plot y_arr vs. x_arr * make the axes log-scale by adding `plt.xscale('log')` and `plt.yscale('log')` ### Deep copy and shallow copy If you want to create two separate arrays with the same values: ``` # Incorrect: a = np.array(10) b = a # a and b will affect each other # Correct: import copy a = np.array(10) b = copy.deepcopy(a) # a and b won't affect each other ``` Guess what will happen and why? ``` # case 1 (shallow copy) a = np.array([81, 82, 83]) b = a b[0] = 5 print(a) # case 2 (deep copy) a = np.array([81, 82, 83]) import copy b = copy.deepcopy(a) b[0] = 5 print(a) ``` Are they the same object, or different objects with the same value? ``` # shallow copy a = np.array(10) b = a # a and b will affect each other print(a is b) print(a == b) # deep copy import copy a = np.array(10) b = copy.deepcopy(a) # a and b won't affect each other print(a is b) print(a == b) ``` ## Numerical accuracies ### Floating point arithmetic (laptop) Use f-string, print 0.1 with 17 decimal places What is the answer? Is this a bug in Python? ### Decimal (base 10) vs. binary (base 2) Humans use decimal fractions: $$ \frac{1}{8} = 10^{-1} + 2\times10^{-2} + 5\times10^{-3} = 0.125_{10} $$ Computers use binary fractions $$ \frac{1}{8} = 2^{-3} = 0.001_{2} $$ Not all decimal fractions can be expressed exactly using binary fractions. The decimal fraction can never express 1/3 exactly, no matter how many digits we use: $$ \frac{1}{3} = 0.3333333333.... $$ Similarly, the binary fraction can never express 1/10 exactly: 0.1 in binary: $$ 0.0001100110011001100110011001100110011001100110011..._2 $$ What is the difference between 1e400 and `10**400`? Hint: they are the same for humans. ``` print(1e400) ``` ``` print(10**400) ``` ### Float = floating point number Basically, it's a number with a decimal point (as opposed to an integer). Any number using scientific notation (e.g. 3e8, 1e-9) is a float. What is the biggest float your computer can handle?(beyond that, it will give you inf (infinity)) How many significant digits? In Python, a float has 64 bits (historically it's called a double-precision float) * 1 bit for +/- sign * 11 bits for exponent (positive and negative): $2^{\pm 1024} \sim 10^{\pm 308}$ * 52 bits for fraction: $2^{-52} \sim 10^{-16}$ (16 significant digits) Therefore, the largest/smallest floats are $10^{\pm 308}$. Each float has 16 significant figures. ### Rounding errors A floating point number has 16 significant digits Anything beyond the 16 significant digits is a numerical error. `print(1e20 + 0.1 - 1e20)` What do you get? Why? ### Rounding errors revisited ``` a = 12345678912345678. b = 12345678912345677. print(a - b) ``` ``` a = 0.1 + 0.1 + 0.1 # this could be some very complicated calculation b = 0.3 print(a == b) ``` Print a using f-string, show 20 significant digits. What is the answer, and why? ``` print(f'{a:.20f}') '0.30000000000000004441' print(f'{b:.20f}') '0.29999999999999998890' ``` ### We should never compare two floats using == We can (1) ask if the difference is less than a tiny number, or (2) use `np.isclose()` ``` a = 0.1 + 0.1 + 0.1 # this could be some very complicated calculation b = 0.3 print(a == b) # wrong tolerance = 1e-6 print(abs(a-b) < tolerance) print(np.isclose(a, b)) ``` # Week 5: Integration ### Lapboard practices * Create an array: $10^1, 10^{1.1}, 10^{1.2}, ... 10^{2}$ * Plot $f(x)=\frac{1}{x(1+x)}$, for x between 0.1 and 10, using log-log plot * f(x) is a complicated function that does not take in an array. Write a for-loop to plot this function between -5 and 5. * print G = 6.67430e-11 with scientific notation with 3 decimal places, and the string "The gravitational constant is..." * generate an array from 0 to 2$\pi$ (including or exceeding 2$\pi$) * 100 evenly-spaced numbers, using `np.linspace` * in intervals of 0.1, using `np.arange` * Calculate this by hand $$ f(x) = x^4 - 2 x + 1 \\ \int_0^2 f(x) dx $$ (The answer is 4.4) ### Integration * Numerical integration * Riemann sum & mid-point * Trapezoidal method * Simpson's method ### Numerical Integration (area under the curve) Sometimes we have an analytical function e.g. $f(x) = e^{-x^2}$, and sometimes we have a curve made of sample points $(x_i, f(x_i))$ from experiments or some complicated calculations. We would like to find the area under the curve. ### Riemann Sum Basic idea: evaluate f(x) at various sample points and sum them up. Simplest: Riemann sum (h -> 0) `(f(a) + f(a+h) + f(a+2h) + … + f(b-h)) * h` We can do better with midpoints `(f(a+0.5*h) + f(a+1.5*h) + f(a+2.5*h) + … + f(b-0.5*h))*h` ### Riemann Sum Use `np.array` and `np.sum` to calculate `(f(a) + f(a+h) + f(a+2h) + … + f(b-h)) * h` with `a=0`, `b=2`, `h=0.01` 1. define f(x) 2. create an array between 0, 1.99, with step size h=0.01 3. evaluate f(x) at these points 4. sum over them, multiply by h ### Midpoint Repeat for $(f(a+0.5*h) + f(a+1.5*h) + f(a+2.5*h) + … + f(b-0.5*h))*h$ create an array between 0+0.005 and 1.99+0.005, with step size `h=0.01` Hint: what's the first and last values of the array? ### trapezoidal method Instead of calculating the values at midpoints, we can use the average of the boundary: $0.5*h*(f(a)+f(a+h)) + … + 0.5*h*(f(b-h)+f(b))$ This is effectively summing up the area of these narrow trapezoids Observe this series: $0.5*h*(f(a)+f(a+h)) + 0.5*h*(f(a+h)+f(a+2h)) + … + \\ 0.5*h*(f(b-2h)+f(b-h)) + 0.5*h*(f(b-h)+f(b))$ Let's collect the repeated terms. The result is: $0.5*h*(f(a) + f(b)) + h*(f(a+h)+f(a+2h)+...+f(b-2h)+f(b-h) )$ Practice this summation: 1. create an array between a+h, … , b-h (need to have the last point) 2. sum over f(x_arr), multiply by h 3. add the extra term: `0.5*h*(f(a) + f(b))` ### np.trapz 1. create an array between 0, 2, with step size h=0.01 (including 2) 2. evaluate f(x) at these points, call it y_arr 3. np.trapz(y_arr, x_arr) ### scipy.integrate.simpson ``` from scipy import integrate integrate.simpson(y, x) #or integrate.simps(y, x) ``` ### Practice $$ f(x) = e^{-x^2} \\ \int_0^2 f(x) dx $$ ### Numerical errors of integration $$ f(x) = x^4 - 2x +1 $$ $$ \int_0^2 f(x) dx $$ Use `np.trapz`, with h = $10^{-1}$, $10^{-2}$, and $10^{-3}$. What are the fractional numerical errors (absolute value of the difference between numerical and actual results, divided by the actual value) for each h? ### Errors for Trapz ($h^2$) $$ \epsilon = \frac{1}{12}h^2 [f'(a) - f'(b)] $$ (Euler-Maclaurin formula. See Newman 5.2 for derivation) When h is 10x smaller, the error becomes 100x smaller. ``` h=0.1 , frac error = 6.06e-03 h=0.01 , frac error = 6.06e-05 h=0.001 , frac error = 6.06e-07 ``` ### Errors for Simpson ($h^4$) Repeat for `scipy.integrate.simpson` $$ \epsilon = \frac{1}{90}h^4[f'''(a)-f'''(b)] $$ When h is 10x smaller, the error becomes 10000x smaller. ``` h=0.1 , frac error = 6.06e-06 h=0.01 , frac error = 6.06e-10 h=0.001 , frac error = 6.04e-14 ``` ### Estimating errors for Trapz if we don't have $f'(x)$ Trick: doubling the number of slices Use $N_1$ slices and get result $I_1$, and then use $N_2 = 2 N_1$ slices and get result $I_2$. The error of $I_2$ is given by $$ \epsilon_2 = \frac{1}{3}(I_2 - I_1) $$ (see Newman 5.2 for derivation) ### Practice error estimation (use np.trapz) $$ f(x) = x^4 - 2x +1\\ \int_0^2 f(x) dx $$ 1. Use 10 slices (11 sample points) to calculate the integration, call it $I_1$ 2. Use 20 slices (21 sample points) to calculate the integration, call it $I_2$ 3. Then, estimate the error using $\epsilon_2 = \frac{I_2}{I_1}$ Compare it with the true error of $I_2$ (we know the exact answer is 4.4). ### Keep doubling the slices: Romberg (1955) method We start with Trapz with N slices and get numerical result $I_1$. We use Trapz again with 2N slices and get the numerical result $I_2$. Previous method: $I_2$ itself has a numerical error $ \frac{1}{3}(I_2 - I_1) = O(h^2) $ By combining $I_1$ and $I_2$, we can get even more accurate answers than $I_2$. We can keep repeating this process (doubling the number of slices and combining it with the previous iteration). This is called Richardson extrapolation. `scipy.integrate.romb` requires `2**k-1` evenly spaced sample points. `scipy.integrate.romberg` takes a user-defined function. $$ I = I_2 + \frac{1}{3}(I_2 - I_1) + O(h^4) $$ ### Practice: scipy.integrate.romberg Use `scipy.integrate.romberg` to calculate $$ f(x) = x^4 - 2x +1\\ \int_0^2 f(x) dx $$ (Note: The syntax is different from trapz and simps!) ### Practice: using romberg to evaluate this integration $$ f(x) = e^{-x^2} \\ \int_0^2 f(x) dx $$ ``` from scipy import integrate integrate.romberg(f, a, b) ``` ### lambda in Python: shorthand for defining a function ``` f = lambda x: x**2 # f is the same as the following: def g(x): return x**2 print(f(1.5)) print(g(1.5)) ``` ``` # two arguments f = lambda x, y: x**2+y**2 # f is the same as the following: def g(x, y): return x**2+y**2 print(f(1.5, 2.5)) print(g(1.5, 2.5)) ``` ### Practice lambda $f(x) = x^2$ Use `lambda`, make a plot for x between 0 and 10 # Week 6: Integration (cont'd) ### Practice 1. $f(x)$ is a function. Use `np.trapz` method to integrate $f(x)$ between -1 and 1 2. Repeat uing `scipy.integrate.quad` 3. How to code $g(x) = \int_0^{x} f(t) dt$ (use quad) 4. How to plot $g(x)$ between [0,2] using a for-loop? ### Gaussian quadrature Recall: trapezoidal method $0.5*h*(f(a) + f(b)) + h*(f(a+h)+f(a+2h)+...+f(b-2h)+f(b-h) )$ All integration methods follow this principle $$ \int_a^b f(x) dx \approx \sum_{i=1}^{N} w_k f(x_k) ~, $$ where $x_k$ are our sample points. Simpson's rule approximates the function locally with 2nd degree polynomial. But why stop there? How about 3rd degree, 4th degree, ...? How about (N-1)th degree? Here N is the number of sample points. But we don't have to sample points evenly. In this case, the sample points need to be chosen strategically (Legendre polynomials). For a given N, there's an optimal way to choose sample points & give them weights. Python has a library for that. Gauss has shown that it is equivalent to fitting a polynomial of 2N-1. `scipy.integrate.quadrature` takes in a user-defined function. ### Practice: scipy.integrate.quadrature Use `scipy.integrate.quadrature` to calculate $$ f(x) = e^{-x^2} \\ \int_0^2 f(x) dx $$ ### scipy.integrate.quad (most widely used) Use `scipy.integrate.quad` to calculate the same integration. The return has two values. ### There are many more methods [(see scipy documentation)](https://docs.scipy.org/doc/scipy/reference/integrate.html) All these methods are used in one of the two ways: 1. pass two arrays: `y_arr` and `x_arr` e.g. `integrate.simps(y_arr, x_arr)` (using dx is error-prone and is discouraged.) 2. pass the integrand, lower limit, upper limit e.g. `integrate.quad(func, a, b)` ### Practice (use quad) $$ f(t) = e^{-t^2} \\ E(x) = \int_0^x f(t) dt $$ The result can't be expressed analytically. It's called an error function. 1. Define $f(t) = e^{-t^2}$ and do this integration for t = 0 to 1 2. Write a function, E(x), with argument x and that returns the integration result. 3. Plot E(x) vs. x, for x from 0 to 3, 100 points (use the method in this slide) ### repeat with np.trapz ``` # step 1 import numpy as np import matplotlib.pyplot as plt def f(t): return np.exp(-t**2) t_arr = np.linspace(0,1,100) f_arr = f(t_arr) step1 = np.trapz(f_arr, t_arr) print('step 1 result =', step1) # step 2 def e(x): def f(t): return np.exp(-t**2) t_arr = np.linspace(0,x,100) f_arr = f(t_arr) result = np.trapz(f_arr, t_arr) return result print('step 2 result =', e(1)) # step 3: x_arr = np.linspace(0, 3, 100) y_arr = [] for x in x_arr: y_arr.append(e(x)) plt.plot(x_arr, y_arr) ``` ### Summary for integrations (Newman Ch 5) We talked about four methods: 1. Trapezoidal (1st order): simple yet one most widely used 2. Simpson's (2nd order): also widely used 3. Romberg (high-order): built upon Trapz. Requires analytical integrand or $2^k+1$ sample points 4. Gaussian Quadrature (high-order): Requires analytical integrand. ## Linear Algebra * Matrix multiplication * Inverse of a matrix * Simultaneous linear equations: Gaussian elimination * LU decomposition ### Matrix multiplication Row by column (like the shape 7) $$ \begin{bmatrix} a_{00} & a_{01} & a_{02}\\ a_{10} & a_{11} & a_{12}\\ a_{20} & a_{21} & a_{22} \end{bmatrix} \begin{bmatrix} b_{00} & b_{01} & b_{02}\\ b_{10} & b_{11} & b_{12}\\ b_{20} & b_{21} & b_{22} \end{bmatrix}= \begin{bmatrix} c_{00} & c_{01} & c_{02}\\ c_{10} & c_{11} & c_{12}\\ c_{20} & c_{21} & c_{22} \end{bmatrix} $$ $$ c_{00} = a_{00}b_{00} + a_{01}b_{10} + a_{02}b_{20} $$ #### Practice 1 $$ \begin{bmatrix} 1 & 0 & 0\\ 0 & 0 & 1\\ 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} $$ #### Practice 2 $$ \begin{bmatrix} 1 & 0 & 0\\ 2 & 1 & 0\\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} $$ This matrix takes the first row, multiply by 2, and add it to the second row. This matrix does a "row operation". #### Practice 3 $$ \begin{bmatrix} 1 & 0 & 0\\ 2 & 1 & 0\\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 3 & 0 & 1 \end{bmatrix} $$ #### Practice 4 $$ \begin{bmatrix} 1 & 0 & 0\\ 2 & 1 & 0\\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} 1 & 0 & 0\\ -2 & 1 & 0\\ 0 & 0 & 1 \end{bmatrix} $$ ### Inverse of a matrix $\mathsf{A}$ is a square matrix. If $$ \mathsf{A} \mathsf{A}^{-1} = \mathsf{I} $$ we call $\mathsf{A}^{-1}$ the inverse of $\mathsf{A}$. Here $\mathsf{I}$ is the identity matrix with 1 on the diagonal. For example, the 3x3 identity matrix is given by $$ \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{bmatrix} $$ We'll soon see that finding the inverse matrix & solving linear equations are equivalent problems. #### Practice 2x2 matrix inversion $$ A=\begin{bmatrix} 3 & 5 \\ -7 & 2 \end{bmatrix} $$ Hint: * divide by determinant * swap diagonal * flip signs of off-diagonal #### Laptop practice: `scipy.linalg.inv` * Type A * Calculate the inverse of A * Check if $A A^{-1}$ gives the identity matrix ``` from scipy import linalg linalg.inv(A) ``` ``` from scipy.linalg import inv inv(A) ``` ``` import scipy.linalg scipy.linalg.inv(A) ``` ``` import scipy.linalg as LA LA.inv(A) ``` ### Solving simultaneous linear equations $$ x + 2 y = 7 \\ 4x + 5y = 22 $$ We multiply eq1 by 4 & subtract it from eq 2 $$ x + 2 y = 7 \\ -3y = -6 $$ From the new eq 2, we get y = 2. Plug into eq 1, we get x = 3 $$ \begin{bmatrix} 1 & 2 \\ 4 & 6 \end{bmatrix}\begin{bmatrix} x \\ y \end{bmatrix}=\begin{bmatrix} 7 \\ 22 \end{bmatrix} $$ Multiply the first row by 4 Subtract it from the second row $$ \begin{bmatrix} 1 & 2\\ 0 & -3 \\ \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} =\begin{bmatrix} 7\\ -6 \end{bmatrix} $$ From the second row, we get y = 2 Combined with the first row, we get x = 3 #### How did we do it? * We repeatedly multiply a row by a number and add it to another row (row operations). * Eventually we get an "upper triangular" matrix, whose elements below the diagonal are zero (Gaussian elimination) * Getting x, y from bottom up (back substitution) ### Solving simultaneous linear equations The same set of equations can be re-written as $$ A {\bf x} = {\bf b} $$ where $$ {\mathsf A}=\begin{bmatrix} 1 & 2\\ 0 & -3 \\ \end{bmatrix} $$ $$ {\bf x}=\begin{bmatrix} x \\ y \end{bmatrix} $$ $$ {\bf b}=\begin{bmatrix} 7 \\ 22 \end{bmatrix} $$ The solution is given by $$ {\bf x} = {\mathsf A}^{-1} {\bf b} $$ Solving simultaneous linear equations ⇔ inverting matrix #### Practice 1. use `np.linalg.solve()` to solve the equations above 2. check if you get the same answer using `np.linalg.inv()` # Week 7: Linear Algebra (cont'd) ### Gauss elimination (2x2) $$ {\mathsf A}=\begin{bmatrix} 2 & 1\\ 8 & 7\\ \end{bmatrix} $$ Goal: we want an upper triangular matrix by doing row-operation to $$ E A = U $$ $$ \begin{bmatrix} 1 & 0\\ -4 & 1\\ \end{bmatrix} \begin{bmatrix} 2 & 1\\ 8 & 7\\ \end{bmatrix}= \begin{bmatrix} 2 & 1\\ 0 & 3\\ \end{bmatrix} $$ Here $E$ is the elimination matrix But we want $$ A = L U $$ $$ L = \begin{bmatrix} 1 & 0\\ 4 & 1\\ \end{bmatrix} $$ ### Gaussian elimination (3x3) $$ {\mathsf A}=\begin{bmatrix} 1 & 2 & 3\\ 4 & 5 & 6 \\ 7 & 8 & 10 \end{bmatrix} $$ 1. Multiply the first row by 4 & subtract it from the second row 2. Multiply the first row by 7 & subtract it from the third row $$ \begin{bmatrix} 1 & 2 & 3\\ 0 & -3 & -6 \\ 0 & -6 & -11 \end{bmatrix} $$ 3. Multiply the second row by 2 and subtract it from the third row $$ {\mathsf U}=\begin{bmatrix} 1 & 2 & 3\\ 0 & -3 & -6 \\ 0 & 0 & 1 \end{bmatrix} $$ Here is our upper triangular matrix. Let's call it U for the upper triangle. ### LU decomposition (L U stands for lower & upper) Row operation: replacing $R_i$ by $R_i - k R_j$. $k$ will be an element of the $L$ matrix ($L_{ij}$) $$ \mathsf{A = L U} $$ where $\mathsf L$ is related to the steps of Gaussian elimination * step 1: subtract 4x the first row from the second row $(R_1 - 4 R_0)$ ($L_{10}=4$) * step 2: subtract 7x the first row from the second row $(R_2 - 7 R_0)$ ($L_{20}=7$) * step 3: subtract 2x the first row from the third row $(R_2 - 2 R_1)$ ($L_{21}=2$) $$ \begin{bmatrix} 1 & 2 & 3\\ 4 & 5 & 6 \\ 7 & 8 & 10 \end{bmatrix}= \begin{bmatrix} 1 & 0 & 0\\ 4 & 1 & 0 \\ 7 & 2 & 1 \end{bmatrix} \begin{bmatrix} 1 & 2 & 3\\ 0 & -3 & -6 \\ 0 & 0 & 1 \end{bmatrix} $$ Here is a very good [video](https://www.youtube.com/watch?v=a9S6MMcqxr4) by math professor Leah Howard. #### Practice $$ \begin{bmatrix} 2 & 2 & 3\\ 5 & 9 & 10 \\ 4 & 1 & 2 \end{bmatrix} $$ ### What are L & U matrices good for? A = L U To solve A x = b, we can instead solve L U x = b Let's define y = Ux, and we can solve L y = b (easy, since L is triangular), and then we solve Ux = y (also easy, since U is triangular). If we change b, we can use the same L & U. ### Pivoting What if an element on the diagonal is zero? We can't use it to eliminate other elements. Solution: We swap rows to make sure all the diagonal elements are non-zero (called pivoting). Numbers that are close to zero can lead to large numerical errors. In numerical algebra, it's standard to put the number farthest from zero (positive or negative) on the diagonal before doing Gaussian elimination $$ \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 10 \end{bmatrix} $$ swapping the first and third row $$ \begin{bmatrix} 7 & 8 & 10 \\ 4 & 5 & 6 \\ 1 & 2 & 3 \end{bmatrix} $$ #### laptop practice `scipy.linalg.lu` 1. Use `np.array` to make $$ \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 10 \end{bmatrix} $$ 2. Use `scipy.linalg.lu` to decompose A into P, L, U 3. Use matrix multiplication @ or `np.dot()` to calculate PLU, and use `np.allclose()` to verify that A = P L U is true (for matrices A and B: `A @ B` or `np.dot(A, B)` ) ### Eigenvalues & eigenvectors A is a NxN matrix, the eigenvalue $\lambda$ (a number) and eigenvector (one column, N rows) satisfy: $$ \mathsf{A} \mathbf{x} = \lambda \mathbf{x} $$ To calculate λ, we solve $$ det(A - \lambda I ) =0 $$ There are N eigenvalues, and each one has its corresponding eigenvector. The eigenvectors are usually normalized to have length 1. #### Practice Calculate the eigenvalue and eigenvectors for this matrix: $$ A=\begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} $$ 1. Solve $\det(A - \lambda I ) =0$, there should be two solutions for $\lambda$. Call the bigger one $\lambda_0$ and the smaller one $\lambda_1$. 2. Solve $$ A \begin{bmatrix} x\\ y \end{bmatrix} = \lambda\begin{bmatrix} x\\ y \end{bmatrix} $$ ### Eigenvalue decomposition ("diagonalizing") Let's assume $\mathsf{A}$ is a 3$\times$3 and **symmetric** ($A = A^T$) We can organize the eigenvalues and eigenvectors like this: $$ {\mathsf \Lambda} = \begin{bmatrix} \lambda_0 & 0 & 0 \\ 0 & \lambda_1 & 0 \\ 0 & 0 & \lambda_2 \end{bmatrix} $$ $$ {\mathsf X} = \begin{bmatrix} & & \\ \mathbf{x}_0 & \mathbf{x}_1 & \mathbf{x}_2 \\ & & \end{bmatrix} $$ Here ${\mathsf X}$ is orthonormal ${\mathsf X} {\mathsf X}^T = {\mathsf X}^T {\mathsf X} = {\mathsf I}$ The equation can be re-written as $$ {\mathsf A} {\mathsf X} = {\mathsf X} {\mathsf \Lambda} $$ or $$ {\mathsf A} = {\mathsf X} {\mathsf \Lambda}{\mathsf X}^{T} $$ ### Note for eigendecomposition * If A is symmetric, $A = X \Lambda X^T$ , the eigenmatrix is orthonormal: $X X^T= I$ * If A is not symmetric, $A = X \Lambda X^{-1}, and X is not orthonormal #### Practice Eigenvalue decomposition (lapboard) $$ A=\begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} $$ Use your previous results, but write the $\Lambda$ and $X$ matrices $$ \mathsf{\Lambda}=\begin{bmatrix} \lambda_0 & 0 \\ 0 & \lambda_1 \end{bmatrix} $$ and $$ \mathsf{X}=\begin{bmatrix} x_0 & x_1 \\ y_0 & y_1 \end{bmatrix} $$ Then verify $$ {\mathsf A} = {\mathsf X} {\mathsf \Lambda}{\mathsf X}^{T} $$ #### Practice (laptop) 1. use np.array() to create this matrix $$ A=\begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} $$ 2. calculate the eigenvalues & eigenvectors of this matrix. assign the eigenvalues to lam0, lam1, and the corresponding eigenvectors (columns) to v0, v1 3. verify that A v0 = lam0 v0, and A v1 = lam1 v1 are True 4. verify that v0 v0 = 1, v1 v1 = 1, v0 v1 = 0, and $X X^T = X^T X = I$ 5. use `np.diag` to create the $\Lambda$ matrix and verify that $X A X^T = A$ # Week 8: Linear Algebra (cont'd) ### Singular matrix $$ \mathsf{A} = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} $$ Use Python to 1. calculate the LU decomposition 2. calculate the determinant $\det(A) = \det(U)$ 3. calculate the eigenvalues and eigenvectors ### Invertible vs. singular matrices * An $N \times N$ square matrix A is invertible if there exists an $N\times N$ square matrix B such that $\mathsf{A B = BA = I}_N$. Otherwise, it is singular. * B is often denoted as $\mathsf{A}^{-1}$ * det$(\mathsf{A}) \neq 0$ * rank$(\mathsf{A}) = N$ * Rank is the number of linearly independent rows * When you do Gaussian elimination, rank is the number of non-zero rows * $\mathsf{A} \mathbf{x} = 0$ has only the solution x = 0 * All the eigenvalues are non-zero ## Singular Value Decomposition Any matrix can be decomposed into $$ \mathsf{A = U \Sigma V^T} $$ Let's first focus on a square matrix: $N\times N$ U ($N\times N$): left singular vector, $\mathsf{U U^T = I}$ (orthonormal) V ($N\times N$): right singular vector, $\mathsf{V V^T = I}$ (orthonormal) $\Sigma$ ($N\times N$): a diagonal matrix, all non-negative, sorted from large to small. $$\Sigma = \begin{bmatrix} \sigma_0 & 0 & \cdots & 0 \\ 0 & \sigma_1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \sigma_{N-1} \end{bmatrix} $$ #### Practice: SVD for matrix A $$A= \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} $$ 1. use `scipy.linalg.svd` to calculate U, Sigma, VT 2. use `np.diag` to turn Sigma into a matrix 2. Verify that A = U @ Sigma @ VT ### "Range" vs. "null-space" $$\Sigma= \begin{bmatrix} 16.8 & 0 & 0 \\ 0 & 1.1 & 0 \\ 0 & 0 & 0 \end{bmatrix} $$ $u_0$ - $u_1$ plane: range of A $u_2$ ($\sigma_2$=0): null-space of A ### Defining an "inverse" for a singular matrix (Moore-Penrose pseudo-inverse) $$ \mathsf{A = U \Sigma V^T} $$ $$ \mathsf{A^+ = V \Sigma^{-1} U^T} $$ where $$ \mathsf{\Sigma}^{-1} = {\rm diag}(1/\sigma_i) $$ * Condition number: max($\sigma_i$)/min($\sigma_i$). If it's large, we call it "ill-conditioned" (close to singular) * If some $\sigma_i$=0, set $1/\sigma_i$= 0 #### Practice: calculate the pseudoinverse of A $$A= \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} $$ 1. use `scipy.linalg.svd` to calculate U, Sigma, VT 2. Create $\Sigma^{-1}$ (remember to replace $\infty$ with 0) 3. Use this formula to calculate $A^+$ 4. Compare the result with `scipy.linalg.pinv(A)` ### What is the pseudo-inverse good for? If $A$ is singular, $Ax = b$ either (1) has infinitely many solutions or (2) has no solution $$ 1x + 2y + 3z = 1 \\ 4x + 5y + 6z = 2 \\ 7x + 8y + 9z = 3 $$ If there are infinitely many solutions, $x = A^+ b$ gives the solution with the shortest $|x|$ $$ 1x + 2y + 3z = 1 \\ 4x + 5y + 6z = 2 \\ 7x + 8y + 9z = 5 $$ If there is no solution, $x = A^+ b$ gives the solution with the shortest $|Ax - b|$ #### Practice: $$A= \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} $$ $$b_1= \begin{bmatrix} 1 \\ 2 \\ 5 \end{bmatrix} $$ 1. Calculate the pseudo-inverse of A (`linalg.pinv`) 2. Solve $A x = b_1$ 3. Calculate the norm of $Ax - b_1$ (i.e. the square root of the sum of the square of each element, also called the Frobenius norm) ### Singular value decomposition for a non-square matrix A is a MxN matrix (M>N), and it can be decomposed into $A = U \Sigma V^T$ There are two conventions. First convention (set full_matrices=True in scipy): * U: MxM * $\Sigma$: MxN * $V^T$: NxN $$ U = \begin{bmatrix} & & \\ & & \\ u_0 & \cdots & u_{M-1} \\ & & \\ & & \end{bmatrix} $$ bigger MxM matrix called "left singular vectors" $U U^T = U^T U = I_{M\times M}$ $$ \Sigma = \begin{bmatrix} \sigma_0 & 0 & 0 & 0 \\ 0 & \sigma_1 & 0 & 0 \\ 0 & 0 & \cdots & 0\\ 0 & 0 & 0 & \sigma_{N-1} \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} $$ $M\times N$: singular values, all non-negative $\sigma_0 \ge \sigma_1 \ge \cdots \ge \sigma_{N-1} \ge 0$ $$ V = \begin{bmatrix} & & \\ & & \\ v_0 & \cdots & v_{N-1} \\ & & \\ & & \end{bmatrix} $$ smaller NxN matrix called "right singular vectors" $V V^T = V^T V = I_{M\times M}$ #### Practice: SVD for a non-square matrix $$ A = \begin{bmatrix} 5 & 7 \\ 6 & 3 \\ -1 & 2 \end{bmatrix} $$ 1. What are the dimensions of $U$, $\Sigma$, $V^T$ matrices? 2. Use `linalg.svd` to calculate $U$, $\Sigma$, $V^T$, set `full_matrices=True` 3. Create the $\Sigma$ matrix by populating the main diagonal and filling the extra rows with zeros. 4. Verify that $A = U \Sigma V^T$ Starter code ``` import numpy as np from a import linalg A = np.array([[5,7],[6,3],[-1,2]]) U, sigma_arr, VT = # do SVD Sigma = np.zeros((3,2)) for i in range(2): # try to construct Sigma print(np.allclose()) # verify your results ``` # Week 9: Linear Algebra (cont'd) ### Summary for matrices 1. What is the rank of a matrix? How to calculate it? The rank is the number of linearly independent rows. We can calculate it by (1) counting the non-zero rows of the U matrix (2) counting the number of non-zero eigenvalues (3) counting the number of non-zero singular values 2. What does it mean that a matrix is orthonormal? $A A^T = I$ 3. What are the properties of a singular matrix? (1) non-invertible, (2) determinant = 0, (3) has zero eigenvalues, (4) has zero singular values, (4) has all zero rows in the U matrix, or rank < N 4. How to diagonalize a matrix? Perform eigenvalue decomposition $A = X \Lambda X^{-1}$, the $\Lambda$ matrix is the diagonalized matrix 5. What are the differences between eigenvalue decomposition and SVD? Eigen value decomposition => only works for square matrices Singular value decomposition => works for both square and rectangular matrices ### What are the differences between eigen decomposition and singular value decomposition? | | Eigen decomposition | Singular value decomposition | | ----------- | ----------- | ----------- | |Formula | $A = X \Lambda X^{-1}$ | $A = U \Sigma V^T$ | |Shape of the matrix|all matrices are (N $\times$ N)| (M $\times$ N), (M $\times$ M), (M $\times$ N), (N $\times$ N)| |Signs of eigen / singular values|can be any real or complex numbers | non negative| |How to calculate|$\det(A - \lambda)$ = 0 | calculate the eigensystems for $AA^T$ and $A^TA$| #### Practice: $$A= \begin{bmatrix} 2 & -1 & 0 \\ -1 & 2 & -1 \\ 0 & -1 & 2 \end{bmatrix} $$ Use `numpy.linalg` to 1. Calculate the determinant 2. Calculate the eigenvalues and eigenvectors 3. Verify that $A = X \Lambda X^T$ 4. Calculate the singular values, left singular vectors, and right singular vectors 5. Verify that $A = U \Sigma V^T$ 6. What's the relation between eigenvalue decomposition and singular value decomposition for this matrix? ``` import numpy as np from numpy import linalg A = np.array([[2,-1,0],[-1,2,-1],[0,-1,2]]) ``` #### Repeat for this singular matrix #### Practice: $$A= \begin{bmatrix} 1 & -1 & 0 \\ 0 & 1 & -1 \\ 1 & 0 & -1 \end{bmatrix} $$ Use `numpy.linalg` to 1. Calculate the determinant 2. Calculate the eigenvalues and eigenvectors 3. Verify that $A = X \Lambda X^{-1}$ 4. Calculate the singular values, left singular vectors, and right singular vectors 5. Verify that $A = U \Sigma V^T$ 6. What's the relation between eigenvalue decomposition and singular value decomposition for this matrix? ``` import numpy as np from numpy import linalg A = np.array([[1,-1,0],[0,1,-1],[1,0,-1]]) ``` ### Eigenvalue decomposition & singular value decomposition * If A is symmetric, then $A = X \Lambda X^T$, and $X$ is orthonormal $XX^T = I$ * If A is not symmetric, then $A = X \Lambda X^{-1}$, and $X$ is not orthonormal * If A is symmetric and positive-definite (all positive eigenvalues), then the two decompositions give the same results #### Practice on laptop $$ A = \begin{bmatrix} 2 & 4 & 6 & 8 \\ -1 & 0 & 1 & 0 \\ 1 & 2 & 3 & 4 \\ 1 & 4 & 7 & 8 \end{bmatrix} $$ $$ b = \begin{bmatrix} 12 \\ 0 \\ 6 \\ 12 \end{bmatrix} $$ 1. Calculate the LU decomposition of A. What is the dimension of the null-space of A? What is the rank of A? 2. Calculate the eigenvalues of A. How many eigenvalues are 0 (within numerical accuracy)? 3. Perform the SVD of A using "full_matrices=False". How many singular values are 0? 4. Construct a matrix Sigma_inv = np.diag(1/sigma_i). When $\sigma_i$ = 0, replace 1/sigma by 0 5. Calculate the pseudo-inverse A+ = V Sigma_inv UT , and find $x = A^+ b$ 6. Use "scipy.linalg.lstsq" (least square) to solve A x = b. Do you get the same solution? Starter code ``` import numpy as np from numpy import linalg A = np.array([2,4,6,8,-1,0,1,0,1,2,3,4,1,4,7,8]).reshape(4,4) ``` ### Pseudo-inverse and least square solution When A is singular, Ax = b has either (1) no solution or (2) infinitely many solutions. When there's no solution, we can calculate the pseudo inverse A+ Here x = A+ b gives a "compromised solution" i.e. the least square solution. That is, Ax - b is not zero, but has as small an absolute value as possible. #### Practice $$ A = \begin{bmatrix} 5 & 7 & -1 \\ 7 & 3 & 2 \\ -1 & 2 & 4 \end{bmatrix} $$ 1. Perform the Singular Value Decomposition $A = U \Sigma V^T$. Print $U$ and $\Sigma$. 2. Calculate the eigenvalues and eigenvectors of $A A^T$ 3. What's the relation between the singular values of A and the eigenvalues of $A A^T$ Starter code: ``` import numpy as np from numpy import linalg A = np.array([[5,7,-1],[7,3,2],[-1,2,4]]) ``` ### How to calculate SVD by hand? Let's assume A is M$\times$N $A = U \Sigma V^T$ $A^T = V \Sigma^T U^T$ $A A^T = U \Sigma \Sigma^T U^T$ This is the eigendecomposition of ($A A^T$), which is MxM and symmetric $A^T A = V \Sigma^T \Sigma V^T$ This is the eigendecomposition of ($A^T A$), which is NxN and symmetric $\sigma_i = \sqrt{\lambda_i}$ The singular values are the positive square root of the eigenvalues of $A A^T$ # Data analysis ### An example Let's consider the heights of all the people in the US. There's a true mean $\mu$, a true variance $\sigma^2$. $$ \sigma^2 = \frac{1}{N}\sum_{i=1}^{N} (x_i - \mu)^2 $$ But we cannot measure the weights of all the people in the US. We can only take a sample (e.g. 1000 people) and estimate the mean and variance. We expect that a bigger sample will lead to a better estimate of the mean. We have *population mean, population variance* vs *sample mean, sample variance*. ### Population vs. sample variance * Population mean $\mu$ & variance $\sigma^2$ (truth, unknown to us) * Now we have N data points : $x_1$ , $x_2$, … , $x_N$ Sample mean $$ \bar{x} = \frac{1}{N}\sum_{i=1}^{N} x_i $$ Sample variance $$ s^2 = \frac{1}{N-1}\sum_{i=1}^{N} (x_i - \bar{x})^2 $$ Here "N-1" comes from the degree-of-freedom (number of data points - number of free parameters = N - 1) Standard deviation: s *Standard error of the mean* $$ \sigma_{\bar{x}} = \frac{s}{\sqrt{N}} $$ (decreasing with sample size) We describe our uncertainty in estimating the population mean using $$ \bar{x} \pm \sigma_{\bar{x}} $$ #### Practice ``` import numpy as np x = np.array([63.65868476, 64.59765675, 65.30541444, 63.53366255, 66.26193878, 67.60795265, 67.57792303, 69.38398625, 71.18409535, 70.90454756]) ``` 1. code up the standard error 2. compare with np.std (set ddof=1) 3. code up standard error of the mean sigma_xbar #### Linear regression: the least square approach We have data points (xi, yi), and we want to find the line $y = a + b x$ that best describes these data points. For data point $(x_i, y_i)$, the difference between the prediction $(a+ b x_i)$ and the measurement $(y_i)$ is given by $y_i - (a + b x_i)$ To find a & b, we minimize the sum of the square difference of all these points: $$ \chi^2 = \sum_i (y_i - (a + b x_i))^2 $$ (pronounced as chi-squared) We want to find a & b that minimizes $\chi^2$. We solve $$ \frac{\partial \chi^2}{\partial a} = 0 \\ \frac{\partial \chi^2}{\partial b} = 0 $$ and obtain $$ b = \frac{\sum_i (x_i - \bar{x})(y_i - \bar{y})}{\sum_i (x_i - \bar{x})^2} \\ a = \bar{y} - b \bar{x} $$ #### Practice on laptop ``` import numpy as np x = np.array([120.61051228, 125.31693251, 125.31726086, 127.23471398, 139.93428306, 140.85120087, 142.95377076, 145.34869458, 160.46059713, 161.58425631]) y = np.array([63.65868476, 64.59765675, 65.30541444, 63.53366255, 66.26193878, 67.60795265, 67.57792303, 69.38398625, 71.18409535, 70.90454756]) ``` 1. calculate xbar, ybar 2. calculate b and a 3. compare with `np.polynomial.polynomial.polyfit` results # Week 10: Data Analysis (cont'd) ## using `polyfit` `np.polynomial.polynomial.polyfit(x, y, deg=n)` The output is c0, c1, c2, …, cn If we have "deg = 1" we have intercept (c0) and slope (c1) ## Linear regression with error bars Each data point has an error bar, $σ_i$. We weight each data point by $1/σ_i$ (or each square difference is weighted by $1/σ_i^2$) Degrees of freedom $\nu$ (Greek letter nu) = number of data points - 2 (number of parameters = 2) If $\chi^2/\nu$ < 1, then the model provides a good fit (chi-squared per degree of freedom) ## Practice ``` import numpy as np import matplotlib.pyplot as plt x = np.array([120.61051228, 125.31693251, 125.31726086, 127.23471398, 139.93428306, 140.85120087, 142.95377076, 145.34869458, 160.46059713, 161.58425631]) y = np.array([63.65868476, 64.59765675, 65.30541444, 63.53366255, 66.26193878, 67.60795265, 67.57792303, 69.38398625, 71.18409535, 70.90454756]) sigma_y = np.array([0.58065631, 1.01441054, 0.26079558, 0.65342395, 0.78509076, 0.06280835, 0.82113238, 0.23663287, 0.09261278, 1.34560599]) ``` 1. use `plt.errorbar` to make a plot with error bars 2. use `np.polynomial.polynomial.polyfit` to fit a line, with w = 1/sigma_y 3. add the best-fit model to the plot 4. calculating the chi_square ## chi-squared The concept can be generalized to any model. $$ \chi^2 = \sum_i \frac{(y_i - model)^2}{\sigma_i^2} $$ Degrees of freedom $\nu$ (Greek letter nu) = number of data points - number of model parameters. If $\chi^2/\nu$ (the difference between data and model is smaller than the error bar), then the model provides a good fit. In physics, we often ask/hear "What is your chi-square per degree of freedom" Which basically means "How good is your fit" ### Finding the minimum: Nelder–Mead downhill simplex method Basic idea: Rolling a triangle downhill. The triangle can be flipped, stretched, or shrinked according to the landscape [YouTube video](https://www.youtube.com/watch?v=yZjt0qlG7OU) #### Practice: scipy.optimize.minimize $$f(x) = 1/x^6 - e^{-x}$$ (potential energy of interactions between atoms) 1. Plot f(x) between 1 and 6 2. Find x where f(x) is the lowest ### What about finding the maximum of f(x)? Finding the minimum of -f(x) ### Data fitting example: fitting a normal distribution We have a histogram, and we want to fit it to a normal distribution (also called Gaussian distribution or bell curve) and find mu and sigma $$ f(x) = \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2} $$ #### Practice: Data fitting example: fitting a normal distribution ``` x = np.array([-0.34450244, -0.27538229, -0.20626214, -0.13714199, -0.06802183, 0.00109832, 0.07021847, 0.13933862, 0.20845877, 0.27757892]) y = np.array([0.02893512, 0.05787024, 0.56423488, 1.34548316, 3.24073364, 3.73263071, 3.19733096, 1.67823706, 0.44849439, 0.17361073]) ``` $$ \chi^2 = \sum_i (y_i - f(x_i))^2 $$ Let's assume mu = 0 and we want to find sigma 1. Code up this f(x, sigma) function 2. Code up a function: chi_squared(sigma) for this data set & model. 3. Minimize the chi-squared. Find the sigma value that minimizes chi-squared ## Interpolation We have several (x,y) points. We want a continuous function that passes through all those points. Then we can predict values in between points, use this function for integration, etc. What's the difference between interpolation and fitting? Interpolation: Going through all points. Using all the values we've got Fitting: finding a model (not necessarily going through all points) that best describes all the data. The model parameters usually have a physical meaning. ### Linear interpolation (Plot from Newman's book) f(x) is known at x=a and x=b Points in between a and b are estimated by assuming a straight line. Most widely used and robust. There are more complicated methods, but sometimes they give bad results. #### Practice `scipy.interpolate.interp1d` ``` x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) y = np.array([ 0. , -3.8, -1.6, 6.6, 20.8, 41. , 67.2, 99.4, 137.6, 181.8]) ``` 1. Plot y vs. x 2. Make the interpolation function f(x) = y(x) and find the y value at x = 4.3 3. Put these this new point on the same plot (`plt.scatter`) 4. Make the interpolation function g(y) = x(y) and find the x value at y = 50 5. Put these this new point on the same plot (`plt.scatter`) ### What about extrapolation? We can use the `scipy.interpolate.interp1d` to do extrapolation. But extrapolation is usually a bad idea unless you know what you are doing. # Week 11 ## Solving nonlinear equations How do you find the solution for $$ x = 2 - e^{-x} $$ 1. Plot $y = x$ 2. Plot $y = 2 - e^{-x}$ Then we can see that the two curves cross at around -1.5 and 2. ### Relaxation method initial guess: x0 = 2 plug in right-hand side: we get 2 - e-2 = 1.632, set x1 = 1.632 plug in right-hand side again, get x2 and again, and again... After a few steps, the answer will stop changing, which means we find our solution. If we want to find the solution near -1.5, we need to set the initial guess x0 = -1.5 ### Binary (bisection) method We want to find the root of f(x) = 0 Let's start with two guesses: x1 and x2 If f(x1) and f(x2) have the opposite sign (one positive and the other negative), we know that the root is in between x1 and x2 We then divide the interval: x'=(x1+x2)/2 and see of f(x') is positive or negative. We divide again, and again… until the interval is small enough (achieving the desired numerical accuracy.) ### Newton's method We again want to find the root $f(x) = 0$ We start with an initial guess x, and calculate the derivative at x: $ f'(x)$ Then we move to the next point $x' = x - \frac{f(x)}{f'(x)}$ Then we calculate the derivative at x', and move to the next point, and again.. #### Practice: scipy.optimize.root_scalar $$ f(x) = 5 e^{-x} + x -5 $$ Obviously, x=0 is a root for f(x)=0. But we want the other root. First, plot f(x) and see where the root might be. Try different plotting ranges. Add a horizontal line y = 0. Use method='bisect' (a binary method) to find the root for f(x) = 0. Use method='newton' to find the root. Hint: need fprime # Ordinary Differential Equations Ordinary differential equations (ODE): Vocabulary $$ \frac{dx}{dt} = \frac{2x}{t} + \frac{3x^2}{t^3} $$ Or in general $$ \frac{dx}{dt} = f(x, t) $$ * x is called a dependent variable * t is called an independent variable * The equation is called ordinary because there's only one independent variable (x only depends on t) * The equation is called first-order because there's only first derivative ## ODE: Euler's method (1707-1783) We are given the value of $x$ at time $t$ (i.e. the initial condition) After a short time interval $h$, the $x$ value can be calculated using the Taylor expansion $$ x(t+h) = x(t) + h\left(\frac{dx}{dt}\right)_t + {\mathcal O}(h^2) \\ \approx x(t) + h f(x, t) $$ We can keep going: x(t+2h), x(t+3h), ….. If h is very small, ${\mathcal O}(h^2)$ will be negligible, but a numerical error will build up as we take more and more steps. ### Practicing Euler's method by hand Consider this equation: $$ \frac{dx}{dt} = -x^3 + \sin(t) $$ Initial condition: x=0, t=0. Let's use h = 0.1 1. Find x at t = 0.1: need dx/dt(t=0) 2. Find x at t = 0.2: need dx/dt(t=0.1) Fill this table: | t | x(t) | dx/dt(t) | x(t+h) | | ----------- | ----------- |----------- | ----------- | | t = 0 | x=0 | | | | t = 0.1 | | | | | t = 0.2 | | | | ## The second-order Runge-Kutta (RK2) method Trick: Taylor expansion around t+h/2 $$ x(t+h) = x\left(t+\frac{1}{2}h\right) + \frac{1}{2}h \left(\frac{dx}{dt}\right)_{t+\frac{1}{2}h} + \frac{1}{8}h^2 \left(\frac{d^2x}{dt^2}\right)_{t+\frac{1}{2}h} + \mathcal{O}(h^3) $$ $$ x(t) = x\left(t+\frac{1}{2}h\right) - \frac{1}{2}h \left(\frac{dx}{dt}\right)_{t+\frac{1}{2}h} + \frac{1}{8}h^2 \left(\frac{d^2x}{dt^2}\right)_{t+\frac{1}{2}h} + \mathcal{O}(h^3) $$ Combining these two equations, we have $$ x(t+h) = x(t) + h \left(\frac{dx}{dt}\right)_{t+\frac{1}{2}h} + \mathcal{O}(h^3) $$ ### Comparing Euler and RK2 Euler $$ x(t + h) = x(t) + h ~ f(x,t) + \mathcal{O}(h^2) $$ RK2: $$ x(t + h) = x(t) + h ~ f\left(x,t+\frac{1}{2}h\right) + \mathcal{O}(h^3) $$ Euler: use the slope at $t$, error $\mathcal{O}(h^2)$ RK2: use the slope at $t + h/2$, $\mathcal{O}(h^3)$ The accuracy is improved by an order of magnitude! ## 2nd-order Runge-Kutta (RK2) in practice $$ \begin{aligned} k_1 &= h ~ f(x,t) \\ k_2 &= h ~ f\left(x + \frac{1}{2}k_1, t+ \frac{1}{2}h \right) \end{aligned} $$ To get x(t+h/2), we use Euler's method in this step. $$ x(t+h) = x(t) + k_2 $$ ## The fourth-order Runge-Kutta (RK4) method Trick: Taylor expansion around smartly chosen locations. More higher-order terms got canceled. The error is $\mathcal{O}(h^5)$. And we are still only using the first derivatives! #### Laptop practice: `scipy.integrate.solve_ivp` $$ \frac{dx}{dt} = -x^3 + sin(t) $$ Initial condition: x=0, t=0. solve x(t) for t from 0 to 10 (note that Python uses y instead of x0) plot x vs. t (hint: you can add t_eval = np.linspace(0,10)) ## Two dependent variables: x(t) and y(t) $$ \begin{aligned} \frac{dx}{dt} &= xy - x \\ \frac{dy}{dt} &= y - xy + \sin^2t \end{aligned} $$ Or in general $$ \frac{dx}{dt} = f_x(x, y, t) \\ \frac{dy}{dt} = f_y(x, y, t) $$ These are still ordinary diff eq (only one independent variable t), not partial diff eq. In this case, x(t) and y(t) affect each other. Example: the number of rabbits & foxes in an ecosystem, ... Trick: define vectors (or Python arrays): $$ \vec{r} = (x, y) \\ \vec{f} = (f_x, f_y) $$ The equations become $$ \frac{d\vec{r}}{dt} = \vec{f}(\vec{r},t) $$ And Euler's formula can be easily generalized to: $$ \vec{r}(t+h) \approx \vec{r}(t) + h ~ \vec{r}(t) $$ We can do the same for the Runge-Kutta formula. #### Laptop practice: scipy.integrate.solve_ivp $$ \begin{aligned} \frac{dx}{dt} &= xy - x \\ \frac{dy}{dt} &= y - xy + \sin^2t \end{aligned} $$ Initial condition: x = y = 1 at t = 0 Plot x(t) for t = 0 to 10 Plot y(t) for t = 0 to 10 (in a separate plot) Hint 1: Python calls the dependent variable y, but let's call it r, and it is now an array Hint 2: r is an array r = [x,y], initial condition is an array r0 = [x0, y0], and f(t, r) returns an array [fx, fy] Starter code: ``` def f(t, r): x = r[0] y = r[1] dxdt = ? dydt = ? return dxdt, dydt from scipy.integrate import solve_ivp sol = solve_ivp(f, t_span = ? , y0 = ?, t_eval=np.linspace(0,10,100)) t_sol = sol.? x_sol = sol.? y_sol = sol.? plt.plot(t_sol, x_sol) plt.plot(t_sol, y_sol) ``` # Week 12 ### The rabbit-fox Problem x(t): population size of rabbits y(t): population size of foxes $$ \begin{aligned} \frac{dx}{dt} &= \alpha x - \beta x y \\ \frac{dy}{dt} &= \gamma x y - \delta y \end{aligned} $$ α: rabbits reproduce by themselves β: rabbits die because of foxes γ: foxes reproduce when they have rabbits to eat δ: foxes die by themselves Different sets of parameters will lead to different solutions ### Epidemiology: the SIR model S(t): Susceptible (not sick) I (t): Infectious (sick, can spread the disease) R(t): Recovered (won't get sick again, won't spread the disease) N: total population. N = S + I + R = constant $$ \begin{aligned} \frac{dS}{dt} &= -\frac{\beta I S }{N} \\ \frac{dI}{dt} &= \frac{\beta I S }{N} - \gamma I \\ \frac{dR}{dt} &= \gamma I \end{aligned} $$ The susceptible population decreases (some people get sick). β: infection rate Some become sick, and some people recover. γ: recover rate #### Practice 1: N = 1000, beta = 0.4, gamma = 0.04 N = S(t) + I(t) + R(t) (total population doesn't change) Initial condition: S(t=0) = 997 I(t=0) = 3 R(t=0) = 0 Solve: from t = 0 to 100 Plot S(t), I(t), R(t) #### Practice 2: Same initial condition Let's keep gamma = 0.04, and change beta Plot the following: Plot I(t) for beta = 0.4 Plot I(t) for beta = 0.1 Plot I(t) for beta = 0.05 Plot I(t) for beta = 0.03 Calculate t = 0 to 1000 ### $R_0$: basic reproduction number $$ R_0 = \frac{\beta}{\gamma} $$ $\beta$: infection rate $\gamma$: recover rate $R_0 > 1$: the disease will keep spreading, and everyone gets it eventually $R_0 < 1$: the disease will dissipate itself "The number of people that one sick person will infect on average" [NPR](https://www.npr.org/sections/goatsandsoda/2021/08/11/1026190062/covid-delta-variant-transmission-cdc-chickenpox) | R_0 | Virus | | --- | ------ | | 18 | measles | | 12 | mumps | | 10 | chickenpox | | 9.5 | COVID (omicron strain) | | 7 | COVID (delta strain) | | 3 | COVID (original strain) | ### Vaccination: herd immunity To contain a disease with a reproduction number $R_0$, we need to vaccinate $(R_0 - 1)/R_0$ of the entire population. E.g. if COVID has $R_0=3$, we need to vaccinate 2/3 of the population. (To show this, we need to solve ODEs with a vaccinated population.) See [here](https://en.wikipedia.org/wiki/Compartmental_models_in_epidemiology) for more models ### ODE recap: what does each number mean here? sol = solve_ivp(f, t_span = (0, 10), y0 = [0, 1], t_eval=np.linspace(0,10,100)) Read the documentation 1. What is f (write an example)? 2. What is t_span? 3. What is y0? 4. What is t_eval? 5. How many differential equations are in this problem? 6. What is the initial condition of this problem? 7. What is sol.t? 8. What is sol.y? ## Second-order ODE $$ \frac{d^2x }{dt^2} = f\left(x, \frac{dx}{dt}, t\right) $$ Here is the trick: $$ v = \frac{dx}{dt} $$ Now we are solving two first-order differential equations! $$ \frac{dx}{dt} = v \\ \frac{dv}{dt} = f(x, v, t) $$ It's like one equation for position and the other equation for velocity. Similarly, a third order eq can be converted to 3 first-order eq., and so on.. #### Practice $$ \frac{d^2x}{dt^2} = -x $$ Please don't solve it by hand :) Initial condition: x=1, dx/dt = 0 at t = 0 Plot x(t) for t from 0 to 50 Step 1: convert it to 2 first-order equations Step 2: use `scipy.integrate.solve_ivp` Repeat for $$ \frac{d^2x}{dt^2} = -x^3 $$ #### Practice Projectile motion: initial height y(t=0)=0, initial velocity v(t=0)=20 (Please don't solve it by hand) g = 9.8 $$ \frac{d^2y}{dt^2} = -g $$ Solve y(t) and make a plot Initial conditions y(t=0)=0, v(t=0)=20 First, rewrite it into two first order equations, one for y and the other for v=dy/dt Then solve them using scipy.integrate.solve_ivp, from t=0 to t=5 Plot y(t) #### Practice $$ \frac{d^2 x}{dt^2} + \frac{dx}{dt} + x = 0 $$ Initial condition: x(t=0)=3. dx/dt(x=0)=1 1. (by hand) convert it into 2 first-order ordinary differential equations 2. (code) solve x(t) from t=0 to 10 3. (code) plot x(t) from t=0 to 10 ## Boundary value problem ### Example: Vertical projectile motion with an unknown initial velocity All we know is that the projectile flew for 10 seconds (Please don't solve it by hand) $$ \frac{d^2y}{dt^2} = -g $$ y(t=0) = 0, y(t=10) = 0 Find y(t) So far we've only solved initial value problems. For example, we were given x(t=0)=0 and v(t=0)=20. Here, we are given x(t=0) and x(t=10). How do we solve the problem? ### The Shooting Method Trial-and-error to match the boundary value Let's guess an initial velocity e.g. v(t=0)=2, does it lead to x(t=10)=0? If we overshoot, we decrease the initial velocity. If we undershoot, we increase the initial velocity. Try v(t=0)=20 and calculate x(t=10). Is it larger or smaller than 0? Should you increase or decrease the initial velocity? Try a few initial velocity so that you get x(t=10) as close to 0 as you can! (We'll do it systematically in the lab tomorrow.) #### Practice ``` x = np.array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.]) y = np.array([ 0.07959564, 2.93576248, 3.11609594, 1.31749718, -1.75677014, -1.90673618, 0.08858298, 2.54574154, 2.99372428, 1.8276882, -1.52403515]) ``` 1. create an interpolation function y = f(x) 2. evaluate y at x0 = 1.5 We want to fit a model f(x)=a sin(x) 3. define a function chi_squared(a) that returns the sum of the square difference between the model and data 4. Find a that minimizes chi_squared 5. Plot the best-fit model together with the data # Week 13: Monte Carlo Methods Sometimes a system has some randomness and is described by probabilities. For example: * an atom has a certain probability of decaying; * a person has a certain probability of getting sick & transmitting disease. Sometimes a function is difficult to integrate. We can use random numbers to sample this function ### Random numbers Generating random numbers is a field of study on its own. We'll use Python's random number generator `np.random.rand()` # gives a random number, uniformly selected between 0 and 1 `np.random.rand(10)` # gives 10 such numbers There are non-uniform random numbers (e.g. Gaussian distribution) but let's focus on the uniform distribution today. ### Random seed If we input a given number in "seed", it will give us the same random number every time. `np.random.seed(42)` # here we can put in any integer `print(np.random.rand())` How about this? `np.random.seed()` # Python will take your computer's clock as seed `print(np.random.rand())` #### Practice: Rolling two dice: 1 million times Use np.random.randint (documentation) 1. Generate two random numbers between 1 and 6 2. Generate 2x1,000,000 random numbers between 1 and 6. How many times do you get double six? 3. Plot a histogram of the sum of the two dice. #### A biased Coin We have a coin that shows Heads 20% of the time Toss this coin 1000 times. How many Heads are there? 1. generate 1000 random numbers between 0 and 1, using random.rand() 2. count how many random numbers are less than 0.2 (use array filtering) ``` toss = np.random.rand(1000) print(len(toss[toss < 0.2])) ``` ### These problems are solved in the same way * A coin-toss shows heads 20% of the time * An atom has 20% probability to decay * In a population, 20% people gets sick * From 10000 data points, we randomly select 20% data points If you want something to happen with 20% probability (e.g. a biased coin shows heads 20% of the time). Draw random numbers between [0,1) using np.random.rand() If the random number is less than 0.2, show heads. Otherwise, show tail. #### Practice: Radioactive decay At a given time Δt, an atom A at has some probability of decaying into an atom B. This probability is given by $p = 1 - 2 ^{-\Delta t/\tau}$ Let's consider τ = 180 s (called half life), and Δt = 1 s Now assume we have 1000 A's. Let's simulate the number of A's and B's as a function of time, for t=0, 1, 2, …, 1000 seconds First, set up `n_A = 1000, n_B = 0`. 1. At t = 1s, how many A's are left? 2. What about t = 2s? 3. Write a for-loop, for t = 1, 2, …, 1000s ### Monte Carlo integration Classic example: if we want to calculate the area of ¼ a circle, we can spread beans randomly on this plane, and count the beans inside vs. outside the circle. We generate lots of random (x,y) pairs, and count how many pairs are inside vs. outside the area. #### Practice 1. Generate 1,000,000 pairs of (x,y), which are random numbers between 0 and 1 2. Count how many points satisfies: `x**2 + y**2 < 1` 3. What's the fraction of such points? (should be `np.pi /4`) (Yeah I know this is boring. If you have time, you can calculate the volume of a sphere in 10 dimensions.) ## Integrating an oscillating function $$ I = \int_0^2 \sin^2\left(\frac{1}{x(2-x)}\right) dx $$ The integrand oscillates like crazy. Gaussian quadrature will not be accurate. Strategy: Put this function in a rectangle. This rectangle has area = 2 * 1 = 2 Put lots of points in this rectangle. Count how many points fall in the gray area. #### Practice 1. define $f(x)$ and plot between 0 and 2 2. Calculate $\int_0^2 f(x)dx $ using scipy.integrate.quad (the error should be larger than usual) 3. Let's try Monte Carlo: generate 1,000,000 pairs of (x, y), x is between [0,2], y is between [0,1]. What's the fraction of points satisfy $ y < \sin^2\left(\frac{1}{x(2-x)}\right)$? The area of the rectangle is 2. What's the area under the curve? Run this a few times. Do you get the same area every time? ### The mean value method for Monte Carlo integration $$ I = \int_a^b f(x) dx $$ The mean of the integrand is given by: $$ \langle f \rangle = \frac{1}{b-a} \int_a^b f(x) dx = \frac{1}{b-a} I $$ and we get $$ I = (b-a) \langle f \rangle $$ We can calculate f at randomly selected points, and take the average! #### Practice 1. create x: 1,000,000 random values between [0,2] 2. calculate the f(x) at these random points 3. take the average of these values, and calculate Do you get the same answer as before? # Week 14 ### Uniformly-distributed random numbers So far, we've only looked at random numbers drawn from a uniform distribution. For example, `np.random.rand()` gives us a random number uniformly selected between 0 and 1. If we draw lots of them, we will get a flat histogram ``` import numpy as np import matplotlib.pyplot as plt x = np.random.rand(1000000) plt.hist(x) ``` ### Gaussian random numbers But sometimes we want a non-flat distribution. For example, a Gaussian (normal) distribution with mean μ (mu) and standard deviation σ (sigma) $$ f(x) = \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2} $$ `numpy.random.normal` ``` import numpy as np import matplotlib.pyplot as plt mu, sigma = 0, 0.1 x = np.random.normal(mu, sigma, 1000000) plt.hist(x, bins=np.linspace(-0.5,0.5,101)) ``` ### Drawing from an arbitrary distribution function Python has many more built-in PDFs (probability distribution functions), e.g. Poisson distribution, Gamma distribution, ….) But what if we want to draw from a distribution that's not in Python's package? For example, this distribution $$ f(x) = 2^{-x/\tau} \frac{\ln 2}{\tau} $$ Or you might have a histogram from experiments. ### Probability Distribution Function (PDF) vs. Cumulative Distribution Function (CDF) Example: f(x) is the PDF of people's heights F(x) is the CDF of f(x) f(70) * 0.2 => probability a person's height is between 69.9 and 70.1 F(70) => probability a person's height is lower than 70 ### Probability distribution functions (PDFs) Examples: normal distribution $$ f(x) = \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2} $$ It's usually normalized to 1, meaning $$ \int_{-\infty}^{\infty} f(x)dx = 1 $$ It's cumulative distribution function (CDF) is the probability below x and is given by $$ F(x) = \inf_{-\infty}^{x} f(x) dx $$ The value of F is always between 0 and 1 ### More on normalization $$ \int_{-\infty}^{\infty} f(x)dx = 1 $$ Basically, it means the area below the curve is 1. If it's not, them you can simply divide f(x) by the total area. (All probabilities sum up to 1) When you plot histogram using plt.hist(), set density=True, and the area of the histogram will be one. ### PDF and CDF of the Gaussian distribution PDF $$ f(x) = \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2} $$ CDF $$ F(x) = \int_{-\infty}^x f(x') dx' = \frac{1}{2}\left[ 1 + erf\left(\frac{x-\mu}{\sigma\sqrt{2}}\right) \right] $$ Here "erf" is the "error function" $$ erf(z) = \frac{2}{\sqrt{\pi}}\int_0^z e^{-t^2} dt $$ It doesn't have a closed function form but is very widely used (scipy.special.erf) ### How to draw random numbers matching an arbitrary pdf f(x) Calculate CDF: y = F(x) Draw uniform random numbers for y between 0 and 1 Use y = F(x) to convert from y to x Random numbers following a given distribution Question: How to draw 1,000,000 points so that their histogram will look like this function this function? $$ f(x) = 2^{-x/\tau} \frac{\ln 2}{\tau} $$ Step 1: calculate the CDF F(x), numerically or analytically Step 2: draw 1,000,000 random numbers uniformly between 0 and 1, call them yi Step 3: solve yi = F(xi), numerically or analytically, and xi are our desired random numbers! ### Practice: $$ f(x) = 2^{-x} {\ln 2} $$ 1. Its CDF is given by $F(x)=1-e^{-x\ln2}$ Make a plot for both f(x) and F(x), between 0 and 10 2. (By hand) Set y=F(x). Calculate by hand its inverse function x = F-1(y). 3. Code up this function Finv(y) 4. Draw 1,000,000 random numbers uniformly between 0 and 1, call them yi 5. Use $x = F^{-1}(y)$ to convert yi to xi. 6. Plot a histogram for xi (with density=True), use bins=100.. 7. Over plot f(x). Does it agree with f(x)? ### Importance sampling In the "mean-value method" for Monte Carlo Integration, we draw uniform random numbers xi, evaluate f(xi) at each point, and calculate the mean But we don't have to draw uniform random numbers. We can draw non-uniform random numbers to sample our function properly. If our samples follow a non-uniform distribution p(x), our integration will be given by $$ \int_a^b f(x) dx \approx \left(\int_a^b p(x)dx \right)\frac{1}{N} \sum_{i=1}^{N}\frac{f(x_i)}{p(x_i)} $$ #### Practice: mean value $$ f(x) = \frac{x^{-1/2}}{e^x + 1} $$ Consider this integration: 1. draw 1 million uniform random numbers between 0 and 1 2. evaluate f(x) at the locations of each numbers 3. calculate the mean $I = (b-a) \langle f \rangle$ #### Practice: importance sampling (step 1: non-uniform random number) $$ p(x) = \frac{1}{2\sqrt{x}} ~ \mbox{for} ~ 0 < x \le 1 $$ 1. (By hand) Calculate the cumulative probability distribution y = F(x) 2. (By hand) calculate the inverse $x = F^{-1}(y)$ 3. Draw 1 million points for y: uniform random number between 0 and 1 4. Convert y to x #### Practice: importance sampling (step 2: evaluation) $$ I \approx \left(\int_0^1 p(x) dx \right) \frac{1}{N}\sum_{i=1}^{N}\frac{f(x_i)}{p(x_i)} $$ Use the xi sample from step 1, calculate the mean of f(xi) / p(xi) ## Mathematica Go to https://www.wolfram.com/siteinfo/ and enter your Boise State email Select the option to "Activate through your organization (SSO)". Sign in with your University account. ## Mathematica Basics ```Print["Hello World"] (* Let's make a simple plot *) (*You can use space instead of * for multiplication *) Plot[0.5 Sin[x]/x, {x, -10, 10}] (* Define a function and plot it*) f[x_] := x^2 Exp[-x^2] Plot[f[x], {x, -3, 3}] (* Complex number *) a = 1+ 2 I a^2 ``` #### Practice: integration $$ f(x) = x^4 - 2x + 1 $$ First, plot this function Indefinite integral $$ \int f(x) dx $$ Definite integral $$ \int_0^2 f(x) dx $$ Definite integral with limit (a, b) $$ \int_a^b f(x) dx $$ #### Practice: matrix operations Pauli matrices $$\sigma_1 = \sigma_x =\begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} $$ $$\sigma_2 = \sigma_y =\begin{bmatrix} 0 & -i \\ i & 0 \end{bmatrix} $$ $$\sigma_3 = \sigma_z =\begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix} $$ Verify that $\sigma_1^2 = \sigma_2^2 = \sigma_3^2 = -i \sigma_1\sigma_2\sigma_3 = I $ $\det \sigma_j = -1$ ${\rm tr}~ \sigma_j = 0$ Calculate the inverse, determinant, trace: $$ \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} $$ ### non-linear equations $$ 5 e^{-x} + x - 5 = 0 $$ You can also solve a system of simultaneous, non-linear equations (we didn't do this in class). ``` sol = FindRoot[5 Exp[-x] + x - 5 == 0, {x, 5}] Print[x /. sol] ``` ### minimization ``` f(x, y) = 100 (y-x^2)^2 + (1-x)^2 Minimize[100 (y - x^2)^2 + (1 - x)^2, {x, y}] ``` ### Resources [BYU PHYS 230 lab notes](https://physics.byu.edu/courses/computational/phys230) [Itsuko S Suzuki lecture notes (SUNY Binghamton)](https://bingweb.binghamton.edu/~itsuko/CompPhysLN.html)