numpy - HackMD

numpy === ## 甚麼是numpy numpy是一個python中的模組 ## 為甚麼要用它因為相比較於python中動態分配的list，用numpy這個底部用C和Fortran寫的，可以平行處理，效率較list高 ## 學習其他模組前的基石此外 Python 其餘重量級的資料科學相關套件（例如：Pandas、SciPy、Scikit-learn 等）都幾乎是奠基在 Numpy 的基礎上，所以要學習其他模組之前先學習numpy奠定基礎會學得更快。 # numpy的重點在於陣列操作 Numpy 的重點在於陣列的操作，建立同質性且多維的陣列，ndarray ## 將list變為numpy.array 只要 ```python= >>> numpy.array(list) ``` 就可以換為numpy.array然後使用array的功能 ## 建立陣列 ```python= dim=(2,3) np1 = np.zeros([2, 3]) # array([[ 0., 0., 0.], [ 0., 0., 0.]]) np2 = np.ones([2, 3]) # array([[ 1., 1., 1.], [ 1., 1., 1.]]) np3 = np.zeros(dim,dtype=int) # array([[ 0., 0., 0.], [ 0., 0., 0.]]) #可以變換float成int，dim也可使用tuple ``` ## 建立單位陣列(identity matrix) ```python= import numpy print numpy.identity(3) #3 is for dimension 3 X 3 #Output [[ 1. 0. 0.] [ 0. 1. 0.] [ 0. 0. 1.]] ``` ## 建立eye陣列 ```python= import numpy print numpy.eye(8, 7, k = 1) # 8 X 7 Dimensional array with first upper diagonal 1. #Output [[ 0. 1. 0. 0. 0. 0. 0.] [ 0. 0. 1. 0. 0. 0. 0.] [ 0. 0. 0. 1. 0. 0. 0.] [ 0. 0. 0. 0. 1. 0. 0.] [ 0. 0. 0. 0. 0. 1. 0.] [ 0. 0. 0. 0. 0. 0. 1.] [ 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0. 0. 0.]] print numpy.eye(8, 7, k = -2) # 8 X 7 Dimensional array with second lower diagonal 1. #output [[0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0.] [1. 0. 0. 0. 0. 0. 0.] [0. 1. 0. 0. 0. 0. 0.] [0. 0. 1. 0. 0. 0. 0.] [0. 0. 0. 1. 0. 0. 0.] [0. 0. 0. 0. 1. 0. 0.] [0. 0. 0. 0. 0. 1. 0.]] ``` ## 讓array可以加減乘除找餘數找商數就平常那些運算符就可以了 ## 變換arrary中的type numpy.arry跟list很像，只是它裡面的element要求要同一個type，也可以當作一個參數去做變換 ```python= >>> numpy.array([1,2,3]) [1,2,3] >>> numpy.array([1,2,3],float) [1.,2.,3.] ``` ## 變換維度 ```python= import numpy my_array = numpy.array([1,2,3,4,5,6]) print numpy.reshape(my_array,(3,2)) #Output #[[1 2] #[3 4] #[5 6]] ``` 一維陣列通常叫做```vector```，二維陣列稱為```matrix``` ## 擴增維度的另類方法 ```python= import numpy as np test=np.array([[1,2],[3,4]]) print(test.shape) test=test[...,None] print(test) print(test.shape) ``` ## transpose 轉置 ```python= import numpy my_array = numpy.array([[1,2,3], [4,5,6]]) print numpy.transpose(my_array) #Output [[1 4] [2 5] [3 6]] ``` 另外，我試過把my_array用list代入，一樣是可行的 ## flatten 將原本N$\times$M的陣列，轉為1$\times$(N*M)的陣列 ```python= import numpy my_array = numpy.array([[1,2,3], [4,5,6]]) print my_array.flatten() #Output [1 2 3 4 5 6] ``` ## concatenate(串聯) 可以把兩個陣列依照不同的axis串聯起來， ```python= >>> a = np.array([[1, 2], [3, 4]]) >>> b = np.array([[5, 6]]) >>> np.concatenate((a, b), axis=0) array([[1, 2], [3, 4], [5, 6]]) >>> np.concatenate((a, b.T), axis=1) array([[1, 2, 5], [3, 4, 6]]) >>> np.concatenate((a, b), axis=None) array([1, 2, 3, 4, 5, 6])#即flatten ``` ## ceil,floor,rint 可以四捨五入，可以進位、捨去。 ## numpy.set_printoptions 決定array是如何展示可以控制精度，可以控制輸出正數、負數，另外，也可以控制輸出的版本是要用現行numpy的1.14版還是1.13版 https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.set_printoptions.html ## numpy.all(),numpy.any() 確認矩陣是不是True，axis變數=0，表示比較col上的 all表示都要是Ture any表示有True就True ## numpy.random.choice(a, size=None, replace=True, p=None) a如果是array就從裡面抓，若是int就當作range(a) size就是你要生成的array的size replace是...跨謀 :cry: p是可以安排a這個陣列被分配的機率，要是沒用就是uniform ## 找最大最小 np.argmax np.argmin 可以利用axis這個變數依照col或是row去找最大或是最小 axis=0是col，1則是row，沒輸入當作None就是找全部element中最大或是最小 ## 內外積利用numpy.cross外積，numpy.dot內積 ## Polynomials 也可以把vector當作polnomial的變數，去進行積分微分，或是找解 poly:找到具有我們給的vector作為root polnomial的序列 ```python= print numpy.poly([-1, 1, 1, 10]) #Output : [ 1 -11 9 11 -10] ``` roots:找到root ```python= print numpy.roots([1, 0, -1]) #Output : [-1. 1.] ``` polyint:不定積分 ```python= print numpy.polyint([1, 1, 1]) #Output : [ 0.33333333 0.5 1. 0. ] ``` polyder:微分 ```python= print numpy.polyder([1, 1, 1, 1]) #Output : [3 2 1] ``` polyval:帶入特定數字的答案 ```python= print numpy.polyval([1, -2, 0, 2], 4) #Output : 34 ``` polyfit:用least-squares逼近所得到誤差最小的方程式，並且可以指定degree The polyfit tool fits a polynomial of a specified order to a set of data using a least-squares approach. ```python= print numpy.polyfit([0,1,-1, 2, -2], [0,1,1, 4, 4], 2) #Output : [ 1.00000000e+00 0.00000000e+00 -3.97205465e-16] ``` #### 叉積點積內積外積兩者之間有沒有差別? 數學上面看到似乎是沒有差，不過在使用numpy做dot product和cross、outer、跟inner時，答案有不一樣，所以我出現懷疑， https://blog.csdn.net/u011599639/article/details/77926402 這邊有介紹下面整理下 ##### Cross product: 以前面對到的題目大多a,b都是1$\times$N，若是面對到N$\times$N的情況，就要把每一row分開來做，這樣就回歸1$\times$N的情況。 \begin{aligned}\mathbf {a\times b} &={\begin{vmatrix}a_{2}&a_{3}\\b_{2}&b_{3}\end{vmatrix}}\mathbf {i} -{\begin{vmatrix}a_{1}&a_{3}\\b_{1}&b_{3}\end{vmatrix}}\mathbf {j} +{\begin{vmatrix}a_{1}&a_{2}\\b_{1}&b_{2}\end{vmatrix}}\mathbf {k} \\&=(a_{2}b_{3}-a_{3}b_{2})\mathbf {i} -(a_{1}b_{3}-a_{3}b_{1})\mathbf {j} +(a_{1}b_{2}-a_{2}b_{1})\mathbf {k} \end{aligned} The cross product of a and b in R^3 is a vector perpendicular to both a and b. 就跟高中學的一樣，cross porduct 是找出跟兩項量垂直的向量。 ##### Outer product: 若ab不是一維矩陣，則會依某種規則壓縮成一維矩陣，再進行 ```python= a=[a0,a1,...,aM] b=[b0,b1...,bN] numpy.outer(a,b) #[[a0*b0 a0*b1 ... a0*bN ] #[a1*b0 . #[ ... . #[aM*b0 aM*bN ]] ``` ##### dot product 1. If both a and b are 1-D arrays, it is inner product of vectors (without complex conjugation). a,b皆1-D陣列，那dot等於inner product 2. If both a and b are 2-D arrays, it is matrix multiplication, but using matmul or a @ b is preferred. 若a,b都是2-D陣列，則是矩陣乘法，但是推薦用matmul或是a@b這種形式 3. If either a or b is 0-D (scalar), it is equivalent to multiply and using numpy.multiply(a, b) or a * b is preferred. 若a,b其中一個是純量，則是multiply 但是一樣推薦用numpy.multiply(a, b)或是a * b :::warning 下面4.5.我覺得怪，好像是要在np.dot(a,b)[1,2,3]後面加上[]並且打數字，才有可能是sum product ::: 4. If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b. a是N-D陣列，b是1-D陣列，他會是a,b最後一個axis的sum product 5. If a is an N-D array and b is an M-D array (where M>=2), it is a sum product over the last axis of a and the second-to-last axis of b: dot product在一維時跟inner product是一樣的， ## @ 和 * 差別 @就是我們認識的矩陣相乘，若是(3,1)就只能跟(1,N)相乘 \*是用在純量上，不過如果用在矩陣上，那就把對應的地方當作純量去做相乘，微拗口，看下面\*的例子 ```python= import numpy as np a=np.array([[1],[2],[3]]) b=np.array([[4],[5],[6]]) print(a.shape,b.shape) print(a*b) #(3, 1) (3, 1) #[[ 4] # [10] # [18]] # a@b可是會報錯的喔 ``` ## [np.pad](https://blog.csdn.net/zenghaitao0128/article/details/78713663) 在使用卷基神經網路中，邊緣常常會需要填充一些東西，python 中常用np的pad來進行。 ## Linear Algebra 可以找determine，也可以找eigenvector、eigenvalue，或是找這個matrix的inverse # np可能出現的bug ## array相加 ```python= import numpy as np a=np.array([[1,2,3],[0,0,0]]) b=np.array([4,5,6]) print(a.shape,b.shape) print(a+b) #輸出 #(2, 3) (3,) #[[5 7 9] # [4 5 6]] #另一個例子 import numpy as np a=np.array([1,2,3]) b=np.array([[4],[5],[6]]) print(a.shape,b.shape) print(a+b) #(3,) (3, 1) #[[5 6 7] # [6 7 8] # [7 8 9]] ``` 要注意，(3,)這樣的size會因為相加或相乘的數而改變它的size 必須要讓他變成(3,1)相加才不會出現問題 :::success 那你要怎麼固定他的shape到(3,1)呢?? 我想了很久終於查到其實可以簡單的用np.reshape(3,1)就OK啦~ :+1: ::: # 觀念 ## array 與 martix的差別 matrix 屬於array的一支，因為array可以是1D,2D...，而matrix只限制你在2D ## ndarray 與 array 差別 array並不是一個object 在做np.array時，它的type實際上會是ndarray 可以用type來確認 ```python= import numpy as np a=np.array([[1,2,3],[0,0,0],[7,8,9]]) b=np.array([[4,5,6]]) print(type(a),type(b)) #<class 'numpy.ndarray'> <class 'numpy.ndarray'> ``` # 資料來源 https://blog.techbridge.cc/2017/07/28/data-science-101-numpy-tutorial/ https://www.hackerrank.com/dashboard ###### tags: `python`