Python 筆記 - HackMD

# Python 基礎介紹 ## Python is Object-Oriented Ｑ：什麼是 object？Ａ： Object 基本上就是 ++a collection of data，和一些可以對這些 data 作用的 functions++。幾乎所有會在 python 裡看到的東西都是 object，像是 data structures 如 strings, lists, arrays, dictionaries...，和這些各式各樣的 objects 一起的是可以對它們作用的各種不同的 methods。 > method：可以作用在 object 上的 ++functions++。舉例來說，假設我們有一個 string： ```python c = "My dog's name is Bingo" ``` 其中一個可以對 string object 作用的 method 叫做 `split()`： ```python c.split() >>> ['My', "dog's", 'name', 'is', 'Bingo'] ``` > `split()` 會對和它用 `.` 相連的 string `c` 作用，從空格處將這個 string 分割成（在這個例子中）五個分開的 strings，然後 return 一個包含這五個 strings 的 ++list++。我們也可以在 `split()` 裡加上 argument，例如： ```python c.split('g') >>> ['My do', "'s name is Bin", 'o'] ``` > 把 `c` 這個 string 從有出現 g 的地方切開所以從這個例子我們知道 `split()` 是一種 string method，而除了 `split()` 以外，也有很多其他的 string methods。 Object 除了有 methods 以外，還有另個東西叫做 instance variables，也就是++和這個 object 存在一起的 data++；如果我們把 ++methods 和 instance variables++ 一起看，它們兩個合稱 attributes。 $\rightarrow$ 每一種 object（像是 strings, lists, arrays, dictionaries）都會有自己一個 set 的 attributes，而且這個 set 是 uniquely associated with 這個 object class。舉例來說： `split()` 是 string 的 attribute，但不是 array 的 attribute。 # list / dictionary / tuple / set ## list list：一個 python objects 的 collection，用 ++int++ 作為 index，從 0 開始。 - multidimensional lists and tuples ```python a = [[3, 9], [8, 5], [11, 1]] print(a[1]) >>> [8, 5] print(a[1][0]) >>> 8 ``` > tuple 的概念也相同，只是 tuple 裡的東西不能改變。 ### 建立 list - 建立空 list ```python mylist = [] ``` - 建立一個具 n elements 的 list，其中 init value 皆為 0 ```python mylist = [0]*5 print(mylist) >>> [0, 0, 0, 0, 0] ``` - 建立一個具 n elements 的 list，其中除了第一個 init value 為 -1，其餘皆為 0 ```python mylist = [-1] + [0]*4 print(mylist) >>> [-1, 0, 0, 0, 0] ``` ### 刪除 list 某些 index 的 element 如果沒有要 return 需要刪除的 element，我們可以用 `del`，如下面的例子，我們希望刪除 index 分別在 1 和 3 的 `'b'` 和 `'d'`： ```python my_list = ['a', 'b', 'c', 'd', 'e'] indices_to_remove = [1, 3] for index in sorted(indices_to_remove, reverse=True): del my_list[index] print(my_list) >>> ['a', 'c', 'e'] ``` > 首先我們先將要刪除的 index 利用 `sorted()` 排序，接著再一個一個刪除 > > $\rightarrow$ 須注意的是 `reverse` 設成 `True`，因為如果按照 index 小到大的順序刪除，在刪除 index 1 時，後面的 elements 會遞補過來，接下來刪 index 3 時就不是 `'d'` 而是刪到 `'e'` 了。 ## dictionary ### 基本性質 dictonary：一個 python objects 的 collection，但是 index（也就是 keys）可以是 strings / numbers。 $\rightarrow$ ++key++ 不見得要 int，甚至可以用 tuples，++幾乎任何 python object 都可以，只要這個 object 是 immutable （不會改變的），所以例如 list 就不能作為 key++。例子： ```python room = {"Emma":309, "Jake":582, "Olivia":764} ``` > 每個 dictionary 的 entry 用 `,` 分隔，每個 entry 包含一個 key 和一個 value，兩者用 `:` 分開。 > > $\rightarrow$ access dictionary entry 的方式和去 access list 類似，只是 list 是用 index，dictionary 則是用 key。混合不同形式的 key 的例子： ```python= weird = {"tank":52, 846:"horse", "bones":[23, 'fox', 'grass'], "phase":'I am here'} print(weird[846]) >>> "horse" print(weird["bones"]) >>> [23, 'fox', 'grass'] ``` > 在 access dictionary 的 element 時，如果 key 是 int 就不需要 quotation，例如第三行我們用 `weird[846]` 而非 `weird["846"]` > > 另一方面，如果 key 是 string 則需要 quotation，不管用 single quote `'` 或是 double quote `"` 都可以，這兩個是通用的（雖然習慣上會用 `"`） > > $\rightarrow$ 即使我們在定義 dictionary entries 時，key 是用 double quote 包住，在 access 時打 single quote 也行，例如第六行的 `weird["bones"]` 改 `weird['bones']` 也可以得到一樣的結果。 ### 建立一個 dictionary 建立一個空的 dictionary： ```python d = {} ``` 有了空的 dictionary `d` 以後，我們可以加入一些 entries： ```python= d["last name"] = "Alberts" d["first name"] = "Marie" d["birthday"] = "January 27" print(d) >>> {'last name': 'Alberts', 'first name': 'Marie', 'birthday': 'January 27'} ``` > 每一個 `d[]` 裡的東西，就好像我們去 access list element 時裡面是 index，只是對 dictionary 來說，我們的 index 就是 key，如第一行的 `"last name"` 是 key，後面的 `"Alberts"` 是這個 key 對應的 value。 ### 利用 dictionary 計數假設我們需要計算一個 list 裡的每個單字出現的次數，我們可以透過建立一個 dictionary，來存放每個單字（key）以及對應的出現次數（value）。例子： ```python= word_list = ['apple', 'banana', 'orange', 'apple', 'banana', 'apple'] word_counts = {} for word in word_list: # 對於每個在 word_list 裡的 element if word in word_counts: # 我們都去檢查它是否已經存在於 word_counts 這個 dictionary 中 word_counts[word] += 1 # 如果有，則增加它的 value（即出現次數） else: # 如果 dictionary 中還未加入這個 word word_counts[word] = 1 # 把這個 word 加入 dictionary 中，並令它的 value（即出現次數）為 1 print(word_counts) >>> {'apple': 3, 'banana': 2, 'orange': 1} ``` > `word_counts[word]` 代表的是在 `word_counts` 這個 dictionary 中， `word` 這個 key 對應到的 value。 > $\rightarrow$ 當使用 `word_counts[word]` 時，python 就會在 `word_counts` 這個 dictionary 中尋找 `word` 這個 key，然後再 return 它的 value。 ### 利用 max 找出 dictionary 中最大的 value 及對應的 key ```python my_dict = {'apple': 10, 'banana': 20, 'orange': 15} max_key = max(my_dict, key=my_dict.get) print("Key with the greatest value:", max_key) print("Value of the key:", my_dict[max_key]) >>> Key with the greatest value: banana >>> Value of the key: 20 ``` > `max()` 的 parameter `key` 被設成一個 function `my_dict.get`，`my_dict.get` 會去取得 `my_dict` 這個 dictionary 中每個 key 對應的 value，藉由這個值，`max()` 會去比較每個 key 的 value，然後 return value 最大的 ++key++。 > > $\rightarrow$ 雖然說 `max()` 會 return 一個 iterable 中最大的「值」，但不代表我們在此 return 並 assign 給 `max_key` 的，會是某個 key 的 value（也就是 `max_key` 為 `banana` 而非它的 value `20`） >> 因為在此使用 `max()` 時，它針對的 iterable 是一個 dictionary `my_dict`，而在 iterate through 一個 dictionary 時，預設是 iterate over keys。 >> 因此，當我們在將 `key` 設成 `key=my_dict.get` 時，也就代表了每個 dictionary 的 key 都已經被賦予了它的 value 作為比較的基準，所以最後 `max()` return 的「最大值」仍是一個 key。 ## tuple tuples：不能改變的 list，用 ``()`` 表示 ## set set 的長相： ```python this_is_a_set = {1, 'apple', (2, 3), 4.5} ``` $\rightarrow$ set 由大括號 `{}` 表示 $\rightarrow$ 裡面的 element 可以是不同的 type，如此例中四個 element 的 type 依序為：int, str, tuple, float - Note：雖然 set 是由大括號 `{}` 表示，但是在建立空 set 時不能如建立空 list 一樣 `new_empty_list = []` 令 `new_empty_set = {}` > $\Rightarrow$ 因為會被認為是一個空的 dictionary 因此建立一個空 set 需用： `new_empty_set = set()` ### set 特性： - unordered > $\rightarrow$ 因此無法用 index 去 access set 裡面的 element >> 例如對 list 可以： `target = mylist[0]` >> > $\rightarrow$ 也無法用 slicing >> 例如對 list 可以： `newlist = oldlist[3:6]` - set 中的 elements unique > $\rightarrow$ 所以重複將相同的 element 加入 set 的動作會被 ignore >> 例如重複將 `"apple"` 加入 set 兩次，最後 set 中只會有一個 `"apple"`： ```python myset = set() myset.add("apple") print(myset) myset.add("apple") print(myset) >>> {'apple'} >>> {'apple'} ``` ### set 用途 - set 經常用來檢驗++是否存在 duplicate++，例子如下： ```python def has_duplicate(nums: list[int]): seen_numbers_set = set() for num in numbers_list: print("\ncurrent number examined is:", num) if num in seen_numbers_set: print("Exists duplicate") return True seen_numbers_set.add(num) print("Current seen numbers set is:", seen_numbers_set) return False numbers_list = [1, 4, 5, 6, 2, 3, 1] ans = has_duplicate(numbers_list) print("Duplicate in this list?", ans) ``` output： <img src= "https://hackmd.io/_uploads/B1oL1j_W0.jpg" alt=output width="70%"> # 一些常用的 functions ## 獲得 input 獲得 input 的兩種方式，一個是用 `input` 讓 user 輸入，另一種是用 `loadtxt` 讀檔案（像是 txt 檔、csv 檔）進來。 ### input Python 有一個 function 叫作 `input`，可以從 user 那邊拿到一個 input，然後 assign 這個 input 一個變數名稱。例如： ![image](https://hackmd.io/_uploads/Syama87b0.png) 當我們執行到這行 `input` function 時，螢幕上就會顯示 input 括弧內的 string argument： ![image](https://hackmd.io/_uploads/HkU3h87ZR.png) 然後等待 user 輸入 input，而我們得到的 input 會被 assign 到 `distance` 這個變數。假設 user 輸入 `450`，那麼： ![image](https://hackmd.io/_uploads/H1xg6IQbA.png) $\rightarrow$ 如果我們想把得到的 input 變成小數，只要加上 `float()` 就行： ```python distance = float(input("Input trip distance(miles): ")) ``` ### loadtxt #### 從 txt 檔讀 input 假設我們有一個檔案叫做 `mydata.txt`，裡面可能有一些實驗資料和相關資訊長這樣： ```= Data for experiment Date: 16-Aug-2016 Data taken by L and J data point time(sec) height(mm) uncertainty(mm) 0 0.0 180 3.5 1 0.5 182 4.5 2 1.0 178 4.0 3 1.5 165 5.5 ``` 我們想把這些資料裡的每一行都讀進一個對應名字的 array 裡面，其中一個作法就是用 NumPy 的 `loadtxt` function： ![image](https://hackmd.io/_uploads/H1WS55mWR.png) 例子： ```python dataPt, time, height, error = np.loadtxt("mydata.txt", skiprows = 5, unpack = True) ``` 此處的 `loadtxt` 用了三個 arguments： 1. 檔案名稱的 string > 也就是例子中的 `"mydata.txt"` 2. 檔案的最前面（也叫做 header）要跳過幾行 > 此例中，因為前五行非 data 本身的內容，所以跳過 3. 如果 `unpack = True` （預設為 `False`），每一行會被拆成分開的 array，如果設 `unpack = False`，所有讀到的 data 會在同個 array 中。 > 此例中，因為 `unpack = True`，所以我們可以按照這四行資料拆成四個 array。 > > $\rightarrow$ 如果用預設的 `unpack = False` > > ![image](https://hackmd.io/_uploads/H1SHei7bA.png) > > 結果會是： > > ![image](https://hackmd.io/_uploads/HyCOxi7bC.png) - Note：這裡的 `dataPt`, `time`, `height`, `error` 的 type 都是 numpy array 而非 lists，我們可以看下方的 code： ![image](https://hackmd.io/_uploads/BkpPC5Qb0.png) 印出來長這樣： ![image](https://hackmd.io/_uploads/rJdC09X-C.png) 一樣可以用 access list 的 index 方式，例如： ![image](https://hackmd.io/_uploads/B123JsQZ0.png) 結果會是： ![image](https://hackmd.io/_uploads/SySRyoX-R.png) ## 刪除 ### del 用於刪除 list 中特定 index 的 element，且不需要 return 被刪的東西時。 > 如果刪除的同時要 return 被刪的 element，可以用 `pop()` 直接看例子： ```python my_list = [1, 2, 3, 4, 5] n = 2 del my_list[n] print(my_list) >>> [1, 2, 4, 5] ``` - Note：因為 `del` modifies list in place，所以如果有一個情況是我們有一個 list A，然後我們++把 list A 的內容 assign 給 list B++，那麼當我們對 list A 做 `del` 時，list B 的值也會跟著更改，例子如下： ```python list_A = [1, 2, 3] list_B = list_A del list_A[0] print("List A: ", list_A) # Output: [2, 3] print("List B: ", list_B) # Output: [2, 3] >>> List A: [2, 3] >>> List B: [2, 3] ``` > $\rightarrow$ 因為 `list_A` 和 `list_B` refer 到 memory 裡的同個 list object，所以對 `list_A` 做的更動對 `list_B` 來說也是 visible 的。如果我們不希望在 `del list_A[0]` 時也一起改到 `list_B` 的內容，我們可以使用 `copy()` 將 `list_B = list_A` 改成 `list_B = list_A.copy()`，如下： ```python list_A = [1, 2, 3] list_B = list_A.copy() del list_A[0] print("List A: ", list_A) print("List B: ", list_B) >>> List A: [2, 3] >>> List B: [1, 2, 3] ``` ### 刪除特定類型字元： join 例子：用 `join()` 刪除所有不是英文的 character ```python original_string = "Hello, 123world!" result_string = ''.join(char for char in original_string if char.isalpha()) print(result_string) >>> Helloworld ``` > `join()` 會 iterate 過所有 `original_string` 裡的 character，對每個 char，我們會 call `isalpha()` 來檢查是否為英文字母，如果是的話， `join()` 會 concatenate 這些字母，再 return 單一的一個 string。 - `join()` 的 syntax： ```python string.join(iterable) ``` > `iterable` 如 list, tuple...是你要拿來 concatenate 的東西；而 `string` 則是一個 separator，也就是你在 concatenate 各個 elements 時，每兩個中間的要插入的東西（例如空格）舉例來說： ```python words = ['Hello', 'world', 'how', 'are', 'you?'] result_string = ' '.join(words) print(result_string) >>> Hello world how are you? ``` > 在這個例子裡，`string` 是一個空格 `' '`，所以在把 `words` 這個 iterable 裡的每個 string concatenate 時，中間都會插入一個空格。 Note： - `join()` 是 call on `string` - `join()` 通常用在要 concatenate 大量的 strings，來避免用 `+` 來 concatenate 的 performance overhead。 #### 反向 concatenate 一種常見的題目是把一個句子裡所有的單字反向再 output，這種時候一個作法是可以先將句子切割成單字（用 `split()`，用法可參考下方 `split()` 小節。）接著再用 slicing 反向，如下方例子： ```python my_list = ['apple', 'banana', 'orange'] result = ' '.join(my_list[::-1]) print(result) >>> 'orange banana apple' ``` > `my_list[::-1]` 是一種 slicing 的 operation，會++產生++原本 list `my_list` 的反向 ++copy++，其中方括弧中的內容為`[start:stop:step]`： >> 第一個 `:` 前的數字（start）代表從哪裡開始 >> 第二個 `:` 和第一個 `:` 之間的數字（stop）代表結束的位置 >> 第二個 `:` 後的最後一個數字（step）如果是正的，代表多少一個間隔，如果是負的則是反向 >> > 因此，此處即反向的去 `join()` `my_list`，每個 element 在 concatenate 時中間插入一個間隔 `' '`，並且因為沒有設定 start 和 stop，所以會從最後一個 element traverse 到第一個。 ## 排序 ### sort `sort()` 會 sort in place，也就是會 modify 原本的 list $\rightarrow$ 如果不想要更動到原本 list 的內容，可改用 `sorted()`（見下一小節） ```python my_list = [3, 1, 4, 1, 5, 9, 2, 6, 5] my_list.sort() print(my_list) >>> [1, 1, 2, 3, 4, 5, 5, 6, 9] ``` ### sorted ```python my_list = [3, 1, 4, 1, 5, 9, 2, 6, 5] sorted_list = sorted(my_list) print("The original list is: ", my_list) print("The sorted list is: ", sorted_list) >>> The original list is: [3, 1, 4, 1, 5, 9, 2, 6, 5] >>> The sorted list is: [1, 1, 2, 3, 4, 5, 5, 6, 9] ``` > $\rightarrow$ 原本的 `my_list` 內容不會隨著 `sorted()` 的使用而改變。 - `sorted()` 的 syntax： ```python sorted(iterable, *, key=None, reverse=False) ``` > - iterable：可以是如 list, tuple, string... > - key (optional)：即排序的依據，是一個會對每個 iterable 的 element 作用的 function，預設為 `None`，即用 natural order 排序。 >> 其餘常見的 key 使用方式可參考下方 lambda 小節 >> > - reverse (optional)：預設為 `False`（由小到大），如果設為 `True` 則為由大到小排序。 ### 更改排序依據： lambda `lambda` 是一種 function 的 keyword，通常在當我們想要快速使用一個簡單的 function，卻又不想再多去定義一大個 code block 時用。舉一個很常見的例子： ```python my_list = [(1, 'b'), (2, 'a'), (3, 'c')] sorted_list = sorted(my_list, key=lambda x: x[1]) print(sorted_list) >>> [(2, 'a'), (1, 'b'), (3, 'c')] ``` 在這個例子裡面，`lambda x: x[1]` 就是一個 lambda function，它的 argument `x` 代表了 `my_list` 裡的每個 tuple，而這個 lambda function 會 return 的就是每個 tuple 的 index 1 項 `x[1]`。這裡我們用到這個 lambda function 是因為我們想要根據 `my_list` 裡每個 tuple 的第一項去 sort，所以我們將 lambda function 的 return value `x[1]` 設成排序依據的 key。 - lambda function 的 syntax： ```python lambda arguments: expression ``` - Note： - lambda function 是 ++user-defined++ function，並非一些 predefined 的 functions - lambda function ++只能有一行++，如果需要 define 的 function 太複雜，需要多行，就還是該用 `def` 去完整定義另外一個 function。 - lambda function ++可以有多個 arguments++，每個用 `,` 分隔，舉例來說： ```python lambda x, y: x + y ``` > $\rightarrow$ 這個例子中的 lambda function 有兩個 arguments `x` 和 `y`，然後會 return 他們的和 `x + y` ## 數值大小、界線 ### min syntax： `min(iterable, *iterables, key, default)` > $\rightarrow$ 後面三者為 optional > - iterable 如 list, tuple, set, dictionary... > - \*iterables 即可以有多個 iterables > - key 為某個 function，我們去比較求 min 即基於這個 function 回傳的值 > - default 即如果我們的 iterable 為空，預設的回傳值舉例來說，如果有一個 list 裡面每個 element 都是 string，我們希望求這個 list 裡最短的 string： ```python my_list = ["apple", "banana", "pear"] shortest_string = min(my_list, key = len) print("Shortest string:", shortest_string) >>> Shortest string: pear ``` $\rightarrow$ 如果沒有特別 specify 我們的 string 的比較基準是長度，那麼比較基準就是首個字母的 alphabetical order，同上面例子的 list： ```python my_list = ["apple", "banana", "pear"] smallest_string = min(my_list) print("Smallest string:", smallest_string) >>> Smallest string: apple ``` 也可以用在常見的求一個 list 裡最小的數： ```python numbers = [0,22,101,8,44] print(min(numbers)) >>> 0 ``` ### range `range(n)` 產生一個從 0 到 n （但不包含 n）的數列。 ```python list(range(0,10,2)) >>> [0, 2, 4, 6, 8] ``` > - 只有在要把 `range(n)` 的結果 print 出來時需要先用 `list()` 把產生的數列轉換成 list > > $\rightarrow$ `range()` 的三個 parameters 依序為 start, stop, step，也就是從哪個數字開始、到哪個數字（但不包含此數）、每次跳的值為多少如果我們把 step 設成負數，就會產生一個由大到小的數列，例如： ```python list(range(10,0,-2)) >>> [10, 8, 6, 4, 2] ``` 如果要創造一個遞減的數列，另個方法是用 `reversed()`，例子如下： ```python list(reversed(range(0,10,2))) >>> [8, 6, 4, 2, 0] ``` - `list(range(0))` 會 output `[]`（empty list） - `list(range(-10))` （任何負數）會 output `[]`（empty list） ## split 用來將一個 string（常用如對一個句子或一個段落），切割成一個一個單字，最後會 ++return++ 的是切好的一個個單字的 ++list++。 > 其實正確的來說也不一定是單字，而是（預設）會根據遇到空格或分行的地方去切。舉例來說： ```python sentence = "This is a sentence." words = sentence.split() print(words) >>> ['This', 'is', 'a', 'sentence.'] ``` - `split()` 的 syntax： ```python str.split(sep=None, maxsplit=-1) ``` > - `str`：要拿來 split 的 string 名稱 > - `sep` (optional)：要拿來切割的分界依據，如果沒有特別更改，預設為遇到 whitespace characters（空格、tab `\t`、換行 `\n`），可以特別指定某個字母、符號⋯⋯去作為分界依據，例如： ```python date = "2022-04-19" parts = date.split('-') print(parts) >>> ['2022', '04', '19'] ``` > - `maxsplit` (optional)：最多可以++切幾次++，所以假設 `maxsplit=2`，那麼會最多只能切出三個 element。預設為 `-1`，也就是對切的次數沒有限制。 > 例子： ```python sentence = "This is a sentence." words = sentence.split(maxsplit=2) print(words) >>>['This', 'is', 'a sentence.'] ``` Note： ++`split()` 的 `sep` 只能設定一個++，因此，如果有多種符號需要作為劃分依據，一個方法是可以先換成同一種。例子： ```python date = "2024,05-06 12:51" date_number_only = date.replace(',',' ').replace('-', ' ').replace(':', ' ') numbers = date_number_only.split() print(numbers) >>> ['2024', '05', '06', '12', '51'] ``` > $\rightarrow$ 可以一次對同個 string 連結多個 `replace()` ## lower `lower()` 可以把所有大寫英文字母轉換成小寫，如果本身即是小寫的 char 則不變，非英文字母的 char（如數字）也不變。例子：把一個 list 裡的每個 string 轉換成小寫 ```python words = ['Hello', 'WORLD', 'How', 'Are', 'You?'] lowercase_words = [word.lower() for word in words] print(lowercase_words) >>>['hello', 'world', 'how', 'are', 'you?'] ``` 例子：如果遇到數字、逗號驚嘆號等符號不變 ```python string = "Hello, 123world!" lowercase_string = string.lower() print(lowercase_string) >>> hello, 123world! ```