Python 基本用法

# Python 基本用法 ###### tags: `python` created_time: 2020-11-17 15:30 updated_time: 2022-10-26 15:15:03.887608 [TOC] ## 控制結構 if: elif: else: while : 注意要有不滿足condition的一刻不然會陷入無窮迴圈 for 變數 in 迭代物件 : pass : 不做事 #維持程式完整可以慢點寫完成不會使程式誤判 break : 跳離當前迴圈 continue : 直接進入下一次迴圈 ## 基本運算 ### Addition and subtraction: 加與減 ``` print(10+15) print(10-15) ``` ### Multiplication and division: 乘與除 ``` print(10*15) print(10/2) ``` ### Exponentiation: 指數(次方) ``` print(10**2) ``` ### Roots(根號) ``` print(16**0.5) ``` ### Modulo: 餘數 ``` print(18%7) print(18%2) ``` ### Floor Division 取整除 - 返回商的整数部分 ``` print(18//7) ``` ## 顯示及變數 ### Print ``` # Console Out: 執行結果，於Out位置顯示 "Hi" # print(): "印出函式"的執行結果 print("Hi") print('Hello', 'Python') print('Hello', 'Python', sep='__') print('Hello', 'Python', sep=';') ``` NOTE: 執行結果在 jupyter notebook 中，僅會顯示"最後執行之輸出" ### Create Variable and Variable Assignments 創造變數與賦值 ``` # Create a variable x x = 5 ``` ### 同時賦值 ``` a, b, c = 1, 2, 3 print('a:', a) print('b:', b) print('c:', c) ``` ### 刪除已命名之變數 ``` temp = 50 del temp ``` ### 查看型態 ``` var = 10 type(var) type(False) ``` ## 小技巧 ### 關於註解 ``` # 在 Python 當中，使用 # 作為單行註釋符號。 # 另外，使用三個 " 做為多行註釋符號。舉例如下： """ print("在多行註釋內的區域，內容不會被執行") """ ``` ### 關於變數命名規則(Constant and Variable Naming Conventions) ``` - 變數的命名，不只是為了賦值而已，同時也是為了讓自己在寫程式時更加清楚。 - 每個程式語言，都具有自己的命名規則(rules)與寫作風格(styles)： - 命名規則：必須遵守。否則可能會導致程式無法運行、或無法正確運行。 - 寫作風格：建議遵守，包括變數命名方式、空格使用等。參考該語言社群較流行的寫作風格，將有利開發、以及方便他人閱讀程式碼。 - 在 Python的變數命名規則中，以下是必須要遵守的項目： - 不能夠在命名變數時，以數字做為開頭(ex: 1a 1b)。 - 不能夠在命名變數時，使用空白(可使用 _ 取代)。 - 不能夠在命名變數時，使用除了 _ 之外的特殊字元。 - 避免使用單英文單字進行命名。 - 避免使用Python內定保留字元進行命名。(ex: list, str, tuple)。 ``` Python的寫作風格可參考 PEP8 https://www.python.org/dev/peps/pep-0008/ ### 查看保留字 import keyword keyword.kwlist ## 字串 String ``` phone = "iphone X" # 對應位置示意: string iphone X index 01234567 # 注意 python 的範圍選取，是由 0 開始 print(phone[0]) # 'p' print(phone[5]) # 'e' # 可用負號倒數回來 print(phone[-1]) # 'X' # 包前不包後 print(phone[0:3]) → 'iph' # 冒號之前不填數字，代表從最前開始 print(phone[:3]) # 'iph' # 冒號之後不填數字，會包括最後一個 print(phone[0:]) # 'iphone X' # 除了最後一個之外 phone[:-1] # 'iphone ' # 全選 print(phone[:]) # 'iphone X' # Error phone[0] = 'k' ``` ### 字串合併 ``` first_name = "Kobe" last_name = "Bryant" full_name = first_name + last_name full_name + '24' ``` ### 字串乘法 ``` letter = "A" letter*10 ``` ### 注意字串和數字，型態不同 ``` 5 == '5' # False # This will cause Error 5 + '5' ``` ### 字串的內建方法(Built-in String methods) ``` phone = "iphone X" # 第一個字為大寫 phone.title() # 全轉為大寫 phone.upper() # 全轉為小寫 phone.lower() # 以空白分割、分裂(預設是空白) phone.split() # 以指定字元分割、分裂 phone.split('e') "Hello my name is SnackD".split(" ") # ['Hello', 'my', 'name', 'is', 'SnackD'] # 取得字串長度 len(phone) # 字串取代 phone.replace("X","!") # 判斷第一個字元(注意大小寫式不同的!) phone.startswith('I') # 判斷最後一個字元 phone.endswith('X') ``` 在 Pandas 中： ``` temp = pd.read_csv("csv/temp.csv").dropna(how="all") temp["Department"] = temp["Department"].astype("category") temp.tail(3) temp["Name"].str.split(",").head() temp["Name"].str.split(",").str.get(0).str.title() temp["Name"].str.split(",").str.get(0).str.title().head() ``` ### Print String with .format() method ``` # Automatic field numbering with {} 'hello {}'.format('world!') # Manual field specification with {0} {1}... '{2}, {1}, {0}, {3}'.format('A', 'B', 'C', 'D') # Manual field specification with {0} {1}... '{first}, {second}, {first}, {second}'.format(first='A', second='B') # 在format中的值無論什麼型態，皆會直接以轉為以str表示 '{}, {}, {}'.format(1, 'kobe', True) # 使用範例 fruit_name = 'apple' fruit_price = 17 print('Fruit: {0}, Cost: {1} dollars.'.format(fruit_name, fruit_price)) ``` ### Formatted String Literals (f-strings) python 3.6 的 String新功能(注意：版本一定要在3.6以上) - https://cito.github.io/blog/f-strings/ - https://docs.python.org/3/reference/lexical_analysis.html#f-strings ``` name = 'Jack' age = 22 f'He said his name is {name} and he is {age} years old.' num = 23.45678 print("My 10 character, four decimal number is:{0:10.4f}".format(num)) print(f"My 10 character, four decimal number is:{num:{10}.{6}}") ``` ## List 串列,列表 * list 有序, 且具有索引特性 * list 內可包含不同類型對象 * list 長度可變動 * list 內容可變動 References 1 https://openhome.cc/Gossip/Python/ListType.html 2 https://www.w3schools.in/python-tutorial/lists/ ``` # list 有序, 且具有索引特性 num_list = [1, 3, 5, 7] type(num_list) # list 內可包含不同類型對象 mix_list = [100, 'apple', True] mix_list # 空 list temp_list = [] print(type(temp_list), len(temp_list)) # 查看list長度 num_list = [1, 3, 5, 7] len(num_list) # num_list[0] # 1 # num_list[:] # [1, 3, 5, 7] # num_list[1:-1] # [3, 5] ``` ### Concatenate lists 連接列表 ``` first_list = [10, 20, 30] second_list = [100, 200, 300] total_list = first_list + second_list total_list # [10, 20, 30, 100, 200, 300] # list長度可變動 first_list = first_list + [40] # Check first_list first_list # [10, 20, 30, 40] # Make the list Triple times first_list * 3 # [10, 20, 30, 40, 10, 20, 30, 40, 10, 20, 30, 40] # Check first_list first_list # [10, 20, 30, 40] # list內容可變動 first_list[0] = 50 first_list # [50, 20, 30, 40] ``` ### Methods built in list ``` list_1 = [10, 20, 30] list_1 # [10, 20, 30] # Append data list_1.append(40) list_1 # [10, 20, 30, 40] # Pop off item with pop(). Default popped index is -1 list_1.pop() list_1 # [10, 20, 30] # Check the popped item, with popped-index -2 popped_item = list_1.pop(-2) popped_item # 20 # Check the remain items list_1 # [10, 30] new_list = [50, 30, 60, 70] alpha_list = ['A', 'C', 'B', 'F'] # Sort new_list.sort() new_list # [30, 50, 60, 70] # Reverse alpha_list.reverse() alpha_list # ['F', 'B', 'C', 'A'] ``` ### Nested List ``` first_list = [1, 2, 3] second_list = [4, 5, 6] third_list = [7, 8, 9] # Make a list of lists nested_list = [first_list, second_list, third_list] nested_list # [[1, 2, 3], [4, 5, 6], [7, 8, 9]] # index it nested_list[0] # [1, 2, 3] nested_list[1][2] # 6 ``` ### Dictionary 字典 #### Dictionary 字典的使用方式可以先選擇性不看 References 1. https://openhome.cc/Gossip/Python/DictionaryType.html 1. https://www.w3schools.in/python-tutorial/dictionaries/ ``` # 使用 {'key': value} 建立 dictionary students_dict = {'Jane': 90, 'Jack': 80} students_dict type(students_dict) # {'Jack': 80, 'Jane': 90} # Call values by their key students_dict['Jane'] # 90 # dictionary的value可為多種型態如list, 甚至是nested dict. info_dict = {'name': 'Ken', 'age': 20, 'grades': [90, 80, 80], 'fav_fruits': {'apple': 10, 'banana': 20} } # Get grades which is 90 info_dict['grades'][0] # 90 # Get apple's value info_dict['fav_fruits']['apple'] # 10 ``` #### Add key-value ``` info_dict['fav_sport'] = 'baseball' info_dict # {'age': 20, # 'fav_fruits': {'apple': 10, 'banana': 20}, # 'fav_sport': 'baseball', # 'grades': [90, 80, 80], # 'name': 'Ken'} ``` #### Reassign values ``` info_dict['age'] = 21 info_dict # {'age': 21, # 'fav_fruits': {'apple': 10, 'banana': 20}, # 'fav_sport': 'baseball', # 'grades': [90, 80, 80], # 'name': 'Ken'} ``` ### Basic methods of dict ``` # Check dict_keys info_dict.keys() # dict_keys(['name', 'age', 'grades', 'fav_fruits', 'fav_sport']) # Check dict_values info_dict.values() # dict_values(['Ken', 21, [90, 80, 80], {'apple': 10, 'banana': 20}, 'baseball']) # Check dict_items info_dict.items() # dict_items([('name', 'Ken'), ('age', 21), ('grades', [90, 80, 80]), ('fav_fruits', {'apple': 10, 'banana': 20}), ('fav_sport', 'baseball')]) # pop off specifiec item with key info_dict.pop('name') info_dict # {'age': 21, # 'fav_fruits': {'apple': 10, 'banana': 20}, # 'fav_sport': 'baseball', # 'grades': [90, 80, 80]} ``` ### Tuple ``` temp_tuple = () first_tuple = (10, 20, 30) # Check length len(first_tuple) # index first_tuple[0] # 10 # tuple可存多種類型 sec_tuple = ([1, 2, 3], 50, 'apple', {'height': 100} ) sec_tuple[0] # [1, 2, 3] sec_tuple[0][0] # 1 sec_tuple[3] # {'height': 100} ``` #### Methods build in tuple ``` first_tuple = (10, 20, 30, 10) # Enter a value within index(), return the value's index position. first_tuple.index(10) # 0 # Count the number of times a value appears first_tuple.count(10) # 2 ``` #### Add or Re-assign ``` first_tuple = (10, 20, 30, 10) # This will cause Error # > TypeError: 'tuple' object does not support item assignment # first_tuple[3] = 40 ``` ### set ``` grades_list = [100, 90, 90, 80, 85, 85, 85] grades_list # [100, 90, 90, 80, 85, 85, 85] # set set(grades_list) # {80, 85, 90, 100} no_repeat_grades = list(set(grades_list)) no_repeat_grades # [80, 90, 100, 85] ``` ### None (NoneType) ``` t = None print(t) # None type(t) # NoneType ``` ### Statements | if else References 1. https://openhome.cc/Gossip/Python/IfStatement.html 1. https://www.w3schools.in/python-tutorial/decision-making/ python 和其他類型程式語言比較：其他類型語言： ``` if (a > 5){ DO SOMETHING } ``` Python： ``` if a > 5: DO SOMETHING ``` * python敘述句中，省略了() 和 {}。 * 取而代之，使用空格縮排(Indentation)、冒號(:)來控制區域。 * 請注意縮排非常重要。 ``` if True: print("因為判斷條件為True，執行此區域") # 因為判斷條件為True，執行此區域 if False: print("因為判斷條件為False，不會執行此區域") ``` ``` # if-else為由上往下，因此若判斷條件為真，即不會執行剩餘同等級判斷句。 x = 100 if x > 80: print("判斷條件為真，執行此區域") else: print("不會執行此區域") # 判斷條件為真，執行此區域 grade = 75 if grade > 70: print("您的分數是{}分, 判定及格。".format(grade)) else: print("您的分數是{}分, 判定不及格。".format(grade)) # 您的分數是75分, 判定及格。 p_name = 'Kobe' if p_name == 'Kobe': print('Hi Kobe!') else: print("Who?") # Hi Kobe! ``` ``` # 判斷條件不僅一個時 grade = 63 if grade > 80: print("您的分數是{}分, 比80高呢。".format(grade)) elif grade > 75: print("您的分數是{}分, 比75高呢。".format(grade)) elif grade > 70: print("您的分數是{}分, 比70高呢。".format(grade)) else: print("剩餘的條件，都歸類在這。") # 剩餘的條件，都歸類在這。 # 判斷條件不僅一個時 price = 250 if price > 300: print("價格是{}, 比300高呢。".format(price)) elif 300 >= price >= 200: print("價格是{}, 介於200~300呢。".format(price)) else: print("剩餘的條件，都歸類在這。") # 價格是250, 介於200~300呢。 ``` * 除了明確的判斷條件(ex: x = 5, a > 6),或是True False之外。 * empty list, empty dict, empty tuple, empty string, 0 -> 都判斷為False。 ``` if []: print("不會執行呢") if {}: print("不會執行呢") if 0: print("不會執行呢") if (): print("不會執行呢") if "": print("不會執行呢") ``` ``` if ["A"]: print("會執行呢") if {'key1': 30}: print("會執行呢") if 2: print("會執行呢") if (3, 5): print("會執行呢") if "hi": print("會執行呢") ``` #### Nested if-else ``` x = 5 y = 10 if x == 10: if y > 30: print("A") else: print("B") else: if y == 10: print("C") # C ``` ### Statements | loops References 1. https://openhome.cc/Gossip/Python/ForWhileStatement.html 1. https://www.w3schools.in/python-tutorial/loops/ 語法： ``` for item in object: DO SOMETHING ``` 語法要點： ``` for item in object: ￣￣ # DO SOMETHING ↑ item的變數名可自由命名，不影響loop結果 ``` 概念舉例： ``` for 學生姓名 in 修課學生名單: print(學生姓名) ``` ``` num_list = [1, 2, 3, 4, 5] # 印出num_list中的數字 print(num_list[0]) #1 print(num_list[1]) #2 print(num_list[2]) #3 print(num_list[3]) #4 print(num_list[4]) #5 ↓ num_list = [1, 2, 3, 4, 5] # 用for loop印出印出num_list中的數字 for num in num_list: print(num) # 用 x ，也是可以執行的。 for x in num_list: print(x) # 1 # 2 # 3 # 4 # 5 mix_list = [('A', 20), ('B', 40), ('C', 60)] for (x, y) in mix_list: print(x) print(y) # A # 20 # B # 40 # C # 60 ``` ``` grade_list = [80, 70, 69, 50, 92] for grade in grade_list: if grade > 75: print("分數為{}, 高於75分呢!".format(grade)) # 分數為80, 高於75分呢! # 分數為92, 高於75分呢! num_list = [1, 2, 3, 4, 5, 6] for num in num_list: if num % 2 == 0: print(num, "偶數") else: print(num, '奇數') # 1 奇數 # 2 偶數 # 3 奇數 # 4 偶數 # 5 奇數 # 6 偶數 ``` ``` name_list = ["Kobe", "James", "Steve", "Klay"] k_list = [] for name in name_list: # 轉為小寫後, 判斷開頭是否為k if name.lower().startswith('k'): # append到k_list k_list.append(name) print(k_list) # ['Kobe', 'Klay'] ``` ``` # string for letter in 'Congratulations!': print(letter) # C # o # n # g # r # a # t # u # l # a # t # i # o # n # s # ! # tuple _tuple = (1,2,3,4,5) for t in _tuple: print(t) # 1 # 2 # 3 # 4 # 5 ``` ``` grade_dict = {'math': 80, 'physics': 90, 'geography': 89} # check grade_dict.keys() grade_dict.keys() # dict_keys(['math', 'physics', 'geography']) for key in grade_dict.keys(): print(key) # math # physics # geography # check grade_dict.items() grade_dict.items() # dict_items([('math', 80), ('physics', 90), ('geography', 89)]) for key, value in grade_dict.items(): print(key, value) # math 80 # physics 90 # geography 89 ``` #### break、continue ``` # grade_list grade_list = [10, 20, 30, 40, 50] # count use 計數用 i = 0 for grade in grade_list: # 當i>3為真, break 會離開該層迴圈 if i > 3: break # 印出 i與grade print(i, grade) # 每次進行到這, 計數 i = i + 1 i += 1 # 0 10 # 1 20 # 2 30 # 3 40 # count use 計數用 i = 0 for grade in grade_list: # continue用於跳過該次迭代 if i == 3: i += 1 continue print(i, grade) i += 1 # 0 10 # 1 20 # 2 30 # 4 50 ``` #### 關於 pass ``` word = "python" for letter in word: if letter == "t": pass else: print(letter) ``` ### while loops References 1. GIF animation https://blog.penjee.com/top-5-animated-gifs-explain-loops-python/ 1. https://www.w3schools.in/python-tutorial/loops/ 1. https://www.w3schools.in/python-tutorial/loops/ when to use "while" or "for" in python? 1. https://stackoverflow.com/questions/920645/when-to-use-while-or-for-in-python 語法： ``` while 條件: DO SOMETHING else: Escape the while loop, finally DO SOMETHING ``` **請注意：判斷條件若使用不善，可能會產生infinite loop** ``` age = 10 while age < 18: print("現在{}歲, 沒滿18歲還不能喝酒喔".format(age)) age+=1 else: print('跳出while了, 現在{}歲!'.format(age)) # 請注意!!這會造成無限迴圈。 # 因 while條件恆為真! # while True: # print("卡在infinite loop=_=") # while條件恆為真。 # 利用while內的if判斷式搭配break進行迴圈跳出。 # 請注意此種方法，else內不會被執行。 x = 10 while True: print(x) x += 1 if x > 15: break else: print("咦?") # 10 # 11 # 12 # 13 # 14 # 15 ``` 常用技巧整理 * range http://www.runoob.com/python3/python3-func-range.html * enumerate http://www.runoob.com/python3/python3-func-enumerate.html * zip https://goo.gl/2rXdPB http://www.cnblogs.com/yemeng/p/4063769.html * in https://goo.gl/pRgqz8 * list comprehension http://python-3-patterns-idioms-test.readthedocs.io/en/latest/Comprehensions.html #### range ``` range(5) # range(0, 5) list(range(5)) # [0, 1, 2, 3, 4] for i in range(5): print(i) # 0 # 1 # 2 # 3 # 4 for i in range(1, 5): print(i) # 1 # 2 # 3 # 4 ``` ### enumerate ``` fruit_list = ['apple', 'banana', 'pineapple'] # get index method 1 i = 0 for item in fruit_list: print(i, item) i += 1 # 0 apple # 1 banana # 2 pineapple # get index method 2 for item in fruit_list: item_index = fruit_list.index(item) print(item_index, item) # 0 apple # 1 banana # 2 pineapple # get index method 3, with range for i in range(len(fruit_list)): print(i, fruit_list[i]) # 0 apple # 1 banana # 2 pineapple # get index method 4, with enumerate for index, item in enumerate(fruit_list): print(index, item) # 0 apple # 1 banana # 2 pineapple ``` #### zip ``` number_list = [10, 20, 30, 40, 50] subject_list = ['A', 'B', 'C', 'D', 'E'] zip(number_list, subject_list) # <zip at 0x24135295b88> list(zip(number_list, subject_list)) # [(10, 'A'), (20, 'B'), (30, 'C'), (40, 'D'), (50, 'E')] ``` #### in 參考網址：[in Operators Example](https://goo.gl/pRgqz8) ``` 'p' in 'python' # True 'A' in ['A', 'B', 'C'] # True ['A'] in ['A', 'B', 'C'] # False ('A', 'B') in [('A', 'B'), ('B', 'C') ] # True ``` #### list comprehension ``` number_list = [10, 20, 30, 40, 50] collection_list = [] # with for loop for number in number_list: collection_list.append(number) print(collection_list) # [10, 20, 30, 40, 50] ``` ### array ``` import numpy as np g = np.array([40, 70, 25]) g[0] #指定第一個數字 # 40 w = np.array([0.4, 0.3, 0.3]) (g*w).sum() # 44.5 np.dot(g, w) # 44.5 np.dot(g, 1) # array([40, 70, 25]) grades = [77, 85, 66, 98, 0, 74, 90] total = 0 for s in grades: total = total + s print(total/len(grades)) newgrades = [] for s in grades: newgrades.append(s+5) newgrades # [82, 90, 71, 103, 5, 79, 95] garr = np.array(grades) garr # array([77, 85, 66, 98, 0, 74, 90]) garr.mean() # 70.0 garr.sum() # 490 garr + 5 # array([ 82, 90, 71, 103, 5, 79, 95]) ``` ## Pandas Pandas 可以想成是 Python 的 Excel, 但又更有彈性、更方便! Pandas 基本上有兩種資料結構, 一是 DataFrame, 可以想成一個 table; 另一個是 Series, 是個有點像 array 或 list 的形態。我們先讀入練習檔案, 做成一個 DataFrame。 ``` %matplotlib inline import pandas as pd df = pd.read_csv("http://bit.ly/gradescsv") df.head() # 前 5 筆資料長怎麼樣 df.tail() # 最後 5 筆資料 df["數學"].mean() # 11.57 df[["英文", "數學"]] # 選取這兩欄 df.數學.plot() df.數學.hist(bins=15) df.數學.std() # 標準化 df.describe() # 描述性資料 df.corr() # 相關係數 ``` ``` # 增加一行資料 df["沒用"] = 1 # 刪除該欄位 df.drop("沒用", axis=1, inplace=True) # 新增總級分欄位 df["總級分"] = df.sum(axis=1) # 新增加權欄位 df["加權"] = df.英文*1.5 + df.數學 * 2 #排序，從大排到小 # df.sort_values(by="總級分", ascending=False).head() # 重設 index 排序 df2.index = range(1,101) df2[df2.數學 == 15] df2[(df2.數學==15) & (df2.英文==15)] df2[(df2.數學==15) | (df2.英文==15)] ``` 基本的 loc 使用方式基本上 loc 這樣使用 df[列的範圍, 行的範圍] ``` df.loc[2:3, "數學":"社會"] df.loc[(2), ("社會")] df.loc[(2), ("社會")] = -1 ``` ### Matplotlib 資料視覺化到今天, matplotlib 幾乎是標準 Python 畫圖套件了! 在有 matplotlib 之前, Python 要畫圖不那麼方便, 和 Python 很多套件一樣, 有許多方案, 但各家有不同的優缺點, 也沒有一套是大家都在用的。 ``` # from IPython.display import YouTubeVideo # YouTubeVideo('e3lTby5RI54') %matplotlib inline import numpy as np import matplotlib.pyplot as plt import pandas as pd x = np.linspace(-10,10,100) y = np.sin(x) plt.plot(x,y,'r') plt.plot(x,y,'r-.') ``` #### 【技巧】快速改變線條風格 | 參數 | 說明 | |---|---| | `--` | dash | | `-.` | 點 + dash | | `:` | 點點 | | `o` | 大點點 | | `^` | 三角 | | `s` | 方塊 | #### 基本的修飾 | 參數 | 說明 | |---|---| | `alpha` | 透明度 | | `color` (`c`)| 顏色 | | `linestyle` (`ls`) | 線條風格 | | `linewidth` (`lw`) | 線寬 | ``` r = 3 t = np.linspace(-2*np.pi, 2*np.pi, 200) x = r*np.cos(t) y = r*np.sin(t) plt.plot(x, y, lw=3) ax = plt.gca() ax.set_aspect('equal') plt.plot(x, y, lw=3) r = np.sin(5*t) ``` #### subplot 畫多個圖我們每次畫圖的時候, matplotlib 就弄 1 個 figure 畫圖區出來, 裡面可以有很多子圖, 在 figure 裡叫 axes。目前我們都只有 1 個 figure 內含 1 張圖, 所以都不用設, 現在我想畫 4 張圖時。我們就要先想好「陣式」。比如說 2x2 這樣排列的 4 張圖。 ``` x = np.linspace(0, 10, 100) plt.subplot(221) plt.plot(x, np.sin(x), c='#e63946', lw=3) plt.subplot(222) plt.plot(x, np.sin(3*x), c='#7fb069', lw=3) plt.subplot(223) plt.scatter(x, np.random.randn(100), c='#daa73e', s=50, alpha=0.5) plt.subplot(224) plt.bar(range(10), np.random.randint(1,30,10), fc='#e55934') ``` ### 進階色彩顏色表示法 1 c = 'r' 可以用 blue (b), green (g), red (r), cyan (c), magenta (m), yellow (y), black (k), white (w) 顏色表示法 2 用一個 0 到 1 的數字表灰階, 越大越白。 c = '0.6' 顏色表示法 3 網頁常用的標準 16 進位 RGB 表示法。 c = '#00a676' 我們怎知哪裡可選顏色呢? 可以用之前彥良介紹的 Coolors.co 等。顏色表示法 4 用 0-1 的數字表 RGB 也可以。 c=(0.7, 0.4, 1) #### marker 可以設的參數 ``` x = np.linspace(-10,10,200) y = np.sinc(x) plt.plot(x,y, c = '#00a676', lw=3) ``` ``` x = range(20) y = np.random.randn(20) plt.plot(x, y, marker='o') plt.plot(x, y, c='#6b8fb4', lw=5, marker='o', mfc='#fffa7c', mec="#084c61", mew=3, ms=20) ``` | 參數 | 說明 | |---|---| | `marker` | marker 的風格 | | `markeredgecolor` (`mec`) | 邊線顏色 | | `markeredgewidth` (`mew`) | 邊線寬度 | | `markerfacecolor` (`mfc`) | marker 的顏色 | | `markerfacecoloralt` (`mfcalt`) | marker 替換色 | | `markersize` (`ms`) | marker 大小 | | `markevery` | 隔多少畫一個 marker | #### bar ``` plt.bar(range(1,6), np.random.randint(1,30,5)) ``` #### 雙色的長條圖 ``` x = np.arange(1,6) plt.bar(x - 0.4, [3, 10, 8, 12, 6], width=0.4, ec='none', fc='#e63946') plt.bar(x, [6, 3, 12, 5, 8], width=0.4, ec='none', fc='#7fb069') ``` #### 疊加型的資料 ``` A = np.random.randint(2,15,5) B = np.random.randint(2,15,5) C = np.random.randint(2,15,5) plt.bar(x, A, fc='#e63946', ec='none') plt.bar(x, B, fc='#7fb069', ec='none', bottom = A) plt.bar(x, C, fc='#e55934', ec='none', bottom = A+B) ``` #### 橫放的長條圖 ``` x = np.arange(0.6, 6) plt.barh(x, np.random.randint(1,15,6), fc='#e55934', ec='none') ``` #### 雙向的長條圖 ``` x = np.arange(0.6,6) A = np.random.randint(1,15,6) B = np.random.randint(1,15,6) plt.barh(x, A, fc='#e63946', ec='none') plt.barh(x, -B, fc='#7fb069', ec='none') ``` #### 畫圖區的設定 ``` plt.title("My lovely sin function") plt.xlabel('x-axes') plt.ylabel('values') plt.xlim(-6,6) plt.ylim(-1.2,1.2) plt.plot(x, y, lw=3, label='$\sin$') plt.plot(x, np.cos(x), lw=3, label='$\cos$') plt.legend() ``` #### xticks ``` xv = np.linspace(0, 2*np.pi, 100) yv = np.sin(xv) plt.plot(xv,yv,lw=3) plt.xticks([0, np.pi/2, np.pi, 3*np.pi/2, 2*np.pi], ['$0$', '$\pi/2$', '$-\pi$', '$-3\pi/2$', '$-2\pi$']); ``` #### 取得現在工作中 axes ``` ax = plt.gca() ax.set_facecolor('#69b8bb') ax.set_xlim(-6,6) ax.set_ylim(-1.2,1.2) plt.plot(x,y,lw=5,c='white') 移動 x, y 座標軸 ax = plt.gca() ax.set_facecolor('#69b8bb') ax.set_xlim(-6,6) ax.set_ylim(-1.2,1.2) ax.spines['right'].set_color('none') ax.spines['top'].set_color('none') ax.spines['bottom'].set_position(('data',0)) ax.spines['left'].set_position(('data',0)) plt.plot(x,y,lw=5,c='white') ``` #### 中文顯示問題 ``` import matplotlib.font_manager as fm myfont = fm.FontProperties(fname="/Users/mac/Library/Fonts/NotoSansHant-Medium.otf") x = np.linspace(-2*np.pi, 2*np.pi, 200) y = np.sin(x) plt.plot(x, y, lw=5) plt.title("我可愛的 sin 函數", fontproperties=myfont, size=20) # 方法 2: 完完全全改過來 # 再來我們可以用 matplotlib 的參數設定, rcParams, 把字型完完全全用某個中文字型。 plt.rcParams['font.sans-serif'] = ['SimHei'] # 選個普通的黑體字 plt.rcParams['axes.unicode_minus']=False # 負號不出問題 plt.plot(x, y, lw=5) plt.title("我可愛的 sin 函數", size=15) # 不用再設字型! ``` #### 耍寶可愛的 xkcd ``` save_state = plt.rcParams.copy() plt.xkcd() x = np.linspace(-5, 5, 200) y = np.sin(2*x) + 0.2*x plt.plot(x,y) plt.rcParams.update(save_state) ``` #### seaborn 大救星 ``` import seaborn as sns sns.set(color_codes=True) x = np.linspace(-10,10,200) y = np.sinc(x) plt.plot(x,y) ``` ### Merge (DataFrame) 欄位合併 | 參數 | data1 | | - | - | | left | 參與合併的左側DataFrame | | right| 參與合併的右側DataFrame | | how| 連線方式：'inner'（預設）；還有 'outer'、'left'、'right’ | | on | 用於連線的列名，必須同時存在於左右兩個DataFrame物件中，如果位指定，則以left和right列名的交集作為連線鍵 | | left_on | 左側DataFarme中用作連線鍵的列 | | right_on |右側DataFarme中用作連線鍵的列 | | left_index| 將左側的行索引用作其連線鍵 | | right_index| 將右側的行索引用作其連線鍵 | | sort| 根據連線鍵對合並後的資料進行排序，預設為True。有時在處理大資料集時，禁用該選項可獲得更好的效能 | | suffixes | 字串值元組，用於追加到重疊列名的末尾，預設為（‘_x’,‘_y’）.例如，左右兩個DataFrame物件都有‘data’，則結果中就會出現‘data_x’，‘data_y’ | | copy | 設定為False，可以在某些特殊情況下避免將資料複製到結果資料結構中。預設總是賦值 | ``` df1 = pd.DataFrame({'key':['a','b','c'],'data1':range(3)}) ``` | key | data1 | | - | - | | a | 0 | | b | 1 | | c | 2 | ``` df2 = pd.DataFrame({'key':['a','b','d'],'data2':range(3)}) ``` | key | data1 | | - | - | | a | 0 | | b | 1 | | d | 2 | ``` pd.merge(df1,df2) # 預設為 inner pd.merge(df1,df2,how = 'inner') # 交集 # pd.merge(df1,df2,on='key',how = 'inner') ``` | key | data1 |data2| | - | - | -| | a | 0 |0| | b | 1 |1| ``` pd.merge(df1,df2,how = 'outer') # 聯集 # pd.merge(df1,df2,on='key',how = 'outer') ``` | key | data1 |data2| | - | - | -| | a | 0 |0| | b | 1 |1| | c | 2 |NaN| | d | NaN |1| ``` pd.merge(df1,df2,how = 'left') # 以左側資料為主取全部，右側取部分 # pd.merge(df1,df2,on='key',how = 'left') ``` | key | data1 |data2| | - | - | -| | a | 0 |0| | b | 1 |1| | c | 2 |NaN| ``` pd.merge(df1,df2,how = 'right') # 以右側資料為主取全部，左側取部分 # pd.merge(df1,df2,on='key',how = 'right') ``` | key | data1 |data2| | - | - | -| | a | 0 |0| | b | 1 |1| | d | NaN |1| 如果左右側DataFrame的連線鍵列名不一致，但是取值有重疊，可使用left_on、right_on來指定左右連線鍵。 #### 多對多合併 (兩個表的連線鍵 row 中有重複的情況) ``` df3 = pd.DataFrame({'key':['a','b','c','d','e'],'data1':range(5)}) ``` | key | data1 | | - | - | | a | 0 | | b | 1 | | c | 2 | | d | 3 | | e | 4 | ``` df4 = pd.DataFrame({'key':['e','e','e','f','g'],'data2':range(5)}) ``` | key | data1 | | - | - | | e | 0 | | e | 1 | | e | 2 | | f | 3 | | g | 4 | ``` pd.merge(df3,df4,'inner') ``` | key | data1 | | - | - | | e | 0 | | e | 1 | | e | 2 | ``` pd.merge(df3,df4,'outer') ``` | key | data1 |data2| | - | - | -| | a | 0 |NaN| | b | 1 |NaN| | c | 2 |NaN| | d | 3 |NaN| | e | 4 |0| | e | 4 |1| | e | 4 |2| | f | NaN |3| | g | NaN |4| #### 索引上的合併當連線鍵位於索引中時，成為索引上的合併，可以通過merge函式，傳入left_index、right_index來說明應該被索引的情況。一表中連線鍵是索引列、另一表連線鍵是非索引列 ``` left = pd.DataFrame({'key':['a','b','a','a','b','c'],'value': range(6)}) ``` | key| value | | - | - | | a | 0 | | b | 1 | | a | 2 | | a | 3 | | b | 4 | | c | 5 | ``` right = pd.DataFrame({'group_val':[1,2]},index = ['a','b']) ``` | | group_val | | - | - | | a | 1 | | b | 2 | ``` pd.merge(left,right,left_on = 'key',right_index = True) ``` | key | value | group_val | | --- | ----- | --------- | | a | 0 | 1 | | a | 2 | 1 | | a | 3 | 1 | | b | 1 | 2 | | b | 4 | 3 | #### 兩個表中的索引列都是連線鍵 ``` left = pd.DataFrame(np.arange(6).reshape(3,2),index = ['a','b','c'],columns = ['col1','col2']) ``` | | col1 |col2| |-| - | - | | a | 0 |1| | b | 2 |3| | c | 4 |5| ``` right = pd.DataFrame(np.arange(7,15).reshape(4,2),index = ['b','c','d','e'],columns = ['col3','col4']) ``` | | col1 |col2| |-| - | - | | b | 7 |8| | d | 9 |10| | d | 11 |12| | e | 13 |14| ``` pd.merge(left,right,left_index = True,right_index= True,how = 'outer') ``` | col1 | col2 | col3 | col4 | | ---- | ---- | ---- | ---- | | 0 | 1 | NaN | NaN | | 2 | 3 | 7 | 8 | | 4 | 5 | 9 | 10 | | NaN | NaN | 11 | 12 | ### Concat (Series、DataFrame)