# string 資訊之芽 2023 Python 語法班 Author: Sean 韋詠祥 Note: 日期:2023-03-26 課程時間:14:00 - 15:20 ---- ## 這堂課你會學到... - 字串運算、比較 - 格式化字串 - 字串操作、處理 - 相關特性 ---- ## 什麼是字串 - 一串文字 - 可以用 `'` 或 `"` 包起來 - Python 沒有「字元」(char)的概念 ---- ## 課程回顧:字串輸出 ```python name = 'Sean' print('Hello', name) print("Hi, I'm here!") ``` Note: 已學過:字串、指派變數、輸出 ---- ## 怎樣算一個字串 ```python str1 = '字串可以用單引號' str2 = "或是雙引號包住" ``` ```python str3 = ''' 多行文字 可以用三個引號 ''' ``` ```python """ 有時會被用來作為註解(comment) 來讓一段程式碼不要被執行 """ ``` ```python print(type(str1)) # <class 'str'> ``` --- ## 字串運算:範例 ```python # 用 + 號合併字串 print('app' + 'le') # 用 * 號重複字串 print('ha' * 3) # 相鄰的字串會自動合併 print('to' 'day') ``` ```python # 也可以跨行 print('I am a student' ' in the Department of Computer Science,' ' National Yang Ming Chiao Tung University.') ``` ---- ## 字串運算:練習 請紀錄哪些不能正常執行 又有哪些輸出結果不符預期 ```python # Try it: banana print('1a', 'ba' + 'na' * 2) print('1b', 'ba' + 2 * 'na' ) print('1c', 'ba' 'na' * 2) print('1d', 'ba' 2 * 'na' ) ``` ```python # Try it: lalaland print('2a', 'la' * 3 + 'nd') print('2b', 3 * 'la' + 'nd') print('2c', 'la' * 3 'nd') print('2d', 3 * 'la' 'nd') ``` Note: 1a banana 1b banana 1c banabana 1d invalid syntax 2a lalaland 2b lalaland 2c invalid syntax 2d landlandland ---- ## 字串比較 ```python print('foo' == 'bar') # False print('NYCU' > 'NTHU') # True ``` ```python print('12' < 42) # TypeError: '<' not supported between # instances of 'str' and 'int' ``` --- ## 格式化字串 Format String ```python height = 175 weight = 60 bmi = weight / (height/100) / (height/100) ``` ```python print('I am ' + str(height) + ' cm tall, BMI=' + str(bmi)) print('I am {} cm tall, BMI={}'.format(height, bmi)) print(f'I am {height} cm tall, BMI={bmi}') ``` ```python print(f'I am {height} cm tall, BMI={bmi:.1f}') print('I am {} cm tall, BMI={:.3f}'.format(height, bmi)) ``` ---- ## 習題:九九乘法・改 https://neoj.sprout.tw/problem/3089/ - 輸入一個整數 n(n >= 0) - 輸出跳過 n 的九九乘法表 ---- ## 跳脫字元 | Escape Seq | 代表意義 | | :--: | :--: | | `\n` | ASCII 換行字元(LF) | | `\t` | ASCII Tab 字元 | | `\'`, `\"` | 單引號(`'`), 雙引號(`"`) | | `\\` | 反斜線(`\`) | | `\xhh` | Hex 值為 hh 的字元 | | `\uhhhh` | Hex 值為 hhhh 的 2 位元組字元 | | `\<換行>` | 忽略換行 | ---- ## 跳脫字元:練習 試著輸出以下文字 ```text 1. Let's code and "debug"! 2. C:\boot\test\file 3. apple $5 banana $10 cat $20 deer $40 ``` --- ## 字串操作 - 取得長度 (length) - 索引 (index) - 迭代 (iterate) - 分割 (slice) ---- ## 取得長度 (length) ```python sentence = 'Hello World!' length = len(sentence) print(f'The sentence have {length} characters') ``` ```python sa = '國立陽明交通大學學生聯合會交通大學校區分會' print(f'The word "{sa}" is {len(sa)} letters long.') ``` ---- ## 索引 (index) ```python sentence = 'Oka is strong and also gives shade' print(f'The first letter is "{sentence[0]}",' f' and the 3rd letter is "{sentence[2]}".') print(f'The 4th from last letter is "{sentence[-4]}",' f' and the last letter is "{sentence[-1]}".') ``` ---- ## 迭代 (iterate) ```python abbr = 'COVID' for c in abbr: print(f'Do you know what {c} in {abbr} stands for?') ``` ---- ## 分割 (slice) ```python lang = 'JavaScript' # 0123456789 # -987654321 print(f'{lang} is not {lang[:4]}.') print(f'The word "{lang[1:3]}"' f' and "{lang[6:9]}" is forbidden.') print(f'{lang} is a {lang[-6:]}.') ``` ---- ## Immutable(不可變) 如果我們想改其中一個字 ```python name = 'Github' name[3] = 'H' # TypeError: 'str' object does not support item assignment ``` ```python # Alternative way name = 'Github' name = list(name) name[3] = 'H' name = ''.join(name) print(name) ``` --- ## 字串處理 - replace() - find() - split() - join() - strip() ---- ## replace() ```python sentence = 'PHP 是世界上最好的語言。' print(sentence.replace('PHP', 'Python')) ``` ```python DNA = 'ATCG ATAT TCTC TGTG TATA CCTT ATGT' print('RNA:', DNA.replace('T', 'U')) print('RNA:', DNA.replace('T', 'U', 3)) ``` ```python # 關於資訊安全 xss1 = '<script>alert(1)</script>' print(f'Safe! {xss1.replace("script", "")}') xss2 = '<sscriptcript>alert(1)</sscriptcript>' print(f'Safe? {xss2.replace("script", "")}') ``` ---- ## find() ```python word = input('Your name: ') pos = word.find('e') if pos == -1: print(f'The character "e" not found in "{word}".') else: print(f'"e" is the {pos+1}-th character in "{word}".') ``` ```python word = input('Your name: ') if 'e' in word: print(f'The character "e" not found in "{word}".') else: print(f'We found "e" in "{word}"!') ``` ---- ## split() ```python # 預設用一個或多個空格切割 score0 = 'Alice:5 Bob:10 Carol:20 Dave:40 Eva:60' score1 = score0.split() score2 = [item.split(':') for item in score1] score3 = {name: val for name, val in score2} print('Raw: ', score0, 'Each:', score1, 'List:', score2, 'Dict:', score3, sep='\n') ``` ```python # 指定用什麼字串切割 attendees = 'Frank, Grace, Henry, Iris, Jack' attendees = attendees.split(', ') for name in attendees: print(f'Hello {name}!') ``` ```python # 進階用法:以 RegEx(正規表達式)切割 import re dep = 'CS,,EE,,,MATH,PHY,,,,LAW,,HS' print(re.split(',+', dep)) # 一個或多個 ',' ``` ---- ## join() 注意用法是 `'分隔符'.join([列表])` ```python habbits = ['Swimming', 'Reading', 'Meowing', 'Hiking'] print(f'My habbits are {habbits}.') print(f'My habbits are {", ".join(habbits)}.') ``` ```python # 只有字串能 join print(''.join([1, 2, 3])) # TypeError: sequence item 0: expected # str instance, int found ``` ---- ## strip() 在 `input()` 時很常用 ```python name = ' Frank ' print(f'Hey, "{name}"!') print(f'Hello, "{name.strip()}"!') ``` ```python names = '.... Bob, Alice, .... ' print(f'Hi, "{names}"!') print(f'Hey, "{names.strip()}"!') print(f'Hello, "{names.strip("., ")}"!') ``` ---- ## 字串處理:練習 你有另一個不打標點符號的檳友 總是用一些空白代替逗號,用 XD 代替句號 請你用 Python 幫他改成正確的標點符號 ```python s = '我..熬夜...做..簡報.. 累死..了...XD' \ '熬夜..真的..會...變智障 我該..去睡覺XD' # Ans: '我熬夜做簡報,累死了。熬夜真的會變智障,我該去睡覺。' ``` 進階挑戰:一句話完成 Note: ```python ','.join(s.split()) \ .replace('XD', '。') \ .replace('.', '') ``` --- ## 延伸:byte literal - Python 2 vs Python 3 - Unicode / UTF-8 - byte literal Note: pwntools ---- ## Python 2 Python 3.0 在 2008 年底釋出 舊版 Python 2 在 2020 年初被正式淘汰 在 Python 2 的字串基於 byte 而 Python 3 則預設是 Unicode 字串 ---- ## Unicode 在 UTF-8 編碼中,`U+D8` 會變成 `b'\xC3\x90'` ![](https://img.sean.taipei/2021/12/utf-8.jpg) Ref: [Unicode、UTF-8 差異](https://blog.sean.taipei/2021/12/unicode#unicode-diff) ---- ## Byte ```python print('\xFF\xD8\xFF\xE1'.encode('latin-1')) print(bytes('\xFF\xD8\xFF\xE1', 'latin-1')) print(b'\xFF\xD8\xFF\xE1') ``` Week 12. pwntools ---- ## 延伸:格式化字串 (有興趣自行參考 [pyformat.info](https://pyformat.info/)) ```python! # Python 3.8+ name = 'Sean' age = 21 print(f'{name=}, {age=}') ``` ```python print('{:10}'.format('left')) print('{:^10}'.format('center')) print('{:>10}'.format('right')) # left # center # right # 0123456789 ``` ```python user = {'name': 'Sean', 'univ': 'NCTU'} # dict print('Hello, {name} from {univ}!'.format(**user)) ``` ```python print('{:.{prec}} = {:.{prec}f}' .format('Python', 3.14159, prec=2)) # Py = 3.14 ``` --- ## 彩蛋:WTF Python - Not knot! - Half triple-quoted strings - Strings can be tricky sometimes - Splitsies - Character Ref: [satwikkansal/wtfpython](https://github.com/satwikkansal/wtfpython)([簡中翻譯](https://github.com/leisurelicht/wtfpython-cn)) ---- ## Not knot! ```python print(not True == False) # True print(True == not False) # SyntaxError: invalid syntax ``` ---- ## Half triple-quoted strings ```go print('WTF''') # 正常 print("WTF""") # 正常 print('''WTF') # SyntaxError print("""WTF") # SyntaxError ``` ---- ## Strings can be tricky sometimes ```python a = 'wtf' b = 'wtf' print(a is b) # True a = 'wtf!' b = 'wtf!' print(a is b) # False a, b = 'wtf!', 'wtf!' print(a is b) # True ``` ---- ## Splitsies ```python print(' a '.split()) # ['a'] print(' a '.split(' ')) # ['', 'a', ''] print(' '.split()) # [] print(' '.split(' ')) # ['', '', '', '', ''] ``` ---- ## Character 以下這句話在語法上是正確的 ```python 'bar'[0][0][0][0][0] ``` 因為 Python 並沒有 char 這種資料型別 對字串使用 `[x]` 取值,會回傳包含單個字的 string,而不是字元 ---- # Thanks 投影片連結:https://hackmd.io/@Sean64/py-string <!-- .element: class="r-fit-text" --> <br> [![CC-BY](https://mirrors.creativecommons.org/presskit/buttons/88x31/png/by.png)](https://creativecommons.org/licenses/by/4.0/deed.zh_TW) ###### 這份投影片以 [創用 CC - 姓名標示](https://creativecommons.org/licenses/by/4.0/deed.zh_TW) 授權公眾使用,原始碼及講稿請見 [此連結](https://hackmd.io/@Sean64/py-string/edit)。
{"type":"slide","tags":"presentation","title":"string - 資訊之芽 2023 Python 語法班","description":"Sean 韋詠祥 / 2023-03-26 14:00 / 字串運算、比較 / 格式化字串 / 字串操作、處理 / 相關特性"}
    499 views