# string
資訊之芽 2023 Python 語法班
Author: Sean 韋詠祥
Note:
日期:2023-03-26
課程時間:14:00 - 15:20
----
## 這堂課你會學到...
- 字串運算、比較
- 格式化字串
- 字串操作、處理
- 相關特性
----
## 什麼是字串
- 一串文字
- 可以用 `'` 或 `"` 包起來
- Python 沒有「字元」(char)的概念
----
## 課程回顧:字串輸出
```python
name = 'Sean'
print('Hello', name)
print("Hi, I'm here!")
```
Note:
已學過:字串、指派變數、輸出
----
## 怎樣算一個字串
```python
str1 = '字串可以用單引號'
str2 = "或是雙引號包住"
```
```python
str3 = '''
多行文字
可以用三個引號
'''
```
```python
"""
有時會被用來作為註解(comment)
來讓一段程式碼不要被執行
"""
```
```python
print(type(str1)) # <class 'str'>
```
---
## 字串運算:範例
```python
# 用 + 號合併字串
print('app' + 'le')
# 用 * 號重複字串
print('ha' * 3)
# 相鄰的字串會自動合併
print('to' 'day')
```
```python
# 也可以跨行
print('I am a student'
' in the Department of Computer Science,'
' National Yang Ming Chiao Tung University.')
```
----
## 字串運算:練習
請紀錄哪些不能正常執行
又有哪些輸出結果不符預期
```python
# Try it: banana
print('1a', 'ba' + 'na' * 2)
print('1b', 'ba' + 2 * 'na' )
print('1c', 'ba' 'na' * 2)
print('1d', 'ba' 2 * 'na' )
```
```python
# Try it: lalaland
print('2a', 'la' * 3 + 'nd')
print('2b', 3 * 'la' + 'nd')
print('2c', 'la' * 3 'nd')
print('2d', 3 * 'la' 'nd')
```
Note:
1a banana
1b banana
1c banabana
1d invalid syntax
2a lalaland
2b lalaland
2c invalid syntax
2d landlandland
----
## 字串比較
```python
print('foo' == 'bar') # False
print('NYCU' > 'NTHU') # True
```
```python
print('12' < 42)
# TypeError: '<' not supported between
# instances of 'str' and 'int'
```
---
## 格式化字串
Format String
```python
height = 175
weight = 60
bmi = weight / (height/100) / (height/100)
```
```python
print('I am ' + str(height) + ' cm tall, BMI=' + str(bmi))
print('I am {} cm tall, BMI={}'.format(height, bmi))
print(f'I am {height} cm tall, BMI={bmi}')
```
```python
print(f'I am {height} cm tall, BMI={bmi:.1f}')
print('I am {} cm tall, BMI={:.3f}'.format(height, bmi))
```
----
## 習題:九九乘法・改
https://neoj.sprout.tw/problem/3089/
- 輸入一個整數 n(n >= 0)
- 輸出跳過 n 的九九乘法表
----
## 跳脫字元
| Escape Seq | 代表意義 |
| :--: | :--: |
| `\n` | ASCII 換行字元(LF) |
| `\t` | ASCII Tab 字元 |
| `\'`, `\"` | 單引號(`'`), 雙引號(`"`) |
| `\\` | 反斜線(`\`) |
| `\xhh` | Hex 值為 hh 的字元 |
| `\uhhhh` | Hex 值為 hhhh 的 2 位元組字元 |
| `\<換行>` | 忽略換行 |
----
## 跳脫字元:練習
試著輸出以下文字
```text
1. Let's code and "debug"!
2. C:\boot\test\file
3.
apple $5
banana $10
cat $20
deer $40
```
---
## 字串操作
- 取得長度 (length)
- 索引 (index)
- 迭代 (iterate)
- 分割 (slice)
----
## 取得長度 (length)
```python
sentence = 'Hello World!'
length = len(sentence)
print(f'The sentence have {length} characters')
```
```python
sa = '國立陽明交通大學學生聯合會交通大學校區分會'
print(f'The word "{sa}" is {len(sa)} letters long.')
```
----
## 索引 (index)
```python
sentence = 'Oka is strong and also gives shade'
print(f'The first letter is "{sentence[0]}",'
f' and the 3rd letter is "{sentence[2]}".')
print(f'The 4th from last letter is "{sentence[-4]}",'
f' and the last letter is "{sentence[-1]}".')
```
----
## 迭代 (iterate)
```python
abbr = 'COVID'
for c in abbr:
print(f'Do you know what {c} in {abbr} stands for?')
```
----
## 分割 (slice)
```python
lang = 'JavaScript'
# 0123456789
# -987654321
print(f'{lang} is not {lang[:4]}.')
print(f'The word "{lang[1:3]}"'
f' and "{lang[6:9]}" is forbidden.')
print(f'{lang} is a {lang[-6:]}.')
```
----
## Immutable(不可變)
如果我們想改其中一個字
```python
name = 'Github'
name[3] = 'H'
# TypeError: 'str' object does not support item assignment
```
```python
# Alternative way
name = 'Github'
name = list(name)
name[3] = 'H'
name = ''.join(name)
print(name)
```
---
## 字串處理
- replace()
- find()
- split()
- join()
- strip()
----
## replace()
```python
sentence = 'PHP 是世界上最好的語言。'
print(sentence.replace('PHP', 'Python'))
```
```python
DNA = 'ATCG ATAT TCTC TGTG TATA CCTT ATGT'
print('RNA:', DNA.replace('T', 'U'))
print('RNA:', DNA.replace('T', 'U', 3))
```
```python
# 關於資訊安全
xss1 = '<script>alert(1)</script>'
print(f'Safe! {xss1.replace("script", "")}')
xss2 = '<sscriptcript>alert(1)</sscriptcript>'
print(f'Safe? {xss2.replace("script", "")}')
```
----
## find()
```python
word = input('Your name: ')
pos = word.find('e')
if pos == -1:
print(f'The character "e" not found in "{word}".')
else:
print(f'"e" is the {pos+1}-th character in "{word}".')
```
```python
word = input('Your name: ')
if 'e' in word:
print(f'The character "e" not found in "{word}".')
else:
print(f'We found "e" in "{word}"!')
```
----
## split()
```python
# 預設用一個或多個空格切割
score0 = 'Alice:5 Bob:10 Carol:20 Dave:40 Eva:60'
score1 = score0.split()
score2 = [item.split(':') for item in score1]
score3 = {name: val for name, val in score2}
print('Raw: ', score0, 'Each:', score1,
'List:', score2, 'Dict:', score3, sep='\n')
```
```python
# 指定用什麼字串切割
attendees = 'Frank, Grace, Henry, Iris, Jack'
attendees = attendees.split(', ')
for name in attendees:
print(f'Hello {name}!')
```
```python
# 進階用法:以 RegEx(正規表達式)切割
import re
dep = 'CS,,EE,,,MATH,PHY,,,,LAW,,HS'
print(re.split(',+', dep)) # 一個或多個 ','
```
----
## join()
注意用法是 `'分隔符'.join([列表])`
```python
habbits = ['Swimming', 'Reading', 'Meowing', 'Hiking']
print(f'My habbits are {habbits}.')
print(f'My habbits are {", ".join(habbits)}.')
```
```python
# 只有字串能 join
print(''.join([1, 2, 3]))
# TypeError: sequence item 0: expected
# str instance, int found
```
----
## strip()
在 `input()` 時很常用
```python
name = ' Frank '
print(f'Hey, "{name}"!')
print(f'Hello, "{name.strip()}"!')
```
```python
names = '.... Bob, Alice, .... '
print(f'Hi, "{names}"!')
print(f'Hey, "{names.strip()}"!')
print(f'Hello, "{names.strip("., ")}"!')
```
----
## 字串處理:練習
你有另一個不打標點符號的檳友
總是用一些空白代替逗號,用 XD 代替句號
請你用 Python 幫他改成正確的標點符號
```python
s = '我..熬夜...做..簡報.. 累死..了...XD' \
'熬夜..真的..會...變智障 我該..去睡覺XD'
# Ans: '我熬夜做簡報,累死了。熬夜真的會變智障,我該去睡覺。'
```
進階挑戰:一句話完成
Note:
```python
','.join(s.split()) \
.replace('XD', '。') \
.replace('.', '')
```
---
## 延伸:byte literal
- Python 2 vs Python 3
- Unicode / UTF-8
- byte literal
Note:
pwntools
----
## Python 2
Python 3.0 在 2008 年底釋出
舊版 Python 2 在 2020 年初被正式淘汰
在 Python 2 的字串基於 byte
而 Python 3 則預設是 Unicode 字串
----
## Unicode
在 UTF-8 編碼中,`U+D8` 會變成 `b'\xC3\x90'`
![](https://img.sean.taipei/2021/12/utf-8.jpg)
Ref: [Unicode、UTF-8 差異](https://blog.sean.taipei/2021/12/unicode#unicode-diff)
----
## Byte
```python
print('\xFF\xD8\xFF\xE1'.encode('latin-1'))
print(bytes('\xFF\xD8\xFF\xE1', 'latin-1'))
print(b'\xFF\xD8\xFF\xE1')
```
Week 12. pwntools
----
## 延伸:格式化字串
(有興趣自行參考 [pyformat.info](https://pyformat.info/))
```python!
# Python 3.8+
name = 'Sean'
age = 21
print(f'{name=}, {age=}')
```
```python
print('{:10}'.format('left'))
print('{:^10}'.format('center'))
print('{:>10}'.format('right'))
# left
# center
# right
# 0123456789
```
```python
user = {'name': 'Sean', 'univ': 'NCTU'} # dict
print('Hello, {name} from {univ}!'.format(**user))
```
```python
print('{:.{prec}} = {:.{prec}f}'
.format('Python', 3.14159, prec=2))
# Py = 3.14
```
---
## 彩蛋:WTF Python
- Not knot!
- Half triple-quoted strings
- Strings can be tricky sometimes
- Splitsies
- Character
Ref: [satwikkansal/wtfpython](https://github.com/satwikkansal/wtfpython)([簡中翻譯](https://github.com/leisurelicht/wtfpython-cn))
----
## Not knot!
```python
print(not True == False) # True
print(True == not False) # SyntaxError: invalid syntax
```
----
## Half triple-quoted strings
```go
print('WTF''') # 正常
print("WTF""") # 正常
print('''WTF') # SyntaxError
print("""WTF") # SyntaxError
```
----
## Strings can be tricky sometimes
```python
a = 'wtf'
b = 'wtf'
print(a is b) # True
a = 'wtf!'
b = 'wtf!'
print(a is b) # False
a, b = 'wtf!', 'wtf!'
print(a is b) # True
```
----
## Splitsies
```python
print(' a '.split()) # ['a']
print(' a '.split(' ')) # ['', 'a', '']
print(' '.split()) # []
print(' '.split(' ')) # ['', '', '', '', '']
```
----
## Character
以下這句話在語法上是正確的
```python
'bar'[0][0][0][0][0]
```
因為 Python 並沒有 char 這種資料型別
對字串使用 `[x]` 取值,會回傳包含單個字的 string,而不是字元
----
# Thanks
投影片連結:https://hackmd.io/@Sean64/py-string
<!-- .element: class="r-fit-text" -->
<br>
[![CC-BY](https://mirrors.creativecommons.org/presskit/buttons/88x31/png/by.png)](https://creativecommons.org/licenses/by/4.0/deed.zh_TW)
###### 這份投影片以 [創用 CC - 姓名標示](https://creativecommons.org/licenses/by/4.0/deed.zh_TW) 授權公眾使用,原始碼及講稿請見 [此連結](https://hackmd.io/@Sean64/py-string/edit)。
{"metaMigratedAt":"2023-06-17T19:42:03.688Z","metaMigratedFrom":"YAML","title":"string - 資訊之芽 2023 Python 語法班","breaks":true,"description":"Sean 韋詠祥 / 2023-03-26 14:00 / 字串運算、比較 / 格式化字串 / 字串操作、處理 / 相關特性","contributors":"[{\"id\":\"8a6148ae-d280-4bfd-a5d9-250c22d4675c\",\"add\":15643,\"del\":6981}]"}