string

資訊之芽 2023 Python 語法班
Author: Sean 韋詠祥

Note:
日期:2023-03-26
課程時間:14:00 - 15:20


這堂課你會學到

  • 字串運算、比較
  • 格式化字串
  • 字串操作、處理
  • 相關特性

什麼是字串

  • 一串文字
  • 可以用 '" 包起來
  • Python 沒有「字元」(char)的概念

課程回顧:字串輸出

name = 'Sean'
print('Hello', name)

print("Hi, I'm here!")

Note:
已學過:字串、指派變數、輸出


怎樣算一個字串

str1 = '字串可以用單引號'
str2 = "或是雙引號包住"
str3 = '''
多行文字
可以用三個引號
'''
"""
有時會被用來作為註解(comment)
來讓一段程式碼不要被執行
"""
print(type(str1))  # <class 'str'>

字串運算:範例

# 用 + 號合併字串
print('app' + 'le')

# 用 * 號重複字串
print('ha' * 3)

# 相鄰的字串會自動合併
print('to' 'day')
# 也可以跨行
print('I am a student'
      ' in the Department of Computer Science,'
      ' National Yang Ming Chiao Tung University.')

字串運算:練習

請紀錄哪些不能正常執行
又有哪些輸出結果不符預期

# Try it: banana
print('1a', 'ba' +     'na' * 2)
print('1b', 'ba' + 2 * 'na'    )
print('1c', 'ba'       'na' * 2)
print('1d', 'ba'   2 * 'na'    )
# Try it: lalaland
print('2a',     'la' * 3 + 'nd')
print('2b', 3 * 'la'     + 'nd')
print('2c',     'la' * 3   'nd')
print('2d', 3 * 'la'       'nd')

Note:
1a banana
1b banana
1c banabana
1d invalid syntax

2a lalaland
2b lalaland
2c invalid syntax
2d landlandland


字串比較

print('foo' == 'bar')   # False
print('NYCU' > 'NTHU')  # True
print('12' < 42)
# TypeError: '<' not supported between
#            instances of 'str' and 'int'

格式化字串

Format String

height = 175
weight = 60
bmi = weight / (height/100) / (height/100)
print('I am ' + str(height) + ' cm tall, BMI=' + str(bmi))

print('I am {} cm tall, BMI={}'.format(height, bmi))

print(f'I am {height} cm tall, BMI={bmi}')
print(f'I am {height} cm tall, BMI={bmi:.1f}')

print('I am {} cm tall, BMI={:.3f}'.format(height, bmi))

習題:九九乘法・改

https://neoj.sprout.tw/problem/3089/

  • 輸入一個整數 n(n >= 0)
  • 輸出跳過 n 的九九乘法表

跳脫字元

Escape Seq 代表意義
\n ASCII 換行字元(LF)
\t ASCII Tab 字元
\', \" 單引號('), 雙引號("
\\ 反斜線(\
\xhh Hex 值為 hh 的字元
\uhhhh Hex 值為 hhhh 的 2 位元組字元
\<換行> 忽略換行

跳脫字元:練習

試著輸出以下文字

1. Let's code and "debug"!

2. C:\boot\test\file

3.
apple   $5
banana  $10
cat     $20
deer    $40

字串操作

  • 取得長度 (length)
  • 索引 (index)
  • 迭代 (iterate)
  • 分割 (slice)

取得長度 (length)

sentence = 'Hello World!'
length = len(sentence)
print(f'The sentence have {length} characters')
sa = '國立陽明交通大學學生聯合會交通大學校區分會'
print(f'The word "{sa}" is {len(sa)} letters long.')

索引 (index)

sentence = 'Oka is strong and also gives shade'

print(f'The first letter is "{sentence[0]}",'
      f' and the 3rd letter is "{sentence[2]}".')

print(f'The 4th from last letter is "{sentence[-4]}",'
      f' and the last letter is "{sentence[-1]}".')

迭代 (iterate)

abbr = 'COVID'

for c in abbr:
    print(f'Do you know what {c} in {abbr} stands for?')

分割 (slice)

lang = 'JavaScript'
#       0123456789
#       -987654321

print(f'{lang} is not {lang[:4]}.')

print(f'The word "{lang[1:3]}"'
      f' and "{lang[6:9]}" is forbidden.')

print(f'{lang} is a {lang[-6:]}.')

Immutable(不可變)

如果我們想改其中一個字

name = 'Github'
name[3] = 'H'
# TypeError: 'str' object does not support item assignment
# Alternative way
name = 'Github'
name = list(name)
name[3] = 'H'
name = ''.join(name)
print(name)

字串處理

  • replace()
  • find()
  • split()
  • join()
  • strip()

replace()

sentence = 'PHP 是世界上最好的語言。'
print(sentence.replace('PHP', 'Python'))
DNA = 'ATCG ATAT TCTC TGTG TATA CCTT ATGT'
print('RNA:', DNA.replace('T', 'U'))
print('RNA:', DNA.replace('T', 'U', 3))
# 關於資訊安全
xss1 = '<script>alert(1)</script>'
print(f'Safe! {xss1.replace("script", "")}')

xss2 = '<sscriptcript>alert(1)</sscriptcript>'
print(f'Safe? {xss2.replace("script", "")}')

find()

word = input('Your name: ')

pos = word.find('e')
if pos == -1:
    print(f'The character "e" not found in "{word}".')
else:
    print(f'"e" is the {pos+1}-th character in "{word}".')
word = input('Your name: ')

if 'e' in word:
    print(f'The character "e" not found in "{word}".')
else:
    print(f'We found "e" in "{word}"!')

split()

# 預設用一個或多個空格切割
score0 = 'Alice:5 Bob:10   Carol:20 Dave:40   Eva:60'
score1 = score0.split()
score2 = [item.split(':') for item in score1]
score3 = {name: val for name, val in score2}

print('Raw: ', score0, 'Each:', score1,
      'List:', score2, 'Dict:', score3, sep='\n')
# 指定用什麼字串切割
attendees = 'Frank, Grace, Henry, Iris, Jack'
attendees = attendees.split(', ')
for name in attendees:
    print(f'Hello {name}!')
# 進階用法:以 RegEx(正規表達式)切割
import re
dep = 'CS,,EE,,,MATH,PHY,,,,LAW,,HS'
print(re.split(',+', dep))  # 一個或多個 ','

join()

注意用法是 '分隔符'.join([列表])

habbits = ['Swimming', 'Reading', 'Meowing', 'Hiking']
print(f'My habbits are {habbits}.')
print(f'My habbits are {", ".join(habbits)}.')
# 只有字串能 join
print(''.join([1, 2, 3]))
# TypeError: sequence item 0: expected
#            str instance, int found

strip()

input() 時很常用

name = '  Frank '
print(f'Hey,   "{name}"!')
print(f'Hello, "{name.strip()}"!')
names = '.... Bob, Alice, .... '
print(f'Hi,    "{names}"!')
print(f'Hey,   "{names.strip()}"!')
print(f'Hello, "{names.strip("., ")}"!')

字串處理:練習

你有另一個不打標點符號的檳友
總是用一些空白代替逗號,用 XD 代替句號
請你用 Python 幫他改成正確的標點符號

s = '我..熬夜...做..簡報..  累死..了...XD' \
    '熬夜..真的..會...變智障   我該..去睡覺XD'

# Ans: '我熬夜做簡報,累死了。熬夜真的會變智障,我該去睡覺。'

進階挑戰:一句話完成

Note:

','.join(s.split()) \
.replace('XD', '。') \
.replace('.', '')

延伸:byte literal

  • Python 2 vs Python 3
  • Unicode / UTF-8
  • byte literal

Note:
pwntools


Python 2

Python 3.0 在 2008 年底釋出
舊版 Python 2 在 2020 年初被正式淘汰

在 Python 2 的字串基於 byte
而 Python 3 則預設是 Unicode 字串


Unicode

在 UTF-8 編碼中,U+D8 會變成 b'\xC3\x90'

Ref: Unicode、UTF-8 差異


Byte

print('\xFF\xD8\xFF\xE1'.encode('latin-1'))
print(bytes('\xFF\xD8\xFF\xE1', 'latin-1'))
print(b'\xFF\xD8\xFF\xE1')

Week 12. pwntools


延伸:格式化字串

(有興趣自行參考 pyformat.info

# Python 3.8+
name = 'Sean'
age = 21
print(f'{name=}, {age=}')
print('{:10}'.format('left'))
print('{:^10}'.format('center'))
print('{:>10}'.format('right'))
# left
#   center
#      right
# 0123456789
user = {'name': 'Sean', 'univ': 'NCTU'}  # dict
print('Hello, {name} from {univ}!'.format(**user))
print('{:.{prec}} = {:.{prec}f}'
      .format('Python', 3.14159, prec=2))
# Py = 3.14

彩蛋:WTF Python

  • Not knot!
  • Half triple-quoted strings
  • Strings can be tricky sometimes
  • Splitsies
  • Character

Ref: satwikkansal/wtfpython簡中翻譯


Not knot!

print(not True == False)  # True
print(True == not False)  # SyntaxError: invalid syntax

Half triple-quoted strings

print('WTF''')  # 正常
print("WTF""")  # 正常
print('''WTF')  # SyntaxError
print("""WTF")  # SyntaxError

Strings can be tricky sometimes

a = 'wtf'
b = 'wtf'
print(a is b)  # True

a = 'wtf!'
b = 'wtf!'
print(a is b)  # False

a, b = 'wtf!', 'wtf!'
print(a is b)  # True

Splitsies

print(' a '.split())     # ['a']
print(' a '.split(' '))  # ['', 'a', '']

print('    '.split())     # []
print('    '.split(' '))  # ['', '', '', '', '']

Character

以下這句話在語法上是正確的

'bar'[0][0][0][0][0] 

因為 Python 並沒有 char 這種資料型別

對字串使用 [x] 取值,會回傳包含單個字的 string,而不是字元


Thanks

投影片連結:https://hackmd.io/@Sean64/py-string


CC-BY

這份投影片以 創用 CC - 姓名標示 授權公眾使用,原始碼及講稿請見 此連結
Select a repo