# 字典與集合
### A bite of `dict` and `set`
---
<!-- .slide: data-background="https://hackmd.io/_uploads/ByB5Vi_bC.jpg" -->
## About Me
- Yoyo
- Volunteer of PyCon TW since 2022
- 雙棲 Backend + AI 工程師

---
## Set
<font size=5>
不含重複值的集合
</font>
```python
s = {1, 1, 2, 2, 3, 3}
print(s) # {1, 2, 3}
s = set(['a', 'b', 'c', 123, 321, '123'])
print(s) # 觀察輸出會發現值沒有按照順序排列
```
----
## 更新
```python
s = {1, 2, 3}
# 沒有回傳值的更新 (in-place update)
s.add(4)
s.update({5, 6})
s.update([7, 8, 7])
s.remove(8)
s.discard(8) # Do nothing
# 支援減法
s -= {0, 1, 2}
# 不支援加法
s += {'invalid', 'operation'}
```
----
## 更新(2)
```python
# 有回傳值的運算,但不會更新
s = {'a', 'b'}
print(s.union({'c', 'd'})) # {'a', 'b', 'c', 'd'}
print(s | {'e', 'f'}) # {'a', 'b', 'e', 'f'}
print(s) # {'a', 'b'}
```
----
## 各種運算子
```python
a = {1, 2, 3}
b = {3, 4, 5}
print(a & b) # {3}
print(a | b) # {1, 2, 3, 4, 5}
print(a - b) # {1, 2}
print(a ^ b) # {1, 2, 4, 5}
print(a > b) # False
print(a < b) # False
```
##### [集合運算的示意圖](https://texample.net/media/tikz/examples/PNG/set-operations-illustrated-with-venn-diagrams.png)
----
## 最大的用途 - 判斷"值"是否存在
比起檢查元素是否存在 list 裡,檢查元素是否存在 set 裡要更快的多
```python
# 打開 IPython 執行以下
a = set(range(10000))
b = list(range(10000))
import random
timeit random.randint(0, 10000) in a # 572 ns ± 0.895 ns per loop
timeit random.randint(0, 10000) in b # 35200 ns ± 244 ns per loop
# For reference, check value in dict
d = {v: v for v in range(10000)}
timeit random.randint(0, 10000) in d # 587 ns ± 4.96 ns per loop
```
---
## Dict
<font size=5>
Python 最萬用的資料結構
</font>
```python
a = {'key': 'value'}
b = dict(key='value')
print(a == b) # True
print(type({})) # ??
```
----
### 字典怪人
```python
# 有支援的 key 的資料型態
d = {
's': 123, # str
321: 'yoyo', # int
0.99: ['hello', 'yo'], # float
('tup', 'key'): 100, # tuple
'nested': {'level': 2}, # nested dict
# ['list', 'not', 'allowed']: 'as key',
}
print(d[321]) # yoyo
print(d['s']) # 123
```
----
### 走訪元素
```python
d = {'a': 100, 'b': 200}
# 兩者都拿
for key, value in d.items():
...
# 只拿 key
for key in d:
...
# 只拿 value
for v in d.values():
...
```
----
### 一次全拿
```python
d = {'a': 100, 'b': 200}
keys = d.keys()
keys = list(keys)
values = d.values()
values = list(values)
```
----
### 更新
<font size=5>
key 是會按照新增的順序進行排列的,在列舉的時候排序就不會亂掉
</font>
```python
d = {'a': 100}
d['b'] = 200 # 直接新增
d.update({'c': 300, 'a': 20}) # 用另一個字典更新自己
d |= {'d': 123, 'b': 40} # 同上,但語法更簡潔,推薦使用
d = {**d, **{'e': 33}} # 解壓縮式語法,舊時代的 workaround,不推薦使用
print(d) # {'a': 20, 'b': 40, 'c': 300, 'd': 123, 'e': 33}
```
----
### 刪除
```python
d = {'a': 10, 'b': 20}
del d['a']
v = d.pop('b')
print(d, v)
```
----
#### 安全的拿
```python
d = {'a': 100}
print(d['b']) # KeyError!!
print(d.get('b', 'No such key!')) # Print out 'No such key!'
```
#### 安全的修改值
```python
d = {'a': 10}
d.setdefault('b', 0) # 新增 key 與值
d.setdefault('a', 700) # 撞 key,不會動到原本的值
print(d) # {'a': 10, 'b': 0}
```
---
### Dict 的安全使用守則!
> Pass by value or by reference, that is a question.
----
<font size=5>
在函示裡的更動,不會影響到外面
</font>
```python
def inplace_modify(v: int):
v = 60
v = 100
inplace_modify(v)
print(v) # 100
```
<font size=5>
咦?怎麼變了!?說好的不影響呢 QQ
</font>
```python
def inplace_modify(d: dict):
d['a'] = 'modified'
d = {'a': 100, 'b': 200}
inplace_modify(d)
print(d) # {'a': 'modified', 'b': 200}
```
----
<font size=5>
還是有好用且合理的用法
</font>
```python
d = {'a': 100}
l = d.setdefault('b', [])
l.append(90)
print(d) # {'a': 100, 'b': [90]}
```
<font size=5>
恩!?怎麼這樣 OwO!?
</font>
```python
l = [{'a': 10}] * 3
l[0]['a'] = 33
print(l) # [{'a': 33}, {'a': 33}, {'a': 33}]
```
----
### Ad(Dict)ion
<font size=6>
字典上癮的特徵
</font>
```shell
1. 程式裡有大量的參數是字典型態
2. 函式參數 or 回傳值充滿字典
3. 在函示裡新增、修改傳入的字典資料
```
----
```python
def do_something_complex(d: dict) -> dict:
...
d = {}
do_something_complex(d)
print(d) # ??
```
----
### Dict Everywhere 的後遺症
```shell
1. 沒人知道字典最終的長相(有哪些 key、value 是多少)
2. 難以追蹤 value 什麼時候、在哪裡被修改
3. 缺乏 type hint 的輔助,難以理解 value 的資料型態是什麼
```
----
### Best Practice
<font size=6>
複雜的資料結構不要用 dict 存,請另外定義一個 class 存資料
</font>
***Explicit** is ALWAYS better than **implicit***
```python
# Don't do this
data = {'name': 'Yoyo', 'address': 'Japan'}
# Do this instead
class User:
def __init__(self, name, address):
self.ame = name
self.address = 'Japan'
def modify_user(user):
user.name = 'Gobby'
user.extra_attr = 'New attribute'
user = User(name='Yoyo', address='Japan')
modify_user(user)
print(user.name)
print(user.address)
print(user.extra_attr)
```
---
<!-- .slide: data-background="https://hackmd.io/_uploads/Sk6dWsdb0.jpg" -->
## Thanks for your listening~~
### Q&A
---
## Reference
[Colab](https://colab.research.google.com/drive/1Nf9pNZmgmlu6-tCntutz5HH1I-msHoKg?usp=sharing)
{"title":"字典與集合","description":"","contributors":"[{\"id\":\"b1e00459-ddb2-479e-b757-9f04a68f2ed8\",\"add\":5320,\"del\":390}]"}