Python 中名稱的有效範圍 (scope)

###### tags: `Python` # Python 中名稱的有效範圍 (scope) 在 Python 中程式是以[**程式區塊 (code block)**](https://docs.python.org/3/reference/executionmodel.html#structure-of-a-program) 為單位, 每個**模組** (module, 單一腳本檔)、函式本體、類別定義都是一個程式區塊, 有自己紀錄名稱與對應物件的清單, 稱為**名稱空間 (namespace)**, 系統就是依據**程式區塊間的層級關係**, 找到個別名稱要對應到的物件。 ## 綁定名稱每當執行設定敘述、函式定義、類別定義、import 敘述, 以及傳入引數叫用函式或方法時, 就會將名稱[**綁定 (binding)**](https://docs.python.org/3/reference/executionmodel.html#binding-of-names) 到對應的物件, 記錄在該程式區塊的名稱空間內。在 Python 執行環境中, 預設會有 \_\_builtins\_\_ 名稱空間, 對應到 [builtins 模組](https://docs.python.org/3/library/builtins.html#module-builtins), 包含所有內建的名稱, 像是內建函式 `print` 的名稱就是記錄在這裡。當找不到綁定的名稱時, 最終就會到 \_\_builtins\_\_ 中尋找, 這也是我們可以不用 import 任何名稱就可以叫用 `print` 等內建函式的原因。 ## 基本原則：使用在最近一層的程式區塊中綁定的名稱當使用到某個名稱時, 基本原則就是以[最靠近的程式區塊](https://docs.python.org/3/reference/executionmodel.html#resolution-of-names)中綁定的名稱為準, 例如以下這個簡單的例子： ```python a = 10 def test(): b = 20 print(b) print(a) test() ``` 一開始執行模組本身的程式區塊時, 當執行到 `a = 10` 後, 就會紀錄 `a` 名稱綁定到 10 這個整數物件；執行完 `test` 函式的定義後, 也會記錄 `test` 名稱綁定到定義好的函式： ``` __main__ | | a -----------> 10 | test-----------> test 函式 +------- ``` Python 會把執行的模組取名為 "\_\_main\_\_", 如果是匯入的模組, 名稱則是檔名。等到叫用 `test()` 時, 由於在目前的名稱空間中就可以找到 `test` 這個名稱, 因此會執行該名稱所綁定到的函式。這時會執行此函式的程式區塊, 並建立該區塊的命名空間, 並在執行 `b = 20` 後記錄 `b` 名稱綁定到 20 這個整數物件： ``` +--__main__-- | | a -----------> 10 | test-----------> 函式 | | +--test-- | | | | b---------> 20 | +-------- +------------ ``` 執行到 `print(b)` 時, 由於在目前的名稱空間中就可以找到 `b` 這個名稱, 因此印出的就是 20。接著執行 `print(a)` 時, 因為在目前的名稱空間中並沒有名稱 `a`, 會往外層的程式區塊中尋找, 所以印出的會是上一層的 `a` 所綁定的 10。最後的執行結果就會是： ``` # py test.py 20 10 ``` 這個搜尋名稱的動作是在執行時進行的, 因此即便把 `a = 10` 移到定義 `test()` 函式之後也沒有問題： ```python def test(): b = 20 print(b) print(a) a = 10 test() ``` 實際叫用 `test()` 時, 已經綁定 `a` 了, 所以一樣可以找到名稱 `a` 正確執行： ``` # py test.py 20 10 ``` ## 區域 (local) 與全域 (global) 變數在特定程式區塊內綁定的名稱, 會在程式區塊結束後跟著消失, 無法使用。舉例來說, 如果在剛剛的範例最後加上 `print(b)`： ```python a = 10 def test(): b = 20 print(b) print(a) test() print(b) ``` 雖然在 `test` 函式中有將 `b` 綁定到 20, 但是在 `test()` 叫用結束後, `b` 這個名稱也消失了, 執行時會因為名稱空間中找不到 `b` 而引發 [`NameError`](https://docs.python.org/3/library/exceptions.html#NameError) 例外： ``` # py test.py 20 10 Traceback (most recent call last): File "D:\code\test\test.py", line 9, in <module> print(b) NameError: name 'b' is not defined ``` 由於所有的名稱都只在綁定時所在的程式區塊內有效, 因此稱為**區域變數 (local variables)**。對於在模組層級綁定的名稱, 例如前面範例中的 `a`, 因為在模組內的任何地方都可以使用, 也稱它們為**全域變數 (global variables)**, 也就是說, 模組內的名稱既是該程式區塊內的區域變數, 也是模組內的全域變數。如果使用到沒有在所在區塊內綁定的名稱, 例如前述範例中 `test()` 裡面的 `a`, 就稱為**自由變數 (free variable)**。如果在區塊中有綁定特定名稱, 就會將該名稱視為是區塊內的區域變數, 若在綁定之前就先使用該名稱, 並不會因為找不到該名稱而往外層尋找, 而是會引發 [`UnboundLocalError`](https://docs.python.org/3/library/exceptions.html#UnboundLocalError) 例外, 意思就是尚未綁定的區域變數, 例如： ```python a = 10 def test(): b = 20 print(b) print(a) a = 30 test() ``` 執行結果如下： ``` # py test.py 20 Traceback (most recent call last): File "D:\code\python\test.py", line 9, in <module> test() File "D:\code\python\test.py", line 6, in test print(a) UnboundLocalError: local variable 'a' referenced before assignment ``` 這是因為 `a` 是在 `print(a)` 之後才綁定, 即使外層有同名的 `a` 也一樣。 ## 內外層同名名稱的處理由於是從最近一層的程式區塊開始尋找名稱, 所以若是內層與外層有同樣的名稱, 就無法使用到外層的名稱, 例如： ```python a = 10 def test(): a = 20 print(a) test1() def test1(): a = 30 print(a) print(a) test() ``` 一開始執行到 `print(a)` 時, 找到的是模組綁定的名稱 `a`： ``` __main__ | | a -----------> 10 | test -----------> test 函式 | test1-----------> test1 函式 +------- ``` 因此會印出 10, 到執行 `test` 時, 找到的是函式內綁定的名稱 `a`, 這個名稱和外層模組綁定的名稱 `a` 雖然同名, 但分屬於不同的名稱空間： ``` __main__ | | a -----------> 10 | test -----------> test 函式 | test1-----------> test1 函式 | | +--test-- | | | | a ---------> 20 | +-------- +------- ``` 因此印出 20。到執行 `test1` 時, 又綁定了一個新的 `a`, 如下所示： ``` __main__ | | a -----------> 10 | test -----------> test 函式 | test1-----------> test1 函式 | | +--test-- | | | | a ---------> 20 | +-------- | | +--test1-- | | | | a ---------> 30 | +-------- +------- ``` 因此會印出 30。最後的執行結果如下： ``` # py test.py 10 20 30 ``` 請特別留意, 程式區塊的層級關係是原始碼的層級關係, 並不是函式之間叫用的關係, 也就是說, 雖然是在 `test()` 內叫用 `test1()`, 但兩者之間並沒有包含的關係。因此, 如果我們把 `test1` 中綁定 `a` 的程式去除, 像是這樣： ```python a = 10 def test(): a = 20 print(a) test1() def test1(): print(a) print(a) test() ``` 在 `test1` 中印出的 `a` 就會是外層模組中 `a` 綁定的 10： ``` # py test.py 10 20 10 ``` 但如果將 `test1` 定義在 `test` 內, 像是這樣： ```python a = 10 def test(): def test1(): print(a) a = 20 print(a) test1() print(a) test() ``` 執行到 `test1` 的時候, 區塊的層級關係會是這樣： ``` __main__ | | a -----------> 10 | test -----------> test 函式 | test1-----------> test1 函式 | | +--test-- | | | | a ---------> 20 | | | | +--test1-- | | | | | +-------- | +-------- +------- ``` 因此離 `test1` 最近一層就是 `test`, 所以 `a` 名稱綁定的就是 20, 而不是模組內的 10 了： ``` # py test.py 10 20 20 ``` 如果把 `test` 中綁定名稱 `a` 的設定敘述去除, 像是這樣： ```python a = 10 def test(): def test1(): print(a) print(a) test1() print(a) test() ``` `test1` 就會再往外找到最外層的 `a`, 這樣就會印出 3 個 10 了： ```# py test.py 10 10 10 ``` ## 指定使用全域變數或是外層的區域變數如果你想要使用的是最外層模組的全域變數 `a`, 可以在 `test1` 中使用 `global` 指明要引用的全域變數, 例如： ```python a = 10 def test(): def test1(): global a print(a) a = 20 print(a) test1() print(a) test() ``` 這樣系統就會知道在 `test1` 中使用到名稱 `a` 時, 要直接到最外層找, 因此列印的是最外層的 `a` 綁定的 10： ``` # py test.py 10 20 10 ``` 如果你很明確的要使用外層的區域變數, 而不是最上層模組的全域變數, 可以使用 `nonlocal`, 像是這樣： ```python a = 10 def test(): def test1(): nonlocal a print(a) a = 20 print(a) test1() print(a) test() ``` 這樣在 `test1` 中使用的就會是外層 `test` 中的 `a` 了： ``` # py test.py 10 20 20 ``` `nonlocal` 並不只是單單往外找一層, 而是會一層層往外找, 例如： ```python a = 10 def test(): def test1(): def test2(): nonlocal a print(a) test2() a = 20 test1() test() ``` 在 `test2` 中使用的就是往外兩層在 `test` 中綁定的 `a`, 所以印出的是 20： ``` # py test.py 20 ``` 你可能會想說, 咦？這樣好像不用特別標示 `nonlocal`, 不就一樣會一層層往外找尋名稱, 為什麼要多此一舉呢？這是因為 `nonlocal` 尋找名稱時, 並不會到最外層的模組找尋全域變數, 以底下的例子來說： ```python b = 10 def test(): def test1(): nonlocal b print(b) test1() test() ``` 雖然最外層模組有名稱 `b`, 可是因為在 `test1` 中將 `b` 標示為 `nonlocal`, 所以尋找名稱時並不會找到最外層而出現錯誤： ``` # py test.py File "D:\code\test\test.py", line 5 nonlocal b ^^^^^^^^^^ SyntaxError: no binding for nonlocal 'b' found` ``` 實際上甚至根本都還沒有執行, Python 在編譯程式碼時就發現外層區塊並沒有綁定 `b` 名稱, 因而引發代表語法錯誤的 [**SyntaxError**](https://docs.python.org/3/library/exceptions.html?highlight=syntaxerror#SyntaxError) 例外。 ## 縮排並不會建立程式區塊由於函式的主體需要縮排, 所以會讓人誤以為縮排也會建立一個程式區塊, 像是 C/C++ 程式用大括號建立的區塊那樣。不過事實上, 縮排並不是程式區塊, 在縮排中綁定的名稱就是隸屬於所在的程式區塊, 離開縮排區域後還是存在, 例如： ```python for i in range(3): a = i print(i) print(i) print(a) ``` 在 `for` 迴圈結束後, 不論是隨著 `for` 綁定的 `i` 還是在 `for` 迴圈本體中才綁定的 `a` 都還是有效, 並不會消失。執行結果如下： ``` # py test.py 0 1 2 2 2 ``` ## 類別定義的程式區塊不包含類別內的方法前面提過區塊層級是以原始碼而定, 但有個例外, 就是**類別定義的程式區塊並不包含類別中的方法**, 像是以下這個例子： ```python class A: x = 10 def test(self): print(x) a = A() a.test() ``` 依照往最近的區塊找尋名稱的規則, 在 `test` 方法中找不到的 `x` 應該是往外層找到類定義中綁定的 `x`, 不過實際上這個程式會發生錯誤： ``` # py test.py Traceback (most recent call last): File "D:\code\test\test.py", line 8, in <module> a.test() File "D:\code\test\test.py", line 5, in test print(x) NameError: name 'x' is not defined ``` 這是因為實際上類別定義有它自己的名稱空間, 和類別內的方法之間是獨立的, 你可以將之視為如下： ``` __main__ | | a -----------> A 物件 | A -----------> 類別 A 的定義 | | +--A-- | | | | x ---------> 10 | +-------- | | +--a.test-- | | | +-------- +------- ``` 在方法中找不到的名稱會往全域變數找, 因此如果在最外層定義 `x`, `a.test` 就會使用最外層的 `x`, 例如： ```python class A: x = 10 def test(self): print(x) x = 20 a = A() a.test() ``` 執行結果如下： ``` # py test.py 20 ``` 或者透過 `self` 引用定義在類別中的 `x`： ```python class A: x = 10 def test(self): print(self.x) a = A() a.test() ``` 印出來的就會是 10 了： ``` # py test.py 10 ``` 類別定義的命名空間會成為類別自己的特徵值 (attributes), 這可以透過 [object.\_\_dict\_\_](https://docs.python.org/3/library/stdtypes.html?highlight=__dict__#object.__dict__) 查看, 例如： ```python >>> A.__dict__.keys() dict_keys(['__module__', 'x', 'test', '__dict__', '__weakref__', '__doc__']) ``` 你可以看到 `x` 出現在其中, 我們也可以觀察 `a`： ```python >>> a.__dict__.keys() dict_keys([]) ``` 你會發現是空的集合, 如果透過 `a` 引用 `x`, 會因為 `a` 本身沒有 `x` 可用, 於是再透過 `a.__class__` 往 `A` 尋找而引用到類別定義中的 `x`： ```python >>> a.x 10 >>> a.__class__ <class '__main__.A'> >>> a.__class__.x 10 ``` 如果幫 `a` 物件添加 `x` 特徵值, 那麼 `a.test()` 就會循 `self` 引用到這個 `x`： ```python >>> a.x = 20 >>> A.x 10 >>> a.test() 20 >>> a.__dict__.keys() dict_keys(['x']) ``` ## 遞迴呼叫的命名空間前面提過, 每次執行一個程式區塊時, 就會建立一個新的名稱空間, 對於遞迴呼叫的函式, 就會建立多個同一函式的名稱空間, 以底下的例子來說： ```python def fact(n): if n < 2: return 1 return n * fact(n - 1) print(fact(4)) ``` 執行到 `fact(4)` 時的名稱空間如下： ``` __main__ | | fact -----------> fact 函式 | | +--fact(4)-- | | | | n ---------> 4 | +-------- +------- ``` 但是因為遞迴, 會再依序執行 `fact(3)`、`fact()`、`fact(1)`, 名稱空間變成： ``` __main__ | | fact -----------> fact 函式 | | +--fact(4)-- | | | | n ---------> 4 | +-------- | | +--fact(3)-- | | | | n ---------> 3 | +-------- | | +--fact(2)-- | | | | n ---------> 2 | +-------- | | +--fact(1)-- | | | | n ---------> 1 | +-------- +------- ``` 也就是說, 每次叫用 `fact` 時, 其內的 `n` 都是各自專屬的名稱, 而不是所有的 `fact` 共用同一個 `n`。這個結構會從 `fact(1)` 傳回 1 後依序傳回計算值, 最後得到 `4*3*2*1`, 也就是 24 的值： ``` # py test.py 24 ``` ## 結語本文希望透過簡短的文章與圖解, 讓初學者可以分清楚程式中實際使用的名稱到底是哪一個？避免因為用到尚未綁定的名稱、或是用錯名稱導致程式錯誤, 實際上可能還有一些細節, 不過對於一般程式來說, 本文提到的部分應該已經夠用了。