# 毀天滅地的 Python 模組引用
在上週某天突然發現 Production 上的資料全部都變成 RC 環境的資料,project settings 都是 RC 環境的參數,這是一個非常嚴重的錯誤發生在 Production,第一時間我還還毫無頭緒。
先來看看 VM 裡面的目錄長這樣:
(先別嗆我為什麼三個環境的目錄會放在同一個 VM...)
```
project-admin/
└──project-prod/
| ├─app.py
| └─...
└──project-rc/
| ├─app.py
| └─...
└──project-test/
├─app.py
└─...
```
乍看之下沒問題,每個環境的 Code 都分別放在不同的目錄下,也都有個別使用虛擬環境來執行 gunicorn,出事當下也確認過 Production 環境的執行目錄是在 project-prod 底下沒錯,那怎麼還會有事?!
<b>原因就是和 Python 模組的引用順序有關。</b>
我們的專案是用 Flask 開發的,app.py 就是很簡單的實例化一個 Flask 物件。
```python=
# app.py
from flask import Flask
app = Flask(__name__)
```
接著我們來看看 Flask 的 Source Code:
```python=
# https://github.com/pallets/flask/blob/1.1.x/src/flask/app.py
class Flask(_PackageBoundObject):
...
def __init__(
self,
import_name,
static_url_path=None,
static_folder="static",
static_host=None,
host_matching=False,
subdomain_matching=False,
template_folder="templates",
instance_path=None,
instance_relative_config=False,
root_path=None,
):
_PackageBoundObject.__init__(
self, import_name, template_folder=template_folder, root_path=root_path
)
self.static_url_path = static_url_path
self.static_folder = static_folder
if instance_path is None:
instance_path = self.auto_find_instance_path()
elif not os.path.isabs(instance_path):
raise ValueError(
"If an instance path is provided it must be absolute."
" A relative path was given instead."
)
...
```
建構式中有個 Optional 參數 `root_path` 會被傳進 `_PackageCoundObject.__init__()` 中, 接著來看這個函式:
```python=
# https://github.com/pallets/flask/blob/1.1.x/src/flask/helpers.py
class _PackageBoundObject(object):
#: The name of the package or module that this app belongs to. Do not
#: change this once it is set by the constructor.
import_name = None
#: Location of the template files to be added to the template lookup.
#: ``None`` if templates should not be added.
template_folder = None
#: Absolute path to the package on the filesystem. Used to look up
#: resources contained in the package.
root_path = None
def __init__(self, import_name, template_folder=None, root_path=None):
self.import_name = import_name
self.template_folder = template_folder
if root_path is None:
root_path = get_root_path(self.import_name)
self.root_path = root_path
self._static_folder = None
self._static_url_path = None
# circular import
from .cli import AppGroup
#: The Click command group for registration of CLI commands
#: on the application and associated blueprints. These commands
#: are accessible via the :command:`flask` command once the
#: application has been discovered and blueprints registered.
self.cli = AppGroup()
...
def get_root_path(import_name):
"""Returns the path to a package or cwd if that cannot be found. This
returns the path of a package or the folder that contains a module.
Not to be confused with the package path returned by :func:`find_package`.
"""
# Module already imported and has a file attribute. Use that first.
mod = sys.modules.get(import_name)
if mod is not None and hasattr(mod, "__file__"):
return os.path.dirname(os.path.abspath(mod.__file__))
# Next attempt: check the loader.
loader = pkgutil.get_loader(import_name)
# Loader does not exist or we're referring to an unloaded main module
# or a main module without path (interactive sessions), go with the
# current working directory.
if loader is None or import_name == "__main__":
return os.getcwd()
...
```
如果沒有傳入指定的 `root_path`, Flask 會調用 get_root_path 來搜尋跟目錄,看到這邊有一行在閃閃發亮 `sys.module.get(import_name)` 原來連我們寫的 `app = Flask(__name__)` 不是從當前的工作目錄開始找而是 `sys.path` ?
來做個實驗,建立兩個 flask 的 project:
```bash
/home/kevin/
└──repo1/
| ├─app.py
└──repo2/
├─app.py
```
```python=
# repo1/app.py
from flask import Flask
app = Flask(__name__)
print("repo1 app!")
```
```python=
# repo2/app.py
from flask import Flask
app = Flask(__name__)
print("repo2 app!")
```
接著使用錯誤的環境變數來改變 Python 尋找模組的順序:
```
export PYTHONPATH="home/kevin/repo2:home/kevin/repo1"
```
再用 gunicorn 執行看看會發生什麼事
```
$ gunicorn app:app
# repo2 app!
```
如果再把 `mod = sys.modules.get(import_name)` 這行的 mod 變數印出來則會看到
```
<module 'app' from '/home/kevin/repo2/app.py'>
```
這樣就重現了這個錯誤,修改的方法也很簡單,有幾種修正的方式
1. 執行 gunicorn 的時候給定正確的環境變數 PYTHONPATH
1. 把 Production, RC, Test 環境分別包進 container 中
1. 在 Flask() 中直接塞入 root_path