scripting env - HackMD

# scripting env 場景：程式內部，額外提供讓user寫腳本的功能。本篇分析各種方案選擇。 ## boundary * 腳本要在後端執行。這樣關掉 HMI 後還能動。--> security * 考慮的是 full language，不單是 gcode read，然後轉換。 * 假設最終會有 HMI + controller ，都是offline embed linux * 不可能期待 network 更新，手動 copy 就是上限。 * 假設最終會有 tablet + controller * 完整的 script 是希望讓系統整合商/FAE 能夠直接對接各種系統，不需原廠的各種 support。 * 希望可以update SDK version。 * 希望可以update libs * end user scripting * 下載（+編譯）程式 5s 內 * 每次程式執行 1s 內開始 * > only update program ## script execution place * script 放前端 or 後端？ * 前端 * TLDR; give up。 * 編譯之後的結果不需要相關的dependency。target machine 自然就不需要安裝一堆東西。 * binary 了話不用裝runtime & lib * aotsnapshot 了話不用裝 lib * GUI env 要裝 compiler 確實有難度。~~特別是 architecture 不同了話，還要 docker，**基本不用想**。~~ * binary (`exe`, `aot-snapshot`) 不用想 * compile to `kernel(dart)` or `js` or **`wasm`** 可以。 * but 要編譯了話，ㄧ樣要 compiler （只是從 backend 移到frontend）。整個 sdk 約 131MB。 * 能不能和OS IO 交互才是key。WASI (WebAssembly System Interface)。現在dart 是不 support 的。 * 但script 應該只會用到 web client (now supported)。 * 從其它的client 連入了話，要看對應的 code? * 備份。 * 後端 * 要以 script or binary 的形式那一種？ * 無論那一種，backend 至少的 compiler/interpreter & lib 都是必要的。 * bytecode, script language 有需要 language infrastructure，就表示一定有版本問題。其實 compiler 如果是拿source code 也有，e.g. c++20。 ## upgrade env level system scope lower to higher 1. program 2. libs 3. SDK, e.g. compiler, interpreter e.g. cpython, gcc, ld, cmake... 4. VM, e.g. docker, lxc... 1, 2 是開發必要的，沒有了話 SI/FAE 無法工作。有 3 才能跟進最新的版本（含 language feature）。 > 大致方向可能就是 e.g. docker 開三個volume，分別for program, libs, SDK。 libs 的匯入又和 pkg manager 有很大的關係。是在 offline 的環境，是否支援export to zip，然後 import from zip。綁定pkg manager 也不太對，應該是 language/SDK 本就有一個 general 的 import 機制。然後 developer pkg manager export 後能夠符合這個機制。另外 libs 這邊其實有一點語意不清。以 gcc 來說，libs 是 source code or .so 檔？前者和 OS 弱相關，後者和 OS 強相關(e.g. libc, ABI...)。 ### libs and sdk libs & sdk 其實很多時候會變得密不可分。經過 sdk 處理的 libs 在脫離原本的 sdk 後就沒用了。此時，讓兩者同時 upgrade 可能更好，或是唯一作法。以python 來說，cpython 資料夾（共 112MB）裡就有各種 lib。和 self repo venv 裡的 `site-packages` 分開。 ## security execute user uploaded untrusted code https://g.co/gemini/share/1c0a8c00482c [linux中的沙箱技术](https://atum.li/2017/04/25/linuxsandbox/#linux%E4%B8%AD%E7%9A%84%E6%B2%99%E7%AE%B1%E6%8A%80%E6%9C%AF) * linux kernel feature * cgroups * chroot * [使用chroot 簡單隔離](https://hackmd.io/@iD40lBm-QAqgh62DVHbjPA/SJmdaPGns) * 如果知道如何設定到 prevent jail escape，也ok * selinux * apparmor * ... * sandbox (only, compare to VM) * bubblewrap(bwrap) * [Notes on running containers with bubblewrap](https://jvns.ca/blog/2022/06/28/some-notes-on-bubblewrap/) * 0.3s(docker)-->0.008s(bwrap) * 應該算所有東西的底層。 * **Firejail** * `firejail uv run python -m esc_decode -h` * Bubblejail * Flatpak * back by Bubblejail * [Security by sandboxing: Firejail vs bubblewrap vs other alternatives ](https://www.reddit.com/r/linux/comments/knjzf2/security_by_sandboxing_firejail_vs_bubblewrap_vs/) * VM(-like) * 關鍵看有沒有 libc or fully static link。 * lxc, docker, podman, MicroVMs (e.g., Firecracker), kata containers, unikernel * unikernel 還屬於科研。 * startup time: ~~unikernel~~ < `containerd/runc` < **lxc** < **podman** < **docker** < Firecracker == kata * unikernel 和其他已經很不同。 * containerd 太底層，根本無法使用。 * [LXC vs. Docker: Which One Should You Use?](https://www.docker.com/blog/lxc-vs-docker/) * docker 最初的底層是 LXC，後來才轉到 containerd。 * 如果考慮單container 的消耗，containerd 可能更少。如果是很多IO 和CPU 計算，LXC 執行效率可能更高。 * [Docker 與 Podman：選擇適合您需求的容器技術](https://easontechtalk.com/tw/docker-vs-podman-containerization-comparison/) * `docker exec ${container_name} python main.py`, 若 `main.py` 有 `rm -rf /`，砍掉的是 container 而非自己的OS。 * 相比 sandbox 當然更慢，但是能夠解決 libc version 問題，使用難度更小。 * [An overview of LibraryOS and Unikernels](https://ianchen.tw/posts/2020-03-06-libos-and-unikernels) * [ Kata Containers vs Firecracker vs gvisor ](https://www.reddit.com/r/docker/comments/1fmuv5b/kata_containers_vs_firecracker_vs_gvisor/) * [Understanding Firecracker MicroVMs: The Next Evolution in Virtualization](https://medium.com/@meziounir/understanding-firecracker-microvms-the-next-evolution-in-virtualization-cb9eb8bbeede) * mciroVM still VM，比 container e.g. docker 更重是一定的，但能比 docker OS level 隔離提供更好的security。 * 除非說 VM 虛擬化在某些 workload 能夠比 OS level 更有效率，否則都是 docker 輕。 * 另 `flintlock`(原 Ignite) 可以 create VM filesystem from dockerfile。 * [ Kamal: Reason behind choosing Docker instead micro VMs (like firecracker)](https://news.ycombinator.com/item?id=41433588) * language level * language-dependent ofcourse 我的判斷是，kernel feature 直接用太低階。language level 來做了話不夠general。直接用 container 是比較適合的，container 也是多種 kernel feature 拼湊出的。 ## language ref * banckend server lang * script lang 的 runtime 通常不小 e.g. nodejs。 * compile to binary 通常也是直接包 runtime 進去 * smaller binary size 需要 special runtime e.g. [quickjs](https://github.com/quickjs-ng/quickjs) * but limited, quickjs 甚至沒有 web server lib (like express in nodejs) * micropython * 單考慮這case，我更傾向typed lang 一勞永逸。 * gui lang * qt * LGPL 40MB 也是大 * flutter * android * **auto mode script lang** * 難度 * 是否會直接類似 sql injection 破解系統? * js 的 lib 本就 in browser 比較不會。 * 腳本如果是在後端執行，就有這個問題。 * 如果在 **sandbox** 執行應不會有問題。 * 讓 user 用自己的lib * 讓 user 改 interpreter/runtime? * ref * `online code editor` * compilr/ interpreter 在後端。 * 新版compiler/ interpeter 是靠OTA(server 端的 option) * [online-python](https://www.online-python.com/) * only buildin lib allow e.g. numpy, no flask * 結論就是不太可能 let user use their own lib. or install 3rd lib * `sandbox` * python sandbox * container/VM * [What's the best way to run "untrusted" binaries? ](https://www.reddit.com/r/linuxquestions/comments/rljlfz/whats_the_best_way_to_run_untrusted_binaries/) * alphine docker 5MB * compile to `WASM` * 但凡 compiler/interpreter 要更新，都不能trusted user。offline 和 online 都要有密碼。 * 還是 sandbox 裡，讓 user 隨便玩？ * interpreter * lua * python * js * script interpreter 也有版本 matching 問題，見下 jit bytecode/runtime。 * bytecode/binary * `dart run` \*.dart 會慢，usually 0.7s when python 0.25s。`dart run` kernel ~0.43s * bytecode 每個大版本都在升級。e.g. java23 program usually not runable on java11 machine。 java23 compiler 可以指定runtime version，但許多新feature 也禁用。 * runtime 的版本大概一下就會過時。不依賴才對。要就必須able update runtime。 * 只有 wasm byte code 相對穩定。 * dart wasm only support browser env now. 所以 web frontend 外不可用。 * [[dart2wasm] Support non-JS wasm runtimes #53884](https://github.com/dart-lang/sdk/issues/53884) --- * [Create a Single Binary Executable for Your Express.js API Using pkg](https://medium.com/@karanmamtora/create-a-single-binary-executable-for-your-express-js-api-using-pkg-cbf1a907454c) ## 解法現在的方向就是 `docker run` 用指的image run container，然後同時用 volume 插入 1. program, 2. libs, 3. SDK 三個東西。大部份的 language 都可以這樣拆(by 觀察)。e.g. python 的 libs 在 `site-package`, nodejs 在 `node_modules`, dart 在 `.pub-cache`。 * [dart docker emulated compile](https://hackmd.io/TJoUSM_XStK_rK7iAvoYHQ) ### 選個語言 20 大語言裡，script language 的libs 通常都很大。（不考慮沒有async, await 的 lua）在 script language 挑，綜合考量，還是 **python**，除非有什麼大批漏。 compiled gc language 可以考慮 go or dart。go 沒有 async了話，**dart** 還是比較方便。先選 python。 ### LLM query 模板 > If I want to deploy a program in xxx language. The program have to deploy at environment have same architecture and OS as develop machine. But no pkg manager and SDK version is mismatched or even not installed and the target machine is offline. Additionally, the program will be changed after deployed to target machine (but dependency/lib/package keep at same version). What should I do to achieve deploy? > A Self-Contained XXX Environment with Separated Code ### python #### minimal start:relocatable venv > 以下方法不保證處理 abs path 問題（depend on libs own implement）。且 cpython 為 sharedlink，需另外製作。 > https://g.co/gemini/share/da89b91852c2 * [venv: --copies? #2103](https://github.com/astral-sh/uv/issues/2103#issuecomment-2471902697) * still open(2025)，考慮如何解決中。 in uv project ```bash mkdir release uv venv --relocatable --link-mode copy --managed-python --python 3.13 ./release source ./release/bin/activate uv sync --no-dev --active deactivate ``` copy to desired place ```bash VENV_PATH="/home/${}/${}/.venv" && source "$VENV_PATH/bin/activate" ``` > can write a `run.sh` then write as you want, run script: `python your_main_app_script.py`, run pkg: `python -m your_module`. 要特別注意的是這個方法無法解決 system shared lib 的問題 e.g. libc。所以要特別注意執行 `uv sync` 的 docker image。 --- ```bash export PYTHONPATH=/home/morgana/python_PCSgetValue/release/lib/python3.13/site-packages/ python script.py ``` #### 基準速度 ```bash $ time uv run python -m esc_decode -h #... real 0m0.278s user 0m0.915s sys 0m0.040s ``` #### combine libs, sdk 如果語言真的綁定太深，e.g. hardcoded abs path, link to specific lib version, libc version... 。把 2 libs 和 3 SDK 綁在一起 update 會簡單很多。 ##### docker sample 1. create `Dockerfile` 1. alpine + multi-stage is necessay for python, [打造最小 Python Docker 容器](https://blog.wu-boy.com/2021/07/building-minimal-docker-containers-for-python-applications/) 2. 有numpy下還是要約 119MB 2. `docker build -t ${img_name} -f ./Dockerfile .` 3. `docker run --rm -v $(pwd)/tests/data:/data ${img_name}` * [Why is python slower inside a docker container?](https://stackoverflow.com/questions/76130370/why-is-python-slower-inside-a-docker-container) * `privileged` 可 speed up，但不可用，root 權力。 * > why privilege result in massive performance boost of python conatiner? * 好像是 python 的 import 要很多 system call，檢查就更慢。有 docker 啟動的時間其實慢不少。也可能是 alpine 的關係，`FROM python:3.12-alpine AS builder`。(0.5s --> 1.5s) 換成 `FROM python:3.13-slim AS builder` 後，速度比 alpine 快，但還是比 native 慢。(0.5s --> 1s)，且 image size `119MB` --> `186MB` * [ Realistic startup times for container ](https://www.reddit.com/r/docker/comments/cuqr67/realistic_startup_times_for_container/) * `time docker run hello-world` take 0.3s on my pc(2025) * `time podman run hello-world` take 0.28s lxc 在先開啟 container 的情況野蠻快的 ```bash sudo lxc-create -n helloworld -t busybox sudo lxc-start -n helloworld time sudo lxc-attach -n helloworld -- /bin/echo "Hello, World!" sudo lxc-stop -n helloworld sudo lxc-destroy -n helloworld ``` ``` Hello, World! real 0m0.023s user 0m0.002s sys 0m0.007s ``` ##### docker with container start 一樣的方法對 docker ```bash docker run -d --name myalpine alpine sleep infinity time docker exec myalpine echo "Hello, World!" docker stop myalpine docker rm myalpine ``` ``` Hello, World! real 0m0.076s user 0m0.016s sys 0m0.036s ``` --- ```bash $ docker run -d --name temp_esc_decode --entrypoint sleep esc_decode infinity $ time docker exec temp_esc_decode python -m esc_decode usage: __main__.py [-h] [-th THRESHOLD] [-i IGNORE_ADDRS [IGNORE_ADDRS ...]] input_file __main__.py: error: the following arguments are required: input_file real 0m0.810s user 0m0.032s sys 0m0.021s ``` 1s --> 0.8s (有提昇但不夠)。 ```bash $ podman build -t esc_decode . # ... $ time podman run --rm esc_decode -h # ... real 0m1.002s user 0m0.024s sys 0m0.041s $ podman run -d --name temp_esc_decode --entrypoint sleep esc_decode infinity $ time podman exec temp_esc_decode python -m esc_decode -h $ time podman exec temp_esc_decode python -m esc_decode -h # ... real 0m0.801s user 0m0.018s sys 0m0.027s $ podman stop temp_esc_decode $ podman rm temp_esc_decode ``` 速度一樣，再來就 LXC 了，但提昇應該不大。應該是真的存在 execution performance drop。(0.3-->0.1s), (1-->0.8s), start up time optimize 已經去掉，0.8-0.1-0.4=0.3s 就是因為 performance drop 少掉的部份。 ##### LXC 腳本很複雜，略過。 lxc profile 只能設定 cloud-init 時的內部預裝。`uv sync` 之類的動作要寫在額外的腳本，無法寫在 lxc profile。 ```bash time lxc exec esc-decode-app -- bash -c "cd /app && python3 -m esc_decode --help" #... real 0m0.634s user 0m0.023s sys 0m0.040s ``` 小幅提昇，工作量大增。 #### firecracker [python-firecracker](https://github.com/Okeso/python-firecracker) 弄一半，最基本的可啟動，但還有很多設定。firecracker 還很底層，基本是runc 這個等級，更常是其他東西的底層，e.g. `flintlock`。建議之後再搞。 #### sandbox sample ```bash $ time firejail uv run python -m esc_decode -h #... real 0m0.336s user 0m0.973s sys 0m0.111s ``` 0.27 --> 0.33s，sandbox 確實會讓 python 有 performance drop，但情況比 VM 好很多。 #### script update form 單 program level update 是知道了，但 program 的範圍多大到是不清楚，可能是 1.分散的檔案 2.各資料夾各一個檔案 3.各資料夾多個檔案。另外，後續有 transpile gcode to program, then program level update. 如果能直接對接更好。 > 各資料夾一個 gcode，翻譯成 python 在同資料夾，或是再增一層以module 執行都可以。額外的資料夾是希望 sandbox 可以限制 code access 的範圍。所以還是回到想提供的執行方式。其中的重點在於盡量簡單化要啟動的script, module 指定動作。 * [PySide6-project-template](https://github.com/trin94/PySide6-project-template/tree/main) * `uv run esc_decode/__main__.py -h` or `uv run python esc_decode/__main__.py -h` 直接選擇執行的 script 還是直覺一點，打開的檔案就是要執行的。而且 module based 專案 or 單script 都可執行。 * 讓 user 可以上傳含 folder 的專案。然後有 backend/frontend API * `uv build --wheel --sdist` * 要主support sdist 還是 wheel 還不確定。其實應該都可以，只是如果root 有 main.py(or other)，就用 sdist。 * 讓 user access filesystem。 * 直接用 sshfs? 這樣 browser 無法用。那如果可以 uploade 資料夾，其實是表示可以把需要編譯的 module 也在 `.tar.gz` 上傳，以此 hacking 達到 update libs level 的。 `resymot_server` 的位置用 env variable 傳入？ #### libs update form > 特別小心要用到 openssl 的 lib * [ Export a virtual environment as a wheel for offline deployment ](https://www.reddit.com/r/learnpython/comments/10vh651/export_a_virtual_environment_as_a_wheel_for/) * wheel is not designed for this purpose, just copy .venv * [把 Python 的 venv 移到其他機器](https://blog.pan93.com/posts/move-venv-to-other-machines/) * pex 就是 venv，不含cpython * 只要有 with C libs，三者都無法跨平台。 * > sdk update 可以考慮 `python-build-standalone`, `standalone-python` ### dart dart 的 `.pub-cache` 可以存之前download 的檔案，[Getting while offline](https://dart.dev/tools/pub/cmd/pub-get#getting-while-offline)。可以把這個暫存當成 install package。 ### summary1 大方向大方向就是 1. compiled language + docker 1. compiled language 啟動快，但是 dependency 和 ABI 之類的更強相關。用 docker 可以簡化。 2. 也可以 compiling SDK 在 docker, running 在 sandbox 進一步提升。 2. script language + sandbox 1. script 的 loading 時間慢，導致 startup 慢，用更輕量的 sandbox。如果是純script libs 了話，比較不受系統 ABI 之類影響。 2. with C 的 libs 很難解決。created with matched docker container 是唯一的辦法（如果可以**確定 target machine**）。 1. 否則也只能 docker libc, ABI 問題是否真的那麼嚴重也不好說，20年也這麼 deploy 過來。 ## special case study/implement ### 即時更新 auto mode 腳本的參數 1. 只可能/可以改參數，改程式 structure 會 segfault。 2. 只要在 send_straigh_line 之前馬上read db value 就能做到（所有general programming language 都可以）