# Bloom inference server維護 [參考文件 : Bloom-Inference](https://hackmd.io/VYkEmhnhTfWYxxFvPKFMFQ#Bloom-Inference) 以台固的vcs環境已建立的兩個VM為例:  需要在master/worker vm node上,加上access client的public key:   ## 如何重啟 inference server 目前有以下case可能需要重啟inference server: - inference server or backend 掛掉。 - 需要更換新版model時。 基本以上都需要把現有的web server/backend/inference server相關的process kill掉。 下指令: ``` bash ps au -u ubuntu ``` master:  worker:  可以直接用以下shell script: kill_inf_proc.sh (master在/transformers-bloom-inference下,worker在home目錄下) ``` bash= #!/bin/bash ps au -u ubuntu | grep ui | awk '{print $2}' | xargs kill sleep 1 ps au -u ubuntu | grep pdsh | awk '{print $2}' | xargs kill sleep 1 ps au -u ubuntu | grep gunicorn | awk '{print $2}' | xargs kill sleep 1 ps au -u ubuntu | grep deepspeed| awk '{print $2}' | xargs kill sleep 5 echo "Killed all inference service processes!" ``` 各別在master/worker執行指令,可停止相關processes. ### 啟動service 參考 [4.設定與執行Inference Server](https://hackmd.io/VYkEmhnhTfWYxxFvPKFMFQ?view#4%E8%A8%AD%E5%AE%9A%E8%88%87%E5%9F%B7%E8%A1%8CInference-Server) [5.執行UI & Backend](https://https://hackmd.io/VYkEmhnhTfWYxxFvPKFMFQ?view#5%E5%9F%B7%E8%A1%8CUI-amp-Backend) 可以在master node的home目錄的/transformers-bloom-inference下,執行下列script (start_inf_serv.sh): ```bash= #!/bin/bash nohup make twm & echo "Starting inference server... wait for 60s..." sleep 60 echo "Starting backend and web server..." nohup python -m ui & ``` ## 如何更換模型 參考 [Model Convert](https://hackmd.io/VYkEmhnhTfWYxxFvPKFMFQ?view#Model-Convert/) [Pre-sharded a model for tensor parallelism](https://hackmd.io/VYkEmhnhTfWYxxFvPKFMFQ?view#Pre-sharded-a-model-for-tensor-parallelism) 現在twm VM都有ssfs mount folder: remote_disk_work/ 所以可以至以下路徑找到model範例: /home/ubuntu/remote_disk_work/twm_poc/twm-train-v2/456876/checkpoints/tr13-176B-ml-t0/checkpoints/xp3capmixnewcodelonglossseq/  Megatron-Deepspeed checkpoint:  HuggingFace checkpoint:  Pre-shared checkpoint:  最終用來推論的會是pre-shared model(為了減少第一次loading huggingface checkpoint的pre-sharding的耗時)。 另外需補上以下檔案成為最終狀態:  注意: ds_inference_config.json需用原先 .models/hfmodel-tp/下的;如果用轉換時產出的,會指到產出環境的路徑。
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up