\$ supervisor
===
###### tags: `OS / Ubuntu`
###### tags: `OS`, `Ubuntu`, `linux`, `command`, `supervisor`
<br>
[TOC]
<br>
## 簡介
### 特色
- 分檔案收集各 process 的 stdout & stderr 的 log
### 官方文件
- [[GitHub] Supervisor / supervisor](https://github.com/Supervisor/supervisor)
- [[doc] Configuration File](http://supervisord.org/configuration.html)
### 其他參考
- [指令用途](https://blog.51cto.com/u_13052892/4631081)
- supervisor:要安裝的軟件的名稱。
- supervisord:supervisor的守護進程
- 一般結尾是d的都是守護進程
- 裝好supervisor後,supervisord用於啟動supervisor服務。
- supervisorctl:用於管理supervisor配置文件中的各種其他進程。
<br>
<hr>
<br>
## 安裝
- ### 自動印出安裝指令
> 不知為何有些環境會印出訊息
```bash=
$ supervisord
Command 'supervisord' not found, but can be installed with:
sudo apt install supervisor
```
- ### 安裝指令
- **root**
`# apt install -y supervisor`
- **non-root**
`$ sudo apt install supervisor`
- ### 執行 service
> 這是在全域下執行
> 實際上,可以不用在全域下執行
- **root**
```
# service supervisor status
supervisord is not running.
# supervisord
/usr/lib/python2.7/dist-packages/supervisor/options.py:461: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
'Supervisord is running as root and it is searching '
# service supervisor status
supervisord is running
```
- **non-root**
```
$ supervisord
Traceback (most recent call last):
File "/usr/bin/supervisord", line 11, in <module>
load_entry_point('supervisor==4.1.0', 'console_scripts', 'supervisord')()
File "/usr/lib/python3/dist-packages/supervisor/supervisord.py", line 358, in main
go(options)
File "/usr/lib/python3/dist-packages/supervisor/supervisord.py", line 368, in go
d.main()
File "/usr/lib/python3/dist-packages/supervisor/supervisord.py", line 70, in main
self.options.make_logger()
File "/usr/lib/python3/dist-packages/supervisor/options.py", line 1466, in make_logger
loggers.handle_file(
File "/usr/lib/python3/dist-packages/supervisor/loggers.py", line 417, in handle_file
handler = RotatingFileHandler(filename, 'a', maxbytes, backups)
File "/usr/lib/python3/dist-packages/supervisor/loggers.py", line 213, in __init__
FileHandler.__init__(self, filename, mode)
File "/usr/lib/python3/dist-packages/supervisor/loggers.py", line 160, in __init__
self.stream = open(filename, mode)
PermissionError: [Errno 13] Permission denied: '/var/log/supervisor/supervisord.log'
```
- 沒加 sudo,會印出 log 路徑
`/var/log/supervisor/supervisord.log`
(可獲得 log 位置)
加 sudo 執行 supervisord:
```
$ sudo supervisord
[sudo] password:
/usr/lib/python3/dist-packages/supervisor/options.py:470:
UserWarning: Supervisord is running as root and it is
searching for its configuration file in default locations
(including its current working directory); you probably
want to specify a "-c" argument specifying an absolute
path to a configuration file for improved security.
self.warnings.warn(
```
<br>
<hr>
<br>
## Help
```
$ supervisord -h
supervisord -- run a set of applications as daemons.
Usage: /usr/bin/supervisord [options]
Options:
-c/--configuration FILENAME -- configuration file path (searches if not given)
-n/--nodaemon -- run in the foreground (same as 'nodaemon=true' in config file)
-h/--help -- print this usage message and exit
-v/--version -- print supervisord version number and exit
-u/--user USER -- run supervisord as this user (or numeric uid)
-m/--umask UMASK -- use this umask for daemon subprocess (default is 022)
-d/--directory DIRECTORY -- directory to chdir to when daemonized
-l/--logfile FILENAME -- use FILENAME as logfile path
-y/--logfile_maxbytes BYTES -- use BYTES to limit the max size of logfile
-z/--logfile_backups NUM -- number of backups to keep when max bytes reached
-e/--loglevel LEVEL -- use LEVEL as log level (debug,info,warn,error,critical)
-j/--pidfile FILENAME -- write a pid file for the daemon process to FILENAME
-i/--identifier STR -- identifier used for this instance of supervisord
-q/--childlogdir DIRECTORY -- the log directory for child process logs
-k/--nocleanup -- prevent the process from performing cleanup (removal of
old automatic child log files) at startup.
-a/--minfds NUM -- the minimum number of file descriptors for start success
-t/--strip_ansi -- strip ansi escape codes from process output
--minprocs NUM -- the minimum number of processes available for start success
--profile_options OPTIONS -- run supervisord under profiler and output
results based on OPTIONS, which is a comma-sep'd
list of 'cumulative', 'calls', and/or 'callers',
e.g. 'cumulative,callers')
```
- 預設的配置檔案
`/etc/supervisor/supervisord.conf`
- 預設的日誌檔案
`/var/log/supervisor/supervisord.log`
<br>
<hr>
<br>
## 入門範例 1 (預設:背景模式)
> 在背景下執行,執行完需 kill 掉 supervisord
### 0. 預覽
- ### 執行前

- ### 執行中

- 因為在背景執行,一執行就立刻結束
- 產生的檔案

- `supervisord.log`

- supervisord started with pid 269
- 若要結束 supervisord,需自行 kill 269
- `supervisord.pid`

- 也有記載 supervisord pid
- `python.log` (由 `run_and_sleep.py` 產生)

<br>
### 1. 準備 `run_and_sleep.py`
```python=
import os
import sys
import time
from datetime import datetime
print("sys.argv:", sys.argv)
print('command:', os.environ.get('command'))
print('logfile:', os.environ.get('logfile'))
with open('python.log', 'w') as f:
for i in range(10):
log = "[{}] ".format(i) + str(datetime.now())
print(log)
f.write(log + '\n')
time.sleep(1)
```
### 2. 準備 `my_supervisord.conf`
```ini=
[supervisord]
[program:python]
command=python run_and_sleep.py
```
- ### [INI檔案](https://zh.wikipedia.org/wiki/INI%E6%96%87%E4%BB%B6)
- 有時候,INI 檔案也會以不同的副檔名,
如「.cfg」、「.conf」、或是「.txt」代替。
- **格式**
- 節
`[section]`
- 參數
`name=value`
- 註解
` ; comment text`
註解使用分號表示(;)。在分號後面的文字,直到該行結尾都全部為註解。
- `[supervisord]` 是必要的節(section)
- 如果沒有,會產生底下的錯誤
```bash
$ supervisord -c my_supervisord.conf
Error: .ini file does not include supervisord section
For help, use /usr/bin/supervisord -h
```
- `[program:python]`
:::warning
:warning: **此範例不會產生 stdout & stderr log,只是先行講解含意**
若要輸出 program 的 stdout & stderr log,
由變數 `childlogdir` 控制(見範例二)
:::
- 表示 stdout & stderr log 的檔案名稱,會是以 `python` 為前綴
`python-stderr---supervisor-5eu4bnx0.log`
`python-stdout---supervisor-oj8b8p41.log`
- 範例:`[program:ABC]`
`ABC-stderr---supervisor-68ws35ys.log`
`ABC-stdout---supervisor-_3tn_t1v.log`
- log 檔案管理
- 下次啟用 supervisord 時,有相同 pattern 的檔名會自動被移除
- 變更 program 名稱,舊的檔案也會被自動移除
<br>
### 3. 執行 supervisord
```bash=
$ supervisord -c my_supervisord.conf
(沒有 log)
```
- 輸出結果

- 多出三個檔案
- `python.log` (自己產生)
- `supervisord.log` (由 supervisord 產生)
- `supervisord.pid` (由 supervisord 產生)
- `python.log` (自己產生)
> 執行完才會有 log
```
[0] 2022-04-20 18:31:12.309555
[1] 2022-04-20 18:31:13.310658
[2] 2022-04-20 18:31:14.311786
[3] 2022-04-20 18:31:15.312910
[4] 2022-04-20 18:31:16.314029
[5] 2022-04-20 18:31:17.315147
[6] 2022-04-20 18:31:18.316276
[7] 2022-04-20 18:31:19.317407
[8] 2022-04-20 18:31:20.318548
[9] 2022-04-20 18:31:21.319673
```
- `supervisord.log` (由 supervisord 產生)
```
2022-04-22 10:03:47,289 INFO daemonizing the supervisord process
2022-04-22 10:03:47,291 INFO supervisord started with pid 269
2022-04-22 10:03:48,293 INFO spawned: 'python' with pid 270
2022-04-22 10:03:49,295 INFO success: python entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-04-22 10:03:58,329 INFO exited: python (exit status 0; expected)
```
- supervisord 除了在 terminal 輸出 log 之外,
也會同時輸出 log 到 `supervisord.log`
- :warning: `supervisord.log` 中並沒有見到 python 在 stdout 輸出的 log
```
author: None
logfile: None
[0] 2022-04-20 18:31:12.309555
...
[9] 2022-04-20 18:31:21.319673
```
若要輸出 program 的 stdout & stderr log,
由變數 `childlogdir` 控制(見範例二)
- `supervisord.pid` (由 supervisord 產生)
會記錄當前 pid 為 269
```text=
269
```
<br>
### 4. 關閉 supervisor
```
$ ps -aux | grep super
tj 269 0.0 0.1 30708 19384 ? Ss 10:03 0:00 /usr/bin/python3 /usr/bin/supervisord -c my_supervisord.conf
tj 295 0.0 0.0 8164 668 pts/1 S+ 10:24 0:00 grep --color=auto super
```
```
$ kill 269
```
```
$ ps -aux | grep super
```
<br>
<hr>
<br>
## 入門範例 2 (預設:前景模式)
> 在前景下執行,中斷即結束
### 0. 預覽
- ### 執行前

- ### 執行中
[](https://i.imgur.com/5W8IjwX.png)
- 因為在前景執行,所以會卡住當前 process
- 產生的檔案
[](https://i.imgur.com/5ugxykK.png)
- ### ^C 中斷後

- `my_supervisord.pid` 會自行刪除
<br>
### 1. 準備 `run_and_sleep.py`
> 同 [範例1](#1-準備-run_and_sleeppy)
<br>
### 2. 準備 `my_supervisord.conf`
```ini=
[supervisord]
nodaemon=true ; 在前景執行
logfile=my_supervisord.log ; 在當前目錄輸出 my_supervisord.log
pidfile=my_supervisord.pid ; 在當前目錄輸出 my_supervisord.pid
childlogdir=. ; 在當前目錄輸出 program 的 log
[program:python]
command=python run_and_sleep.py
```
- `nodaemon=true`
表示跑在前景,卡住當前 process
[](https://i.imgur.com/5W8IjwX.png)
- `logfile` 預設路徑&檔名為
當前目錄下輸出 `supervisord.log` (如同範例一所示)
- `pidfile` 預設路徑&檔名為
當前目錄下輸出 `supervisord.pid` (如同範例一所示)
- `childlogdir`
- 預設路徑&檔名:不輸出檔案
- 設定此變數表示輸出 program 的 stdout & stderr log
<br>
### 3. 執行 supervisord
```bash=
$ supervisor -c my_supervisord.conf
sh: 2: supervisor: not found
$ supervisord -c my_supervisord.conf
2022-04-22 10:53:19,727 INFO supervisord started with pid 307
2022-04-22 10:53:20,732 INFO spawned: 'python' with pid 309
2022-04-22 10:53:21,734 INFO success: python entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-04-22 10:53:30,768 INFO exited: python (exit status 0; expected)
(卡住當前 process)
```
<br>
### 4. 關閉 supervisor
`^C` 中斷
<br>
<hr>
<br>
## 常用範例 3 (實作範例)
### `my_supervisord.conf`
```ini=
[supervisord] ; [必要]
nodaemon=true ; [必要] 在前景執行,可以卡住 process 避免 container 結束
logfile=my_supervisord.log ; [選擇性] 在當前目錄輸出 my_supervisord.log
pidfile=my_supervisord.pid ; [選擇性] 在當前目錄輸出 my_supervisord.pid
childlogdir=. ; [選擇性] 在當前目錄輸出 program 的 log
user=root ; [選擇性]
[program:run_and_sleep]
command=python run_and_sleep.py
var1
var2
var3
```
- :warning: 不能有空的 session
```
Error: program section program:XXX does not specify a command
in section 'program:XXX' (file: 'supervisord.conf')
```
- 如果 user 是 root,會有底下訊息
```
CRIT Supervisor running as root (no user in config file)
```
添加底下屬性,可以消除此訊息
```ini
user=root
```
<br>
<hr>
<br>
## :warning: 使用限制
:::warning
:warning: **daemon vs. non-daemon?**
program 執行時間,未超過 1 秒的,會被反覆重啟 4 次:
```
INFO spawned: 'your_program' with pid 9999
INFO exited: your_program (exit status 0; not expected)
```
最後印出:
```
gave up: your_program entered FATAL state, too many start retries too quickly
```
:::
### 測試 .sh
```ini
[program:sh_tester]
command=bash test.sh ; test.sh 檔案是空的
```
```=
$ supervisord -c my_supervisord.conf
2022-04-26 10:22:39,514 INFO supervisord started with pid 1967321
2022-04-26 10:22:40,517 INFO spawned: 'sh_tester' with pid 1967324
2022-04-26 10:22:40,526 INFO exited: sh_tester (exit status 0; not expected)
2022-04-26 10:22:41,529 INFO spawned: 'sh_tester' with pid 1967350
2022-04-26 10:22:41,535 INFO exited: sh_tester (exit status 0; not expected)
2022-04-26 10:22:43,539 INFO spawned: 'sh_tester' with pid 1967351
2022-04-26 10:22:43,551 INFO exited: sh_tester (exit status 0; not expected)
2022-04-26 10:22:46,556 INFO spawned: 'sh_tester' with pid 1967352
2022-04-26 10:22:46,565 INFO exited: sh_tester (exit status 0; not expected)
2022-04-26 10:22:47,567 INFO gave up: sh_tester entered FATAL state, too many start retries too quickly
```
- `.sh` 會被反覆重啟 4 次
- 如何讓 supervisord 認為是**常駐程式(daemon)**?
- `test.sh` 修改如下
```bash=
# delay 1 second before ending,
# so that supervisord won't restart this
sleep 1
```
- 測試執行
```=
$ supervisord -c my_supervisord.conf
2022-04-26 10:59:18,584 INFO supervisord started with pid 1974732
2022-04-26 10:59:19,587 INFO spawned: 'sh_tester' with pid 1974760
2022-04-26 10:59:20,588 INFO success: sh_tester entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-04-26 10:59:20,633 INFO exited: sh_tester (exit status 0; expected)
```
:+1: :100:
:::warning
:warning: **執行 sh 的注意事項**
- 如果不使用 sh, bash 等 sheband 告知要以哪種方式執行,會有底下幾種類型錯誤
- **使用相對路徑**
```
INFO spawnerr: can't find command 'test.sh'
```
- 解決辦法
在檔案最前面加上 `#!/bin/sh`, `#!/bin/bash`
```sh=
#!/bin/sh
sleep 1
```
- **如果絕對路徑**
- test.sh 不是執行檔
```
INFO spawnerr: command at '/home/tj/workplace/supervisord/test.sh' is not executable
```
- 解決辦法
```
chmod +x test.sh
```
- test.sh 是執行檔
```
INFO spawned: 'sh_tester' with pid 2011012
INFO exited: sh_tester (exit status 127; not expected)
```
開啟 sh_tester-stderr---supervisor-xxxxxxxx.log
```
supervisor: couldn't exec /home/tj/Asus/workplace/supervisord/test.sh: ENOEXEC
supervisor: child process was not spawned
```
- 解決辦法
在檔案最前面加上 `#!/bin/sh`, `#!/bin/bash`
```sh=
#!/bin/sh
sleep 1
```
- 參考資料
- [supervisord exiting with ENOEXEC](https://stackoverflow.com/questions/19285666)
:::
<br>
### 測試 .py
```ini
[program:py_tester]
command=python test.py ; test.py 檔案是空的
```
```=
$ supervisord -c my_supervisord.conf
2022-04-26 10:26:12,170 INFO supervisord started with pid 1967964
2022-04-26 10:26:13,174 INFO spawned: 'py_tester' with pid 1967967
2022-04-26 10:26:13,198 INFO exited: py_tester (exit status 0; not expected)
2022-04-26 10:26:14,202 INFO spawned: 'py_tester' with pid 1967968
2022-04-26 10:26:14,224 INFO exited: py_tester (exit status 0; not expected)
2022-04-26 10:26:16,229 INFO spawned: 'py_tester' with pid 1967969
2022-04-26 10:26:16,254 INFO exited: py_tester (exit status 0; not expected)
2022-04-26 10:26:19,259 INFO spawned: 'py_tester' with pid 1967999
2022-04-26 10:26:19,281 INFO exited: py_tester (exit status 0; not expected)
2022-04-26 10:26:20,283 INFO gave up: py_tester entered FATAL state, too many start retries too quickly
```
- `.py` 會被反覆重啟 4 次
- 如何讓 supervisord 認為是**常駐程式(daemon)**?
- `test.py` 修改如下
```python=
import time
time.sleep(1)
```
- 測試執行
```=
$ supervisord -c my_supervisord.conf
2022-04-26 10:55:39,943 INFO supervisord started with pid 1974098
2022-04-26 10:55:40,947 INFO spawned: 'py_tester' with pid 1974102
2022-04-26 10:55:41,950 INFO success: py_tester entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-04-26 10:55:42,473 INFO exited: py_tester (exit status 0; expected)
```
:+1: :100:
<br>
### 最終解法
- [Configuration File > startsecs](http://supervisord.org/configuration.html#startsecs)
> The total number of seconds which the program needs to stay running after a startup to consider the start successful (moving the process from the STARTING state to the RUNNING state). Set to 0 to indicate that the program needn’t stay running for any particular amount of time.
```
[program:sh_tester]
command=bash test.sh
startsecs=0
```
<br>
<hr>
<br>
## 參考資料
- ### [[GitLab] notebook-images / ngcTensorflow / supervisord.conf](http://10.78.26.44:30000/cloud_infra2/notebook-images/-/blob/master/ngcTensorflow/supervisord.conf)
http://10.78.26.44:30000/cloud_infra2/notebook-images/-/blob/master/ngcTensorflow/supervisord.conf
- ### [[官網] Supervisor 4.2.4 doc](http://supervisord.org/configuration.html)
- [Supervisor的作用与配置](https://www.jianshu.com/p/0226b7c59ae2)
```
cat /etc/supervisor/supervisord.conf
```
- [program:x]:配置文件必须包括至少一个program,x是program名称,必须写上,不能为空
- command:包含一个命令,当这个program启动时执行
- directory:执行子进程时supervisord暂时切换到该目录
- user:账户名
- startsecs:进程从STARING状态转换到RUNNING状态program所需要保持运行的时间(单位:秒)
- redirect_stderr:如果是true,则进程的stderr输出被发送回其stdout文件描述符上的supervisord
- stdout_logfile:将进程stdout输出到指定文件
- stdout_logfile_maxbytes:stdout_logfile指定日志文件最大字节数,默认为50MB,可以加KB、MB或GB等单位
- stdout_logfile_backups:要保存的stdout_logfile备份的数量
- ### [[Linux] Supervisor的使用](https://www.huweihuang.com/article/linux/supervisor-usage/)
- 默认日志路径为 /var/log/supervisor/supervisord.log
- 创建confd.conf配置
```ini=
[program:confd]
directory = /usr/local/bin ; 程序的启动目录
command = /usr/local/bin/confd -config-file /etc/confd/confd.toml ; 启动命令,与命令行启动的命令是一样的
autostart = true ; 在 supervisord 启动的时候也自动启动
startsecs = 5 ; 启动 5 秒后没有异常退出,就当作已经正常启动了
autorestart = true ; 程序异常退出后自动重启
startretries = 3 ; 启动失败自动重试次数,默认是 3
user = root ; 用哪个用户启动
redirect_stderr = true ; 把 stderr 重定向到 stdout,默认 false
stdout_logfile_maxbytes = 20MB ; stdout 日志文件大小,默认 50MB
stdout_logfile_backups = 20 ; stdout 日志文件备份数
; stdout 日志文件,需要注意当指定目录不存在时无法正常启动,所以需要手动创建目录(supervisord 会自动创建日志文件)
stdout_logfile = /etc/supervisord.d/log/confd.log ;日志统一放在log目录下
; 可以通过 environment 来添加需要的环境变量,一种常见的用法是修改 PYTHONPATH
; environment=PYTHONPATH=$PYTHONPATH:/path/to/somewhere
```
- ### [Supervisor的作用与配置](https://www.jianshu.com/p/0226b7c59ae2)