前言
之前開發的一個流媒體服務與網關服務,為了保障其可靠運作,對程序增加了守護,而且大大減低了運維難度。這裡就不得不說一下Supervisor。
Supervisor是用Python開發的一套通用的程序管理程式,能将一個普通的指令行程序變為背景daemon,并監控程序狀态,異常退出時能自動重新開機。它是通過fork/exec的方式把這些被管理的程序當作supervisor的子程序來啟動,這樣隻要在supervisor的配置檔案中,把要管理的程序的可執行檔案的路徑寫進去即可。也實作當子程序挂掉的時候,父程序可以準确擷取子程序挂掉的資訊的,可以選擇是否自己啟動和報警。
安裝supervisor
執行指令:
apt-get install supervisor
等待supervisor安裝完成
使用echo_supervisord_conf生成預設的配置檔案
執行指令:
echo_supervisord_conf > /etc/supervisor/supervisord.conf
打開/etc/supervisor目錄下的supervisord.conf檔案,如下:前面的“;”是注釋符
; Sample supervisor config file.
;
; For more information on the config file, please see:
; http://supervisord.org/configuration.html
;
; Notes:
; - Shell expansion ("~" or "$HOME") is not supported. Environment
; variables can be expanded using this syntax: "%(ENV_HOME)s".
; - Comments must have a leading space: "a=b ;comment" not "a=b;comment".
[unix_http_server]
file=/tmp/supervisor.sock ; (the path to the socket file)
;chmod=0700 ; socket file mode (default 0700)
;chown=nobody:nogroup ; socket file uid:gid owner
;username=user ; (default is no username (open server))
;password=123 ; (default is no password (open server))
;[inet_http_server] ; inet (TCP) server disabled by default
;port=127.0.0.1:9001 ; (ip_address:port specifier, *:port for all iface)
;username=user ; (default is no username (open server))
;password=123 ; (default is no password (open server))
[supervisord]
logfile=/tmp/supervisord.log ; (main log file;default $CWD/supervisord.log)
logfile_maxbytes=50MB ; (max main logfile bytes b4 rotation;default 50MB)
logfile_backups=10 ; (num of main logfile rotation backups;default 10)
loglevel=info ; (log level;default info; others: debug,warn,trace)
pidfile=/tmp/supervisord.pid ; (supervisord pidfile;default supervisord.pid)
nodaemon=false ; (start in foreground if true;default false)
minfds=1024 ; (min. avail startup file descriptors;default 1024)
minprocs=200 ; (min. avail process descriptors;default 200)
;umask=022 ; (process file creation umask;default 022)
;user=chrism ; (default is current user, required if root)
;identifier=supervisor ; (supervisord identifier, default is 'supervisor')
;directory=/tmp ; (default is not to cd during start)
;nocleanup=true ; (don't clean up tempfiles at start;default false)
;childlogdir=/tmp ; ('AUTO' child log dir, default $TEMP)
;environment=KEY="value" ; (key value pairs to add to environment)
;strip_ansi=false ; (strip ansi escape codes in logs; def. false)
; the below section must remain in the config file for RPC
; (supervisorctl/web interface) to work, additional interfaces may be
; added by defining them in separate rpcinterface: sections
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[supervisorctl]
serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket
;serverurl=http://127.0.0.1:9001 ; use an http:// url to specify an inet socket
;username=chris ; should be same as http_username if set
;password=123 ; should be same as http_password if set
;prompt=mysupervisor ; cmd line prompt (default "supervisor")
;history_file=~/.sc_history ; use readline history if available
; The below sample program section shows all possible program subsection values,
; create one or more 'real' program: sections to be able to control them under
; supervisor.
;[program:theprogramname]
;command=/bin/cat ; the program (relative uses PATH, can take args)
;process_name=%(program_name)s ; process_name expr (default %(program_name)s)
;numprocs=1 ; number of processes copies to start (def 1)
;directory=/tmp ; directory to cwd to before exec (def no cwd)
;umask=022 ; umask for process (default None)
;priority=999 ; the relative start priority (default 999)
;autostart=true ; start at supervisord start (default: true)
;startsecs=1 ; # of secs prog must stay up to be running (def. 1)
;startretries=3 ; max # of serial start failures when starting (default 3)
;autorestart=unexpected ; when to restart if exited after running (def: unexpected)
;exitcodes=0,2 ; 'expected' exit codes used with autorestart (default 0,2)
;stopsignal=QUIT ; signal used to kill process (default TERM)
;stopwaitsecs=10 ; max num secs to wait b4 SIGKILL (default 10)
;stopasgroup=false ; send stop signal to the UNIX process group (default false)
;killasgroup=false ; SIGKILL the UNIX process group (def false)
;user=chrism ; setuid to this UNIX account to run the program
;redirect_stderr=true ; redirect proc stderr to stdout (default false)
;stdout_logfile=/a/path ; stdout log path, NONE for none; default AUTO
;stdout_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB)
;stdout_logfile_backups=10 ; # of stdout logfile backups (default 10)
;stdout_capture_maxbytes=1MB ; number of bytes in 'capturemode' (default 0)
;stdout_events_enabled=false ; emit events on stdout writes (default false)
;stderr_logfile=/a/path ; stderr log path, NONE for none; default AUTO
;stderr_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB)
;stderr_logfile_backups=10 ; # of stderr logfile backups (default 10)
;stderr_capture_maxbytes=1MB ; number of bytes in 'capturemode' (default 0)
;stderr_events_enabled=false ; emit events on stderr writes (default false)
;environment=A="1",B="2" ; process environment additions (def no adds)
;serverurl=AUTO ; override serverurl computation (childutils)
; The below sample eventlistener section shows all possible
; eventlistener subsection values, create one or more 'real'
; eventlistener: sections to be able to handle event notifications
; sent by supervisor.
;[eventlistener:theeventlistenername]
;command=/bin/eventlistener ; the program (relative uses PATH, can take args)
;process_name=%(program_name)s ; process_name expr (default %(program_name)s)
;numprocs=1 ; number of processes copies to start (def 1)
;events=EVENT ; event notif. types to subscribe to (req'd)
;buffer_size=10 ; event buffer queue size (default 10)
;directory=/tmp ; directory to cwd to before exec (def no cwd)
;umask=022 ; umask for process (default None)
;priority=-1 ; the relative start priority (default -1)
;autostart=true ; start at supervisord start (default: true)
;startsecs=1 ; # of secs prog must stay up to be running (def. 1)
;startretries=3 ; max # of serial start failures when starting (default 3)
;autorestart=unexpected ; autorestart if exited after running (def: unexpected)
;exitcodes=0,2 ; 'expected' exit codes used with autorestart (default 0,2)
;stopsignal=QUIT ; signal used to kill process (default TERM)
;stopwaitsecs=10 ; max num secs to wait b4 SIGKILL (default 10)
;stopasgroup=false ; send stop signal to the UNIX process group (default false)
;killasgroup=false ; SIGKILL the UNIX process group (def false)
;user=chrism ; setuid to this UNIX account to run the program
;redirect_stderr=false ; redirect_stderr=true is not allowed for eventlisteners
;stdout_logfile=/a/path ; stdout log path, NONE for none; default AUTO
;stdout_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB)
;stdout_logfile_backups=10 ; # of stdout logfile backups (default 10)
;stdout_events_enabled=false ; emit events on stdout writes (default false)
;stderr_logfile=/a/path ; stderr log path, NONE for none; default AUTO
;stderr_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB)
;stderr_logfile_backups=10 ; # of stderr logfile backups (default 10)
;stderr_events_enabled=false ; emit events on stderr writes (default false)
;environment=A="1",B="2" ; process environment additions
;serverurl=AUTO ; override serverurl computation (childutils)
; The below sample group section shows all possible group values,
; create one or more 'real' group: sections to create "heterogeneous"
; process groups.
;[group:thegroupname]
;programs=progname1,progname2 ; each refers to 'x' in [program:x] definitions
;priority=999 ; the relative start priority (default 999)
; The [include] section can just contain the "files" setting. This
; setting can list multiple files (separated by whitespace or
; newlines). It can also contain wildcards. The filenames are
; interpreted as relative to this file. Included files *cannot*
; include files themselves.
;[include]
;files = relative/directory/*.ini
去掉下面配置項前面的“;”,啟動supervisor服務後,就能通過http://ip:9001通路并管理配置的程序,友善運維
先看看通過啟動後通過位址http://ip:9001的界面效果:
通路的時候需要使用者名,密碼,這個對應的就是配置檔案中的username與password
配置管理程序
如果我們需要管理的程序比較多的話,建議每個程序單獨一個配置檔案,然後在supervisord.conf檔案中的[include]項增加對配置檔案的引入。
因為我将所有程序的配置項都放在/etc/supervisor/config.d檔案夾下,是以在[include]節點下增加下面這項即可
[include]
files = /etc/supervisor/config.d/*.conf
比如我現在需要對mediaserver程序進行管理,我在/etc/supervisor/config.d檔案夾下增加一個命名為mediaserver.conf檔案
裡面定義内容如下:
[program:mediaserver] ; 定義一個守護程序 mediaserver
user=root ; 啟動mediaserver的使用者
directory=/root/MediaServer_20200714 ; 進入到這個目錄中
command= nohup ./mediaserver -l 2 &; 執行啟動指令
autostart=true ; 設定為随 supervisord 啟動而啟動
autorestart=true; 設定為随 supervisord 重新開機而重新開機
startretries=10; 設定mediaserver重新開機的重試次數
stdout_logfile=/root/MediaServer_20200714/mediaserver.log/medias.log
priority=1
啟動Supervisor服務
執行指令:
supervisord -c /etc/supervisor/supervisord.conf
常用指令
Supervisorctl status檢視狀态
Supervisorctl stop all 停止所有
Supervisorctl start all 開始所有
Supervisorctl update 如果有改配置檔案,需要執行這個
Supervisorctl reload 重新啟動
常見問題
每一個程式的配置,引起的log不一樣,根據log檔案裡面的錯誤提示,一步一步正确完成配置。
問題1:Mediaserver的配置,主要錯誤是error while loading shared libraries: libuv.so.1: cannot open shared object file: No such file or directory
解決方法:編輯/etc/ld.so.conf檔案,在新的一行中加入庫檔案所在目錄:/usr/lib運作ldconfig,以更新/etc/ld.so.cache檔案;
問題2:/var/run/supervisor.sock no such file 或者/var/run/supervisor.sock refused connection 這個錯誤
解決方法:注釋掉/etc/supervisor/supervisord.conf這個檔案下的include項,重新啟動supervisor服務,進入supervisorctl 控制台,用status檢視,目前下面沒有任何的管理程式,然後把剛剛注釋的地方打開,進入supervisor控制台後,執行update指令
結束
之後我們就可以通過http://ip:9001這個網址進行運維
這裡面包含了目前監控的程序清單,以及對程序的一些正常重新開機,停止等運維操作