写点什么

patroni 通过服务启动报错

用户头像
yafeishi
关注
发布于: 2020 年 06 月 01 日

通过服务启动日志报错: systemctl start patroni

May 29 12:07:05 template systemd: Started Runners to orchestrate a high-availability AntDB5.0.
May 29 12:07:05 template systemd: Starting Runners to orchestrate a high-availability AntDB5.0...
May 29 12:07:05 template patroni: FATAL: Patroni requires psycopg2>=2.5.4 or psycopg2-binary
May 29 12:07:05 template systemd: patroni.service: main process exited, code=exited, status=1/FAILURE
May 29 12:07:05 template systemd: Unit patroni.service entered failed state.
May 29 12:07:05 template systemd: patroni.service failed.



patroni 代码里的判断逻辑,根据报错位置,看样子是 import 报错:

def main():
min_psycopg2 = (2, 5, 4)
min_psycopg2_str = '.'.join(map(str, min_psycopg2))
def parse_version(version):
for e in version.split('.'):
try:
yield int(e)
except ValueError:
break
try:
import psycopg2
version_str = psycopg2.__version__.split(' ')[0]
version = tuple(parse_version(version_str))
if version < min_psycopg2:
fatal('Patroni requires psycopg2>={0}, but only {1} is available', min_psycopg2_str, version_str)
except ImportError:
fatal('Patroni requires psycopg2>={0} or psycopg2-binary', min_psycopg2_str)



简单import 验证没有问题:

[antdb@test-node25 ~]$ python -c "import psycopg2; print(psycopg2.__version__)"
2.8.4 (dt dec pq3 ext lo64)
[antdb@test-node25 ~]$ sudo python -c "import psycopg2; print(psycopg2.__version__)"
2.8.4 (dt dec pq3 ext lo64)



单独验证也没有问题

[antdb@test-node25 danghb]$ sudo python t.py
('Patroni requires psycopg2 version {0} available', '2.8.4')
[antdb@test-node25 danghb]$ cat t.py
if __name__ == "__main__":
min_psycopg2 = (2, 5, 4)
min_psycopg2_str = '.'.join(map(str, min_psycopg2))
def parse_version(version):
for e in version.split('.'):
try:
yield int(e)
except ValueError:
break
try:
import psycopg2
version_str = psycopg2.__version__.split(' ')[0]
version = tuple(parse_version(version_str))
if version < min_psycopg2:
print('Patroni requires psycopg2>={0}, but only {1} is available', min_psycopg2_str, version_str)
else:
print('Patroni requires psycopg2 version {0} available', version_str)
except ImportError:
print('Patroni requires psycopg2>={0} or psycopg2-binary', min_psycopg2_str)
[antdb@test-node25 danghb]$



手动启动patroni:patroni /etc/patroni.yml

2020-05-29 12:59:39,173 INFO: Lock owner: None; I am node1
2020-05-29 12:59:39,187 INFO: Lock owner: None; I am node1
2020-05-29 12:59:39,192 INFO: starting as a secondary
2020-05-29 12:59:39,231 INFO: postmaster pid=68565
2020-05-29 12:59:39.240 CST [68565] LOG: listening on IPv4 address "10.238.99.74", port 5432
2020-05-29 12:59:39.241 CST [68565] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2020-05-29 12:59:39.450 CST [68565] LOG: redirecting log output to logging collector process
2020-05-29 12:59:39.450 CST [68565] HINT: Future log output will appear in directory "pg_log".
10.238.99.74:5432 - rejecting connections
10.238.99.74:5432 - accepting connections
2020-05-29 12:59:39,513 INFO: establishing a new patroni connection to the postgres cluster
2020-05-29 12:59:39,526 WARNING: Could not activate Linux watchdog device: "Can't open watchdog device: [Errno 2] No such file or directory: '/dev/watchdog'"
2020-05-29 12:59:39,537 INFO: promoted self to leader by acquiring session lock
server promoting
2020-05-29 12:59:39,545 INFO: cleared rewind state after becoming the leader
2020-05-29 12:59:40,611 INFO: Lock owner: node1; I am node1
2020-05-29 12:59:40,638 INFO: no action. i am the leader with the lock
2020-05-29 12:59:45,612 INFO: Lock owner: node1; I am node1
2020-05-29 12:59:45,627 INFO: no action. i am the leader with the lock
2020-05-29 12:59:50,615 INFO: Lock owner: node1; I am node1
2020-05-29 12:59:50,638 INFO: no action. i am the leader with the lock
2020-05-29 12:59:55,613 INFO: Lock owner: node1; I am node1



也没有问题,怀疑是服务文件这块,没有加载一些环境变量。



将验证脚本写到服务里:

sudo vi /usr/lib/systemd/system/dangtest.service
[Unit]
Description=dangtest
After=syslog.target network.target
[Service]
Type=simple
User=antdb
Group=antdb
ExecStart=/usr/bin/python /home/antdb/danghb/t.py
KillMode=process
[Install]
WantedBy=multi-user.target



sudo systemctl daemon-reload

sudo systemctl start dangtest



May 29 13:49:10 template systemd: Started dangtest.
May 29 13:49:10 template systemd: Starting dangtest...
May 29 13:49:10 template python: ('Patroni requires psycopg2>={0} or psycopg2-binary', '2.5.4')



Systemd starts the processes with a minimal environment



通过systemd 果然不行。



通过systemd 的输出跟下面类似:

[antdb@test-node25 ~]$ env -i /usr/bin/python /home/antdb/danghb/t.py
('Patroni requires psycopg2>={0} or psycopg2-binary', '2.5.4')
[antdb@test-node25 ~]$ /usr/bin/python /home/antdb/danghb/t.py
('Patroni requires psycopg2 version {0} available', '2.8.4')

也就是没有加载环境变量。



指定PYTHONPATH:

[antdb@test-node25 ~]$ env PYTHONPATH=/usr/lib64/python2.7/site-packages /usr/bin/python /home/antdb/danghb/t.py
('Patroni requires psycopg2 version {0} available', '2.8.4')
[antdb@test-node25 ~]$ sudo env PYTHONPATH=/usr/lib64/python2.7/site-packages /usr/bin/python /home/antdb/danghb/t.py
('Patroni requires psycopg2 version {0} available', '2.8.4')



再次修改 dangtest.service 文件中,添加 Environment选项:

sudo vi /usr/lib/systemd/system/dangtest.service
Environment=PYTHONPATH=/usr/lib64/python2.7/site-packages
ExecStart=”/usr/bin/python /home/antdb/danghb/t.py

不起作用.

修改 ExecStart 为:

ExecStart=/usr/bin/env PYTHONPATH=/usr/lib64/python2.7/site-packages /usr/bin/python /home/antdb/danghb/t.py

也不起作用



sudo systemctl daemon-reload

sudo systemctl start dangtest



考虑到执行直接脚本正常,且通过 env -i 异常,那这之间肯定用到了用户的一些环境变量,

sudo vi /etc/antdb_env.conf
SELINUX_ROLE_REQUESTED=
TERM=vt100
SHELL=/bin/bash
HISTSIZE=5
PERL5LIB=/home/antdb/perl5/lib64/perl5:/home/antdb/perl5/usr/local/share/perl5/:
SELINUX_USE_CURRENT_RANGE=
ADB_HOME=/data/antdb/app/antdb
SSH_TTY=/dev/pts/30
USER=antdb
PGPORT=5432
LD_LIBRARY_PATH=/data/antdb/app/antdb/lib:/data/antdb/oracle/instantclient_11_2:
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.oga=01;36:*.spx=01;36:*.xspf=01;36:
PGDATABASE=postgres
MAIL=/var/spool/mail/antdb
PATH=/data/antdb/app/antdb/bin:/data/antdb/oracle/instantclient_11_2:/home/antdb/perl5/usr/local/bin/:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/antdb/.local/bin:/home/antdb/bin
PWD=/home/antdb
LANG=en_US.UTF-8
SELINUX_LEVEL_REQUESTED=
HISTCONTROL=ignoredups
SHLVL=1
HOME=/home/antdb
LOGNAME=antdb
PGDATA=/data/antdb/data/antdb
LESSOPEN=||/usr/bin/lesspipe.sh %s
XDG_RUNTIME_DIR=/run/user/1012
ORACLE_HOME=/data/antdb/oracle/instantclient_11_2
_=/usr/bin/env



修改 EnvironmentFile 为:

EnvironmentFile=/etc/antdb_env.conf



启动 dangtest 服务,终于好了:

May 29 15:32:25 template systemd: Configuration file /usr/lib/systemd/system/dangtest.service is marked world-inaccessible. This has no effect as configuration data is accessible via APIs without restrictions. Proceeding anyway.
May 29 15:32:25 template systemd: Started dangtest.
May 29 15:32:25 template systemd: Starting dangtest...
May 29 15:32:25 template python: ('Patroni requires psycopg2 version {0} available', '2.8.4'



同样的,修改 patroni 的 service 文件:

[antdb@test-node25 ~]$ sudo vi /usr/lib/systemd/system/patroni.service
[Unit]
Description=Runners to orchestrate a high-availability AntDB5.0
After=syslog.target network.target

[Service]
Type=simple
User=antdb
Group=antdb
EnvironmentFile=/etc/antdb_env.conf
ExecStart=/usr/bin/patroni /etc/patroni.yml
#KillMode=mixed
TimeoutSec=30
Restart=no

[Install]
WantedBy=multi-user.target

再次通过服务方式去启动 patroni:

sudo systemctl start patroni
日志:
May 29 15:56:04 template patroni: 2020-05-29 15:56:04.165 CST [69645] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
May 29 15:56:04 template patroni: 2020-05-29 15:56:04.325 CST [69645] LOG: redirecting log output to logging collector process
May 29 15:56:04 template patroni: 2020-05-29 15:56:04.325 CST [69645] HINT: Future log output will appear in directory "pg_log".
May 29 15:56:04 template patroni: 10.238.99.74:5432 - rejecting connections
May 29 15:56:04 template patroni: 10.238.99.74:5432 - accepting connections
May 29 15:56:04 template patroni: 2020-05-29 15:56:04,397 INFO: establishing a new patroni connection to the postgres cluster
May 29 15:56:04 template patroni: 2020-05-29 15:56:04,421 WARNING: Could not activate Linux watchdog device: "Can't open watchdog device: [Errno 2] No such file or directory: '/dev/watchdog'"
May 29 15:56:04 template patroni: 2020-05-29 15:56:04,443 INFO: promoted self to leader by acquiring session lock
May 29 15:56:04 template patroni: server promoting
May 29 15:56:04 template patroni: 2020-05-29 15:56:04,454 INFO: cleared rewind state after becoming the leader
May 29 15:56:05 template patroni: 2020-05-29 15:56:05,528 INFO: Lock owner: node1; I am node1
May 29 15:56:05 template patroni: 2020-05-29 15:56:05,565 INFO: no action. i am the leader with the lock
May 29 15:56:10 template patroni: 2020-05-29 15:56:10,526 INFO: Lock owner: node1; I am node1
May 29 15:56:10 template patroni: 2020-05-29 15:56:10,536 INFO: no action. i am the leader with the lock



启动正常。



参考链接:

  • https://wizardforcel.gitbooks.io/vbird-linux-basic-4e/content/150.html

  • https://stackoverflow.com/questions/35641414/python-import-of-local-module-failing-when-run-as-systemd-systemctl-service



发布于: 2020 年 06 月 01 日阅读数: 310
用户头像

yafeishi

关注

还未添加个人签名 2017.11.01 加入

还未添加个人简介

评论

发布
暂无评论
patroni 通过服务启动报错