patroni 通过服务启动报错
通过服务启动日志报错: systemctl start patroni
May 29 12:07:05 template systemd: Started Runners to orchestrate a high-availability AntDB5.0.May 29 12:07:05 template systemd: Starting Runners to orchestrate a high-availability AntDB5.0...May 29 12:07:05 template patroni: FATAL: Patroni requires psycopg2>=2.5.4 or psycopg2-binaryMay 29 12:07:05 template systemd: patroni.service: main process exited, code=exited, status=1/FAILUREMay 29 12:07:05 template systemd: Unit patroni.service entered failed state.May 29 12:07:05 template systemd: patroni.service failed.
patroni 代码里的判断逻辑,根据报错位置,看样子是 import 报错:
def main(): min_psycopg2 = (2, 5, 4) min_psycopg2_str = '.'.join(map(str, min_psycopg2)) def parse_version(version): for e in version.split('.'): try: yield int(e) except ValueError: break try: import psycopg2 version_str = psycopg2.__version__.split(' ')[0] version = tuple(parse_version(version_str)) if version < min_psycopg2: fatal('Patroni requires psycopg2>={0}, but only {1} is available', min_psycopg2_str, version_str) except ImportError: fatal('Patroni requires psycopg2>={0} or psycopg2-binary', min_psycopg2_str)
简单import 验证没有问题:
[antdb@test-node25 ~]$ python -c "import psycopg2; print(psycopg2.__version__)"2.8.4 (dt dec pq3 ext lo64)[antdb@test-node25 ~]$ sudo python -c "import psycopg2; print(psycopg2.__version__)"2.8.4 (dt dec pq3 ext lo64)
单独验证也没有问题
[antdb@test-node25 danghb]$ sudo python t.py ('Patroni requires psycopg2 version {0} available', '2.8.4')[antdb@test-node25 danghb]$ cat t.py if __name__ == "__main__": min_psycopg2 = (2, 5, 4) min_psycopg2_str = '.'.join(map(str, min_psycopg2)) def parse_version(version): for e in version.split('.'): try: yield int(e) except ValueError: break try: import psycopg2 version_str = psycopg2.__version__.split(' ')[0] version = tuple(parse_version(version_str)) if version < min_psycopg2: print('Patroni requires psycopg2>={0}, but only {1} is available', min_psycopg2_str, version_str) else: print('Patroni requires psycopg2 version {0} available', version_str) except ImportError: print('Patroni requires psycopg2>={0} or psycopg2-binary', min_psycopg2_str)[antdb@test-node25 danghb]$
手动启动patroni:patroni /etc/patroni.yml
2020-05-29 12:59:39,173 INFO: Lock owner: None; I am node12020-05-29 12:59:39,187 INFO: Lock owner: None; I am node12020-05-29 12:59:39,192 INFO: starting as a secondary2020-05-29 12:59:39,231 INFO: postmaster pid=685652020-05-29 12:59:39.240 CST [68565] LOG: listening on IPv4 address "10.238.99.74", port 54322020-05-29 12:59:39.241 CST [68565] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"2020-05-29 12:59:39.450 CST [68565] LOG: redirecting log output to logging collector process2020-05-29 12:59:39.450 CST [68565] HINT: Future log output will appear in directory "pg_log".10.238.99.74:5432 - rejecting connections10.238.99.74:5432 - accepting connections2020-05-29 12:59:39,513 INFO: establishing a new patroni connection to the postgres cluster2020-05-29 12:59:39,526 WARNING: Could not activate Linux watchdog device: "Can't open watchdog device: [Errno 2] No such file or directory: '/dev/watchdog'"2020-05-29 12:59:39,537 INFO: promoted self to leader by acquiring session lockserver promoting2020-05-29 12:59:39,545 INFO: cleared rewind state after becoming the leader2020-05-29 12:59:40,611 INFO: Lock owner: node1; I am node12020-05-29 12:59:40,638 INFO: no action. i am the leader with the lock2020-05-29 12:59:45,612 INFO: Lock owner: node1; I am node12020-05-29 12:59:45,627 INFO: no action. i am the leader with the lock2020-05-29 12:59:50,615 INFO: Lock owner: node1; I am node12020-05-29 12:59:50,638 INFO: no action. i am the leader with the lock2020-05-29 12:59:55,613 INFO: Lock owner: node1; I am node1
也没有问题,怀疑是服务文件这块,没有加载一些环境变量。
将验证脚本写到服务里:
sudo vi /usr/lib/systemd/system/dangtest.service [Unit]Description=dangtestAfter=syslog.target network.target[Service]Type=simpleUser=antdbGroup=antdbExecStart=/usr/bin/python /home/antdb/danghb/t.pyKillMode=process[Install]WantedBy=multi-user.target
sudo systemctl daemon-reload
sudo systemctl start dangtest
May 29 13:49:10 template systemd: Started dangtest.May 29 13:49:10 template systemd: Starting dangtest...May 29 13:49:10 template python: ('Patroni requires psycopg2>={0} or psycopg2-binary', '2.5.4')
Systemd starts the processes with a minimal environment
通过systemd 果然不行。
通过systemd 的输出跟下面类似:
[antdb@test-node25 ~]$ env -i /usr/bin/python /home/antdb/danghb/t.py('Patroni requires psycopg2>={0} or psycopg2-binary', '2.5.4')[antdb@test-node25 ~]$ /usr/bin/python /home/antdb/danghb/t.py('Patroni requires psycopg2 version {0} available', '2.8.4')
也就是没有加载环境变量。
指定PYTHONPATH
:
[antdb@test-node25 ~]$ env PYTHONPATH=/usr/lib64/python2.7/site-packages /usr/bin/python /home/antdb/danghb/t.py('Patroni requires psycopg2 version {0} available', '2.8.4')[antdb@test-node25 ~]$ sudo env PYTHONPATH=/usr/lib64/python2.7/site-packages /usr/bin/python /home/antdb/danghb/t.py('Patroni requires psycopg2 version {0} available', '2.8.4')
再次修改 dangtest.service 文件中,添加 Environment
选项:
sudo vi /usr/lib/systemd/system/dangtest.service Environment=PYTHONPATH=/usr/lib64/python2.7/site-packages ExecStart=”/usr/bin/python /home/antdb/danghb/t.py
不起作用.
修改 ExecStart
为:
ExecStart=/usr/bin/env PYTHONPATH=/usr/lib64/python2.7/site-packages /usr/bin/python /home/antdb/danghb/t.py
也不起作用
sudo systemctl daemon-reload
sudo systemctl start dangtest
考虑到执行直接脚本正常,且通过 env -i
异常,那这之间肯定用到了用户的一些环境变量,
sudo vi /etc/antdb_env.confSELINUX_ROLE_REQUESTED=TERM=vt100SHELL=/bin/bashHISTSIZE=5PERL5LIB=/home/antdb/perl5/lib64/perl5:/home/antdb/perl5/usr/local/share/perl5/:SELINUX_USE_CURRENT_RANGE=ADB_HOME=/data/antdb/app/antdbSSH_TTY=/dev/pts/30USER=antdbPGPORT=5432LD_LIBRARY_PATH=/data/antdb/app/antdb/lib:/data/antdb/oracle/instantclient_11_2:LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.oga=01;36:*.spx=01;36:*.xspf=01;36:PGDATABASE=postgresMAIL=/var/spool/mail/antdbPATH=/data/antdb/app/antdb/bin:/data/antdb/oracle/instantclient_11_2:/home/antdb/perl5/usr/local/bin/:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/antdb/.local/bin:/home/antdb/binPWD=/home/antdbLANG=en_US.UTF-8SELINUX_LEVEL_REQUESTED=HISTCONTROL=ignoredupsSHLVL=1HOME=/home/antdbLOGNAME=antdbPGDATA=/data/antdb/data/antdbLESSOPEN=||/usr/bin/lesspipe.sh %sXDG_RUNTIME_DIR=/run/user/1012ORACLE_HOME=/data/antdb/oracle/instantclient_11_2_=/usr/bin/env
修改 EnvironmentFile
为:
EnvironmentFile=/etc/antdb_env.conf
启动 dangtest
服务,终于好了:
May 29 15:32:25 template systemd: Configuration file /usr/lib/systemd/system/dangtest.service is marked world-inaccessible. This has no effect as configuration data is accessible via APIs without restrictions. Proceeding anyway.May 29 15:32:25 template systemd: Started dangtest.May 29 15:32:25 template systemd: Starting dangtest...May 29 15:32:25 template python: ('Patroni requires psycopg2 version {0} available', '2.8.4'
同样的,修改 patroni 的 service 文件:
[antdb@test-node25 ~]$ sudo vi /usr/lib/systemd/system/patroni.service [Unit]Description=Runners to orchestrate a high-availability AntDB5.0After=syslog.target network.target[Service]Type=simpleUser=antdbGroup=antdbEnvironmentFile=/etc/antdb_env.confExecStart=/usr/bin/patroni /etc/patroni.yml#KillMode=mixedTimeoutSec=30Restart=no[Install]WantedBy=multi-user.target
再次通过服务方式去启动 patroni:
sudo systemctl start patroni日志:May 29 15:56:04 template patroni: 2020-05-29 15:56:04.165 CST [69645] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"May 29 15:56:04 template patroni: 2020-05-29 15:56:04.325 CST [69645] LOG: redirecting log output to logging collector processMay 29 15:56:04 template patroni: 2020-05-29 15:56:04.325 CST [69645] HINT: Future log output will appear in directory "pg_log".May 29 15:56:04 template patroni: 10.238.99.74:5432 - rejecting connectionsMay 29 15:56:04 template patroni: 10.238.99.74:5432 - accepting connectionsMay 29 15:56:04 template patroni: 2020-05-29 15:56:04,397 INFO: establishing a new patroni connection to the postgres clusterMay 29 15:56:04 template patroni: 2020-05-29 15:56:04,421 WARNING: Could not activate Linux watchdog device: "Can't open watchdog device: [Errno 2] No such file or directory: '/dev/watchdog'"May 29 15:56:04 template patroni: 2020-05-29 15:56:04,443 INFO: promoted self to leader by acquiring session lockMay 29 15:56:04 template patroni: server promotingMay 29 15:56:04 template patroni: 2020-05-29 15:56:04,454 INFO: cleared rewind state after becoming the leaderMay 29 15:56:05 template patroni: 2020-05-29 15:56:05,528 INFO: Lock owner: node1; I am node1May 29 15:56:05 template patroni: 2020-05-29 15:56:05,565 INFO: no action. i am the leader with the lockMay 29 15:56:10 template patroni: 2020-05-29 15:56:10,526 INFO: Lock owner: node1; I am node1May 29 15:56:10 template patroni: 2020-05-29 15:56:10,536 INFO: no action. i am the leader with the lock
启动正常。
参考链接:
https://wizardforcel.gitbooks.io/vbird-linux-basic-4e/content/150.html
https://stackoverflow.com/questions/35641414/python-import-of-local-module-failing-when-run-as-systemd-systemctl-service
版权声明: 本文为 InfoQ 作者【yafeishi】的原创文章。
原文链接:【http://xie.infoq.cn/article/c51669760577b9826fa8dc2a2】。文章转载请联系作者。
yafeishi
还未添加个人签名 2017.11.01 加入
还未添加个人简介
评论