Prometheus Exporter (十二)Consul Exporter
本文已经收录在 Prometheus 合集 Prometheus 都可以采集那些指标?-- 常用 Exporter 合集 中。
微服务的框架体系中,服务发现是很重要的一个模块,常用来做服务发现的框架有 Consul 、 Etcd、Zookeeper 等等。今天我们来看看 Consul 要如何监控,其他的框架有时间我们也看一下。
针对 Consul 的监控 Prometheus 官方出了一个插件,叫 Consul Exporter ,地址是 :https://github.com/prometheus/consul_exporter ,这个仓库已经一年没有更新,但是 Prometheus 官方出品,可以继续使用,如果到了 2022 年 10 月份还没有更新,那么就要重新评估了。
Consul Exporter 当前最新版本是 0.7.1 ,发布于 2020.07.21 。
安装
Consul Exporter 可以使用二进制运行,也可以使用 Docker image 来运行。一个 Consul 机器启动一个 Consul Exporter 就可以了。 像下边这样,指定 Consul Server 的地址就可以。缺省从 9107 端口暴露监控数据。
如果是 Docker image 启动的话可以使用如下命令:
对于启动的参数 Consul Exporter 有很多。
consul.allow_stale: Allows any Consul server (non-leader) to service a read.
consul.ca-file: File path to a PEM-encoded certificate authority used to validate the authenticity of a server certificate.
consul.cert-file: File path to a PEM-encoded certificate used with the private key to verify the exporter's authenticity.
consul.health-summary: Collects information about each registered service and exports consul_catalog_service_node_healthy. This requires n+1 Consul API queries to gather all information about each service. Health check information are available via consul_health_service_status as well, but only for services which have a health check configured. Defaults to true. Disable using --no-consul.heatlh-summary.
consul.key-file: File path to a PEM-encoded private key used with the certificate to verify the exporter's authenticity.
consul.insecure: Disable TLS host verification.
consul.require_consistent: Forces the read to be fully consistent.
consul.server: Address (host and port) of the Consul instance we should connect to. This could be a local agent (localhost:8500, for instance), or the address of a Consul server.
consul.server-name: When provided, this overrides the hostname for the TLS certificate. It can be used to ensure that the certificate name matches the hostname we declare.
consul.timeout: Timeout on HTTP requests to consul.consul.request-limit: Limit the maximum number of concurrent requests to consul, 0 means no limit.
log.format: Set the log target and format. Example: logger:syslog?appname=bob&local=7 or logger:stdout?json=true
log.level: Logging level. info by default.
web.listen-address: Address to listen on for web interface and telemetry.
web.telemetry-path: Path under which to expose metrics.
对于 Consul Exporter 支持 Consul 官方提供的所有环境变量,包括使用 CONSUL_HTTP_TOKEN
去设置 ACL TOKEN 。官方提供的环境变量可以参考 https://github.com/hashicorp/consul/blob/b2478036d88a7e8eb9d6a0daf1a1c9ad0c8885ca/api/api.go#L24-L74 。
指标
从 Consul Exporter 暴露出来的
consul_up Was the last query of Consul successful
consul_raft_peers How many peers (servers) are in the Raft cluster
consul_serf_lan_members How many members are in the cluster
consul_serf_lan_member_status Status of member in the cluster. 1=Alive, 2=Leaving, 3=Left, 4=Failed. 这个指标的 Label 有 member
consul_catalog_services How many services are in the cluster
consul_catalog_service_node_healthy Is this service healthy on this node 这个指标的 Label 有 service, node
consul_health_node_status Status of health checks associated with a node 这个指标的 Label 有 check, node, status
consul_health_service_status Status of health checks associated with a service 这个指标的 Label 有 check, node, service, status
consul_catalog_kv The values for selected keys in Consul's key/value catalog. Keys with non-numeric values are omitted 这个指标的 Label 有 key
consul_service_checks Link the Consul service ID with check name if available 这个指标的 Label 有 service_id,service_name, check_id, check_name, node
对于这些指标有这些常用的计算过程如下:
通过下面这个计算,当值为 1 的时候,说明这个服务所有的节点都是好的,状态是 passing ,如果值为 0 ,那么说明这个服务至少有一个节点的状态是异常的。
那么我们可以继续计算,通过下面这个查询我们就能得到 状态异常的服务名称和节点名称:
通过下面这个查询可以得到服务状态是 critical 的服务名称:
小结
简单列举了 Consul Exporter 的使用方法和指标,活用这些指标可以快速的发现问题。
版权声明: 本文为 InfoQ 作者【耳东@Erdong】的原创文章。
原文链接:【http://xie.infoq.cn/article/8e6fa54814fae28ee8a2ce8f7】。未经作者许可,禁止转载。
评论