2024 华为云开源开发者论坛项目抢鲜看｜Kmesh: 监控指标和访问日志功能详解

作者：华为云原生团队

2024-11-27
中国香港
本文字数：3042 字
阅读完需：约 10 分钟

2024华为云开源开发者论坛项目抢鲜看｜Kmesh: 监控指标和访问日志功能详解

Kmesh 是内核原生 Sidecarless 服务网格数据平面。它借助 "eBPF "和 "可编程内核"，将流量治理下沉到操作系统内核，大大的降低了服务网格的资源开销和网络延迟。

通过 eBPF，流量数据可以直接在内核中获取，并且能够使用 "bpf map"将数据传递到用户空间。Kmesh 使用这些数据构建监控指标和访问日志。

▍如何获取原始数据

在内核中，可以直接获取 socket 携带的流量信息。

bpf_tcp_sock 中携带的数据如下：

struct bpf_tcp_sock { __u32 snd_cwnd;  /* Sending congestion window  */ __u32 srtt_us;  /* smoothed round trip time << 3 in usecs */ __u32 rtt_min; __u32 snd_ssthresh; /* Slow start size threshold  */ __u32 rcv_nxt;  /* What we want to receive next  */ __u32 snd_nxt;  /* Next sequence we send  */ __u32 snd_una;  /* First byte we want an ack for */ __u32 mss_cache; /* Cached effective mss, not including SACKS */ __u32 ecn_flags; /* ECN status bits.   */ __u32 rate_delivered; /* saved rate sample: packets delivered */ __u32 rate_interval_us; /* saved rate sample: time elapsed */ __u32 packets_out; /* Packets which are "in flight" */ __u32 retrans_out; /* Retransmitted packets out  */ __u32 total_retrans; /* Total retransmits for entire connection */ __u32 segs_in;  /* RFC4898 tcpEStatsPerfSegsIn     * total number of segments in.     */ __u32 data_segs_in; /* RFC4898 tcpEStatsPerfDataSegsIn     * total number of data segments in.     */ __u32 segs_out;  /* RFC4898 tcpEStatsPerfSegsOut     * The total number of segments sent.     */ __u32 data_segs_out; /* RFC4898 tcpEStatsPerfDataSegsOut     * total number of data segments sent.     */ __u32 lost_out;  /* Lost packets   */ __u32 sacked_out; /* SACK'd packets   */ __u64 bytes_received; /* RFC4898 tcpEStatsAppHCThruOctetsReceived     * sum(delta(rcv_nxt)), or how many bytes     * were acked.     */ __u64 bytes_acked; /* RFC4898 tcpEStatsAppHCThruOctetsAcked     * sum(delta(snd_una)), or how many bytes     * were acked.     */ __u32 dsack_dups; /* RFC4898 tcpEStatsStackDSACKDups     * total number of DSACK blocks received     */ __u32 delivered; /* Total data packets delivered incl. rexmits */ __u32 delivered_ce; /* Like the above but only ECE marked packets */ __u32 icsk_retransmits; /* Number of unrecovered [RTO] timeouts */};

复制代码

注意: 上述数据并没完全用于监控指标和访问日志功能。Kmesh 将在后续的开发中逐步补充这些指标。

现阶段使用的数据有：

struct tcp_probe_info {    __u32 type;    struct bpf_sock_tuple tuple;    __u32 sent_bytes;    __u32 received_bytes;    __u32 conn_success;    __u32 direction;    __u64 duration; // ns    __u64 close_ns;    __u32 state; /* tcp state */    __u32 protocol;    __u32 srtt_us; /* smoothed round trip time << 3 in usecs */    __u32 rtt_min;    __u32 mss_cache;     /* Cached effective mss, not including SACKS */    __u32 total_retrans; /* Total retransmits for entire connection */    __u32 segs_in;       /* RFC4898 tcpEStatsPerfSegsIn                          * total number of segments in.                          */    __u32 segs_out;      /* RFC4898 tcpEStatsPerfSegsOut                          * The total number of segments sent.                          */    __u32 lost_out;      /* Lost packets   */};

复制代码

除了这些 socket 携带的数据外，Kmesh 通过 socket_storage 在建立链接时存储临时数据。当链接关闭时，从之前存储的临时数据中获取链接持续时间等数据。

▍数据处理

Kmesh 在内核中获取了来自链接的数据后，会通过 ringbuf 将数据传递给用户态。

Kmesh 在用户态将 ringbuf 的数据解析之后，根据这些数据中携带的源服务和目标服务信息更新 metricController 中的缓存和构建 metricLabels。

构建的 metricLabels 有 workload 粒度的也有 service 粒度的。但 workload 粒度的监控指标最多是集群中 pod 数量的平方，因此 Kmesh 提供一个启动开关，使用户能够按需启用监控指标功能和访问日志功能。

namespacedhost := ""for k, portList := range dstWorkload.Services {    for _, port := range portList.Ports {        if port.TargetPort == uint32(dstPort) {            namespacedhost = k            break        }    }    if namespacedhost != "" {        break    }}

复制代码

建立工作负载粒度的度量和服务粒度的度量 metricLabels 后，更新缓存。

每 5 秒钟，监控指标信息都会通过 Prometheus API 更新到 Prometheus 中。

在处理指标时，会一起生成访问日志。每次链接关闭时，都会将生成的 Accesslog 打印到 Kmesh 的日志中。

Kmesh 监控指标功能和访问日志功能的整体架构图如下所示：

指标细节

现阶段 Kmesh L4 层监控的指标如下：

工作负载粒度:

服务粒度:

监控指标例子:

kmesh_tcp_workload_received_bytes_total{connection_security_policy="mutual_tls",destination_app="httpbin",destination_canonical_revision="v1",destination_canonical_service="httpbin",destination_cluster="Kubernetes",destination_pod_address="10.244.0.11",destination_pod_name="httpbin-5c5944c58c-v9mlk",destination_pod_namespace="default",destination_principal="-",destination_version="v1",destination_workload="httpbin",destination_workload_namespace="default",reporter="destination",request_protocol="tcp",response_flags="-",source_app="sleep",source_canonical_revision="latest",source_canonical_service="sleep",source_cluster="Kubernetes",source_principal="-",source_version="latest",source_workload="sleep",source_workload_namespace="default"} 231

复制代码

也能够通过 prometheus dashboard 查看监控指标。具体步骤参考 Kmesh 可观测性文档。

现阶段 Kmesh 访问日志展示的字段如下：

Accesslog Result:

accesslog: 2024-09-14 08:19:26.552709932 +0000 UTC src.addr=10.244.0.17:51842, src.workload=prometheus-5fb7f6f8d8-h9cts, src.namespace=istio-system, dst.addr=10.244.0.13:9080, dst.service=productpage.echo-1-27855.svc.cluster.local, dst.workload=productpage-v1-8499c849b9-bz9t9, dst.namespace=echo-1-27855, direction=INBOUND, sent_bytes=5, received_bytes=292, duration=2.733902ms

复制代码

▍Summary

Kmesh 直接从套接字获取流量数据，并将其作为 ringbuf 传递到用户空间，以生成监控指标和访问日志。

避免在用户空间拦截流量并以本地方式获取指标。定期批量更新用户空间中的指标，避免在大流量时增加网络延迟。

随后，我们还将开发跟踪功能，以补充 Kmesh 的可观测能力。

欢迎感兴趣的同学加入 Kmesh 开源社区!

Kmesh GitHub: https://github.com/kmesh-net/kmesh

Kmesh Website: https://kmesh.net/

12 月 7 日，Kmesh 技术专家将在 2024 华为云开源开发者论坛上带来《服务网格的未来：Kmesh 的设计思想与演进方向》技术分享及重磅发布！添加小助手 k8s2222，报名领票参会！

发布于: 刚刚阅读数: 4

华为云原生团队

关注

还未添加个人签名 2020-02-11 加入

还未添加个人简介

发布

暂无评论

创作场景

2024 华为云开源开发者论坛项目抢鲜看｜Kmesh: 监控指标和访问日志功能详解

▍如何获取原始数据

▍数据处理

指标细节

工作负载粒度:

服务粒度:

▍Summary

华为云原生团队

评论