garbage collector 介绍
Kubernetes garbage collector 即垃圾收集器,存在于 kube-controller-manger 中,它负责回收 kubernetes 中的资源对象,监听资源对象事件,更新对象之间的依赖关系,并根据对象的删除策略来决定是否删除其关联对象。
关于删除关联对象,细一点说就是,使用级联删除策略去删除一个owner时,会连带这个owner对象的dependent对象也一起删除掉。
关于对象的关联依赖关系,garbage collector 会监听资源对象事件,根据资源对象中ownerReference 的值,来构建对象间的关联依赖关系,也即owner与dependent之间的关系。
关于 owner 与 dependent 的介绍
以创建 deployment 对象为例进行讲解。
创建 deployment 对象后,kube-controller-manager 为其创建出 replicaset 对象,且自动将该 deployment 的信息设置到 replicaset 对象ownerReference值。如下面示例,即说明 replicaset 对象test-1-59d7f45ffb的owner为 deployment 对象test-1,deployment 对象test-1的dependent为 replicaset 对象test-1-59d7f45ffb。
apiVersion: apps/v1kind: Deploymentmetadata: name: test-1 namespace: test uid: 4973d370-3221-46a7-8d86-e145bf9ad0ce...
复制代码
apiVersion: apps/v1kind: ReplicaSetmetadata: name: test-1-59d7f45ffb namespace: test ownerReferences: - apiVersion: apps/v1 blockOwnerDeletion: true controller: true kind: Deployment name: test-1 uid: 4973d370-3221-46a7-8d86-e145bf9ad0ce uid: 386c380b-490e-470b-a33f-7d5b0bf945fb...
复制代码
同理,replicaset 对象创建后,kube-controller-manager 为其创建出 pod 对象,这些 pod 对象也会将 replicaset 对象的信息设置到 pod 对象的ownerReference的值中,replicaset 是 pod 的owner,pod 是 replicaset 的dependent。
对象中ownerReference 的值,指定了owner与dependent之间的关系。
garbage collector 架构图
garbage collector 中最关键的代码就是garbagecollector.go与graph_builder.go两部分。
garbage collector 的主要组成为 1 个图(对象关联依赖关系图)、2 个处理器(GraphBuilder与GarbageCollector)、3 个事件队列(graphChanges、attemptToDelete与attemptToOrphan):
1 个图
(1)uidToNode:对象关联依赖关系图,由GraphBuilder维护,维护着所有对象间的关联依赖关系。在该图里,每一个 k8s 对象会对应着关系图里的一个node,而每个node都会维护一个owner列表以及dependent列表。
示例:现有一个 deployment A,replicaset B(owner 为 deployment A),pod C(owner 为 replicaset B),则对象关联依赖关系如下:
3个node,分别是A、B、C
A对应一个node,无owner,dependent列表里有B; B对应一个node,owner列表里有A,dependent列表里有C; C对应一个node,owner列表里有B,无dependent。
复制代码
2 个处理器
(1)GraphBuilder:负责维护所有对象的关联依赖关系图,并产生事件触发GarbageCollector执行对象回收删除操作。GraphBuilder从graphChanges事件队列中获取事件进行消费,根据资源对象中ownerReference的值,来构建、更新、删除对象间的关联依赖关系图,也即owner与dependent之间的关系图,然后再作为生产者生产事件,放入attemptToDelete或attemptToOrphan队列中,触发GarbageCollector执行,看是否需要进行关联对象的回收删除操作,而GarbageCollector进行对象的回收删除操作时会依赖于uidToNode这个关系图。
(2)GarbageCollector:负责回收删除对象。GarbageCollector作为消费者,从attemptToDelete与attemptToOrphan队列中取出事件进行处理,若一个对象被删除,且其删除策略为级联删除,则进行关联对象的回收删除。关于删除关联对象,细一点说就是,使用级联删除策略去删除一个owner时,会连带这个owner对象的dependent对象也一起删除掉。
3 个事件队列
(1)graphChanges:list/watch apiserver,获取事件,由informer生产,由GraphBuilder消费;
(2)attemptToDelete:级联删除事件队列,由GraphBuilder生产,由GarbageCollector消费;
(3)attemptToOrphan:孤儿删除事件队列,由GraphBuilder生产,由GarbageCollector消费。
garbage collector 相关启动参数分析
kcm 组件启动参数中,与garbage collector相关的参数代码如下:
// cmd/kube-controller-manager/app/options/garbagecollectorcontroller.go// AddFlags adds flags related to GarbageCollectorController for controller manager to the specified FlagSet.func (o *GarbageCollectorControllerOptions) AddFlags(fs *pflag.FlagSet) { if o == nil { return }
fs.Int32Var(&o.ConcurrentGCSyncs, "concurrent-gc-syncs", o.ConcurrentGCSyncs, "The number of garbage collector workers that are allowed to sync concurrently.") fs.BoolVar(&o.EnableGarbageCollector, "enable-garbage-collector", o.EnableGarbageCollector, "Enables the generic garbage collector. MUST be synced with the corresponding flag of the kube-apiserver.")}
复制代码
从代码中可以看到,kcm 组件启动参数中有两个参数与garbage collector相关,分别是:
(1)enable-garbage-collector:是否开启garbage collector,默认值为true;
(2)concurrent-gc-syncs:garbage collector同步操作的 worker 数量,默认20。
garbage collector 的源码分析将分成两部分进行,分别是:
(1)启动分析;
(2)核心处理逻辑分析。
本篇博客先对 garbage collector 进行启动分析。
garbage collector 源码分析-启动分析
基于 tag v1.17.4
https://github.com/kubernetes/kubernetes/releases/tag/v1.17.4
直接以startGarbageCollectorController函数作为 garbage collector 的源码分析入口。
startGarbageCollectorController
startGarbageCollectorController 函数主要逻辑如下:
(1)根据EnableGarbageCollector变量的值来决定是否开启garbage collector,EnableGarbageCollector变量的值根据 kcm 组件启动参数--enable-garbage-collector配置获取,默认为true;不开启则直接返回,不会继续往下执行;
(2)初始化discoveryClient,主要用来获取集群中的所有资源对象;
(3)调用garbagecollector.GetDeletableResources,获取集群内garbage collector需要处理去删除回收的所有资源对象,支持delete, list, watch三种操作的资源对象称为 deletableResource;
(4)调用garbagecollector.NewGarbageCollector初始化garbage collector;
(5)调用garbageCollector.Run,启动garbage collector;
(6)调用garbageCollector.Sync监听集群中的deletableResources ,当出现新的deletableResources时同步到monitors中,确保监控集群中的所有资源;
(7)暴露 http 服务,注册 debug 接口,用于 debug,用来提供由GraphBuilder构建的集群内所有对象的关联关系。
// cmd/kube-controller-manager/app/core.gofunc startGarbageCollectorController(ctx ControllerContext) (http.Handler, bool, error) { if !ctx.ComponentConfig.GarbageCollectorController.EnableGarbageCollector { return nil, false, nil }
gcClientset := ctx.ClientBuilder.ClientOrDie("generic-garbage-collector") discoveryClient := cacheddiscovery.NewMemCacheClient(gcClientset.Discovery())
config := ctx.ClientBuilder.ConfigOrDie("generic-garbage-collector") metadataClient, err := metadata.NewForConfig(config) if err != nil { return nil, true, err }
// Get an initial set of deletable resources to prime the garbage collector. deletableResources := garbagecollector.GetDeletableResources(discoveryClient) ignoredResources := make(map[schema.GroupResource]struct{}) for _, r := range ctx.ComponentConfig.GarbageCollectorController.GCIgnoredResources { ignoredResources[schema.GroupResource{Group: r.Group, Resource: r.Resource}] = struct{}{} } garbageCollector, err := garbagecollector.NewGarbageCollector( metadataClient, ctx.RESTMapper, deletableResources, ignoredResources, ctx.ObjectOrMetadataInformerFactory, ctx.InformersStarted, ) if err != nil { return nil, true, fmt.Errorf("failed to start the generic garbage collector: %v", err) }
// Start the garbage collector. workers := int(ctx.ComponentConfig.GarbageCollectorController.ConcurrentGCSyncs) go garbageCollector.Run(workers, ctx.Stop)
// Periodically refresh the RESTMapper with new discovery information and sync // the garbage collector. go garbageCollector.Sync(gcClientset.Discovery(), 30*time.Second, ctx.Stop)
return garbagecollector.NewDebugHandler(garbageCollector), true, nil}
复制代码
下面对startGarbageCollectorController函数里的部分逻辑稍微展开一下分析。
1.garbagecollector.NewGarbageCollector
NewGarbageCollector 函数负责初始化garbage collector。主要逻辑如下:
(1)初始化GarbageCollector结构体;
(2)初始化GraphBuilder结构体,并赋值给GarbageCollector结构体的dependencyGraphBuilder属性。
// pkg/controller/garbagecollector/garbagecollector.gofunc NewGarbageCollector( metadataClient metadata.Interface, mapper resettableRESTMapper, deletableResources map[schema.GroupVersionResource]struct{}, ignoredResources map[schema.GroupResource]struct{}, sharedInformers controller.InformerFactory, informersStarted <-chan struct{},) (*GarbageCollector, error) { attemptToDelete := workqueue.NewNamedRateLimitingQueue(workqueue.DefaultControllerRateLimiter(), "garbage_collector_attempt_to_delete") attemptToOrphan := workqueue.NewNamedRateLimitingQueue(workqueue.DefaultControllerRateLimiter(), "garbage_collector_attempt_to_orphan") absentOwnerCache := NewUIDCache(500) gc := &GarbageCollector{ metadataClient: metadataClient, restMapper: mapper, attemptToDelete: attemptToDelete, attemptToOrphan: attemptToOrphan, absentOwnerCache: absentOwnerCache, } gb := &GraphBuilder{ metadataClient: metadataClient, informersStarted: informersStarted, restMapper: mapper, graphChanges: workqueue.NewNamedRateLimitingQueue(workqueue.DefaultControllerRateLimiter(), "garbage_collector_graph_changes"), uidToNode: &concurrentUIDToNode{ uidToNode: make(map[types.UID]*node), }, attemptToDelete: attemptToDelete, attemptToOrphan: attemptToOrphan, absentOwnerCache: absentOwnerCache, sharedInformers: sharedInformers, ignoredResources: ignoredResources, } if err := gb.syncMonitors(deletableResources); err != nil { utilruntime.HandleError(fmt.Errorf("failed to sync all monitors: %v", err)) } gc.dependencyGraphBuilder = gb
return gc, nil}
复制代码
1.1 gb.syncMonitors
gb.syncMonitors 的主要作用是调用gb.controllerFor对各个deletableResources(deletableResources指支持 “delete”, “list”, “watch” 三种操作的资源对象)资源对象的infomer做初始化,并为资源的变化事件注册eventHandler(AddFunc、UpdateFunc 和 DeleteFunc),对于资源的 add、update、delete event,都会 push 到graphChanges队列中,然后gb.processGraphChanges会从graphChanges队列中取出 event 进行处理(后面介绍 garbage collector 处理逻辑的时候会做详细分析)。
// pkg/controller/garbagecollector/graph_builder.gofunc (gb *GraphBuilder) syncMonitors(resources map[schema.GroupVersionResource]struct{}) error { gb.monitorLock.Lock() defer gb.monitorLock.Unlock()
toRemove := gb.monitors if toRemove == nil { toRemove = monitors{} } current := monitors{} errs := []error{} kept := 0 added := 0 for resource := range resources { if _, ok := gb.ignoredResources[resource.GroupResource()]; ok { continue } if m, ok := toRemove[resource]; ok { current[resource] = m delete(toRemove, resource) kept++ continue } kind, err := gb.restMapper.KindFor(resource) if err != nil { errs = append(errs, fmt.Errorf("couldn't look up resource %q: %v", resource, err)) continue } c, s, err := gb.controllerFor(resource, kind) if err != nil { errs = append(errs, fmt.Errorf("couldn't start monitor for resource %q: %v", resource, err)) continue } current[resource] = &monitor{store: s, controller: c} added++ } gb.monitors = current
for _, monitor := range toRemove { if monitor.stopCh != nil { close(monitor.stopCh) } }
klog.V(4).Infof("synced monitors; added %d, kept %d, removed %d", added, kept, len(toRemove)) // NewAggregate returns nil if errs is 0-length return utilerrors.NewAggregate(errs)}
复制代码
gb.controllerFor
gb.controllerFor 主要是对资源对象的infomer做初始化,并为资源的变化事件注册eventHandler(AddFunc、UpdateFunc 和 DeleteFunc),对于资源的 add、update、delete event,都会 push 到graphChanges队列中。
// pkg/controller/garbagecollector/graph_builder.gofunc (gb *GraphBuilder) controllerFor(resource schema.GroupVersionResource, kind schema.GroupVersionKind) (cache.Controller, cache.Store, error) { handlers := cache.ResourceEventHandlerFuncs{ // add the event to the dependencyGraphBuilder's graphChanges. AddFunc: func(obj interface{}) { event := &event{ eventType: addEvent, obj: obj, gvk: kind, } gb.graphChanges.Add(event) }, UpdateFunc: func(oldObj, newObj interface{}) { // TODO: check if there are differences in the ownerRefs, // finalizers, and DeletionTimestamp; if not, ignore the update. event := &event{ eventType: updateEvent, obj: newObj, oldObj: oldObj, gvk: kind, } gb.graphChanges.Add(event) }, DeleteFunc: func(obj interface{}) { // delta fifo may wrap the object in a cache.DeletedFinalStateUnknown, unwrap it if deletedFinalStateUnknown, ok := obj.(cache.DeletedFinalStateUnknown); ok { obj = deletedFinalStateUnknown.Obj } event := &event{ eventType: deleteEvent, obj: obj, gvk: kind, } gb.graphChanges.Add(event) }, } shared, err := gb.sharedInformers.ForResource(resource) if err != nil { klog.V(4).Infof("unable to use a shared informer for resource %q, kind %q: %v", resource.String(), kind.String(), err) return nil, nil, err } klog.V(4).Infof("using a shared informer for resource %q, kind %q", resource.String(), kind.String()) // need to clone because it's from a shared cache shared.Informer().AddEventHandlerWithResyncPeriod(handlers, ResourceResyncTime) return shared.Informer().GetController(), shared.Informer().GetStore(), nil}
复制代码
2.garbageCollector.Run
garbageCollector.Run 负责启动garbage collector,主要逻辑如下:
(1)调用gc.dependencyGraphBuilder.Run:启动GraphBuilder;
(2)根据启动参数配置的 worker 数量,起相应数量的 goroutine,执行gc.runAttemptToDeleteWorker与gc.runAttemptToOrphanWorker,两者属于GarbageCollector的核心处理逻辑,都是去删除需要被回收对象,具体分析会在下篇博客里进行分析。
// pkg/controller/garbagecollector/garbagecollector.gofunc (gc *GarbageCollector) Run(workers int, stopCh <-chan struct{}) { defer utilruntime.HandleCrash() defer gc.attemptToDelete.ShutDown() defer gc.attemptToOrphan.ShutDown() defer gc.dependencyGraphBuilder.graphChanges.ShutDown()
klog.Infof("Starting garbage collector controller") defer klog.Infof("Shutting down garbage collector controller")
go gc.dependencyGraphBuilder.Run(stopCh)
if !cache.WaitForNamedCacheSync("garbage collector", stopCh, gc.dependencyGraphBuilder.IsSynced) { return }
klog.Infof("Garbage collector: all resource monitors have synced. Proceeding to collect garbage")
// gc workers for i := 0; i < workers; i++ { go wait.Until(gc.runAttemptToDeleteWorker, 1*time.Second, stopCh) go wait.Until(gc.runAttemptToOrphanWorker, 1*time.Second, stopCh) }
<-stopCh}
复制代码
2.1 gc.dependencyGraphBuilder.Run
gc.dependencyGraphBuilder.Run 负责启动启动GraphBuilder,主要逻辑如下:
(1)调用gb.startMonitors,启动前面1.1 gb.syncMonitors中提到的 infomers;
(2)每隔 1s 循环调用gb.runProcessGraphChanges,做GraphBuilder的核心逻辑处理,核心处理逻辑会在下篇博客里进行分析。
// pkg/controller/garbagecollector/graph_builder.gofunc (gb *GraphBuilder) Run(stopCh <-chan struct{}) { klog.Infof("GraphBuilder running") defer klog.Infof("GraphBuilder stopping")
// Set up the stop channel. gb.monitorLock.Lock() gb.stopCh = stopCh gb.running = true gb.monitorLock.Unlock()
// Start monitors and begin change processing until the stop channel is // closed. gb.startMonitors() wait.Until(gb.runProcessGraphChanges, 1*time.Second, stopCh)
// Stop any running monitors. gb.monitorLock.Lock() defer gb.monitorLock.Unlock() monitors := gb.monitors stopped := 0 for _, monitor := range monitors { if monitor.stopCh != nil { stopped++ close(monitor.stopCh) } }
// reset monitors so that the graph builder can be safely re-run/synced. gb.monitors = nil klog.Infof("stopped %d of %d monitors", stopped, len(monitors))}
复制代码
3.garbageCollector.Sync
garbageCollector.Sync 的主要功能是周期性的查询集群中所有的deletableResources,调用gc.resyncMonitors来更新GraphBuilder的monitors,为新出现的资源对象初始化infomer和注册eventHandler,然后启动infomer,对已经移除的资源对象的monitors进行销毁。
// pkg/controller/garbagecollector/garbagecollector.gofunc (gc *GarbageCollector) Sync(discoveryClient discovery.ServerResourcesInterface, period time.Duration, stopCh <-chan struct{}) { oldResources := make(map[schema.GroupVersionResource]struct{}) wait.Until(func() { // Get the current resource list from discovery. newResources := GetDeletableResources(discoveryClient) ... if err := gc.resyncMonitors(newResources); err != nil { utilruntime.HandleError(fmt.Errorf("failed to sync resource monitors (attempt %d): %v", attempt, err)) return false, nil } klog.V(4).Infof("resynced monitors") ...
复制代码
3.1 gc.resyncMonitors
调用gc.dependencyGraphBuilder.syncMonitors:初始化infomer和注册eventHandler;
调用gc.dependencyGraphBuilder.startMonitors:启动infomer。
// pkg/controller/garbagecollector/garbagecollector.gofunc (gc *GarbageCollector) resyncMonitors(deletableResources map[schema.GroupVersionResource]struct{}) error { if err := gc.dependencyGraphBuilder.syncMonitors(deletableResources); err != nil { return err } gc.dependencyGraphBuilder.startMonitors() return nil}
复制代码
4.garbagecollector.NewDebugHandler
garbagecollector.NewDebugHandler 暴露 http 服务,注册 debug 接口,用于 debug,用来提供由GraphBuilder构建的集群内所有对象的关联关系。
// pkg/controller/garbagecollector/dump.gofunc NewDebugHandler(controller *GarbageCollector) http.Handler { return &debugHTTPHandler{controller: controller}}
type debugHTTPHandler struct { controller *GarbageCollector}
func (h *debugHTTPHandler) ServeHTTP(w http.ResponseWriter, req *http.Request) { if req.URL.Path != "/graph" { http.Error(w, "", http.StatusNotFound) return }
var graph graph.Directed if uidStrings := req.URL.Query()["uid"]; len(uidStrings) > 0 { uids := []types.UID{} for _, uidString := range uidStrings { uids = append(uids, types.UID(uidString)) } graph = h.controller.dependencyGraphBuilder.uidToNode.ToGonumGraphForObj(uids...)
} else { graph = h.controller.dependencyGraphBuilder.uidToNode.ToGonumGraph() }
data, err := dot.Marshal(graph, "full", "", " ") if err != nil { http.Error(w, err.Error(), http.StatusInternalServerError) return } w.Header().Set("Content-Type", "text/vnd.graphviz") w.Header().Set("X-Content-Type-Options", "nosniff") w.Write(data) w.WriteHeader(http.StatusOK)}
复制代码
获取对象关联关系图
获取全部的对象关联关系图:
curl http://{master_ip}:{kcm_port}/debug/controllers/garbagecollector/graph -o {output_file}
复制代码
获取特定 uid 的对象关联关系图:
curl http://{master_ip}:{kcm_port}/debug/controllers/garbagecollector/graph?uid={project_uid} -o {output_file}
复制代码
示例:
curl http://192.168.1.10:10252/debug/controllers/garbagecollector/graph?uid=8727f640-112e-21eb-11dd-626400510df6 -o /home/test
复制代码
总结
garbage collector 介绍
Kubernetes garbage collector 即垃圾收集器,存在于 kube-controller-manger 中,它负责回收 kubernetes 中的资源对象,监听资源对象事件,更新对象之间的依赖关系,并根据对象的删除策略来决定是否删除其关联对象。
garbage collector 架构图
garbage collector 的主要组成为 1 个图(对象关联依赖关系图)、2 个处理器(GraphBuilder与GarbageCollector)、3 个事件队列(graphChanges、attemptToDelete与attemptToOrphan)。
garbage collector 启动分析
garbage collector 的启动主要是启动了 2 个处理器(GraphBuilder与GarbageCollector),定义了对象关联依赖关系图以及 3 个事件队列(graphChanges、attemptToDelete与attemptToOrphan)。
从 apiserver list/watch 的事件会放入到graphChanges队列,而GraphBuilder从graphChanges队列中取出事件进行处理,构建对象关联依赖关系图,并根据对象删除策略将关联对象放入attemptToDelete或attemptToOrphan队列中,接着GarbageCollector会从attemptToDelete与attemptToOrphan队列中取出事件,再从对象关联依赖关系图中获取信息进行处理,最后回收删除对象。
评论