写点什么

k8s 驱逐 pod 疑惑

用户头像
Geek_f24c45
关注
发布于: 1 小时前

kube-scheduler 是 观察 pod 的删除事件, 做节点资源的减操作;


所以 terminating 状态的 POD,是暂时没有收到 delete event 的, 节点资源仍然是占用的。


那么 evict pod 时,情况是怎么样的那? kubelet 压力驱逐发生时,我们能看到会产生大量的 Failed 状态的 POD,他们是不占用节点资源的, 否则节点资源早满了。


做一个试验

  1. 弄一个 占满集群资源的 POD, 该 pod 只选择该节点

  2. 使节点进入 磁盘压力状态


这时会发生驱逐, 而且会反复的产生几个 Evict 的 pod, 直到 节点上 打上 磁盘压力的污点。不再向上面调度 POD。

那么再假如 我们这个 POD 是容忍所有污点的,那这个过程就会一直持续下去,不断往上面调度 POD,kubelet 不断拒绝,不断产生 Evict 状态的 POD。


这在调度器角度看,Evict 的 POD 是不再占用节点资源的了, 否则不可能一直能调度过去。


可是我们仍然能看到 POD,该 POD 并没有被删除, 也不会有 delete 的 event, kube-scheduler 是怎么减掉资源的那?


我们通过自己 的 informer 观察 Evict 的 POD 的时间变化


驱逐 时刻,只有状态更新的event,并没有删除的事件ERROR: logging before flag.Parse: I0917 10:39:09.988530    8591 main.go:45] POD CREATED: default/whoami-last1-5f6b56b7cd-fwfvdERROR: logging before flag.Parse: I0917 10:39:09.988575    8591 main.go:46] POD Phase: PendingERROR: logging before flag.Parse: I0917 10:39:09.988582    8591 main.go:47] POD Reason:ERROR: logging before flag.Parse: I0917 10:39:09.988588    8591 main.go:48] Add POD Deletetime: <nil>ERROR: logging before flag.Parse: I0917 10:39:09.995592    8591 main.go:57] POD UPDATED. default/whoami-last1-5f6b56b7cd-fwfvd PendingERROR: logging before flag.Parse: I0917 10:39:09.995607    8591 main.go:61] old POD Phase: PendingERROR: logging before flag.Parse: I0917 10:39:09.995612    8591 main.go:62] old POD Reason:ERROR: logging before flag.Parse: I0917 10:39:09.995618    8591 main.go:64] new POD Phase: PendingERROR: logging before flag.Parse: I0917 10:39:09.995623    8591 main.go:65] new POD Reason:ERROR: logging before flag.Parse: I0917 10:39:09.995628    8591 main.go:67] Update POD Deletetime: <nil>ERROR: logging before flag.Parse: I0917 10:39:10.574433    8591 main.go:57] POD UPDATED. default/whoami-last1-5f6b56b7cd-fwfvd FailedERROR: logging before flag.Parse: I0917 10:39:10.574466    8591 main.go:61] old POD Phase: PendingERROR: logging before flag.Parse: I0917 10:39:10.574471    8591 main.go:62] old POD Reason:ERROR: logging before flag.Parse: I0917 10:39:10.574476    8591 main.go:64] new POD Phase: FailedERROR: logging before flag.Parse: I0917 10:39:10.574481    8591 main.go:65] new POD Reason: EvictedERROR: logging before flag.Parse: I0917 10:39:10.574486    8591 main.go:67] Update POD Deletetime: <nil>

当我们 kubectl delete pod 时,或者 gc 时,才有真正的删除事件ERROR: logging before flag.Parse: I0917 10:46:35.529885 12414 main.go:57] POD UPDATED. default/whoami-last1-5f6b56b7cd-fwfvd FailedERROR: logging before flag.Parse: I0917 10:46:35.529928 12414 main.go:61] old POD Phase: FailedERROR: logging before flag.Parse: I0917 10:46:35.529934 12414 main.go:62] old POD Reason: EvictedERROR: logging before flag.Parse: I0917 10:46:35.529940 12414 main.go:64] new POD Phase: FailedERROR: logging before flag.Parse: I0917 10:46:35.529945 12414 main.go:65] new POD Reason: EvictedERROR: logging before flag.Parse: I0917 10:46:35.529950 12414 main.go:67] Update POD Deletetime: 2021-09-17 10:46:35 +0800 CSTERROR: logging before flag.Parse: I0917 10:46:35.533745 12414 main.go:76] POD DELETED: default/whoami-last1-5f6b56b7cd-fwfvdERROR: logging before flag.Parse: I0917 10:46:35.533758 12414 main.go:77] POD Phase: FailedERROR: logging before flag.Parse: I0917 10:46:35.533764 12414 main.go:78] POD Reason: EvictedERROR: logging before flag.Parse: I0917 10:46:35.533769 12414 main.go:79] POD Deletetime: 2021-09-17 10:46:35 +0800 CST
复制代码


按照前面的逻辑, 被驱逐时,不会马上有 delete event, kube-scheduler 也做不到马上减资源,理论不太对呀,,,


 3758 eventhandlers.go:172] add event for unscheduled pod default/whoami-last1-5f6b56b7cd-fwfvd 3758 scheduling_queue.go:812] About to try and schedule pod default/whoami-last1-5f6b56b7cd-fwfvd 3758 scheduler.go:466] Attempting to schedule pod: default/whoami-last1-5f6b56b7cd-fwfvd 3758 fixed_nodes.go:175] FixedNodes (whoami-last1-5f6b56b7cd-fwfvd) Get DeplyFixedNodes failed, err: deployfixednodes  3758 fixed_nodes.go:149] FixedNodes (whoami-last1-5f6b56b7cd-fwfvd) Get preFilterState failed, err: reading kube-scheduler[3758]: I0917 10:39:10.009568    3758 default_binder.go:51] Attempting to bind default/ 3758 cache.go:396] Finished binding for pod b7054b96-ae9f-428f-a4b7-d67a27a60878. Can be expired. 3758 scheduler.go:617] "Successfully bound pod to node" pod="default/whoami-last1-5f6b56b7cd-fwfvd" node="61.147.184.90"  3758 eventhandlers.go:209] delete event for unscheduled pod default/whoami-last1-5f6b56b7cd-fwfvd 3758 eventhandlers.go:229] add event for scheduled pod default/whoami-last1-5f6b56b7cd-fwfvd
// kube-scheduler 马上收到了删除的事件。。。。 3758 eventhandlers.go:287] delete event for scheduled pod default/whoami-last1-5f6b56b7cd-fwfvd
复制代码


有这个事件的话,似乎减资源就对了, 那 kube-scheduler 为啥在没彻底删除时,就收到了 delete event 那? 我的 informer 为啥收不到那?

用户头像

Geek_f24c45

关注

还未添加个人签名 2018.03.24 加入

还未添加个人简介

评论

发布
暂无评论
k8s 驱逐pod疑惑