kube-scheduler 是 观察 pod 的删除事件, 做节点资源的减操作;
所以 terminating 状态的 POD,是暂时没有收到 delete event 的, 节点资源仍然是占用的。
那么 evict pod 时,情况是怎么样的那? kubelet 压力驱逐发生时,我们能看到会产生大量的 Failed 状态的 POD,他们是不占用节点资源的, 否则节点资源早满了。
做一个试验
弄一个 占满集群资源的 POD, 该 pod 只选择该节点
使节点进入 磁盘压力状态
这时会发生驱逐, 而且会反复的产生几个 Evict 的 pod, 直到 节点上 打上 磁盘压力的污点。不再向上面调度 POD。
那么再假如 我们这个 POD 是容忍所有污点的,那这个过程就会一直持续下去,不断往上面调度 POD,kubelet 不断拒绝,不断产生 Evict 状态的 POD。
这在调度器角度看,Evict 的 POD 是不再占用节点资源的了, 否则不可能一直能调度过去。
可是我们仍然能看到 POD,该 POD 并没有被删除, 也不会有 delete 的 event, kube-scheduler 是怎么减掉资源的那?
我们通过自己 的 informer 观察 Evict 的 POD 的时间变化
驱逐 时刻,只有状态更新的event,并没有删除的事件
ERROR: logging before flag.Parse: I0917 10:39:09.988530 8591 main.go:45] POD CREATED: default/whoami-last1-5f6b56b7cd-fwfvd
ERROR: logging before flag.Parse: I0917 10:39:09.988575 8591 main.go:46] POD Phase: Pending
ERROR: logging before flag.Parse: I0917 10:39:09.988582 8591 main.go:47] POD Reason:
ERROR: logging before flag.Parse: I0917 10:39:09.988588 8591 main.go:48] Add POD Deletetime: <nil>
ERROR: logging before flag.Parse: I0917 10:39:09.995592 8591 main.go:57] POD UPDATED. default/whoami-last1-5f6b56b7cd-fwfvd Pending
ERROR: logging before flag.Parse: I0917 10:39:09.995607 8591 main.go:61] old POD Phase: Pending
ERROR: logging before flag.Parse: I0917 10:39:09.995612 8591 main.go:62] old POD Reason:
ERROR: logging before flag.Parse: I0917 10:39:09.995618 8591 main.go:64] new POD Phase: Pending
ERROR: logging before flag.Parse: I0917 10:39:09.995623 8591 main.go:65] new POD Reason:
ERROR: logging before flag.Parse: I0917 10:39:09.995628 8591 main.go:67] Update POD Deletetime: <nil>
ERROR: logging before flag.Parse: I0917 10:39:10.574433 8591 main.go:57] POD UPDATED. default/whoami-last1-5f6b56b7cd-fwfvd Failed
ERROR: logging before flag.Parse: I0917 10:39:10.574466 8591 main.go:61] old POD Phase: Pending
ERROR: logging before flag.Parse: I0917 10:39:10.574471 8591 main.go:62] old POD Reason:
ERROR: logging before flag.Parse: I0917 10:39:10.574476 8591 main.go:64] new POD Phase: Failed
ERROR: logging before flag.Parse: I0917 10:39:10.574481 8591 main.go:65] new POD Reason: Evicted
ERROR: logging before flag.Parse: I0917 10:39:10.574486 8591 main.go:67] Update POD Deletetime: <nil>
当我们 kubectl delete pod 时,或者 gc 时,才有真正的删除事件
ERROR: logging before flag.Parse: I0917 10:46:35.529885 12414 main.go:57] POD UPDATED. default/whoami-last1-5f6b56b7cd-fwfvd Failed
ERROR: logging before flag.Parse: I0917 10:46:35.529928 12414 main.go:61] old POD Phase: Failed
ERROR: logging before flag.Parse: I0917 10:46:35.529934 12414 main.go:62] old POD Reason: Evicted
ERROR: logging before flag.Parse: I0917 10:46:35.529940 12414 main.go:64] new POD Phase: Failed
ERROR: logging before flag.Parse: I0917 10:46:35.529945 12414 main.go:65] new POD Reason: Evicted
ERROR: logging before flag.Parse: I0917 10:46:35.529950 12414 main.go:67] Update POD Deletetime: 2021-09-17 10:46:35 +0800 CST
ERROR: logging before flag.Parse: I0917 10:46:35.533745 12414 main.go:76] POD DELETED: default/whoami-last1-5f6b56b7cd-fwfvd
ERROR: logging before flag.Parse: I0917 10:46:35.533758 12414 main.go:77] POD Phase: Failed
ERROR: logging before flag.Parse: I0917 10:46:35.533764 12414 main.go:78] POD Reason: Evicted
ERROR: logging before flag.Parse: I0917 10:46:35.533769 12414 main.go:79] POD Deletetime: 2021-09-17 10:46:35 +0800 CST
复制代码
按照前面的逻辑, 被驱逐时,不会马上有 delete event, kube-scheduler 也做不到马上减资源,理论不太对呀,,,
3758 eventhandlers.go:172] add event for unscheduled pod default/whoami-last1-5f6b56b7cd-fwfvd
3758 scheduling_queue.go:812] About to try and schedule pod default/whoami-last1-5f6b56b7cd-fwfvd
3758 scheduler.go:466] Attempting to schedule pod: default/whoami-last1-5f6b56b7cd-fwfvd
3758 fixed_nodes.go:175] FixedNodes (whoami-last1-5f6b56b7cd-fwfvd) Get DeplyFixedNodes failed, err: deployfixednodes
3758 fixed_nodes.go:149] FixedNodes (whoami-last1-5f6b56b7cd-fwfvd) Get preFilterState failed, err: reading
kube-scheduler[3758]: I0917 10:39:10.009568 3758 default_binder.go:51] Attempting to bind default/
3758 cache.go:396] Finished binding for pod b7054b96-ae9f-428f-a4b7-d67a27a60878. Can be expired.
3758 scheduler.go:617] "Successfully bound pod to node" pod="default/whoami-last1-5f6b56b7cd-fwfvd" node="61.147.184.90"
3758 eventhandlers.go:209] delete event for unscheduled pod default/whoami-last1-5f6b56b7cd-fwfvd
3758 eventhandlers.go:229] add event for scheduled pod default/whoami-last1-5f6b56b7cd-fwfvd
// kube-scheduler 马上收到了删除的事件。。。。
3758 eventhandlers.go:287] delete event for scheduled pod default/whoami-last1-5f6b56b7cd-fwfvd
复制代码
有这个事件的话,似乎减资源就对了, 那 kube-scheduler 为啥在没彻底删除时,就收到了 delete event 那? 我的 informer 为啥收不到那?
评论