Ubuntu Server 20.04 搭建安装 Kubernetes

作者：玏佾

2021 年 8 月 26 日
本文字数：15503 字
阅读完需：约 51 分钟

1. 简介

kubernetes，简称 K8s，是用 8 代替 8 个字符“ubernete”而成的缩写。是一个开源的，用于管理云平台中多个主机上的容器化的应用，Kubernetes 的目标是让部署容器化的应用简单并且高效（powerful）,Kubernetes 提供了应用部署，规划，更新，维护的一种机制。

本文介绍使用 kubeadm 方式安装。参考文档：https://kubernetes.io/zh/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

2. 准备

之前安装 kvm 准备三台虚拟机：192.168.2.31，192.168.2.32，192.168.33

安装 kvm 虚拟机：Ubuntu server 20.04安装KVM虚拟机

更新环境

sudo apt updatesudo apt -y upgrade && sudo systemctl reboot

复制代码

2.1 配置各节点 HostName

可以修改节点的 host name 替代域名解析。

sudo hostnamectl set-hostname "k8s-master"  #192.168.2.31sudo hostnamectl set-hostname "k8s-node-1"  #192.168.2.32 sudo hostnamectl set-hostname "k8s-node-2"  #192.168.2.33

复制代码

2.2 修改节点 hosts

修改文件/etc/hosts

$ sudo vi /etc/hosts

192.168.2.31    k8s-master192.168.2.32    k8s-node-1192.168.2.33    k8s-node-2

复制代码

2.3 关闭交换区

安装 k8s 的必须环节，为什么要关？简单的说就是为了性能。

暂时关闭

$ sudo sed -i '/ swap / s/^$.*$$/#\1/g' /etc/fstab
$ sudo swapoff -a

永久关闭

$ sudo vi /etc/fstab #注释掉 swap 一行
#/swap.img none swap sw 0 0

查看是否关闭成功

$ free -h

2.4 关闭防火墙

关闭防火墙

$ sudo ufw disable

查看防火墙状态

$ sudo ufw status

2.5 设置内核参数

查看内核参数，确保 sysctl 配置中 net.bridge.bridge-nf-call-iptables 的值设置为了 1。

$ lsmod | grep br_netfilter

如果参数不一致，执行如下命令修改：

sudo modprobe overlaysudo modprobe br_netfilter
sudo tee /etc/sysctl.d/kubernetes.conf<<EOFnet.bridge.bridge-nf-call-ip6tables = 1net.bridge.bridge-nf-call-iptables = 1net.ipv4.ip_forward = 1EOF
sudo sysctl --system

复制代码

3. 安装

3.1 安装 docker

#安装证书curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
#设置阿里云镜像源sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable"
sudo apt updatesudo apt install -y docker.io
# Create required directoriessudo mkdir -p /etc/systemd/system/docker.service.d
# Create daemon json config filesudo tee /etc/docker/daemon.json <<EOF{  "exec-opts": ["native.cgroupdriver=systemd"],  "log-driver": "json-file",  "log-opts": {    "max-size": "100m"  },  "storage-driver": "overlay2"}EOF
# Start and enable Servicessudo systemctl daemon-reload sudo systemctl enable docker.service --nowsudo systemctl restart dockersudo systemctl status dockerdocker --version

复制代码

3.2 安装必要包

设置 Kubernetes repository

sudo apt updatesudo apt install -y apt-transport-https curl gnupg2 software-properties-common ca-certificates
#curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key addcurl -s https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add - #使用阿里云
#sudo apt-add-repository "deb http://apt.kubernetes.io/ kubernetes-xenial main"sudo tee /etc/apt/sources.list.d/kubernetes.list <<EOF deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main #阿里云源EOF

复制代码

安装 kubeadm kubeadm kubectl

sudo apt updatesudo apt-get install -y kubelet kubeadm kubectlsudo apt-mark hold kubelet kubeadm kubectl #标记阻止自动更新

复制代码

查看版本

$ kubectl version --client && kubeadm version

Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.1", GitCommit:"632ed300f2c34f6d6d15ca4cef3d3c7073412212", GitTreeState:"clean", BuildDate:"2021-08-19T15:45:37Z", GoVersion:"go1.16.7", Compiler:"gc", Platform:"linux/amd64"}kubeadm version: &version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.1", GitCommit:"632ed300f2c34f6d6d15ca4cef3d3c7073412212", GitTreeState:"clean", BuildDate:"2021-08-19T15:44:22Z", GoVersion:"go1.16.7", Compiler:"gc", Platform:"linux/amd64"}

复制代码

3.3 获取初始化节点所需镜像

启用

sudo systemctl enable kubelet

在初始化过程中需要获取一些镜像，我们可以在初始化之前先获取下来，不然很容易报错。初始化 master 的时候会一直卡在那里。

$ sudo kubeadm init --pod-network-cidr 10.5.0.0/16 --control-plane-endpoint=vm-ubuntu-06

如果出现如下问题就是失败了，这里显示的链接镜像超时了。这时候你可能需要一把梯子。

W0825 14:09:24.664761    1369 version.go:103] could not fetch a Kubernetes version from the internet: unable to get URL "https://dl.k8s.io/release/stable-1.txt": Get "https://storage.googleapis.com/kubernetes-release/release/stable-1.txt": context deadline exceeded (Client.Timeout exceeded while awaiting headers)W0825 14:09:24.665057    1369 version.go:104] falling back to the local client version: v1.22.1[init] Using Kubernetes version: v1.22.1[preflight] Running pre-flight checks[preflight] Pulling images required for setting up a Kubernetes cluster[preflight] This might take a minute or two, depending on the speed of your internet connection[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'error execution phase preflight: [preflight] Some fatal errors occurred:        [ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-apiserver:v1.22.1: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers), error: exit status 1        [ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-controller-manager:v1.22.1: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers), error: exit status 1        [ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-scheduler:v1.22.1: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers), error: exit status 1        [ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-proxy:v1.22.1: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers), error: exit status 1        [ERROR ImagePull]: failed to pull image k8s.gcr.io/pause:3.5: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers), error: exit status 1        [ERROR ImagePull]: failed to pull image k8s.gcr.io/etcd:3.5.0-0: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers), error: exit status 1        [ERROR ImagePull]: failed to pull image k8s.gcr.io/coredns/coredns:v1.8.4: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers), error: exit status 1[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`To see the stack trace of this error execute with --v=5 or higher

复制代码

或者更换镜像源，首先查看所需版本号

$ kubeadm config images list

k8s.gcr.io/kube-apiserver:v1.22.1k8s.gcr.io/kube-controller-manager:v1.22.1k8s.gcr.io/kube-scheduler:v1.22.1k8s.gcr.io/kube-proxy:v1.22.1k8s.gcr.io/pause:3.5k8s.gcr.io/etcd:3.5.0-0k8s.gcr.io/coredns/coredns:v1.8.4

复制代码

写个批处理来批量修改

$ vi k8s.sh
$ sudo chmod +x k8s.sh
$ sudo ./k8s.sh

k8s.sh的内容如下：

#!/bin/bash
echo "修改docker普通用户权限"sudo chmod 777 /var/run/docker.sock
echo "1.拉取镜像"
#下面的版本号要对应KUBE_VERSION=v1.22.2PAUSE_VERSION=3.5CORE_DNS_VERSION=1.8.4ETCD_VERSION=3.5.0-0
# pull kubernetes images from hub.docker.comdocker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy-amd64:$KUBE_VERSIONdocker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager-amd64:$KUBE_VERSIONdocker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver-amd64:$KUBE_VERSIONdocker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler-amd64:$KUBE_VERSION# pull aliyuncs mirror docker imagesdocker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:$PAUSE_VERSIONdocker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:$CORE_DNS_VERSIONdocker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:$ETCD_VERSION
# retag to k8s.gcr.io prefixdocker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy-amd64:$KUBE_VERSION  k8s.gcr.io/kube-proxy:$KUBE_VERSIONdocker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager-amd64:$KUBE_VERSION k8s.gcr.io/kube-controller-manager:$KUBE_VERSIONdocker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver-amd64:$KUBE_VERSION k8s.gcr.io/kube-apiserver:$KUBE_VERSIONdocker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler-amd64:$KUBE_VERSION k8s.gcr.io/kube-scheduler:$KUBE_VERSIONdocker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:$PAUSE_VERSION k8s.gcr.io/pause:$PAUSE_VERSIONdocker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:$CORE_DNS_VERSION k8s.gcr.io/coredns/coredns:v$CORE_DNS_VERSIONdocker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:$ETCD_VERSION k8s.gcr.io/etcd:$ETCD_VERSION
# untag origin tag, the images won't be delete.docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy-amd64:$KUBE_VERSIONdocker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager-amd64:$KUBE_VERSIONdocker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver-amd64:$KUBE_VERSIONdocker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler-amd64:$KUBE_VERSIONdocker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/pause:$PAUSE_VERSIONdocker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:$CORE_DNS_VERSIONdocker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:$ETCD_VERSION
echo "====================执行完毕======================"
echo "所需镜像:"kubeadm config images list
echo "已安装镜像:"sudo docker images
echo "====如果数量不匹配请多执行几次k8s_pull_images.sh====="

复制代码

查看所需镜像是否存在

$ sudo docker images

REPOSITORY                           TAG       IMAGE ID       CREATED        SIZEk8s.gcr.io/kube-apiserver            v1.22.1   f30469a2491a   5 days ago     128MBk8s.gcr.io/kube-proxy                v1.22.1   36c4ebbc9d97   5 days ago     104MBk8s.gcr.io/kube-controller-manager   v1.22.1   6e002eb89a88   5 days ago     122MBk8s.gcr.io/kube-scheduler            v1.22.1   aca5ededae9c   5 days ago     52.7MBk8s.gcr.io/etcd                      3.5.0-0   004811815584   2 months ago   295MBk8s.gcr.io/coredns                   1.8.4     8d147537fb7d   2 months ago   47.6MBk8s.gcr.io/pause                     3.5       ed210e3e4a5b   5 months ago   683kB

复制代码

再次使用sudo kubeadm init初始化节点。

[init] Using Kubernetes version: v1.22.1[preflight] Running pre-flight checks[preflight] Pulling images required for setting up a Kubernetes cluster[preflight] This might take a minute or two, depending on the speed of your internet connection[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'[certs] Using certificateDir folder "/etc/kubernetes/pki"[certs] Generating "ca" certificate and key[certs] Generating "apiserver" certificate and key[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local vm-ubuntu-06] and IPs [10.96.0.1 192.168.2.36][certs] Generating "apiserver-kubelet-client" certificate and key[certs] Generating "front-proxy-ca" certificate and key[certs] Generating "front-proxy-client" certificate and key[certs] Generating "etcd/ca" certificate and key[certs] Generating "etcd/server" certificate and key[certs] etcd/server serving cert is signed for DNS names [localhost vm-ubuntu-06] and IPs [192.168.2.36 127.0.0.1 ::1][certs] Generating "etcd/peer" certificate and key[certs] etcd/peer serving cert is signed for DNS names [localhost vm-ubuntu-06] and IPs [192.168.2.36 127.0.0.1 ::1][certs] Generating "etcd/healthcheck-client" certificate and key[certs] Generating "apiserver-etcd-client" certificate and key[certs] Generating "sa" key and public key[kubeconfig] Using kubeconfig folder "/etc/kubernetes"[kubeconfig] Writing "admin.conf" kubeconfig file[kubeconfig] Writing "kubelet.conf" kubeconfig file[kubeconfig] Writing "controller-manager.conf" kubeconfig file[kubeconfig] Writing "scheduler.conf" kubeconfig file[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"[kubelet-start] Starting the kubelet[control-plane] Using manifest folder "/etc/kubernetes/manifests"[control-plane] Creating static Pod manifest for "kube-apiserver"[control-plane] Creating static Pod manifest for "kube-controller-manager"[control-plane] Creating static Pod manifest for "kube-scheduler"[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s[apiclient] All control plane components are healthy after 9.505747 seconds[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace[kubelet] Creating a ConfigMap "kubelet-config-1.22" in namespace kube-system with the configuration for the kubelets in the cluster[upload-certs] Skipping phase. Please see --upload-certs[mark-control-plane] Marking the node vm-ubuntu-06 as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers][mark-control-plane] Marking the node vm-ubuntu-06 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule][bootstrap-token] Using token: 0cmq3a.yfcloahv2a9f3p3k[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key[addons] Applied essential addon: CoreDNS[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
  mkdir -p $HOME/.kube  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config  sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
  export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:  https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.2.36:6443 --token 0cmq3a.yfcloahv2a9f3p3k \        --discovery-token-ca-cert-hash sha256:02b3014490559fa323006f6b2dc59b372da3b226256b683770da1d2eaf35fa0b

复制代码

然后按照提示步骤执行

mkdir -p $HOME/.kubesudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/configsudo chown $(id -u):$(id -g) $HOME/.kube/config

复制代码

检查状态

$ kubectl cluster-info

Kubernetes control plane is running at https://192.168.2.36:6443CoreDNS is running at https://192.168.2.36:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

复制代码

查看当前节点, 发现状态为NotReady

$ kubectl get node

NAME           STATUS     ROLES                  AGE     VERSIONvm-ubuntu-06   NotReady   control-plane,master   5m50s   v1.22.1

复制代码

查看具体描述

$ kubectl describe node vm-ubuntu-06

查看 pod 状态

$ kubectl get pod -n kube-system -o wide

3.4 部署网络插件

要让 Kubernetes Cluster 能够工作，必须安装 Pod 网络，否则 Pod 之间无法通信。Kubernetes 支持多种网络方案，常用的有flannel和calico。

3.4.1 部署 flannel 网络插件

这里我们使用 flannel，在 master 节点执行如下命令部署：

$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml

$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+podsecuritypolicy.policy/psp.flannel.unprivileged createdclusterrole.rbac.authorization.k8s.io/flannel createdclusterrolebinding.rbac.authorization.k8s.io/flannel createdserviceaccount/flannel createdconfigmap/kube-flannel-cfg createddaemonset.apps/kube-flannel-ds created

复制代码

3.4.2 部署 calico 网络插件

安装 calico（和 flannel 二选一即可）

$ kubectl create -f https://projectcalico.docs.tigera.io/manifests/tigera-operator.yaml

获取自定义配置

$ wget https://projectcalico.docs.tigera.io/manifests/custom-resources.yaml

修改自定义配置里面的 IP 地址，使其和 master 初始化的网段保持一致：

$ sudo vi custom-resources.yaml
spec:  # Configures Calico networking.  calicoNetwork:    # Note: The ipPools section cannot be modified post-install.    ipPools:    - blockSize: 26      cidr: 10.5.0.0/16                    # 修改IP地址      encapsulation: VXLANCrossSubnet      natOutgoing: Enabled      nodeSelector: all()

复制代码

应用配置

$ kubectl create -f custom-resources.yaml

查看运行状况知道所有都变成运行状态

$ watch kubectl get pods -n calico-system

删除调度污点

$ kubectl taint nodes --all node-role.kubernetes.io/master-

返回结果如下：

node/<your-hostname> untainted

复制代码

确认节点

$ kubectl get nodes -o wide

查看状态是否正常

$ kubectl get pods --all-namespaces

ubuntu@vm-ubuntu-06:/$ kubectl get pods --all-namespacesNAMESPACE     NAME                                   READY   STATUS    RESTARTS   AGEkube-system   coredns-78fcd69978-gxdqg               1/1     Running   0          122mkube-system   coredns-78fcd69978-qhgq5               1/1     Running   0          122mkube-system   etcd-vm-ubuntu-06                      1/1     Running   0          122mkube-system   kube-apiserver-vm-ubuntu-06            1/1     Running   0          122mkube-system   kube-controller-manager-vm-ubuntu-06   1/1     Running   0          122mkube-system   kube-flannel-ds-ltqjs                  1/1     Running   0          108mkube-system   kube-proxy-zcj72                       1/1     Running   0          122mkube-system   kube-scheduler-vm-ubuntu-06            1/1     Running   0          122m

复制代码

重启命令：

$ systemctl restart kubelet docker

4. 初始化节点

4.1 初始化 master 节点

启用 k8s

$ sudo systemctl enable kubelet

初始化 master 节点

$ sudo kubeadm init --pod-network-cidr 10.5.0.0/16 --control-plane-endpoint=vm-ubuntu-06

4.2 加入 worker 节点

在 kubeadm 初始话集群成功后会返回 join 命令，里面有 token，discovery-token-ca-cert-hash 等参数，需要记下来。

有关 token 的过期时间是24小时certificate-key 过期时间是2小时

复制代码

如果是不记得，请执行以下命令获取：

在 master 节点执行kubeadm token list获取 token（注意查看是否过期）

$ kubeadm token list

加入集群除了需要 token 外，还需要 Master 节点的 ca 证书 sha256 编码 hash （--discovery-token-ca-cert-hash）值，这个可以通过如下命令获取：

$ openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'

1.如果是过期了，需要重新生成。重新生成基础的 join 命令：

$ kubeadm token create --print-join-command

2.对于添加 master 节点还需要重新生成 certificate-key：

$ kubeadm init phase upload-certs --experimental-upload-certs

如果添加 master 节点：用上面第 1 步生成的 join 命令和第 2 步生成的--certificate-key 值拼接起来执行

参考：

https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-join/#token-based-discovery-with-ca-pinning

https://kubernetes.io/docs/setup/independent/high-availability/#steps-for-the-first-control-plane-node

4.3 加入 worker 节点

把节点添加到集群：

$ kubeadm join vm-ubuntu-06:6443 --token oabl1q.kb01sa3zzendcbhi --discovery-token-ca-cert-hash sha256:ce4c3ff7440587207345bd15045a37d8fe394c39aa9ebe2c37ec22b9d26ab5a1

4.4 重启 k8s

$ sudo systemctl restart kubelet && sudo systemctl restart docker

4.5 重置 k8s

重置后所有节点后，重新初始化 master 节点，加入 worker 节点。注意：重置会导致所有 pod 数据丢失。

$ sudo kubeadm reset

5. 问题

5.1 如果 swap 分区未关闭可能会出现如下错误

如果是永久禁用的方式需要重启一下。

sudo kubeadm init [init] Using Kubernetes version: v1.22.1[preflight] Running pre-flight checkserror execution phase preflight: [preflight] Some fatal errors occurred:        [ERROR Swap]: running with swap on is not supported. Please disable swap[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`To see the stack trace of this error execute with --v=5 or higher

复制代码

启动时可能出现The connection to the server localhost:8080 was refused，在没有配置 config 文件时，kube-apiserver 默认使用的是 localhost，解决方法如下：在主节点上普通用户执行如下操作：

mkdir -p $HOME/.kubesudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/configsudo chown $(id -u):$(id -g) $HOME/.kube/config

复制代码

或者以 root 用户运行，导入环境变量：

$ export KUBECONFIG=/etc/kubernetes/admin.conf

5.2 `kubeadm join` 超时报错

[preflight] Running pre-flight checks[preflight] Reading configuration from the cluster...[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"[kubelet-start] Starting the kubelet[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...[kubelet-check] Initial timeout of 40s passed.[kubelet-check] It seems like the kubelet isn't running or healthy.[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.[kubelet-check] It seems like the kubelet isn't running or healthy.[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.[kubelet-check] It seems like the kubelet isn't running or healthy.[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.[kubelet-check] It seems like the kubelet isn't running or healthy.[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.[kubelet-check] It seems like the kubelet isn't running or healthy.[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.error execution phase kubelet-start: error uploading crisocket: timed out waiting for the conditionTo see the stack trace of this error execute with --v=5 or higher

复制代码

journalctl -u kubelet 查看 kubectl 日志发现报错如下

error: failed to run Kubelet: failed to create kubelet:  misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"

复制代码

发现设置的 docker 设置 cgroup 名字与 kubelete 的不一致，/etc/docker/daemon.json添加如下配置后使用sudo systemctl restart docker重启 docker。

  {    "exec-opts": ["native.cgroupdriver=systemd"],        "log-driver": "json-file",        "log-opts": {                "max-size": "100m"        }

复制代码

5.3 Pod 内 ping 不通外部网关，无法上网点问题

在 node 节点执行

$ cat /var/run/flannel/subnet.env

FLANNEL_NETWORK=10.244.0.0/16FLANNEL_SUBNET=10.5.1.1/16FLANNEL_MTU=1450FLANNEL_IPMASQ=true

复制代码

尝试 Pod 间互 ping 看是否通畅，如果不通多半是防火墙的问题。

查看 iptables 规则

$ sudo iptables -L --line-numbers

清空 iptables 所有规则

$ sudo iptables -F

在 iptables 中对 subnet 放行：

$ sudo iptables -t nat -I POSTROUTING -s 10.5.0.0/16 -j MASQUERADE

修改域名解析/etc/resolv.conf默认是个软链接，修改后重启后会变回去，这里需要删掉自己建

$ sudo rm /etc/resolv.conf && sudo touch /etc/resolv.conf
$ sudo vim /etc/resolv.conf

nameserver 192.168.2.1 # 改成你的网关nameserver 223.5.5.5nameserver 223.6.6.6

复制代码

6. 脚本

Kubernetes 安装脚本：k8s_install.sh

#!/bin/bash
echo "1.更新环境"
sudo apt updatesudo apt -y upgrade 

echo "2.关闭swap分区"
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstabsudo swapoff -a
echo "3.关闭防火墙"
sudo ufw disable
echo "4.设置参数"
sudo modprobe overlaysudo modprobe br_netfilter
sudo tee /etc/sysctl.d/kubernetes.conf<<EOFnet.bridge.bridge-nf-call-ip6tables = 1net.bridge.bridge-nf-call-iptables = 1net.ipv4.ip_forward = 1EOF
sudo sysctl --system
echo "5.安装docker"
#设置阿里云镜像源sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable"
sudo apt updatesudo apt install -y docker.io
# Create required directoriessudo mkdir -p /etc/systemd/system/docker.service.d
# Create daemon json config filesudo tee /etc/docker/daemon.json <<EOF{  "exec-opts": ["native.cgroupdriver=systemd"],  "log-driver": "json-file",  "log-opts": {    "max-size": "100m"  },  "storage-driver": "overlay2"}EOF
# Start and enable Servicessudo systemctl daemon-reload sudo systemctl enable docker.service --nowsudo systemctl restart dockersudo systemctl status dockerdocker --version
echo "6.设置 Kubernetes repository"
sudo apt updatesudo apt install -y apt-transport-https curl gnupg2 software-properties-common ca-certificates
#curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key addcurl -s https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add - #使用阿里云
#sudo apt-add-repository "deb http://apt.kubernetes.io/ kubernetes-xenial main"sudo tee /etc/apt/sources.list.d/kubernetes.list <<EOF deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main #阿里云源EOF

echo "7.安装k8s"

sudo apt updatesudo apt-get install -y kubelet kubeadm kubectlsudo apt-mark hold kubelet kubeadm kubectl #标记阻止自动更新
kubectl version --client && kubeadm version

复制代码

初始化 Master 节点脚本：k8s_init_master.sh

#!/bin/bash
echo "修改docker普通用户权限"sudo chmod 777 /var/run/docker.sock
echo "1.拉取镜像"
#下面的版本号要对应KUBE_VERSION=v1.22.2PAUSE_VERSION=3.5CORE_DNS_VERSION=1.8.4ETCD_VERSION=3.5.0-0
# pull kubernetes images from hub.docker.comdocker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy-amd64:$KUBE_VERSIONdocker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager-amd64:$KUBE_VERSIONdocker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver-amd64:$KUBE_VERSIONdocker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler-amd64:$KUBE_VERSION# pull aliyuncs mirror docker imagesdocker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:$PAUSE_VERSIONdocker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:$CORE_DNS_VERSIONdocker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:$ETCD_VERSION
# retag to k8s.gcr.io prefixdocker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy-amd64:$KUBE_VERSION  k8s.gcr.io/kube-proxy:$KUBE_VERSIONdocker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager-amd64:$KUBE_VERSION k8s.gcr.io/kube-controller-manager:$KUBE_VERSIONdocker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver-amd64:$KUBE_VERSION k8s.gcr.io/kube-apiserver:$KUBE_VERSIONdocker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler-amd64:$KUBE_VERSION k8s.gcr.io/kube-scheduler:$KUBE_VERSIONdocker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:$PAUSE_VERSION k8s.gcr.io/pause:$PAUSE_VERSIONdocker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:$CORE_DNS_VERSION k8s.gcr.io/coredns/coredns:v$CORE_DNS_VERSIONdocker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:$ETCD_VERSION k8s.gcr.io/etcd:$ETCD_VERSION
# untag origin tag, the images won't be delete.docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy-amd64:$KUBE_VERSIONdocker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager-amd64:$KUBE_VERSIONdocker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver-amd64:$KUBE_VERSIONdocker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler-amd64:$KUBE_VERSIONdocker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/pause:$PAUSE_VERSIONdocker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:$CORE_DNS_VERSIONdocker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:$ETCD_VERSION
echo "====================执行完毕======================"
echo "所需镜像:"kubeadm config images list
echo "已安装镜像:"sudo docker images
echo "====如果数量不匹配请多执行几次k8s_pull_images.sh====="
echo "2.启用k8s"sudo systemctl enable kubeletlsmod | grep br_netfilter
echo "3.初始化master节点"sudo kubeadm init --pod-network-cidr 10.5.0.0/16 --control-plane-endpoint=vm-ubuntu-06#sudo kubeadm init --pod-network-cidr 10.5.0.0/16
echo "4.master节点配置"#mkdir -p $HOME/.kube#sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config#sudo chown $(id -u):$(id -g) $HOME/.kube/config#sudo echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> /etc/profileexport KUBECONFIG=/etc/kubernetes/admin.conf
echo "5.初始化网络"kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.ymlkubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml
kubectl cluster-infokubectl get pods --all-namespaces

复制代码

推荐脚本：https://github.com/easzlab/kubeasz

7. 管理界面 Kuboard

Kuboard v3.x 支持 Kubernetes 多集群管理。如果您从 Kuboard v1.0.x 或者 Kuboard v2.0.x 升级到 Kuboard，请注意：

您可以同时使用 Kuboard v3.x 和 Kuboard v2.0.x；
Kuboard v3.x 支持 amd64 (x86) 架构和 arm68 (armv8) 架构的 CPU；

点击此处可以查看在线演示账号：demo 密码：demo123

安装教程：https://kuboard.cn/install/v3/install.html

发布于: 2021 年 08 月 26 日阅读数: 192

原文链接:【http://xie.infoq.cn/article/08ed85399b0d1fcc322056f2e】。

玏佾

关注

还未添加个人签名 2013.06.26 加入

一个多年.net经验的javaer。

评论 (1 条评论)

发布

远鹏

可以用lank8s.cn代替k8s.gcr.io拉取K8S的镜像,如果是用kubeadm命令的话加上参数--image-repository=lank8s.cn就可以了

2021 年 10 月 24 日 21:24

 0 回复

没有更多了

创作场景

Ubuntu Server 20.04 搭建安装 Kubernetes

1. 简介

2. 准备

2.1 配置各节点 HostName

2.2 修改节点 hosts

2.3 关闭交换区

2.4 关闭防火墙

2.5 设置内核参数

3. 安装

3.1 安装 docker

3.2 安装必要包

3.3 获取初始化节点所需镜像

3.4 部署网络插件

3.4.1 部署 flannel 网络插件

3.4.2 部署 calico 网络插件

4. 初始化节点

4.1 初始化 master 节点

4.2 加入 worker 节点

4.3 加入 worker 节点

4.4 重启 k8s

4.5 重置 k8s

5. 问题

5.1 如果 swap 分区未关闭可能会出现如下错误

5.2 kubeadm join 超时报错

5.3 Pod 内 ping 不通外部网关，无法上网点问题

6. 脚本

7. 管理界面 Kuboard

玏佾

评论 (1 条评论)

5.2 `kubeadm join` 超时报错