1. 简介
kubernetes,简称 K8s,是用 8 代替 8 个字符“ubernete”而成的缩写。是一个开源的,用于管理云平台中多个主机上的容器化的应用,Kubernetes 的目标是让部署容器化的应用简单并且高效(powerful),Kubernetes 提供了应用部署,规划,更新,维护的一种机制。
本文介绍使用 kubeadm 方式安装。参考文档:https://kubernetes.io/zh/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
2. 准备
之前安装 kvm 准备三台虚拟机:192.168.2.31,192.168.2.32,192.168.33
安装 kvm 虚拟机:Ubuntu server 20.04安装KVM虚拟机
更新环境
sudo apt update
sudo apt -y upgrade && sudo systemctl reboot
复制代码
2.1 配置各节点 HostName
可以修改节点的 host name 替代域名解析。
sudo hostnamectl set-hostname "k8s-master" #192.168.2.31
sudo hostnamectl set-hostname "k8s-node-1" #192.168.2.32
sudo hostnamectl set-hostname "k8s-node-2" #192.168.2.33
复制代码
2.2 修改节点 hosts
修改文件/etc/hosts
$ sudo vi /etc/hosts
192.168.2.31 k8s-master
192.168.2.32 k8s-node-1
192.168.2.33 k8s-node-2
复制代码
2.3 关闭交换区
安装 k8s 的必须环节,为什么要关?简单的说就是为了性能。
暂时关闭
$ sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
$ sudo swapoff -a
永久关闭
$ sudo vi /etc/fstab #注释掉 swap 一行
#/swap.img none swap sw 0 0
查看是否关闭成功
$ free -h
2.4 关闭防火墙
关闭防火墙
$ sudo ufw disable
查看防火墙状态
$ sudo ufw status
2.5 设置内核参数
查看内核参数,确保 sysctl 配置中 net.bridge.bridge-nf-call-iptables 的值设置为了 1。
$ lsmod | grep br_netfilter
如果参数不一致,执行如下命令修改:
sudo modprobe overlay
sudo modprobe br_netfilter
sudo tee /etc/sysctl.d/kubernetes.conf<<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
sudo sysctl --system
复制代码
3. 安装
3.1 安装 docker
#安装证书
curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
#设置阿里云镜像源
sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable"
sudo apt update
sudo apt install -y docker.io
# Create required directories
sudo mkdir -p /etc/systemd/system/docker.service.d
# Create daemon json config file
sudo tee /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}
EOF
# Start and enable Services
sudo systemctl daemon-reload
sudo systemctl enable docker.service --now
sudo systemctl restart docker
sudo systemctl status docker
docker --version
复制代码
3.2 安装必要包
设置 Kubernetes repository
sudo apt update
sudo apt install -y apt-transport-https curl gnupg2 software-properties-common ca-certificates
#curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add
curl -s https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add - #使用阿里云
#sudo apt-add-repository "deb http://apt.kubernetes.io/ kubernetes-xenial main"
sudo tee /etc/apt/sources.list.d/kubernetes.list <<EOF
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main #阿里云源
EOF
复制代码
安装 kubeadm kubeadm kubectl
sudo apt update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl #标记阻止自动更新
复制代码
查看版本
$ kubectl version --client && kubeadm version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.1", GitCommit:"632ed300f2c34f6d6d15ca4cef3d3c7073412212", GitTreeState:"clean", BuildDate:"2021-08-19T15:45:37Z", GoVersion:"go1.16.7", Compiler:"gc", Platform:"linux/amd64"}
kubeadm version: &version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.1", GitCommit:"632ed300f2c34f6d6d15ca4cef3d3c7073412212", GitTreeState:"clean", BuildDate:"2021-08-19T15:44:22Z", GoVersion:"go1.16.7", Compiler:"gc", Platform:"linux/amd64"}
复制代码
3.3 获取初始化节点所需镜像
启用
sudo systemctl enable kubelet
在初始化过程中需要获取一些镜像,我们可以在初始化之前先获取下来,不然很容易报错。初始化 master 的时候会一直卡在那里。
$ sudo kubeadm init --pod-network-cidr 10.5.0.0/16 --control-plane-endpoint=vm-ubuntu-06
如果出现如下问题就是失败了,这里显示的链接镜像超时了。这时候你可能需要一把梯子。
W0825 14:09:24.664761 1369 version.go:103] could not fetch a Kubernetes version from the internet: unable to get URL "https://dl.k8s.io/release/stable-1.txt": Get "https://storage.googleapis.com/kubernetes-release/release/stable-1.txt": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
W0825 14:09:24.665057 1369 version.go:104] falling back to the local client version: v1.22.1
[init] Using Kubernetes version: v1.22.1
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-apiserver:v1.22.1: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-controller-manager:v1.22.1: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-scheduler:v1.22.1: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-proxy:v1.22.1: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
[ERROR ImagePull]: failed to pull image k8s.gcr.io/pause:3.5: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
[ERROR ImagePull]: failed to pull image k8s.gcr.io/etcd:3.5.0-0: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
[ERROR ImagePull]: failed to pull image k8s.gcr.io/coredns/coredns:v1.8.4: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
复制代码
或者更换镜像源,首先查看所需版本号
$ kubeadm config images list
k8s.gcr.io/kube-apiserver:v1.22.1
k8s.gcr.io/kube-controller-manager:v1.22.1
k8s.gcr.io/kube-scheduler:v1.22.1
k8s.gcr.io/kube-proxy:v1.22.1
k8s.gcr.io/pause:3.5
k8s.gcr.io/etcd:3.5.0-0
k8s.gcr.io/coredns/coredns:v1.8.4
复制代码
写个批处理来批量修改
$ vi k8s.sh
$ sudo chmod +x k8s.sh
$ sudo ./k8s.sh
k8s.sh
的内容如下:
#!/bin/bash
echo "修改docker普通用户权限"
sudo chmod 777 /var/run/docker.sock
echo "1.拉取镜像"
#下面的版本号要对应
KUBE_VERSION=v1.22.2
PAUSE_VERSION=3.5
CORE_DNS_VERSION=1.8.4
ETCD_VERSION=3.5.0-0
# pull kubernetes images from hub.docker.com
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy-amd64:$KUBE_VERSION
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager-amd64:$KUBE_VERSION
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver-amd64:$KUBE_VERSION
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler-amd64:$KUBE_VERSION
# pull aliyuncs mirror docker images
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:$PAUSE_VERSION
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:$CORE_DNS_VERSION
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:$ETCD_VERSION
# retag to k8s.gcr.io prefix
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy-amd64:$KUBE_VERSION k8s.gcr.io/kube-proxy:$KUBE_VERSION
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager-amd64:$KUBE_VERSION k8s.gcr.io/kube-controller-manager:$KUBE_VERSION
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver-amd64:$KUBE_VERSION k8s.gcr.io/kube-apiserver:$KUBE_VERSION
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler-amd64:$KUBE_VERSION k8s.gcr.io/kube-scheduler:$KUBE_VERSION
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:$PAUSE_VERSION k8s.gcr.io/pause:$PAUSE_VERSION
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:$CORE_DNS_VERSION k8s.gcr.io/coredns/coredns:v$CORE_DNS_VERSION
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:$ETCD_VERSION k8s.gcr.io/etcd:$ETCD_VERSION
# untag origin tag, the images won't be delete.
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy-amd64:$KUBE_VERSION
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager-amd64:$KUBE_VERSION
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver-amd64:$KUBE_VERSION
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler-amd64:$KUBE_VERSION
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/pause:$PAUSE_VERSION
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:$CORE_DNS_VERSION
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:$ETCD_VERSION
echo "====================执行完毕======================"
echo "所需镜像:"
kubeadm config images list
echo "已安装镜像:"
sudo docker images
echo "====如果数量不匹配请多执行几次k8s_pull_images.sh====="
复制代码
查看所需镜像是否存在
$ sudo docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
k8s.gcr.io/kube-apiserver v1.22.1 f30469a2491a 5 days ago 128MB
k8s.gcr.io/kube-proxy v1.22.1 36c4ebbc9d97 5 days ago 104MB
k8s.gcr.io/kube-controller-manager v1.22.1 6e002eb89a88 5 days ago 122MB
k8s.gcr.io/kube-scheduler v1.22.1 aca5ededae9c 5 days ago 52.7MB
k8s.gcr.io/etcd 3.5.0-0 004811815584 2 months ago 295MB
k8s.gcr.io/coredns 1.8.4 8d147537fb7d 2 months ago 47.6MB
k8s.gcr.io/pause 3.5 ed210e3e4a5b 5 months ago 683kB
复制代码
再次使用sudo kubeadm init
初始化节点。
[init] Using Kubernetes version: v1.22.1
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local vm-ubuntu-06] and IPs [10.96.0.1 192.168.2.36]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost vm-ubuntu-06] and IPs [192.168.2.36 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost vm-ubuntu-06] and IPs [192.168.2.36 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 9.505747 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.22" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node vm-ubuntu-06 as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node vm-ubuntu-06 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: 0cmq3a.yfcloahv2a9f3p3k
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.2.36:6443 --token 0cmq3a.yfcloahv2a9f3p3k \
--discovery-token-ca-cert-hash sha256:02b3014490559fa323006f6b2dc59b372da3b226256b683770da1d2eaf35fa0b
复制代码
然后按照提示步骤执行
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
复制代码
检查状态
$ kubectl cluster-info
Kubernetes control plane is running at https://192.168.2.36:6443
CoreDNS is running at https://192.168.2.36:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
复制代码
查看当前节点, 发现状态为NotReady
$ kubectl get node
NAME STATUS ROLES AGE VERSION
vm-ubuntu-06 NotReady control-plane,master 5m50s v1.22.1
复制代码
查看具体描述
$ kubectl describe node vm-ubuntu-06
查看 pod 状态
$ kubectl get pod -n kube-system -o wide
3.4 部署网络插件
要让 Kubernetes Cluster 能够工作,必须安装 Pod 网络,否则 Pod 之间无法通信。Kubernetes 支持多种网络方案,常用的有flannel
和calico
。
3.4.1 部署 flannel 网络插件
这里我们使用 flannel,在 master 节点执行如下命令部署:
$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml
$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created
复制代码
3.4.2 部署 calico 网络插件
安装 calico(和 flannel 二选一即可)
$ kubectl create -f https://projectcalico.docs.tigera.io/manifests/tigera-operator.yaml
获取自定义配置
$ wget https://projectcalico.docs.tigera.io/manifests/custom-resources.yaml
修改自定义配置里面的 IP 地址,使其和 master 初始化的网段保持一致:
$ sudo vi custom-resources.yaml
spec:
# Configures Calico networking.
calicoNetwork:
# Note: The ipPools section cannot be modified post-install.
ipPools:
- blockSize: 26
cidr: 10.5.0.0/16 # 修改IP地址
encapsulation: VXLANCrossSubnet
natOutgoing: Enabled
nodeSelector: all()
复制代码
应用配置
$ kubectl create -f custom-resources.yaml
查看运行状况知道所有都变成运行状态
$ watch kubectl get pods -n calico-system
删除调度污点
$ kubectl taint nodes --all node-role.kubernetes.io/master-
返回结果如下:
node/<your-hostname> untainted
复制代码
确认节点
$ kubectl get nodes -o wide
查看状态是否正常
$ kubectl get pods --all-namespaces
ubuntu@vm-ubuntu-06:/$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-78fcd69978-gxdqg 1/1 Running 0 122m
kube-system coredns-78fcd69978-qhgq5 1/1 Running 0 122m
kube-system etcd-vm-ubuntu-06 1/1 Running 0 122m
kube-system kube-apiserver-vm-ubuntu-06 1/1 Running 0 122m
kube-system kube-controller-manager-vm-ubuntu-06 1/1 Running 0 122m
kube-system kube-flannel-ds-ltqjs 1/1 Running 0 108m
kube-system kube-proxy-zcj72 1/1 Running 0 122m
kube-system kube-scheduler-vm-ubuntu-06 1/1 Running 0 122m
复制代码
重启命令:
$ systemctl restart kubelet docker
4. 初始化节点
4.1 初始化 master 节点
启用 k8s
$ sudo systemctl enable kubelet
初始化 master 节点
$ sudo kubeadm init --pod-network-cidr 10.5.0.0/16 --control-plane-endpoint=vm-ubuntu-06
4.2 加入 worker 节点
在 kubeadm 初始话集群成功后会返回 join 命令,里面有 token,discovery-token-ca-cert-hash 等参数,需要记下来。
有关 token 的过期时间是24小时
certificate-key 过期时间是2小时
复制代码
如果是不记得,请执行以下命令获取:
$ kubeadm token list
$ openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
1.如果是过期了,需要重新生成。重新生成基础的 join 命令:
$ kubeadm token create --print-join-command
2.对于添加 master 节点还需要重新生成 certificate-key:
$ kubeadm init phase upload-certs --experimental-upload-certs
如果添加 master 节点:用上面第 1 步生成的 join 命令和第 2 步生成的--certificate-key 值拼接起来执行
参考:
https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-join/#token-based-discovery-with-ca-pinning
https://kubernetes.io/docs/setup/independent/high-availability/#steps-for-the-first-control-plane-node
4.3 加入 worker 节点
把节点添加到集群:
$ kubeadm join vm-ubuntu-06:6443 --token oabl1q.kb01sa3zzendcbhi --discovery-token-ca-cert-hash sha256:ce4c3ff7440587207345bd15045a37d8fe394c39aa9ebe2c37ec22b9d26ab5a1
4.4 重启 k8s
$ sudo systemctl restart kubelet && sudo systemctl restart docker
4.5 重置 k8s
重置后所有节点后,重新初始化 master 节点,加入 worker 节点。注意:重置会导致所有 pod 数据丢失。
$ sudo kubeadm reset
5. 问题
5.1 如果 swap 分区未关闭可能会出现如下错误
如果是永久禁用的方式需要重启一下。
sudo kubeadm init
[init] Using Kubernetes version: v1.22.1
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR Swap]: running with swap on is not supported. Please disable swap
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
复制代码
启动时可能出现The connection to the server localhost:8080 was refused
,在没有配置 config 文件时,kube-apiserver 默认使用的是 localhost,解决方法如下:在主节点上普通用户执行如下操作:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
复制代码
或者以 root 用户运行,导入环境变量:
$ export KUBECONFIG=/etc/kubernetes/admin.conf
5.2 kubeadm join
超时报错
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
error execution phase kubelet-start: error uploading crisocket: timed out waiting for the condition
To see the stack trace of this error execute with --v=5 or higher
复制代码
journalctl -u kubelet
查看 kubectl 日志发现报错如下
error: failed to run Kubelet: failed to create kubelet:
misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"
复制代码
发现设置的 docker 设置 cgroup 名字与 kubelete 的不一致,/etc/docker/daemon.json
添加如下配置后使用sudo systemctl restart docker
重启 docker。
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
}
复制代码
5.3 Pod 内 ping 不通外部网关,无法上网点问题
在 node 节点执行
$ cat /var/run/flannel/subnet.env
FLANNEL_NETWORK=10.244.0.0/16
FLANNEL_SUBNET=10.5.1.1/16
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true
复制代码
尝试 Pod 间互 ping 看是否通畅,如果不通多半是防火墙的问题。
查看 iptables 规则
$ sudo iptables -L --line-numbers
清空 iptables 所有规则
$ sudo iptables -F
在 iptables 中对 subnet 放行:
$ sudo iptables -t nat -I POSTROUTING -s 10.5.0.0/16 -j MASQUERADE
修改域名解析/etc/resolv.conf
默认是个软链接,修改后重启后会变回去,这里需要删掉自己建
$ sudo rm /etc/resolv.conf && sudo touch /etc/resolv.conf
$ sudo vim /etc/resolv.conf
nameserver 192.168.2.1 # 改成你的网关
nameserver 223.5.5.5
nameserver 223.6.6.6
复制代码
6. 脚本
Kubernetes 安装脚本:k8s_install.sh
#!/bin/bash
echo "1.更新环境"
sudo apt update
sudo apt -y upgrade
echo "2.关闭swap分区"
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
sudo swapoff -a
echo "3.关闭防火墙"
sudo ufw disable
echo "4.设置参数"
sudo modprobe overlay
sudo modprobe br_netfilter
sudo tee /etc/sysctl.d/kubernetes.conf<<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
sudo sysctl --system
echo "5.安装docker"
#设置阿里云镜像源
sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable"
sudo apt update
sudo apt install -y docker.io
# Create required directories
sudo mkdir -p /etc/systemd/system/docker.service.d
# Create daemon json config file
sudo tee /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}
EOF
# Start and enable Services
sudo systemctl daemon-reload
sudo systemctl enable docker.service --now
sudo systemctl restart docker
sudo systemctl status docker
docker --version
echo "6.设置 Kubernetes repository"
sudo apt update
sudo apt install -y apt-transport-https curl gnupg2 software-properties-common ca-certificates
#curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add
curl -s https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add - #使用阿里云
#sudo apt-add-repository "deb http://apt.kubernetes.io/ kubernetes-xenial main"
sudo tee /etc/apt/sources.list.d/kubernetes.list <<EOF
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main #阿里云源
EOF
echo "7.安装k8s"
sudo apt update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl #标记阻止自动更新
kubectl version --client && kubeadm version
复制代码
初始化 Master 节点脚本:k8s_init_master.sh
#!/bin/bash
echo "修改docker普通用户权限"
sudo chmod 777 /var/run/docker.sock
echo "1.拉取镜像"
#下面的版本号要对应
KUBE_VERSION=v1.22.2
PAUSE_VERSION=3.5
CORE_DNS_VERSION=1.8.4
ETCD_VERSION=3.5.0-0
# pull kubernetes images from hub.docker.com
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy-amd64:$KUBE_VERSION
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager-amd64:$KUBE_VERSION
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver-amd64:$KUBE_VERSION
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler-amd64:$KUBE_VERSION
# pull aliyuncs mirror docker images
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:$PAUSE_VERSION
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:$CORE_DNS_VERSION
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:$ETCD_VERSION
# retag to k8s.gcr.io prefix
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy-amd64:$KUBE_VERSION k8s.gcr.io/kube-proxy:$KUBE_VERSION
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager-amd64:$KUBE_VERSION k8s.gcr.io/kube-controller-manager:$KUBE_VERSION
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver-amd64:$KUBE_VERSION k8s.gcr.io/kube-apiserver:$KUBE_VERSION
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler-amd64:$KUBE_VERSION k8s.gcr.io/kube-scheduler:$KUBE_VERSION
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:$PAUSE_VERSION k8s.gcr.io/pause:$PAUSE_VERSION
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:$CORE_DNS_VERSION k8s.gcr.io/coredns/coredns:v$CORE_DNS_VERSION
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:$ETCD_VERSION k8s.gcr.io/etcd:$ETCD_VERSION
# untag origin tag, the images won't be delete.
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy-amd64:$KUBE_VERSION
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager-amd64:$KUBE_VERSION
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver-amd64:$KUBE_VERSION
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler-amd64:$KUBE_VERSION
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/pause:$PAUSE_VERSION
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:$CORE_DNS_VERSION
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:$ETCD_VERSION
echo "====================执行完毕======================"
echo "所需镜像:"
kubeadm config images list
echo "已安装镜像:"
sudo docker images
echo "====如果数量不匹配请多执行几次k8s_pull_images.sh====="
echo "2.启用k8s"
sudo systemctl enable kubelet
lsmod | grep br_netfilter
echo "3.初始化master节点"
sudo kubeadm init --pod-network-cidr 10.5.0.0/16 --control-plane-endpoint=vm-ubuntu-06
#sudo kubeadm init --pod-network-cidr 10.5.0.0/16
echo "4.master节点配置"
#mkdir -p $HOME/.kube
#sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
#sudo chown $(id -u):$(id -g) $HOME/.kube/config
#sudo echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> /etc/profile
export KUBECONFIG=/etc/kubernetes/admin.conf
echo "5.初始化网络"
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml
kubectl cluster-info
kubectl get pods --all-namespaces
复制代码
推荐脚本:https://github.com/easzlab/kubeasz
7. 管理界面 Kuboard
Kuboard v3.x
支持 Kubernetes 多集群管理。如果您从 Kuboard v1.0.x 或者 Kuboard v2.0.x 升级到 Kuboard,请注意:
点击此处可以查看 在线演示 账号:demo
密码:demo123
安装教程:https://kuboard.cn/install/v3/install.html
评论 (1 条评论)