启动docker失败,报错了。Failed to load environment files: No such file or directory
[root@mcwk8s05 ~]# systemctl start docker
Job for docker.service failed because a configured resource limit was exceeded. See "systemctl status docker.service" and "journalctl -xe" for details.
[root@mcwk8s05 ~]# journalctl -xe
-- Subject: Unit docker.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- Unit docker.service has failed.
-- The result is failed.
.....
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- Unit docker.service has begun starting up.
Apr 18 00:33:44 mcwk8s05 kube-proxy[1006]: I0418 00:33:44.786333 1006 reflector.go:160] Listing and watching *v1.Endpoints from k8s.io/client-go/informers/factory.go:133
Apr 18 00:33:44 mcwk8s05 kube-proxy[1006]: I0418 00:33:44.788405 1006 reflector.go:160] Listing and watching *v1.Service from k8s.io/client-go/informers/factory.go:133
Apr 18 00:33:46 mcwk8s05 kube-proxy[1006]: I0418 00:33:46.143912 1006 proxier.go:748] Not syncing ipvs rules until Services and Endpoints have been received from master
Apr 18 00:33:46 mcwk8s05 kube-proxy[1006]: I0418 00:33:46.144004 1006 proxier.go:744] syncProxyRules took 185.651µs
Apr 18 00:33:46 mcwk8s05 kube-proxy[1006]: I0418 00:33:46.144024 1006 bounded_frequency_runner.go:221] sync-runner: ran, next possible in 0s, periodic in 30s
Apr 18 00:33:46 mcwk8s05 systemd[1]: docker.service holdoff time over, scheduling restart.
Apr 18 00:33:46 mcwk8s05 systemd[1]: Failed to load environment files: No such file or directory
查看这个环境文件
[root@mcwk8s05 ~]# cat /usr/lib/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target docker.socket firewalld.service containerd.service
Wants=network-online.target
Requires=docker.socket containerd.service
[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
EnvironmentFile=/run/flannel/subnet.env
发现这个文件是flannel运行时的临时文件。flannel没有启动。那么先启动flannel
[root@mcwk8s05 ~]# ls /run/
abrt console crond.pid dbus faillock lock mount NetworkManager sepermit sshd.pid svnserve systemd tuned user vmware
auditd.pid containerd cron.reboot docker.sock initramfs log netreport plymouth setrans sudo syslogd.pid tmpfiles.d udev utmp
[root@mcwk8s05 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:25:ef:dd brd ff:ff:ff:ff:ff:ff
inet 10.0.0.35/24 brd 10.0.0.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::3a1f:8b4:d1f1:9759/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:25:ef:e7 brd ff:ff:ff:ff:ff:ff
[root@mcwk8s05 ~]#
启动网络,然后启动容器,正常启动
[root@mcwk8s05 ~]# systemctl start flanneld.service
[root@mcwk8s05 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:25:ef:dd brd ff:ff:ff:ff:ff:ff
inet 10.0.0.35/24 brd 10.0.0.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::3a1f:8b4:d1f1:9759/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:25:ef:e7 brd ff:ff:ff:ff:ff:ff
4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN
link/ether 4e:bb:c2:5c:bf:37 brd ff:ff:ff:ff:ff:ff
inet 172.17.98.0/32 scope global flannel.1
valid_lft forever preferred_lft forever
inet6 fe80::4cbb:c2ff:fe5c:bf37/64 scope link
valid_lft forever preferred_lft forever
[root@mcwk8s05 ~]# systemctl start docker
[root@mcwk8s05 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:25:ef:dd brd ff:ff:ff:ff:ff:ff
inet 10.0.0.35/24 brd 10.0.0.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::3a1f:8b4:d1f1:9759/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:25:ef:e7 brd ff:ff:ff:ff:ff:ff
4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN
link/ether 4e:bb:c2:5c:bf:37 brd ff:ff:ff:ff:ff:ff
inet 172.17.98.0/32 scope global flannel.1
valid_lft forever preferred_lft forever
inet6 fe80::4cbb:c2ff:fe5c:bf37/64 scope link
valid_lft forever preferred_lft forever
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
link/ether 02:42:f6:d4:62:1b brd ff:ff:ff:ff:ff:ff
inet 172.17.98.1/24 brd 172.17.98.255 scope global docker0
valid_lft forever preferred_lft forever
[root@mcwk8s05 ~]#
一次k8s的node 是not ready的排查
检查状态没有准备好
[root@mcwk8s03 ~]# kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-1 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}
etcd-0 Healthy {"health":"true"}
[root@mcwk8s03 ~]#
[root@mcwk8s03 ~]#
[root@mcwk8s03 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
mcwk8s05 NotReady <none> 166d v1.15.12
mcwk8s06 NotReady <none> 166d v1.15.12
关闭防火墙
systemctl stop firewalld.service
node 上kubelet没有启动
[root@mcwk8s05 ~]# systemctl status kubelet.service
node上查看错误信息,查看到访问的是nginx负载均衡器的vip。
[root@mcwk8s05 ~]# tail -100f /var/log/messages
Apr 18 01:14:35 mcwk8s05 kubelet: I0418 01:14:35.110814 2985 reflector.go:160] Listing and watching *v1.Node from k8s.io/kubernetes/pkg/kubelet/kubelet.go:454
Apr 18 01:14:35 mcwk8s05 kubelet: I0418 01:14:35.118520 2985 setters.go:753] Error getting volume limit for plugin kubernetes.io/azure-disk
Apr 18 01:14:35 mcwk8s05 kubelet: I0418 01:14:35.118562 2985 setters.go:753] Error getting volume limit for plugin kubernetes.io/gce-pd
Apr 18 01:14:35 mcwk8s05 kubelet: I0418 01:14:35.118568 2985 setters.go:753] Error getting volume limit for plugin kubernetes.io/cinder
Apr 18 01:14:35 mcwk8s05 kubelet: I0418 01:14:35.118573 2985 setters.go:753] Error getting volume limit for plugin kubernetes.io/aws-ebs
Apr 18 01:14:35 mcwk8s05 kubelet: I0418 01:14:35.118591 2985 kubelet_node_status.go:471] Recording NodeHasSufficientMemory event message for node mcwk8s05
Apr 18 01:14:35 mcwk8s05 kubelet: I0418 01:14:35.118605 2985 kubelet_node_status.go:471] Recording NodeHasNoDiskPressure event message for node mcwk8s05
Apr 18 01:14:35 mcwk8s05 kubelet: I0418 01:14:35.118628 2985 kubelet_node_status.go:471] Recording NodeHasSufficientPID event message for node mcwk8s05
Apr 18 01:14:35 mcwk8s05 kubelet: I0418 01:14:35.118644 2985 kubelet_node_status.go:72] Attempting to register node mcwk8s05
Apr 18 01:14:35 mcwk8s05 kubelet: I0418 01:14:35.118645 2985 event.go:258] Event(v1.ObjectReference{Kind:"Node", Namespace:"", Name:"mcwk8s05", UID:"mcwk8s05", APIVersion:"", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'NodeHasSufficientMemory' Node mcwk8s05 status is now: NodeHasSufficientMemory
Apr 18 01:14:35 mcwk8s05 kubelet: I0418 01:14:35.118671 2985 event.go:258] Event(v1.ObjectReference{Kind:"Node", Namespace:"", Name:"mcwk8s05", UID:"mcwk8s05", APIVersion:"", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'NodeHasNoDiskPressure' Node mcwk8s05 status is now: NodeHasNoDiskPressure
Apr 18 01:14:35 mcwk8s05 kubelet: I0418 01:14:35.118701 2985 event.go:258] Event(v1.ObjectReference{Kind:"Node", Namespace:"", Name:"mcwk8s05", UID:"mcwk8s05", APIVersion:"", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'NodeHasSufficientPID' Node mcwk8s05 status is now: NodeHasSufficientPID
Apr 18 01:14:35 mcwk8s05 kubelet: I0418 01:14:35.129924 2985 kubelet.go:1973] SyncLoop (housekeeping, skipped): sources aren't ready yet.
Apr 18 01:14:35 mcwk8s05 kubelet: E0418 01:14:35.194840 2985 kubelet.go:2252] node "mcwk8s05" not found
Apr 18 01:14:35 mcwk8s05 kubelet: E0418 01:14:35.295918 2985 kubelet.go:2252] node "mcwk8s05" not found
Apr 18 01:14:37 mcwk8s05 kubelet: E0418 01:14:37.012374 2985 kubelet.go:2252] node "mcwk8s05" not found
Apr 18 01:14:37 mcwk8s05 kube-proxy: E0418 01:14:37.109904 1006 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Get https://10.0.0.30:6443/api/v1/services?labelSelector=%21service.kubernetes.io%2Fservice-proxy-name&limit=500&resourceVersion=0: dial tcp 10.0.0.30:6443: connect: no route to host
Apr 18 01:14:37 mcwk8s05 kube-proxy: E0418 01:14:37.109992 1006 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Get https://10.0.0.30:6443/api/v1/endpoints?labelSelector=%21service.kubernetes.io%2Fservice-proxy-name&limit=500&resourceVersion=0: dial tcp 10.0.0.30:6443: connect: no route to host
Apr 18 01:14:37 mcwk8s05 kubelet: E0418 01:14:37.110082 2985 kubelet_node_status.go:94] Unable to register node "mcwk8s05" with API server: Post https://10.0.0.30:6443/api/v1/nodes: dial tcp 10.0.0.30:6443: connect: no route to host
Apr 18 01:14:37 mcwk8s05 kubelet: E0418 01:14:37.110127 2985 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/kubelet.go:454: Failed to list *v1.Node: Get https://10.0.0.30:6443/api/v1/nodes?fieldSelector=metadata.name%3Dmcwk8s05&limit=500&resourceVersion=0: dial tcp 10.0.0.30:6443: connect: no route to host
在两个nginx服务器上启动nginx进程。启动高可用
[root@mcwk8s01 ~]# ps -ef|grep nginx
root 1575 1416 0 01:17 pts/0 00:00:00 grep --color=auto nginx
[root@mcwk8s01 ~]# nginx
[root@mcwk8s01 ~]# systemctl start keepalived.service
[root@mcwk8s01 ~]#
然后查看node,已经成为准备状态,可以正常使用了
[root@mcwk8s03 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
mcwk8s05 Ready <none> 166d v1.15.12
mcwk8s06 Ready <none> 166d v1.15.12
[root@mcwk8s03 ~]#