今天在k8s更新服务时,发现pod启动失败,报错
failed to start sandbox container
,如下所示:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 28m default-scheduler Successfully assigned kube-system/k8s-proxy-7wkt4 to tj1-staging-com-ocean007-201812.kscn
Warning FailedCreatePodSandBox 28m (x13 over 28m) kubelet, tj1-staging-com-ocean007-201812.kscn Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "k8s-proxy-7wkt4": Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:297: getting the final child's pid from pipe caused \"EOF\"": unknown
Normal SandboxChanged 3m19s (x1364 over 28m) kubelet, tj1-staging-com-ocean007-201812.kscn Pod sandbox changed, it will be killed and re-created.
sandbox 创建失败只是表象,是宿主机其他异常导致的,一般是(cpu,diskio,mem)导致的.
首先,上节点看kubelet,docker有无异常,日志没有明显错误,通过top
看到docker cpu占用非常高
[root@tj1-staging-com-ocean007-201812 ~]
top - 17:55:00 up 265 days, 3:41, 1 user, load average: 10.71, 11.34, 10.76
Tasks: 816 total, 5 running, 811 sleeping, 0 stopped, 0 zombie
%Cpu(s): 24.0 us, 34.5 sy, 0.0 ni, 41.4 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st
KiB Mem : 65746380 total, 20407940 free, 11007040 used, 34331400 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 49134416 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
115483 root 20 0 3965212 273188 34564 S 489.7 0.4 382260:40 dockerd
1367523 root 20 0 18376 2972 2716 R 66.9 0.0 20163:45 bash
1367487 root 20 0 11856 5616 4512 S 54.0 0.0 16748:26 containerd-shim
3200169 root 20 0 1300 4 0 R 53.3 0.0 14913:49 sh
2429952 root 20 0 1300 4 0 S 49.3 0.0 9620:56 sh
3200130 root 20 0 9392 4756 3884 S 47.7 0.0 13417:30 containerd-shim
3718475 root 20 0 1300 4 0 R 47.4 0.0 8600:20 sh
3718440 root 20 0 10736 5516 4512 S 42.1 0.0 7575:31 containerd-shim
2429917 root 20 0 11856 5556 4512 S 40.1 0.0 8313:22 containerd-shim
3205493 root 20 0 3775924 230996 66704 S 18.9 0.4 2559:07 kubelet
1 root 20 0 195240 157000 3932 S 7.9 0.2 1417:46 systemd
804 dbus 20 0 30308 6460 2464 S 1.7 0.0 462:18.84 dbus-daemon
1011737 root 20 0 277656 122788 18428 S 1.3 0.2 768:03.00 cadvisor
115508 root 20 0 7139200 32896 24288 S 1.0 0.1 662:25.27 containerd
806 root 20 0 24572 3060 2480 S 0.7 0.0 171:22.52 systemd-logind
511080 root 0 -20 2751348 52552 15744 S 0.7 0.1 178:27.51 sagent
1102507 root 20 0 11792 7292 4512 S 0.7 0.0 23:36.37 containerd-shim
1272223 root 20 0 164800 5296 3824 R 0.7 0.0 0:00.38 top
2866292 root 20 0 5045000 1.983g 3080 S 0.7 3.2 230:09.47 redis
同时, cpu system异常高.
%Cpu(s): 24.0 us, 34.5 sy, 0.0 ni, 41.4 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st
按照以前的经验,一般是由某些容器引起的,通过top
看到个别sh
进程占用cpu较高.
通过ps
看到进程居然是个死循环
[root@tj1-staging-com-ocean007-201812 ~]
root 1287628 1247781 0 17:55 pts/1 00:00:00 grep --color=auto 1367523
root 1367523 1367504 72 Feb28 ? 14-00:04:17 /bin/bash -c while true; do echo hello; done
通过/proc/pid/cgroup
找到对应容器
11:freezer:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
10:devices:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
9:hugetlb:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
8:blkio:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
7:memory:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
6:perf_event:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
5:cpuset:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
4:pids:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
3:net_cls,net_prio:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
2:cpu,cpuacct:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
1:name=systemd:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
找到对应容器
docker ps | grep 29842d554
清理完相关pod后,系统恢复正常
top - 18:25:57 up 265 days, 4:12, 1 user, load average: 1.05, 1.24, 4.02
Tasks: 769 total, 1 running, 768 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.7 us, 0.9 sy, 0.0 ni, 97.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 65746380 total, 22106960 free, 10759860 used, 32879560 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 49401576 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3205493 root 20 0 3775924 229844 66704 S 9.9 0.3 2563:18 kubelet
115483 root 20 0 3965468 249124 34564 S 7.9 0.4 382323:36 dockerd
1 root 20 0 195240 157000 3932 S 6.3 0.2 1419:48 systemd
804 dbus 20 0 30308 6460 2464 S 2.0 0.0 462:51.51 dbus-daemon
3085322 root 20 0 12.045g 1.578g 19028 S 1.3 2.5 767:51.19 java
115508 root 20 0 7139200 32264 24288 S 1.0 0.0 662:42.18 containerd
511080 root 0 -20 2751348 42116 15744 S 1.0 0.1 178:44.79 sagent
1011737 root 20 0 277656 111836 18428 S 1.0 0.2 768:49.01 cadvisor
1523167 root 20 0 164800 5436 4012 R 0.7 0.0 0:00.04 top
3199459 root 20 0 1554708 43668 9496 S 0.7 0.1 28:50.60 falcon-agent
7 root 20 0 0 0 0 S 0.3 0.0 619:07.64 rcu_sched
806 root 20 0 24572 3060 2480 S 0.3 0.0 171:33.69 systemd-logind
11921 root 20 0 94820 20480 5840 S 0.3 0.0 1402:42 consul
575838 root 20 0 411464 17092 7364 S 0.3 0.0 15:16.25 python
856593 root 20 0 1562392 37912 9612 S 0.3 0.1 21:34.23 falcon-agent
931957 33 20 0 90728 3392 1976 S 0.3 0.0 0:51.23 nginx
1212186 root 20 0 0 0 0 S 0.3 0.0 0:01.12 kworker/14:1
1726228 root 20 0 9392 4496 3808 S 0.3 0.0 0:00.67 containerd-shim
1887128 root 20 0 273160 7932 3128 S 0.3 0.0 46:05.23 redis-server
2788111 root 20 0 273160 6300 3080 S 0.3 0.0 25:18.55 redis-server
3199297 root 20 0 1563160 44812 9624 S 0.3 0.1 31:13.73 falcon-agent
sandox创建失败的原因是各种各样的, 如[memory设置错误触发的异常][1],[dockerd异常][2].
针对此处问题是由于某些测试pod通过while true; do echo hello; done
启动,死循环一直echo hello
产生大量read()
系统调用,所在cpu飙升.多个类似pod导致系统非常繁忙,无法正常处理其他请求.
此类问题不容易在pod创建时直接检测到,只能通过添加物理节点相关报警(dockerd cpu使用率, node cpu.sys使用率)及时发现问题.
[1] https://github.com/kubernetes/kubernetes/issues/56996
[2] https://plugaru.org/2018/05/21/pod-sandbox-changed-it-will-be-killed-and-re-created/
背景今天在k8s更新服务时,发现pod启动失败,报错failed to start sandbox container,如下所示:Events: Type Reason Age From Message ---- ...
新装Kubernetes,创建一个新Pod,启动Pod遇到CreatePodSandbox或RunPodSandbox异常。查看日志
# journalctl --since 15:00:00 -u kubelet
RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed pulling ...
最初的目标是使用以GO语言复制 。 随着技术的发展,它还实现了包括Linux名称空间和cgroup在内的新技术。
rootfs和间隔CPU使用率检查的思想来自 ,而池化的预分支容器来自 。
注意:仅在Linux上可用,因为ptrace,unshare,cgroup仅在Linux上可用
从安装最新的go编译器
安装libseccomp库:(对于Ubuntu) apt install libseccomp-dev
构建并安装: go install github.com/criyle/go-sandbox/...
libseccomp + ptrace(改进的UOJ沙箱)
受POSIX rlimit限制的计算资源:时间和内存(堆栈)和输出
受限的系统调用访问权限(通过libseccomp和ptrace)
受限的文件访问(读,写,访问和执行)。 由UOJ
docker无法启动,start restart、stop均会卡死
查看containerd服务,没有启动,/run/containerd没有配置信息
ps -ef | grep docker 发现大量进程
手动进行杀死
kill -s 9 pid
由于系统磁盘不足,导致k8s部分线程死掉,搞定之后,部署项目,报以下异常:FailedCreatePodSandBox
network: failed to set bridge addr: “cni0” already has an IP address different from 10.100.1.1/24
意思是说:设置网桥地址失败:“cni0”的IP地址与10.100.1.1/24不同。
Normal SandboxChanged 7m59s (x294 over 12m).
jvm-sandbox-repeater是一个基于Java虚拟机(JVM)的沙箱环境,用于在不影响实际应用程序运行的情况下,对Java应用程序进行安全测试和验证。它提供了一种安全的测试方式,可以防止恶意代码对应用程序的攻击和破坏。jvm-sandbox-repeater还支持重复执行测试,以确保测试结果的准确性和稳定性。
我能够帮助你解决 JVM-Sandbox-Repeater 相关的问题。JVM沙箱(JVM sandbox)是一种在Java虚拟机上运行的安全环境,它可以隔离应用程序,使其无法访问系统资源或执行不安全的操作。Repeater是一种在JVM沙箱中运行的工具,用于测试应用程序的安全性和漏洞。
使用JVM沙箱和Repeater,您可以在安全的环境中测试应用程序,而无需担心可能会对生产环境造成的潜在风险。Repeater提供了一些功能,例如发送HTTP请求、修改请求头、重放请求等,以帮助您测试应用程序的安全性并找出可能的漏洞。
import axios from "axios"
import { ElMessage } from "element-plus"
import { getUser } from "@/utils"
// vite proxy, need not cors
// axios.defaults.baseURL = "http://localhost:8310/";
axios.defaults.baseURL = "http://106........:8310";
axios.defaults.withCredentials = true;
[/code]
Golang+Vue构建全功能Web应用
qingwave: