今天在k8s更新服务时,发现pod启动失败,报错 failed to start sandbox container ,如下所示:

Events:
  Type     Reason                  Age                     From                                           Message
  ----     ------                  ----                    ----                                           -------
  Normal   Scheduled               28m                     default-scheduler                              Successfully assigned kube-system/k8s-proxy-7wkt4 to tj1-staging-com-ocean007-201812.kscn
  Warning  FailedCreatePodSandBox  28m (x13 over 28m)      kubelet, tj1-staging-com-ocean007-201812.kscn  Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "k8s-proxy-7wkt4": Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:297: getting the final child's pid from pipe caused \"EOF\"": unknown
  Normal   SandboxChanged          3m19s (x1364 over 28m)  kubelet, tj1-staging-com-ocean007-201812.kscn  Pod sandbox changed, it will be killed and re-created.

sandbox 创建失败只是表象,是宿主机其他异常导致的,一般是(cpu,diskio,mem)导致的.

首先,上节点看kubelet,docker有无异常,日志没有明显错误,通过top看到docker cpu占用非常高

[root@tj1-staging-com-ocean007-201812 ~]# top
top - 17:55:00 up 265 days,  3:41,  1 user,  load average: 10.71, 11.34, 10.76
Tasks: 816 total,   5 running, 811 sleeping,   0 stopped,   0 zombie
%Cpu(s): 24.0 us, 34.5 sy,  0.0 ni, 41.4 id,  0.0 wa,  0.0 hi,  0.1 si,  0.0 st
KiB Mem : 65746380 total, 20407940 free, 11007040 used, 34331400 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 49134416 avail Mem 
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                      
 115483 root      20   0 3965212 273188  34564 S 489.7  0.4 382260:40 dockerd                                                                                                                                      
1367523 root      20   0   18376   2972   2716 R  66.9  0.0  20163:45 bash                                                                                                                                         
1367487 root      20   0   11856   5616   4512 S  54.0  0.0  16748:26 containerd-shim                                                                                                                              
3200169 root      20   0    1300      4      0 R  53.3  0.0  14913:49 sh                                                                                                                                           
2429952 root      20   0    1300      4      0 S  49.3  0.0   9620:56 sh                                                                                                                                           
3200130 root      20   0    9392   4756   3884 S  47.7  0.0  13417:30 containerd-shim                                                                                                                              
3718475 root      20   0    1300      4      0 R  47.4  0.0   8600:20 sh                                                                                                                                           
3718440 root      20   0   10736   5516   4512 S  42.1  0.0   7575:31 containerd-shim                                                                                                                              
2429917 root      20   0   11856   5556   4512 S  40.1  0.0   8313:22 containerd-shim                                                                                                                              
3205493 root      20   0 3775924 230996  66704 S  18.9  0.4   2559:07 kubelet                                                                                                                                      
      1 root      20   0  195240 157000   3932 S   7.9  0.2   1417:46 systemd                                                                                                                                      
    804 dbus      20   0   30308   6460   2464 S   1.7  0.0 462:18.84 dbus-daemon                                                                                                                                  
1011737 root      20   0  277656 122788  18428 S   1.3  0.2 768:03.00 cadvisor                                                                                                                                     
 115508 root      20   0 7139200  32896  24288 S   1.0  0.1 662:25.27 containerd                                                                                                                                   
    806 root      20   0   24572   3060   2480 S   0.7  0.0 171:22.52 systemd-logind                                                                                                                               
 511080 root       0 -20 2751348  52552  15744 S   0.7  0.1 178:27.51 sagent                                                                                                                                       
1102507 root      20   0   11792   7292   4512 S   0.7  0.0  23:36.37 containerd-shim                                                                                                                              
1272223 root      20   0  164800   5296   3824 R   0.7  0.0   0:00.38 top                                                                                                                                          
2866292 root      20   0 5045000 1.983g   3080 S   0.7  3.2 230:09.47 redis

同时, cpu system异常高.

%Cpu(s): 24.0 us, 34.5 sy,  0.0 ni, 41.4 id,  0.0 wa,  0.0 hi,  0.1 si,  0.0 st

按照以前的经验,一般是由某些容器引起的,通过top看到个别sh进程占用cpu较高.

通过ps看到进程居然是个死循环

[root@tj1-staging-com-ocean007-201812 ~]# ps -ef |grep 1367523
root     1287628 1247781  0 17:55 pts/1    00:00:00 grep --color=auto 1367523
root     1367523 1367504 72 Feb28 ?        14-00:04:17 /bin/bash -c while true; do echo hello; done

通过/proc/pid/cgroup找到对应容器

# cat /proc/1367523/cgroup
11:freezer:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
10:devices:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
9:hugetlb:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
8:blkio:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
7:memory:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
6:perf_event:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
5:cpuset:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
4:pids:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
3:net_cls,net_prio:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
2:cpu,cpuacct:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd
1:name=systemd:/kubepods/besteffort/pod55d3adf2-67f7-11ea-93f2-246e968203b8/29842d5544b701dbb5ff647dba19bb4ebec821edc6ee1ffbd7aeee58fa5038fd

找到对应容器

docker ps | grep 29842d554

清理完相关pod后,系统恢复正常

top - 18:25:57 up 265 days,  4:12,  1 user,  load average: 1.05, 1.24, 4.02
Tasks: 769 total,   1 running, 768 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.7 us,  0.9 sy,  0.0 ni, 97.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 65746380 total, 22106960 free, 10759860 used, 32879560 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 49401576 avail Mem 
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                      
3205493 root      20   0 3775924 229844  66704 S   9.9  0.3   2563:18 kubelet                                                                                                                                      
 115483 root      20   0 3965468 249124  34564 S   7.9  0.4 382323:36 dockerd                                                                                                                                      
      1 root      20   0  195240 157000   3932 S   6.3  0.2   1419:48 systemd                                                                                                                                      
    804 dbus      20   0   30308   6460   2464 S   2.0  0.0 462:51.51 dbus-daemon                                                                                                                                  
3085322 root      20   0 12.045g 1.578g  19028 S   1.3  2.5 767:51.19 java                                                                                                                                         
 115508 root      20   0 7139200  32264  24288 S   1.0  0.0 662:42.18 containerd                                                                                                                                   
 511080 root       0 -20 2751348  42116  15744 S   1.0  0.1 178:44.79 sagent                                                                                                                                       
1011737 root      20   0  277656 111836  18428 S   1.0  0.2 768:49.01 cadvisor                                                                                                                                     
1523167 root      20   0  164800   5436   4012 R   0.7  0.0   0:00.04 top                                                                                                                                          
3199459 root      20   0 1554708  43668   9496 S   0.7  0.1  28:50.60 falcon-agent                                                                                                                                 
      7 root      20   0       0      0      0 S   0.3  0.0 619:07.64 rcu_sched                                                                                                                                    
    806 root      20   0   24572   3060   2480 S   0.3  0.0 171:33.69 systemd-logind                                                                                                                               
  11921 root      20   0   94820  20480   5840 S   0.3  0.0   1402:42 consul                                                                                                                                       
 575838 root      20   0  411464  17092   7364 S   0.3  0.0  15:16.25 python                                                                                                                                       
 856593 root      20   0 1562392  37912   9612 S   0.3  0.1  21:34.23 falcon-agent                                                                                                                                 
 931957 33        20   0   90728   3392   1976 S   0.3  0.0   0:51.23 nginx                                                                                                                                        
1212186 root      20   0       0      0      0 S   0.3  0.0   0:01.12 kworker/14:1                                                                                                                                 
1726228 root      20   0    9392   4496   3808 S   0.3  0.0   0:00.67 containerd-shim                                                                                                                              
1887128 root      20   0  273160   7932   3128 S   0.3  0.0  46:05.23 redis-server                                                                                                                                 
2788111 root      20   0  273160   6300   3080 S   0.3  0.0  25:18.55 redis-server                                                                                                                                 
3199297 root      20   0 1563160  44812   9624 S   0.3  0.1  31:13.73 falcon-agent     

sandox创建失败的原因是各种各样的, 如[memory设置错误触发的异常][1],[dockerd异常][2].

针对此处问题是由于某些测试pod通过while true; do echo hello; done启动,死循环一直echo hello产生大量read()系统调用,所在cpu飙升.多个类似pod导致系统非常繁忙,无法正常处理其他请求.

此类问题不容易在pod创建时直接检测到,只能通过添加物理节点相关报警(dockerd cpu使用率, node cpu.sys使用率)及时发现问题.

[1] https://github.com/kubernetes/kubernetes/issues/56996
[2] https://plugaru.org/2018/05/21/pod-sandbox-changed-it-will-be-killed-and-re-created/

背景今天在k8s更新服务时,发现pod启动失败,报错failed to start sandbox container,如下所示:Events: Type Reason Age From Message ---- ...
新装Kubernetes创建一个新Pod,启动Pod遇到CreatePodSandbox或RunPodSandbox异常。查看日志 # journalctl --since 15:00:00 -u kubelet RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed pulling ...
最初的目标是使用以GO语言复制 。 随着技术的发展,它还实现了包括Linux名称空间和cgroup在内的新技术。 rootfs和间隔CPU使用率检查的思想来自 ,而池化的预分支容器来自 。 注意:仅在Linux上可用,因为ptrace,unshare,cgroup仅在Linux上可用 从安装最新的go编译器 安装libseccomp库:(对于Ubuntu) apt install libseccomp-dev 构建并安装: go install github.com/criyle/go-sandbox/... libseccomp + ptrace(改进的UOJ沙箱) 受POSIX rlimit限制的计算资源:时间和内存(堆栈)和输出 受限的系统调用访问权限(通过libseccomp和ptrace) 受限的文件访问(读,写,访问和执行)。 由UOJ docker无法启动,start restart、stop均会卡死 查看containerd服务,没有启动,/run/containerd没有配置信息 ps -ef | grep docker 发现大量进程 手动进行杀死 kill -s 9 pid
由于系统磁盘不足,导致k8s部分线程死掉,搞定之后,部署项目,报以下异常:FailedCreatePodSandBox network: failed to set bridge addr: “cni0” already has an IP address different from 10.100.1.1/24 意思是说:设置网桥地址失败:“cni0”的IP地址与10.100.1.1/24不同。 Normal SandboxChanged 7m59s (x294 over 12m).
jvm-sandbox-repeater是一个基于Java虚拟机(JVM)的沙箱环境,用于在不影响实际应用程序运行的情况下,对Java应用程序进行安全测试和验证。它提供了一种安全的测试方式,可以防止恶意代码对应用程序的攻击和破坏。jvm-sandbox-repeater还支持重复执行测试,以确保测试结果的准确性和稳定性。 我能够帮助你解决 JVM-Sandbox-Repeater 相关的问题。JVM沙箱(JVM sandbox)是一种在Java虚拟机上运行的安全环境,它可以隔离应用程序,使其无法访问系统资源或执行不安全的操作。Repeater是一种在JVM沙箱运行的工具,用于测试应用程序的安全性和漏洞。 使用JVM沙箱和Repeater,您可以在安全的环境测试应用程序,而无需担心可能会对生产环境造成的潜在风险。Repeater提供了一些功能,例如发送HTTP请求、修改请求头、重放请求等,以帮助您测试应用程序的安全性并找出可能的漏洞。
import axios from "axios" import { ElMessage } from "element-plus" import { getUser } from "@/utils" // vite proxy, need not cors // axios.defaults.baseURL = "http://localhost:8310/"; axios.defaults.baseURL = "http://106........:8310"; axios.defaults.withCredentials = true; [/code] Golang+Vue构建全功能Web应用 qingwave: 配置看起来没有问题,有改过axios吗,改了的话麻烦发下对应配置在web/src/axios/index.js