今天遇到一个http报错,err code是'ECONNRESET',翻了很多资料,仍觉得解释不是很清楚.想用tcp抓包工具去抓包看下,我用的抓包工具是tcpdump,本文主要是记录下这次抓包分析问题的过程。
1 启动node.js http服务;
const http = require("http")
const agent = new http.Agent({ keepAlive: true })
.createServer((req, res) => {
res.write("hello world")
res.end()
.listen(8083)
2 tcpdump监听端口8083;
tcpdump tcp port 8083 -X
3 客户端发送请求;
curl host:8083
4 一次普通的http请求tcp抓包结果(三次握手建立连接,数据传输,四次挥手断开连接)
03:01:42.256536 IP 202.105.123.52.51392 > 66.42.50.27.vultr.com.us-srv: Flags [S], seq 3324167342, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val 718175066 ecr 0,sackOK,eol], length 0
03:01:42.256634 IP 66.42.50.27.vultr.com.us-srv > 202.105.123.52.51392: Flags [S.], seq 3571998284, ack 3324167343, win 28960, options [mss 1460,sackOK,TS val 787904822 ecr 718175066,nop,wscale 7], length 0
03:01:42.509613 IP 202.105.123.52.51392 > 66.42.50.27.vultr.com.us-srv: Flags [.], ack 1, win 2058, options [nop,nop,TS val 718175317 ecr 787904822], length 0
03:01:42.509729 IP 202.105.123.52.51392 > 66.42.50.27.vultr.com.us-srv: Flags [P.], seq 1:81, ack 1, win 2058, options [nop,nop,TS val 718175317 ecr 787904822], length 80
03:01:42.509751 IP 66.42.50.27.vultr.com.us-srv > 202.105.123.52.51392: Flags [.], ack 81, win 227, options [nop,nop,TS val 787905075 ecr 718175317], length 0
03:01:42.577114 IP 66.42.50.27.vultr.com.us-srv > 202.105.123.52.51392: Flags [P.], seq 1:130, ack 81, win 227, options [nop,nop,TS val 787905142 ecr 718175317], length 129
03:01:42.836414 IP 202.105.123.52.51392 > 66.42.50.27.vultr.com.us-srv: Flags [.], ack 130, win 2056, options [nop,nop,TS val 718175643 ecr 787905142], length 0
03:01:42.836468 IP 202.105.123.52.51392 > 66.42.50.27.vultr.com.us-srv: Flags [F.], seq 81, ack 130, win 2056, options [nop,nop,TS val 718175643 ecr 787905142], length 0
03:01:42.837813 IP 66.42.50.27.vultr.com.us-srv > 202.105.123.52.51392: Flags [F.], seq 130, ack 82, win 227, options [nop,nop,TS val 787905403 ecr 718175643], length 0
03:01:43.088383 IP 202.105.123.52.51392 > 66.42.50.27.vultr.com.us-srv: Flags [.], ack 131, win 2056, options [nop,nop,TS val 718175893 ecr 787905403], length 0
TCP Flags
CWR ,ECE 两个flag是用来配合做congestion control的,一般情况下和应用层关系不大。发送方的包ECE(ECN-Echo)为0的时候表示出现了congestion,接收方回的包里CWR(Congestion Window Reduced)为1表明收到congestion信息并做了处理。我们重点看其他六个flag。
URG URG代表Urgent,表明包的优先级高,需要优先传送对方并处理。像我们平时使用terminal的时候经常ctrl+c来结束某个任务,这种命令产生的网络数据包就需要urgent。
ACK 也就是我们所熟悉的ack包,用来告诉对方上一个数据包已经成功收到。不过一般不会为了ack单独发送一个包,都是在下一个要发送的packet里设置ack位,这属于tcp的优化机制,参见delayed ack。
PSH Push我们上面解释过,接收方接收到P位的flag包需要马上将包交给应用层处理,一般我们在http request的最后一个包里都能看到P位被设置。
RST Reset位,表明packet的发送方马上就要断开当前连接了。在http请求结束的时候一般可以看到一个数据包设置了RST位。
SYN SYN位在发送建立连接请求的时候会设置,我们所熟悉的tcp三次握手就是syn和ack位的配合:syn->syn+ack->ack。
FIN Finish位设置了就表示发送方没有更多的数据要发送了,之后就要单向关闭连接了,接收方一般会回一个ack包。接收方再同理发送一个FIN就可以双向关闭连接了。
备注: [.]特殊点,是个占位符,没有其他flag被设置的时候就显示这个占位符,一般表示ack
异常的keep-alive包分析
// 202.105.123.52是客户端ip, 66.42.50.27是服务器ip
06:33:23.094793 IP 202.105.123.52.49851 > 66.42.50.27.vultr.com.us-srv: Flags [S], seq 324968035, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val 725789992 ecr 0,sackOK,eol], length 0
06:33:23.094934 IP 66.42.50.27.vultr.com.us-srv > 202.105.123.52.49851: Flags [S.], seq 1315577865, ack 324968036, win 28960, options [mss 1460,sackOK,TS val 800605660 ecr 725789992,nop,wscale 7], length 0
06:33:23.408659 IP 202.105.123.52.49851 > 66.42.50.27.vultr.com.us-srv: Flags [.], ack 1, win 2058, options [nop,nop,TS val 725790304 ecr 800605660], length 0
06:33:23.409495 IP 202.105.123.52.49851 > 66.42.50.27.vultr.com.us-srv: Flags [P.], seq 1:67, ack 1, win 2058, options [nop,nop,TS val 725790304 ecr 800605660], length 66
06:33:23.409526 IP 66.42.50.27.vultr.com.us-srv > 202.105.123.52.49851: Flags [.], ack 67, win 227, options [nop,nop,TS val 800605975 ecr 725790304], length 0
06:33:23.410437 IP 66.42.50.27.vultr.com.us-srv > 202.105.123.52.49851: Flags [P.], seq 1:130, ack 67, win 227, options [nop,nop,TS val 800605976 ecr 725790304], length 129
06:33:23.719494 IP 202.105.123.52.49851 > 66.42.50.27.vultr.com.us-srv: Flags [.], ack 130, win 2056, options [nop,nop,TS val 725790613 ecr 800605976], length 0
06:33:24.730787 IP 202.105.123.52.49851 > 66.42.50.27.vultr.com.us-srv: Flags [.], ack 130, win 2056, length 0
06:33:24.730861 IP 66.42.50.27.vultr.com.us-srv > 202.105.123.52.49851: Flags [.], ack 67, win 227, options [nop,nop,TS val 800607296 ecr 725790613], length 0
06:33:47.113204 IP 66.42.50.27.vultr.com.us-srv > 202.105.123.52.49851: Flags [.], ack 331, win 227, options [nop,nop,TS val 800629678 ecr 725810270], length 0
06:33:48.102732 IP 202.105.123.52.49851 > 66.42.50.27.vultr.com.us-srv: Flags [P.], seq 331:397, ack 646, win 2048, options [nop,nop,TS val 725814940 ecr 800629678], length 66
06:33:48.102848 IP 66.42.50.27.vultr.com.us-srv > 202.105.123.52.49851: Flags [R.], seq 646, ack 397, win 227, options [nop,nop,TS val 800630668 ecr 725814940], length 0
由于使用了keepalive,所以在连续发送请求的时候不需要关闭连接/重新建立连接,直到超时关闭。上面的异常最后两个, [P.], [R.]可以看出来几乎是同时,客户端向服务端发送请求,服务端也向客户端发送断开通知,由服务端先断开连接导致的ECONNRESET错误。
由于是服务端先断开连接导致的异常,我们可以设置客户端的超时时间少于服务端:
response.setHeader(('Keep-Alive', 'timeout=4'))
new http.Agent({ keepAlive: true, timeout: 4});
但由于网络延迟等原因,仍可能会出现少量异常,可以通过异常捕获,重新发送请求(chrome会自动重试)
let req = https.request(...)
req.on('error', (err) => {
console.log(err, '错误捕获')
process.on('uncaughtException', (err) => {
console.log(err, '进程捕获错误')
1 plantegg.github.io/2019/06/21/…
2 www.jianshu.com/p/df62ac76e…
3 zhuanlan.zhihu.com/p/86953757
4 juejin.cn/post/684490…