相关文章推荐
绅士的围巾  ·  cv2 imencode jpg ...·  1 年前    · 
博学的高山  ·  org.mockito.internal.r ...·  1 年前    · 
Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I have a small application which sends files over the network to an agent located on a Windows OS.

When this application runs on Windows, everything works fine, the communication is OK and the files are all copied successfully.

But, when this application runs on Linux (RedHat 5.3, the receiver is still Windows) - I see in Wireshark network trace messages of TCP Zero Window and TCP Window Full to appear on each 1-2 seconds. The agent then closes the connection after some minutes.

The Windows - Linux code is almost the same, and pretty simple. The only non-trivial operation is setsockopt with SO_SNDBUF and value of 0xFFFF. Removing this code didn't help.

Can someone please help me with this issue?

EDIT: adding the sending code - it looks that it handles properly partial writes:

int totalSent=0;
while(totalSent != dataLen)
    int bytesSent 
        = ::send(_socket,(char *)(data+totalSent), dataLen-totalSent, 0);
    if (bytesSent ==0) {
        return totalSent;
    else if(bytesSent == SOCKET_ERROR){
#ifdef __WIN32
        int errcode = WSAGetLastError();
        if( errcode==WSAEWOULDBLOCK ){
#else
            if ((errno == EWOULDBLOCK) || (errno == EAGAIN)) {
#endif
            else{
                if( !totalSent ) {
                    totalSent = SOCKET_ERROR;
                break;
        else{
            totalSent+=bytesSent;

Thanks in advance.

More details? Is the file being transferred successfully, only at a slower rate or is the transfer failing? If it's failing, where is it failing? Is anything getting across or is it failing half way through? – Robert S. Barnes Aug 8, 2010 at 8:08 @Robert, thanks. The transfer fails. If I transfer a folder contains, for example, 2 GB of 3 KB - 50 KB files, it transfers sometimes ~0.5 GB, sometimes ~1.3 GB of data and then fails. – rkellerm Aug 8, 2010 at 8:36 What error messages are you getting and which side is shutting down the connection? Are you using blocking or non-blocking I/O. Do you have a dedicated thread doing I/O? The more details the better, and if you could post code fragments that would be the best. – Robert S. Barnes Aug 8, 2010 at 9:06 What is ::send(...)? Is this a member of your class which wraps the standard send(...) function? – Robert S. Barnes Aug 8, 2010 at 11:59 Can you post the receiving code too? It sounds like data may not be getting pulled off at the receiving end. – SimonJ Aug 8, 2010 at 12:04

Not seeing your code I'll have to guess.

The reason you get a Zero window in TCP is because there is no room in the receiver's recv buffer.

There are a number of ways this can occur. One common cause of this problem is when you are sending over a LAN or other relatively fast network connection and one computer is significantly faster than the other computer. As an extreme example, say you've got a 3Ghz computer sending as fast as possible over a Gigabit Ethernet to another machine that's running a 1Ghz cpu. Since the sender can send much faster than the receiver is able to read then the receiver's recv buffer will fill up causing the TCP stack to advertise a Zero window to the sender.

Now this can cause problems on both the sending and receiving sides if they're not both ready to deal with this. On the sending side this can cause the send buffer to fill up and calls to send either to block or fail if you're using non-blocking I/O. On the receiving side you could be spending so much time on I/O that the application has no chance to process any of it's data and giving the appearance of being locked up.

From some of your answers and code it sounds like your app is single threaded and you're trying to do non-Blocking sends for some reason. I assume you're setting the socket to non-Blocking in some other part of the code.

Generally, I would say that this is not a good idea. Ideally, if you're worried about your app hanging on a send(2) you should set a long timeout on the socket using setsockopt and use a separate thread for the actual sending.

See socket(7):

SO_RCVTIMEO and SO_SNDTIMEO Specify the receiving or sending timeouts until reporting an error. The parameter is a struct timeval. If an input or output function blocks for this period of time, and data has been sent or received, the return value of that function will be the amount of data transferred; if no data has been transferred and the timeout has been reached then -1 is returned with errno set to EAGAIN or EWOULDBLOCK just as if the socket was specified to be nonblocking. If the timeout is set to zero (the default) then the operation will never timeout.

Your main thread can push each file descriptor into a queue using say a boost mutex for queue access, then start 1 - N threads to do the actual sending using blocking I/O with send timeouts.

Your send function should look something like this ( assuming you're setting a timeout ):

// blocking send, timeout is handled by caller reading errno on short send
int doSend(int s, const void *buf, size_t dataLen) {    
    int totalSent=0;
    while(totalSent != dataLen)
        int bytesSent 
            = send(s,((char *)data)+totalSent, dataLen-totalSent, MSG_NOSIGNAL);
        if( bytesSent < 0 && errno != EINTR )
            break;
        totalSent += bytesSent;
    return totalSent;

The MSG_NOSIGNAL flag ensures that your application isn't killed by writing to a socket that's been closed or reset by the peer. Sometimes I/O operations are interupted by signals, and checking for EINTR allows you to restart the send.

Generally, you should call doSend in a loop with chunks of data that are of TCP_MAXSEG size.

On the receive side you can write a similar blocking recv function using a timeout in a separate thread.

Thanks for this post. It is very informative especially the MSG_NOSIGNAL which I think is my problem on one of my applications. – kuchi Feb 1, 2012 at 9:37

A common mistake when developing with TCP sockets is about incorrect assumption about read()/write() behavior.

When you perform a read/write operation you must check the return value, they may not have read/write the requested of bytes, you usually need a loop to keep track and make sure the entire data was transfered.

FYI, in java read and write method return types are void. how can you check the return value – Md. Alif Al Amin Dec 8, 2021 at 5:31

I tried to disable Nagle's algorithm (with TCP_NODELAY), and somehow, it helped. Transfer rate is much higher, TCP window size isn't being full or reset. The strange thing is that when I chaged the window size it didn't have any impact.

Thank you.

That's really odd. Typically disabling Nagle is only useful for real time apps where you want to have very low latency at the expense of wasting allot of bandwidth. Disabling it for bulk file transfer seems counter-intuitive. Have you actually tested and seen objectively that disabling Nagle is what makes the difference? Maybe some other change you made could be responsible? – Robert S. Barnes Sep 20, 2010 at 10:29 @Robert S. Barnes: That's really odd, I agree. But this is the only change that was made, and it helped. Moreover, the receiver side has already disabled Nagle. I know that it may refer to an underlying fundamental problem that is hiding somewhere, waiting to jump out and bite at another time. But as a workaround it is good enough. – rkellerm Sep 20, 2010 at 10:48

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.