TCP系列之accept

0x01 描述

上图表示的是已经建立好TCP连接之后,在服务端还没有调用accept之前接收到了RST报文段的异常情况。但是,对这种异常终止的连接的处理是依赖于不同的实现的。源自Berkeley的实现完全在内核里处理这种异常终止的连接,内核不会将这个连接传递给服务器进程。然而,大多数的SVR4的实现在服务器进程调用accept的时候返回一个错误值,该错误值依赖于具体的实现。这些SVR4实现返回一个EPROTO (protocol error)错误,但是POSIX规定返回ECONNABORTED(“software caused connection abort”)错误。

0x02 与select结合的问题

我们知道,当有一个新的连接建立的时候,select会把监听套接字描述符标识为可读,然后我们可以调用accept去获得这个连接。那现在看下面的场景:
1、客户端与服务端建立连接
2、服务器进程的select标识监听套接字可读,但是它因为处理其他事情,并没有立即调用accept
3、客户端终止连接,并发送RST报文段
4、此时服务器进程调用accept,这时会有什么结果呢?如果监听套接字描述符是阻塞的,那么在源自Berkeley的实现上,该accept将会阻塞,直到有新的连接到来,在其他的实现上则会返回一个错误。如果监听套接字描述符是非阻塞的,那么在源自Berkeley的实现上,会返回EWOULDBLOCK错误,在其他的实现上则会返回ECONNABORTED或者EPROTO错误。

所以,针对这种情况的解决方案是:
1、将监听套接字描述符始终设置为非阻塞的。
2、当accpet返回-1时,检查errno的值是否是:EWOULDBLOCK(源自Berkeley的实现)、ECONNABORTED(POSIX实现)、EPROTO(SVR4实现)

一个小插曲:EAGAIN or EWOULDBLOCK, The socket is marked nonblocking and no connections are present to be accepted. POSIX.1-2001 and POSIX.1-2008 allow either error to be returned for this case, and do not require these constants to have the same value, so a portable application should check for both possibilities.

0x03 SO_LINGER Socket Option

This option specifies how the close function operates for a connection-oriented protocol. By default, close returns immediately, but if there is any data still remaining in the socket send buffer, the system will try to deliver the data to the peer. The SO_LINGER socket option lets us change this default. This option requires the following structure to be passed between the user process and the kernel.

struct linger {
int l_onoff; /* 0=off, nonzero=on */
int l_linger; /* linger time, POSIX specifies units as seconds */
};

Calling setsockopt leads to one of the following three scenarios:
1、If l_onoff is 0, the option is turned off. The value of l_linger is ignored and the previously discussed TCP default applies: close returns immediately.
2、If l_onoff is nonzero and l_linger is zero, TCP aborts the connection when it is closed. That is, TCP discards any data still remaining in the socket send buffer and sends an RST to the peer, not the normal four-packet connection termination sequence. This avoids TCP’s TIME_WAIT state, but in doing so, leaves open the possibility of another incarnation of this connection being created within 2MSL seconds and having old duplicate segments from the just-terminated connection being incorrectly delivered to the new incarnation.
3、If l_onoff is nonzero and l_linger is nonzero, then the kernel will linger when the socket is closed. That is, if there is any data still remaining in the socket send buffer, the process is put to sleep until either: (i) all the data is sent and acknowledged by the peer TCP, or (ii) the linger time expires. If the socket has been set to nonblocking, it will not wait for the close to complete, even if the linger time is nonzero. When using this feature of the SO_LINGER option, it is important for the application to check the return value from close, because if the linger time expires before the remaining data is sent and acknowledged, close returns EWOULDBLOCK and any remaining data in the send buffer is discarded.

当我们调用close,并从其返回之后,到底意味着什么呢?The basic principle here is that a successful return from close, with the SO_LINGER socket option set, only tells us that the data we sent (and our FIN) have been acknowledged by the peer TCP. This does not tell us whether the peer application has read the data. If we do not set the SO_LINGER socket option, we do not know whether the peer TCP has acknowledged the data.

One way for the client to know that the server has read its data is to call shutdown (with a second argument of SHUT_WR) instead of close and wait for the peer to close its end of the connection.

0x04 发送RST报文段

struct linger ling;
ling.l_onoff = 1; /* cause RST to be sent on close() */
ling.l_linger = 0;
setsockopt(sockfd, SOL\_SOCKET, SO\_LINGER, &ling, sizeof(ling));
close(sockfd);