Once a TCP connection is established, it provides reliable data communication from one end to the other. While the process of establishing a connection is mostly without any problems, connection loss and reconnect scenarios may lead to data loss, caused by a gap between application and kernel on the sending machine's side.

The Basics

When two computers fall in love and decide that they shall have a connection, their love-making-programs may issue the following POSIX commands in order to establish a TCP connection:

The situation

While TCP ensures that there is no data loss within an existig connection, reality provides an opportunity to lose data even when using TCP. To be specific: When using unreliable TCP connection.
What's now an unreliable TCP connection? In context of this article, the term shall describe a TCP connection to a peer which may suddenly disappear (because Alice kill -9ed the according process or Bob physically disconnected the network cable).
But since our service is clever, it offers automatic reconnect. And this reconnect -- that's the gap in which data disappears.
To help understanding the situation, a little drawing:

Appli. --[socket()][bind()][listen()]--------[accept()][write()]
                              \              /            \
Peer1: --------------------[ L i s t e n ][ C o n n e c t e d 
                              / \        /                  \               
                            SYN SYNACK ACK                   \
                            /     \    /                      \
Peer2: ---------------[ c o n n e c t ][Connected][... somehow lost...

Time flows this direction -->
In other words, data loss happens when write()ing to a connection which has been interrupted on the side of Peer2 (as shown in the Figure) without notification of Peer1. Note that there hasn't been sent any RST or FIN to Peer1. The application calls, say, write(fd, buf, 20) and it will return 20 -- because the underlying OS buffer has taken these 20 bytes to his send buffer. The return value of 20 signals ``Got it, dismissed!'' to the application and it may feel free to free() the buffer or overwrite it or whatever. And now the problem occurs: The OS tries to transmit these 20 bytes. A TCP packet is sent -- best case Peer2 will return an ICMP message that there is no one interested in that packet any more; maybe Peer2s kernel will evend decide to answer with resetting the connection. Worst case, Peer1s kernel will just keep on retransmitting those 20 bytes until a timeout occurs and the connection is officially declared dead -- because absolutely no reaction occurred. The next time, the application tries to issue a write(), the kernel will return -1. In context of discovering the lost connection, a SIGPIPE can be issued, which normally causes an application to exit; except it installed a signal handler and decides to ignore the signal.
In any of these cases, the 20 bytes are lost: The application assumes them to be sent to the other peer, the peer never got them.

You can easily stress-test programs for such behavior by writing a server which issues just socker(), bind(), and listen(): The kernel will SYN ACK the connection -- the other side will write their first data packet -- and then you can lay back and just close() the connection. Use Wireshark or tcpdump to observe.

The reconnect scenario

In context of an application, it may be desirable to wait for a reconnect and then re-transfer the lost sequence. Unfortunately, to do so, the application must know how many bytes have gone lost. A quick-and-dirty solution seems to be keeping the last used buffer around and in case write() fails, the previous and the current buffer are to be retransmitted. Unfortunately, that won't work reliably. There is no need for each call to write() to be connected to sending a packet of it's own (and vice versa for read(), which is a common misconception about Socket API -- there is no guarantee that when I send three ten-byte-buffers I will receive three ten-byte buffers. I may receive two fifteen-byte-buffers or one thirty-byte-buffer; TCP just ensures the data to be in correct order and not to get lost, nothing else). And now, things start getting ugly. And there is a reason -- the application starts doing stuff applications are not supposed to to. Applications do application stuff like displaying GUIs, harass users, torture the FPU by doing calculations. Deciding layer-4-network stuff is just strictly kernel business. Having this problem is a typical symptom of ``you're doing something you shoudln't be doing!''. A real application would know its data and thus be able to handle such problems without any trouble -- because it would have a Session concept, covering a transaction model.
Unfortunately, the world needs programs which do not adhere to such claims. Even more sad, I do have to write such. Whatever.

As for my program, I do not know anything about the transaction state -- because my program is just a relay station intended for universal use. And thus I have only two possible ways out of the dilemma:

Yes, that were actually three items. But since one of them is not a real one and one of them is pretty useless, I took the liberty of counting the remaining one twice.

Stichworte:


Impressum