Topics Covered: Reliable Data Transfer, TCP Congestion Control, QUIC
TCP must guarantee reliable data transfer, meaning that we want to ensure our data is received from a sender and able to be assembled in the correct order.
We can see the functions from the BSD socket API. The TCP server has a blocking listen()
which receives a connection from the TCP client connect()
. What really happens?
The client will send the server a packet with SYN flag set to 1, along with a random sequence number. Think of the sequence number as some number keeping track of how much data has been sent so far, and acknowledgement number as how much has been received. No data is sent.
Next, the server will send a SYN-ACK (SYN and ACK flags set to 1) sets the acknowledgement number to the client's sequence number plus one, and sets its own sequence number. The client, upon receiving, can assume the connection is established.
Finally, the client will SYN-ACK the server's SYN-ACK, setting the acknowledgement number to the server's sequence number plus one, and the sequence number to the server's sequence number. This can contain data, whereas the others will not. Once the server receives this SYN-ACK, then it can assume the connection is established.
Why does the client need to SYN-ACK the server's SYN-ACK before the server can assume connection established? In the real world, packets are lost in transit for various reasons, or time out. If a certain duration passes without acknowledgement, the receiver can assume the packet was lost (dropped). If the server's SYN-ACK is lost and it assumes a connection is set up, that is not correct.
TCP uses cumulative ACK (acknowledgement). This means that the receiver will acknowledge one more than the highest sequence number received. So if the sender sends packets with sequence numbers 1, 2, 3, 4, then the receiver sends ACK 5.
Let's trace out what happens in the above example of something called fast retransmit. Our initial sequence number (measured in bytes) is 92, and the sender A transmits 500 bytes at a time. B (the receiver) acknowledges packets as they arrive. The second packet is lost. Since the sequence number of the last packet sent in order was 592, then B will continue to ACK 592 until it receives SEQ 592 from A, while A will expect an ACK 1092 from B. After 3 duplicates (so 4 ACKs with ACK number 592) are received, A assumes that the second packet (SEQ 592) was lost, and immediately retransmits the lost packet.
Why do we use fast retransmit over the alternative? The alternative is to wait for the lost packet to time out, which is usually much longer. See lecture 7 slides 18-23 for how we dynamically estimate the timeout using Karn's algorithm. In almost all cases, three duplicate ACKs are enough to guarantee that the sent packet is lost, and we get a boost in performance.
What happens once B receives the correct packet? B will store the packets it has received (the 3rd, 4th, 5th packets) and will respond with ACK 2592 (since B received all three of aforementioned packets). A will then proceed transmitting in order.
This situation is tricker than the connection setup. Imagine you are a long distance from a friend, and you are yelling to communicate, but there is background noise so once in a while you may not hear them. You (the client) want to end the conversation with your friend (the server).
First you tell your friend that you have nothing more to say (send FIN). Your friend acknowledges that you're done talking, responds with FIN-ACK. Your friend then finishes their story and then tells you they are done too (FIN). You can now acknowledge they are done in the same way (FIN-ACK), and you and your friend can end the conversationโฆ or can you?
How do you know that your friend heard your acknowledgement that they are done? If they didn't hear it, then they think you aren't aware, and they will keep trying to tell you that they are done, but maybe you've walked away by then. How do you solve this problem?
You can wait for a long time (twice the max segment lifetime) to make sure you don't hear anything coming from your friend, and if they are silent that means they probably walked away, and you can too.
Congestion control is mostly formulaic, like timeout estimation, so it's best to see the slides (lecture 8). I will instead try to explain some intutition here for why things are the way they are.
The RcvWindow
field of the TCP header informs the sender how much free buffer space it has, which allows the sender to only transmit enough to fit the most recent update of buffer space. This is how flow control is implemented. Flow control guarantees that the receiver is not overwhelmed with new information. Usually you see the highway analogy online but I wasn't a fan of it when I took this class.
I prefer to think about a water pipe connecting to a sink for example. My sink has a finite amount of water it can hold and drains a fraction of that per second. Flow control makes sure that the sink does not dump so much water that it cannot drain down, preventing my sink from overflowing. So flow control is concerned with the receiver and how it deals with incoming information. Meanwhile, congestion control deals with how much water goes through the pipe to reach my sink. If too much is pumped into the pipe at the same time, it can burst and water never reaches my sink in the first place. Therefore congestion control is concerned with the network medium, specifically avoiding overuse and underuse.
cwnd
starts with 1 MSS (segment = TCP lingo for packet). For every ACK received, increase cwnd
by one. This doesn't sound like much, but notice the above example from the Data Transfer section. As the sender window grows, the number of increases grows as well, making it a multiplicative increase. Once past a threshold ssthresh
, then we move to congestion avoidance (additive increase) in which for each ACK received we increase by one each RTT. Essentially, this means that we will increase by one every time we go through all the packets in the window at a given moment in time.
What's the congestion control part of all this? If we receive 3 duplicate ACKs, then we half the cwnd
. There's the MD in AIMD (Additive Increase Multiplicative Decrease). And we fast retransmit the lost packet as before. If we time out, then reduce cwnd
to 1 MSS. What we aim to do is reach a threshold very quickly (moderate usage of the network) then increase cautiously. Once we start to see packet loss, we assume it is due to congestion, and we quickly decrease the number of packets we send out at a given time.
There are fairness issues with congestion control. Since this is not a requirement, some implementations can choose to forgo it, and take advantage of the network resources and consideration of others. We can technically have a tragedy of the commons issue. But as long as all clients adhere to this protocol within the protocol, then it's mostly fine.
HTTP/3 uses QUIC. QUIC is an alternative to TCP, and technically within the transport layer, but utilizes UDP.
Briefly, here are the problems with TCP which HTTP/2 and HTTP/3 sought to solve:
Developed at Google, it solves the issues mentioned above.
Without restating the lecture 10 slides, the ideas for congestion control, flow control, and reliable delivery are very similar to those of TCP. The only real difference is that sequence numbers are on the QUIC packets (multiple per UDP packet, remember), congestion control is done with the packet number, and flow control with the stream frame sequence number.
The "hardest" part about QUIC is familiarizing with the terminology and the new abstractions. Once those are figured out, the material from the TCP lectures very naturally translate over. Learning older protocols helps you develop and understand new ones, since they all solve the same fundamental problems!