Making Sure the Internet Doesn't Break

Making Sure the Internet Doesn't Break === Topics Covered: Reliable Data Transfer, TCP Congestion Control, QUIC ## Reliable Data Transfer TCP must guarantee reliable data transfer, meaning that we want to ensure our data is received from a sender and able to be assembled in the correct order. ### Connection Setup ![image](https://hackmd.io/_uploads/BJzDdH_zC.png) We can see the functions from the BSD socket API. The TCP server has a blocking `listen()` which receives a connection from the TCP client `connect()`. What really happens? The client will send the server a packet with SYN flag set to 1, along with a random sequence number. Think of the sequence number as some number keeping track of how much data has been sent so far, and acknowledgement number as how much has been received. **No data is sent**. Next, the server will send a SYN-ACK (SYN and ACK flags set to 1) sets the acknowledgement number to the client's sequence number plus one, and sets its own sequence number. The client, upon receiving, can assume the connection is established. Finally, the client will SYN-ACK the server's SYN-ACK, setting the acknowledgement number to the server's sequence number plus one, and the sequence number to the server's sequence number. This **can** contain data, whereas the others will not. Once the server receives this SYN-ACK, then it can assume the connection is established. Why does the client need to SYN-ACK the server's SYN-ACK before the server can assume connection established? In the real world, packets are lost in transit for various reasons, or **time out**. If a certain duration passes without acknowledgement, the receiver can assume the packet was lost (dropped). If the server's SYN-ACK is lost and it assumes a connection is set up, that is not correct. ### Data Transfer ![image](https://hackmd.io/_uploads/rymhYdufC.png) TCP uses cumulative ACK (acknowledgement). This means that the receiver will acknowledge one more than the highest sequence number received. So if the sender sends packets with sequence numbers 1, 2, 3, 4, then the receiver sends ACK 5. Let's trace out what happens in the above example of something called fast retransmit. Our initial sequence number (measured in bytes) is 92, and the sender A transmits 500 bytes at a time. B (the receiver) acknowledges packets as they arrive. The second packet is lost. Since the sequence number of the last packet sent in order was 592, then B will continue to ACK 592 until it receives SEQ 592 from A, while A will expect an ACK 1092 from B. After 3 duplicates (so 4 ACKs with ACK number 592) are received, A assumes that the second packet (SEQ 592) was lost, and immediately retransmits the lost packet. Why do we use fast retransmit over the alternative? The alternative is to wait for the lost packet to time out, which is usually much longer. See lecture 7 slides 18-23 for how we dynamically estimate the timeout using Karn's algorithm. In almost all cases, three duplicate ACKs are enough to guarantee that the sent packet is lost, and we get a boost in performance. What happens once B receives the correct packet? B will **store the packets it has received** (the 3rd, 4th, 5th packets) and will respond with ACK 2592 (since B received all three of aforementioned packets). A will then proceed transmitting in order. ### Connection Termination ![image](https://hackmd.io/_uploads/Hyr-7IOM0.png) This situation is tricker than the connection setup. Imagine you are a long distance from a friend, and you are yelling to communicate, but there is background noise so once in a while you may not hear them. You (the client) want to end the conversation with your friend (the server). First you tell your friend that you have nothing more to say (send FIN). Your friend acknowledges that you're done talking, responds with FIN-ACK. Your friend then finishes their story and then tells you they are done too (FIN). You can now acknowledge they are done in the same way (FIN-ACK), and you and your friend can end the conversation... or can you? How do you know that your friend heard your acknowledgement that they are done? If they didn't hear it, then they think you aren't aware, and they will keep trying to tell you that they are done, but maybe you've walked away by then. How do you solve this problem? You can **wait for a long time** (twice the max segment lifetime) to make sure you don't hear anything coming from your friend, and if they are silent that means they probably walked away, and you can too. ## Congestion Control Congestion control is mostly formulaic, like timeout estimation, so it's best to see the slides (lecture 8). I will instead try to explain some intutition here for why things are the way they are. ## Difference between Congestion and Flow Control The `RcvWindow` field of the TCP header informs the sender how much free buffer space it has, which allows the sender to only transmit enough to fit the most recent update of buffer space. This is how flow control is implemented. Flow control guarantees that the receiver is not overwhelmed with new information. Usually you see the highway analogy online but I wasn't a fan of it when I took this class. I prefer to think about a water pipe connecting to a sink for example. My sink has a finite amount of water it can hold and drains a fraction of that per second. Flow control makes sure that the sink does not dump so much water that it cannot drain down, preventing my sink from overflowing. So **flow control is concerned with the receiver and how it deals with incoming information**. Meanwhile, congestion control deals with how much water goes through the pipe to reach my sink. If too much is pumped into the pipe at the same time, it can burst and water never reaches my sink in the first place. Therefore **congestion control is concerned with the network medium**, specifically avoiding overuse and underuse. ## Slow Start and Congestion Avoidance `cwnd` starts with 1 MSS (segment = TCP lingo for packet). For **every ACK received**, increase `cwnd` by one. This doesn't sound like much, but notice the above example from the Data Transfer section. As the sender window grows, the number of increases grows as well, making it a **multiplicative** increase. Once past a threshold `ssthresh`, then we move to congestion avoidance (additive increase) in which for each ACK received we increase by one each RTT. Essentially, this means that we will increase by one every time we go through all the packets in the window at a given moment in time. What's the congestion control part of all this? If we receive 3 duplicate ACKs, then we half the `cwnd`. There's the MD in AIMD (Additive Increase Multiplicative Decrease). And we fast retransmit the lost packet as before. If we time out, then reduce `cwnd` to 1 MSS. What we aim to do is reach a threshold very quickly (moderate usage of the network) then increase **cautiously**. Once we start to see packet loss, we *assume* it is due to congestion, and we quickly decrease the number of packets we send out at a given time. There are fairness issues with congestion control. Since this is not a requirement, some implementations can choose to forgo it, and take advantage of the network resources and consideration of others. We can technically have a tragedy of the commons issue. But as long as all clients adhere to this protocol within the protocol, then it's *mostly* fine. ## HTTP/3 and QUIC HTTP/3 uses QUIC. QUIC is an alternative to TCP, and technically within the transport layer, but utilizes UDP. Briefly, here are the problems with TCP which HTTP/2 and HTTP/3 sought to solve: - HTTP/1.x handles requests in sequential order, meaning a large file (high resolution image, etc.) will block other objects - HTTP/2 organized objects into frames (can be multiple frames per object) and group them into streams (contain a request-response pair) with priorities - Still uses single TCP connection, but allows frames from different streams to be interleaved, so no one object blocks the queue - Causes problems if packets are lost, all others must still wait ![image](https://hackmd.io/_uploads/HyzJo9dMC.png) - Each stream (request/response pair) has its own letter/color, and each block is a frame. - Congestion control is tangled with flow control - Really infamously bad design decision, lost packets will block the congestion window from progressing with received packets - Delays with setting up connections - 1 RTT for 3 way handshake + 1 RTT for TLS (we talk about this in next set of notes) ### Here Comes QUIC Developed at Google, it solves the issues mentioned above. 1. The connection setup is 1 RTT, since QUIC bundles the security handshake with the connection handshake. 2. Multiplexes streams composed of priority based frames (from HTTP/2). There can be multiple streams per QUIC connection. 3. Congestion and flow control are separated. 4. Utilizes UDP -- one UDP packet can contain multiple QUIC packets. The QUIC headers are encapsulated in UDP. One QUIC packet can have multiple frames. Without restating the lecture 10 slides, the ideas for congestion control, flow control, and reliable delivery are very similar to those of TCP. The only real difference is that sequence numbers are on the QUIC packets (multiple per UDP packet, remember), congestion control is done with the packet number, and flow control with the stream frame sequence number. The "hardest" part about QUIC is familiarizing with the terminology and the new abstractions. Once those are figured out, the material from the TCP lectures very naturally translate over. Learning older protocols helps you develop and understand new ones, since they all solve the same fundamental problems!