Network Layer: Building Blocks of the Internet
===
Topics Covered: IP, Addressing, NAT
## Overview of Responsibilities
Network layer protocol runs in every single device connected to the Internet. This includes hosts and routers. Therefore, all TCP/UDP/QUIC packets are encapsulated within IP packets. The network layer is also tasked with decapsulating received IP packets, and delivering the contents to the transport layer.
**Aside**: Remember from discussion, a good way to conceptualize encapsulation is that the layer above is placed in the payload of the layer below. From project one, you implemented the HTTP (application layer) protocol within the payload of the TCP packet sent over the `SOCK_STREAM` (connection-based) protocol. In project two, you will implement the security layer (TLS-ish) within the payload of a UDP packet, and within that security layer, its payload will contain the encrypted file contents.
## The IP Packet
The format of the IP packet is not super important, aside for a few features.
- The **IP version number** determines whether we are using IPv4 or IPv6.
- The **TTL** (time to live) is determined by a maximum number of remaining hops it can take.
- There is a **protocol** header which determines what the upper layer protocol is. Usually it's either TCP or UDP.
- **Source** and **destination** IP addresses. Self explanatory.
The above are headers. In the payload, we usually have a TCP or UDP packet, with its own headers.
We also have an options field, where we can store options for TCP such as whether we use selective ACK, or choosing values for flow control (window size).
## Addressing
### IPv4
32 bits long (4 bytes), uniquely identifying a host or router *inferface*. An interface is any connection between a host/router and a physical link.
Modern addressing revolves around the idea of a subnet. This involves splitting the address space into two parts. Each individual network is identified by its Network ID. Note that a subnet is assigned a block of IP addresses, but not all blocks of IP addresses necessarily belong to the same subnet.
![image](https://hackmd.io/_uploads/H1rWU7pI0.png)
CIDR is the way we define this separation (it's just the slash notation). It states that the network ID can be some arbitrary number of bits, and we denote this using `/x` notation, where `x` is the number of bits. Backbone routers (routers which exist to forward internet traffic) pretty much only use the most specific network ID available. Additionally, each subnet has two reserved addresses. The broadcast address is the last address within the block (remaining bits after the network ID all 1's) and the network address is the first address (remaining bits after network ID all 0's).
Let's do a quick example.
Say we have a subnet A defined as `200.23.16.0/23`. We are trying to forward a packet to `200.23.16.100`, we would know it's within that subnet. If we have to choose between subnet B `200.23.16.96/27` and subnet A, we would pick subnet B, since it is more specific. Why? If we allocate 27 bits, then we are left with 5 bits, so the possible address space is 2^5 = 32 - 2. Remember to subtract two, for the broadcast and network addresses. Using the same logic, calculate the size and range of the address space of A on your own.
### IPv6
IPv6 tries to solve the issue of running out of IPv4 addresses, but also simplifies the IP header. Instead of being variable length, the header is fixed 40 bytes. This includes 20 bit flow label, next header (protocol of upper layer), and 8 bits priority. The checksum is removed. Options are also moved outside the header.
![image](https://hackmd.io/_uploads/BynZG4T8A.png)
There's one big issue though. A lot of the Internet still runs on IPv4. So how can we gradually introduce IPv6 while still maintaining compatibility? One easy solution is IP tunneling.
Let's say we have two IPv6 networks, but to communicate between them we must go through an IPv4 network.
![image](https://hackmd.io/_uploads/rymEm46UC.png)
Our first IPv6 host sends a packet with source IP `2002:c0a8:101:1::1` and destination IP `4002:c0a8:202:2::2`. Our devices A and B are dual-stack routers, meaning they understand and speak both IPv4 and IPv6. Router A encapsulates the IPv6 header within an IPv4 header with IP `12.34.5.6`. Router A knows B's public IPv4 is `78.9.10.11`. So our source address over the tunnel is `12.34.5.6` and destination address is `78.9.10.11`, with our original IPv6 source and destination encapsulated. Once we reach B, B will decapsulate, and forward the packet to the correct IPv6 host with the original headers. Tunneling can also be used in VPNs, which we will discuss in discussion further with an example :)
## NAT
We're running out of IPv4 addresses to hand out. NAT (Network Address Translation) helps solve some of these issues, and also introduces a bit of security/privacy. This is a bit of a hack, with IPv6 as the main long-term goal.
There exist some address spaces specifically reserved for private addresses (`10.0.0.0/8`, `172.16.0.0/12`, `192.168.0.0/16`). What makes these private? Remember that each router interface has an IP address assigned to it. In the case of a private network, say you have one physical link to the rest of the Internet on your router. That interface has a public IP address, which is reachable by any host or router connected to the Internet. That router knows about your private network addresses, and is able to act as a proxy for your IP packet, pretending that it is the source and not your private host. An example should help explain this best.
![image](https://hackmd.io/_uploads/ryYCqmaU0.png)
Let's say we are sending a packet from host `10.20.0.3` with port 3345 on the left, and we want to reach the host on right, `65.23.78.45` on port 80 (web server).
1. `10.20.0.3:3345` sends the packet with source IP `10.20.0.3`, source port `3345`, destination IP `65.23.78.45`, destination port `80`.
2. The NAT router creates an entry in its NAT table, stating that `10.20.0.3, 3345` is translated to `24.65.45.89, 5001` (the **public IP** associated with the interface to the rest of the Internet). The 5001 port number is chosen by the router temporarily, by RFC6056. It then forwards the packet to the next hop, with source IP `24.65.45.89`, source port `5001`, destination IP `65.23.78.45`, destination port `80`.
3. Packet arrives at destination, and the (probably web server) replies with source IP `65.23.78.45`, source port `80`, destination IP `24.65.45.89`, destination port `5001`.
4. After arriving at the NAT router, we can use the translation table entry to determine that port `5001` was created for private host `10.20.0.3`, and we can deliver the packet to that host through our LAN link. Note: after some time, we have to remove the entry from the translation table, or we risk running out of port numbers.
Therefore, private address spaces can allow each private network to use the same address space, as long as these IP addresses are not exposed to the greater Internet as source or destination IPs. One major issue with NAT is that the private hosts are not aware of what their "public" IP is, so it is difficult to develop applications. Host can either learn through UPnP (Plug and Play), or the port allocated by the NAT router for a given private host can be defined statically (pretty bad solution, why?).
## Routing vs Forwarding Addendum
Let's quickly clarify the difference between routing and forwarding.
**Routing**: Filling in the forwarding table for a router with the best path to each destination. So if a router needs to deliver a packet to host A within a network, it knows the optimal path to get there.
**Forwarding**: The actual act of sending the IP packet to the next hop, based on the forwarding table. So if we know the best path from A to C is through B, then we forward the packet from A to B.
#### Next Week's Topics: ICMP, DHCP, Routing Algorithms