# 50.012 Lecture 4: CDN and P2P
## CDN
Challenge: how to stream content to hundred of thousands of simultaneous users?
Store multiple copies of videos at multiple **geographically distributed** sites
Content distribution networks (CDN) store copies of content at CDN nodes. Subscribers requesting content from the application will be directed to nearby CDN copy (based on mapping request IP address to geographical location) to retrieve content.
### CDN Server Placement
#### Enter deep
Pushing CDN servers deep into many access ISPs (as close as possible to the users).
This will offer a lower delay to the customers. However, this requires the provider to install more clusters to serve a larger customer base.
#### Bring home
Smaller number of larger clusters in IXPs (Internet Exchange Points) near access networks.
### Types of CDN
* Commercial CDN (e.g. Akamai, Cloudflare, Limelight)
* Content provider's own CDN (e.g. Google, Netflix)
* Telco CDN (e.g. Singtel, StarHub)
### Cluster (Server) Selection
5 options:
* Geographically close
* Best performance: real-time measurement and choose which server they should serve from.
* Lowest load: load-balancing
* Cheapest: CDN may need to pay its provider ISP
* Any alive node: fault-tolerant
### Content Access / Routing
3 options:
* Naming (DNS)-based
* Application-driven
* e.g. use HTTP redirect
* Multiple connection setup, name lookups
* Routing (anycast)-based
* anycast: a cluster of servers that can serve the user.
* Coarse-grained (This is determined on the network layer, application has no control over which server to use)
## DNS
### DNS name resolution
### DNS records
DNS: Distributed database storing resource records (RRs)
RR format: `(name, value, type, ttl)`
#### Type=A
* `name` is hostname
* `value` is IP address
#### Type=NS
* `name` is domain (e.g. foo.com)
* `value` is hostname of authoritative name server for this domain
#### Type=CNAME
* `name` is alias name for some "canonical" name
* `value` is canonical name
* e.g. `www.ibm.com` is the alias to `servereast.backup2.ibm.com`
#### Type=MX
* `value` is the name of mailserver associated with `name`
### CDN content access by DNS redirect
#### Akamai Resource Locators (ARL)
* Each customer has its dedicated domain name, e.g. `a123.g.akamaitech.net`
* Akamai operates the authoritative DNS servers for these names.
* The original server can "akamaize" the original URL to ARL. (e.g. if sutd.edu.sg is hosted with Akamai, it will resolve to a123.g.akamaitech.net)
* Client browser issues GET to CDN instead of origin server.
## Peer to Peer Architecture
* No always-on server
* Arbitrary end systems directly communicate as peers
* Peers are intermittently connected and may change IP addresses.
### File distribution: client-server vs P2P
$$
\begin{array}{l}
\text { time to distribute } F \\
\text { to N clients using } \\
\text { client-server approach }
\end{array} \quad D_{c-s} \geq \max \left\{\mathrm{NF} / \mathrm{u}_{\mathrm{s}}, \mathrm{F} / \mathrm{d}_{\mathrm{min}}\right\}
$$
$$
\begin{array}{l}
\text { time to distribute } F \\
{\begin{array}{c}
\text { to } N \text { clients using } \\
P 2 P \text { approach }
\end{array}}\quad D_{P_{2 P}} \geq \max \left\{F / u_{s}, F / d_{\min }, N F /\left(u_{s}+\Sigma u_{i}\right)\right\}
\end{array}
$$