# CS 176C:
CDN Measurement Systems
[Slides](https://hackmd.io/@arpitgupta/B1vwd4u58?type=slide#/)
---
## Learning Objectives
- Content Distribution Networks (CDNs)
- architecture, key components, and operational challenges
- CDN Measurement Systems
- requirements, challenges, and solutions
---
# Content Delivery Networks
---
## Architecture
![](https://i.imgur.com/ahPf44t.png =450x)
PoPs are pushed closer to the end users.
---
## Services
- Front-end caches
- cache static content
- Reverse proxies
- shorten RTTs for clients (how? why?)
---
## Redirection (1)
- Directing client requests to right PoP is critical
- How? (recall DNS)
- Client's DNS request goes to local DNS resolver (LDNS)
- LDNS forwards requests to Auth server
- Auth server assigns appropriate IP address
---
## Redirection (2)
- Approach 1: Geolocate LDNS IP
- Lets auth server do the hard work
- Approach 2: Anycast
- Lets routing do the task
- Most CDNs use mix of two approaches
---
## Anycast Redirection
- An IP is announced at multiple PoPs
- Clients reach PoP with shortest routing path
- Pros: simple, easy to implement
- Cons: hard to control, prone to routing failures/inefficiencies
---
## Operational Challenges
- Content delivery relies on Internet
- which is
- dynamic: demands, routing changes, congestion, etc.
- heterogenous: tens of thousands of ASes, different link capacities
- unreliable: routing failures (planned/random)
---
# CDN Measurement Systems
---
## Goals and Requirements (1)
- Goal:
- Representative achievable performance
- Requirements
- high coverage of paths
- user-microsoft, user-others
- user-perceived performance
---
## Goals and Requirements (2)
- Goal:
- Detect relevant Internet events promptly
- Requirements
- high-frequency data collection
- explicit outage signals
- fault tolerance
---
## Goals and Requirements (3)
- Goal:
- Minimal operational overhead
- Requirements
- Minimal updates over time
- compliance across varying requirements
- privacy, security, etc.
---
## Existing Solutions
- Differ in vantage points for measurements
- End users: Cedexis (L7)
- L3 routers: RIPE Atlas (home), ThousandEyes (ISPs)
- Servers: Google, Facebook
---
### Limitations
- Insufficient coverage
- None provide coverage for more than 45 % MS clients
- Not representative
- Many servers not responsive to ICMP probes
- not representative of user-perceived performance
- only measure used paths
---
## Design Decisions
- Active measurements from MS users
- Application-layer measurements
- External services and smarter clients
- Flexible orchestration & aggr.
---
## Odin Architecture
![](https://i.imgur.com/JWtuwIn.png =450x)
---
## Odin Architecture: Deep Dive
- Client performs the following tasks
1. fetches measurement config from `Orchestration Service`
2. issues measurements
3. uploads the result to `Report Upload Endpoint`
---
## Odin Measurements
- HTTP or HTTPs (capture TLS/SSL overhead)
- Direct or DNS-based (learn user-LDNS association)
- Warm or cold (DNS caching, slowstart, etc.)
---
## Odin Clients
![](https://i.imgur.com/XmZxwXU.png =400x)
Direct or DNS-based measurements
---
## DNS-based Measurements
- MS controls the domain `contoso.com`
- Client issues DNS request to `$(RandID).contoso.com`
- Auth server resolves the IP address based on `RandID`
- Learn (LDNS, client) mappings
- (LDNS, `RandID`) + (`RandID`, Client)
What are the pros and cons of this approach?
---
## Orchestration Service
![](https://i.imgur.com/6V1YTPJ.png =400x)
Example measurement configuration
- Why assign weights for (client, target) mappings?
---
## Example: Setting Weights
```graphviz
digraph graphname{
T1 [label="Target 1"]
T2 [label="Target 2"]
T3 [label="Target 3"]
C1 [label="1000"]
C2 [label="5000"]
C3 [label="2000"]
C1 -> T1
C2 -> T1
C2 -> T2
C2 -> T3
C3 -> T3
}
```
- Each target can handle 100 measurements
- Each client group gets fair share (why?)
- Determine the weights for client groups
- {Target 1: $\frac{50}{1000}$}
- {Target 1: $\frac{50}{5000}$, Target 2: $\frac{100}{5000}$, Target 3: $\frac{50}{5000}$}
- {Target 3: $\frac{50}{2000}$}
---
## Reporting
Fault tolerance is critical.
![](https://i.imgur.com/gRe8i6P.png =450x)
Types of faults
---
## Ensuring Fault Tolerance (1)
![](https://i.imgur.com/ktKC856.png =400x)
Use proxies in third-party ISPs
---
## Summary
- Content Distribution Networks
- Key components
- Operational challenges
- CDN Measurement Systems
- Requirement and challenges
- Design and implementation
###### tags: `lecture` `cs176c`
{"metaMigratedAt":"2023-06-15T08:06:00.846Z","metaMigratedFrom":"YAML","title":"CDN Measurement Systems","breaks":true,"description":"View the slide with \"Slide Mode\".","contributors":"[{\"id\":\"146fbaf9-ce29-4e56-80ea-3c668b75e985\",\"add\":6243,\"del\":1283}]"}