# CS 176C: CDN Measurement Systems [Slides](https://hackmd.io/@arpitgupta/B1vwd4u58?type=slide#/) --- ## Learning Objectives - Content Distribution Networks (CDNs) - architecture, key components, and operational challenges - CDN Measurement Systems - requirements, challenges, and solutions --- # Content Delivery Networks --- ## Architecture ![](https://i.imgur.com/ahPf44t.png =450x) PoPs are pushed closer to the end users. --- ## Services - Front-end caches - cache static content - Reverse proxies - shorten RTTs for clients (how? why?) --- ## Redirection (1) - Directing client requests to right PoP is critical - How? (recall DNS) - Client's DNS request goes to local DNS resolver (LDNS) - LDNS forwards requests to Auth server - Auth server assigns appropriate IP address --- ## Redirection (2) - Approach 1: Geolocate LDNS IP - Lets auth server do the hard work - Approach 2: Anycast - Lets routing do the task - Most CDNs use mix of two approaches --- ## Anycast Redirection - An IP is announced at multiple PoPs - Clients reach PoP with shortest routing path - Pros: simple, easy to implement - Cons: hard to control, prone to routing failures/inefficiencies --- ## Operational Challenges - Content delivery relies on Internet - which is - dynamic: demands, routing changes, congestion, etc. - heterogenous: tens of thousands of ASes, different link capacities - unreliable: routing failures (planned/random) --- # CDN Measurement Systems --- ## Goals and Requirements (1) - Goal: - Representative achievable performance - Requirements - high coverage of paths - user-microsoft, user-others - user-perceived performance --- ## Goals and Requirements (2) - Goal: - Detect relevant Internet events promptly - Requirements - high-frequency data collection - explicit outage signals - fault tolerance --- ## Goals and Requirements (3) - Goal: - Minimal operational overhead - Requirements - Minimal updates over time - compliance across varying requirements - privacy, security, etc. --- ## Existing Solutions - Differ in vantage points for measurements - End users: Cedexis (L7) - L3 routers: RIPE Atlas (home), ThousandEyes (ISPs) - Servers: Google, Facebook --- ### Limitations - Insufficient coverage - None provide coverage for more than 45 % MS clients - Not representative - Many servers not responsive to ICMP probes - not representative of user-perceived performance - only measure used paths --- ## Design Decisions - Active measurements from MS users - Application-layer measurements - External services and smarter clients - Flexible orchestration & aggr. --- ## Odin Architecture ![](https://i.imgur.com/JWtuwIn.png =450x) --- ## Odin Architecture: Deep Dive - Client performs the following tasks 1. fetches measurement config from `Orchestration Service` 2. issues measurements 3. uploads the result to `Report Upload Endpoint` --- ## Odin Measurements - HTTP or HTTPs (capture TLS/SSL overhead) - Direct or DNS-based (learn user-LDNS association) - Warm or cold (DNS caching, slowstart, etc.) --- ## Odin Clients ![](https://i.imgur.com/XmZxwXU.png =400x) Direct or DNS-based measurements --- ## DNS-based Measurements - MS controls the domain `contoso.com` - Client issues DNS request to `$(RandID).contoso.com` - Auth server resolves the IP address based on `RandID` - Learn (LDNS, client) mappings - (LDNS, `RandID`) + (`RandID`, Client) What are the pros and cons of this approach? --- ## Orchestration Service ![](https://i.imgur.com/6V1YTPJ.png =400x) Example measurement configuration - Why assign weights for (client, target) mappings? --- ## Example: Setting Weights ```graphviz digraph graphname{ T1 [label="Target 1"] T2 [label="Target 2"] T3 [label="Target 3"] C1 [label="1000"] C2 [label="5000"] C3 [label="2000"] C1 -> T1 C2 -> T1 C2 -> T2 C2 -> T3 C3 -> T3 } ``` - Each target can handle 100 measurements - Each client group gets fair share (why?) - Determine the weights for client groups - {Target 1: $\frac{50}{1000}$} - {Target 1: $\frac{50}{5000}$, Target 2: $\frac{100}{5000}$, Target 3: $\frac{50}{5000}$} - {Target 3: $\frac{50}{2000}$} --- ## Reporting Fault tolerance is critical. ![](https://i.imgur.com/gRe8i6P.png =450x) Types of faults --- ## Ensuring Fault Tolerance (1) ![](https://i.imgur.com/ktKC856.png =400x) Use proxies in third-party ISPs --- ## Summary - Content Distribution Networks - Key components - Operational challenges - CDN Measurement Systems - Requirement and challenges - Design and implementation ###### tags: `lecture` `cs176c`
{"metaMigratedAt":"2023-06-15T08:06:00.846Z","metaMigratedFrom":"YAML","title":"CDN Measurement Systems","breaks":true,"description":"View the slide with \"Slide Mode\".","contributors":"[{\"id\":\"146fbaf9-ce29-4e56-80ea-3c668b75e985\",\"add\":6243,\"del\":1283}]"}
    306 views