# CS 176C:
Monitoring Systems for Cable Networks
[Slides](https://hackmd.io/@arpitgupta/BJyyVG5sU#/)
---
## Learning Objectives
- Proactive Network Management (PNM)
- What is PNM, what is pre-equalization, etc.
- How is PNM data currently used?
- CableMon
- design goals, rationale, etc.
- how it advances state of the art?
---
# Background
---
## HFC Architecture
![](https://i.imgur.com/PK3DnNP.png =400x)
- Key components
- CMTS, fiber optical node, (trunk/line) RF amplifiers, splitters, cable modem (not shown)
---
## Proactive Network Maintenance (PNM)
- What? Measurement data collected at cable modem (e.g., SNR, FEC stats, etc.)
- Why? Proactively detect, localize, and correct impairments issues
- DOCSIS standardizes how to
- store it in a local MIB
- query it using SNMP
---
## PNM Measurement Data
- Previous generation:
- Data: error rates, SNRs, etc.
- Limits: not suited for root-cause analysis
- Next generation:
- Data: pre-equalization coefficients, full-band capture, etc.
- Limits: hard to scale
Analysis of pre-equalization coefficients was a game changer!
---
## Pre-Equalization (1)
- Objective
- Compensate for RF impairments to improve upstream performance
- How?
- CMTS analyzes the quality of signals received from cable modem
- It computes eqaulizer adjustment values to modem!
---
## Pre-Equalization (2)
![](https://i.imgur.com/WscRnuY.png =250x)
![](https://i.imgur.com/pTd153f.png =250x)
Cable modem signal with (top) and without (botton) pre-equalization
---
## Equalizer Taps (Simple)
![](https://i.imgur.com/b5Wyztq.png =400x)
2-Tap Equalizer
---
## Equalizer Taps (DOCSIS 2.0+)
24-Tap Equalizer
![](https://i.imgur.com/KsoZOea.png =300x)
- `Pre-main taps` (b-8 -- b-1): lower is better
- `Main tap` (b0): higher is better
- `Post-main taps` (b1 - b15): lower is better
---
## Understanding Tap Values (1)
![](https://i.imgur.com/kqmcUdH.png =400x)
DOCSIS Pre-Equalization Tap Values
---
## Understanding Tap Values (2)
Frequency Response from Pre-Equalization Data
![](https://i.imgur.com/ojfz1u1.png =300x)
- Peak-to-valley: 18 dB > Th (0.5 dB)
---
## Understanding Tap Values (3)
- `preMTTER`:
- Pre-Main Tap to Total Energy Ratio
- `postMTTER`:
- Post-Main Tap to Total Energy Ratio
- `MRLevel`:
- Micro-reflection level
- `TDR`:
- Time domain reflectometer
---
## Understanding Tap Values (4)
Pre-Equalization Table After Complex FFT
![](https://i.imgur.com/F2n5tJi.png =400x)
- postMTTER > Th --> micro-reflection
- TDR = ~180 feet --> distance between modem and reflection source
---
## Localizing Faults
![](https://i.imgur.com/tqFRxub.png =400x)
- Correlate tap values across modems
- Localize faults using TDR data
---
## Facts
- Cable broadband is available to 93 % of US homes
- other options: DSL (43 %) and Fiber (29 %)
- FCC requires 99.99 % availability, but currently we only have 99 % availability
- PNM --> diagnose RF impairments --> avoid future outages --> improve availability
---
## Measurement Systems (Collection)
- Approach:
- periodically collect PNM data (instantaneous) from cable modems (every four hours)
- Questions:
- Why such low collection frequency?
- How to collect/analyse data at high frequency?
---
## Measurement Systems (Analysis)
- Approach:
- Scoreboarding (Comcast):
- per-signal (pre-specified) thresholds
- cumulative score across all signals
- MTR example (less than 18 dB 26% of time)
- Questions:
- How to handle noisy (?) data?
- How to adapt thresholds over time?
---
# CableMon
---
## Data
- Collected for 8 months (2019) from 60 K modems
- PNM Data
- <ts, channel-freq, SNR, Tx/Rx-power, FEC stats, T3/T4 timeouts, pre-equalization coeffs, etc.>---every 4 hours
- Customer Complaint Tickets
- <customer-id, creation-time, close-time, etc.>
---
## Design Goals
- Accuracy
- Ticket prediction accuracy: $\frac{CableMon \cap Customer}{CableMon}$
- Ticket coverage: $\frac{CableMon }{Customer}$
- No manual labeling
- No extensive parameter tuning
- Efficient
---
## Approach
- Ticketing rate
- average number of customer tickets created in a unit time (4 hours)
- Divide PNM metric into bins
- For each bin, compute the average ticketing rate ($\frac{N_{b}}{T_{b}}$)
Set threshold based on ticketing rate.
---
## Ticketing Rate vs. PNM Data (SNR)
![](https://i.imgur.com/YIUi9R7.png =450x)
Ticketing rate increases
for low SNR values.
---
## Setting Fault Detection Threshold
- **Goal**
- Minimizes both FPs and FNs
- **Observation**
- Ticketing rate higher than normal during a faulty periods
- **Approach**
- Set threshold that maximizes ticketing rate ratio
---
## Feature Selection
Select top-K features with higher ticketing rate ratio
![](https://i.imgur.com/QiaydsS.png =350x)
---
## Minimizing False Positives
![](https://i.imgur.com/b6GeVNi.png =250x)
- Use a moving window of size y (12, 48 hours)
- If x (8) data points in the window are abnormal, then send dispatch to fix problem
Response can be much slower!
---
## Results (1)
Types of tickets detected
![](https://i.imgur.com/Pf7KR4B.png =400x)
CableMon detected more high-severtiy tickets than state-of-the-art systems
---
## Results (2)
Distribution of Ticket Life Time (`end-creation time`)
![](https://i.imgur.com/G8ahlov.png =350x)
- A longer ticket life time indicates that the problem that triggered the ticket takes a longer time to fix.
---
## Results (3)
- Distribution of Report Waiting Time
- time difference between when problem started and when customer reported it
![](https://i.imgur.com/XyVlzxf.png =350x)
- A shorter report waiting time indicates that the problem triggered by the ticket is more urgent.
---
## Summary
- Proactive Network Management (PNM) data
- What is PNM, how it is used to proactively detect and diagnose RF impairments in the last mile
- CableMon
- What is CableMon, and how it uses customer complaint tickets to predict process PNM data for inferring RF impairments
{"metaMigratedAt":"2023-06-15T08:43:54.106Z","metaMigratedFrom":"YAML","title":"Monitoring Systems for Cable Networks","breaks":true,"description":"View the slide with \"Slide Mode\".","contributors":"[{\"id\":\"146fbaf9-ce29-4e56-80ea-3c668b75e985\",\"add\":7009,\"del\":748}]"}