CS 176C:
Monitoring Systems for Cable Networks
Slides
Learning Objectives
Proactive Network Management (PNM)
What is PNM, what is pre-equalization, etc.
How is PNM data currently used?
CableMon
design goals, rationale, etc.
how it advances state of the art?
HFC Architecture
Key components
CMTS, fiber optical node, (trunk/line) RF amplifiers, splitters, cable modem (not shown)
Proactive Network Maintenance (PNM)
What? Measurement data collected at cable modem (e.g., SNR, FEC stats, etc.)
Why? Proactively detect, localize, and correct impairments issues
DOCSIS standardizes how to
store it in a local MIB
query it using SNMP
PNM Measurement Data
Previous generation:
Data: error rates, SNRs, etc.
Limits: not suited for root-cause analysis
Next generation:
Data: pre-equalization coefficients, full-band capture, etc.
Limits: hard to scale
Analysis of pre-equalization coefficients was a game changer!
Pre-Equalization (1)
Objective
Compensate for RF impairments to improve upstream performance
How?
CMTS analyzes the quality of signals received from cable modem
It computes eqaulizer adjustment values to modem!
Pre-Equalization (2)
Cable modem signal with (top) and without (botton) pre-equalization
Equalizer Taps (Simple)
2-Tap Equalizer
Equalizer Taps (DOCSIS 2.0+)
24-Tap Equalizer
Pre-main taps
(b-8 – b-1): lower is better
Main tap
(b0): higher is better
Post-main taps
(b1 - b15): lower is better
Understanding Tap Values (1)
DOCSIS Pre-Equalization Tap Values
Understanding Tap Values (2)
Frequency Response from Pre-Equalization Data
Peak-to-valley: 18 dB > Th (0.5 dB)
Understanding Tap Values (3)
preMTTER
:
Pre-Main Tap to Total Energy Ratio
postMTTER
:
Post-Main Tap to Total Energy Ratio
MRLevel
:
TDR
:
Time domain reflectometer
Understanding Tap Values (4)
Pre-Equalization Table After Complex FFT
postMTTER > Th – > micro-reflection
TDR = ~180 feet – > distance between modem and reflection source
Localizing Faults
Correlate tap values across modems
Localize faults using TDR data
Facts
Cable broadband is available to 93 % of US homes
other options: DSL (43 %) and Fiber (29 %)
FCC requires 99.99 % availability, but currently we only have 99 % availability
PNM – > diagnose RF impairments – > avoid future outages – > improve availability
Measurement Systems (Collection)
Approach:
periodically collect PNM data (instantaneous) from cable modems (every four hours)
Questions:
Why such low collection frequency?
How to collect/analyse data at high frequency?
Measurement Systems (Analysis)
Approach:
Scoreboarding (Comcast):
per-signal (pre-specified) thresholds
cumulative score across all signals
MTR example (less than 18 dB 26% of time)
Questions:
How to handle noisy (?) data?
How to adapt thresholds over time?
Data
Collected for 8 months (2019) from 60 K modems
PNM Data
<ts, channel-freq, SNR, Tx/Rx-power, FEC stats, T3/T4 timeouts, pre-equalization coeffs, etc.> – -every 4 hours
Customer Complaint Tickets
<customer-id, creation-time, close-time, etc.>
Design Goals
Accuracy
Ticket prediction accuracy: \(\frac{CableMon \cap Customer}{CableMon}\)
Ticket coverage: \(\frac{CableMon }{Customer}\)
No manual labeling
No extensive parameter tuning
Efficient
Approach
Ticketing rate
average number of customer tickets created in a unit time (4 hours)
Divide PNM metric into bins
For each bin, compute the average ticketing rate ( \(\frac{N_{b}}{T_{b}}\) )
Set threshold based on ticketing rate.
Ticketing Rate vs. PNM Data (SNR)
Ticketing rate increases
for low SNR values.
Setting Fault Detection Threshold
Goal
Minimizes both FPs and FNs
Observation
Ticketing rate higher than normal during a faulty periods
Approach
Set threshold that maximizes ticketing rate ratio
Feature Selection
Select top-K features with higher ticketing rate ratio
Minimizing False Positives
Use a moving window of size y (12, 48 hours)
If x (8) data points in the window are abnormal, then send dispatch to fix problem
Response can be much slower!
Results (1)
Types of tickets detected
CableMon detected more high-severtiy tickets than state-of-the-art systems
Results (2)
Distribution of Ticket Life Time ( end-creation time
)
A longer ticket life time indicates that the problem that triggered the ticket takes a longer time to fix.
Results (3)
Distribution of Report Waiting Time
time difference between when problem started and when customer reported it
A shorter report waiting time indicates that the problem triggered by the ticket is more urgent.
Summary
Proactive Network Management (PNM) data
What is PNM, how it is used to proactively detect and diagnose RF impairments in the last mile
CableMon
What is CableMon, and how it uses customer complaint tickets to predict process PNM data for inferring RF impairments
Resume presentation
CS 176C: Monitoring Systems for Cable Networks Slides
{"metaMigratedAt":"2023-06-15T08:43:54.106Z","metaMigratedFrom":"YAML","title":"Monitoring Systems for Cable Networks","breaks":true,"description":"View the slide with \"Slide Mode\".","contributors":"[{\"id\":\"146fbaf9-ce29-4e56-80ea-3c668b75e985\",\"add\":7009,\"del\":748}]"}