# Media Mix Model
This is a technical documentation for the paper **Bayesian [Hierarchical Media Mix Model Incorporating Reach and Frequency Data](https://research.google/pubs/bayesian-hierarchical-media-mix-model-incorporating-reach-and-frequency-data/)** by Yingxiang Zhang, Mike Wurm, Alexander Wakim, Eddie Li, Ying Liu (Google LLC)
## Overview
This paper explains a state-of-the-art advancement in **Marketing Mix Modeling (MMM)** using a **Bayesian hierarchical framework** and introduces a new methodology that incorporates **Reach & Frequency (R&F)** data. This model—called **R&F MMM**—extends the **Geo-level Bayesian Hierarchical Media Mix Model (GBHMMM)** and improves the estimation of advertising impact by capturing the nuanced relationship between unique audience exposure and repeated ad impressions.
## 1. Key Concepts
### 1.1. Marketing Mix Modeling (MMM)
- **Objective**: Quantify the contribution of marketing channels (TV, digital, print, etc.) on KPIs like revenue or conversions.
- **Traditional Model**:
$$
y_t = \beta_0 + \sum_{m=1}^M \beta_m x_{t,m} + \sum_{c=1}^C \gamma_c z_{t,c} + \varepsilon_t
$$
Where:
- $x_{t,m}$: media spend/impressions
- $z_{t,c}$: control variables (weather, price, etc.)
- $y_t$: response (e.g., sales)
---
## 2. GBHMMM (Geo-level Bayesian Hierarchical Media Mix Model)
### 2.1. Motivation
Traditional MMM struggles with:
- Lack of data granularity at national level
- Small sample sizes and poor identifiability
### 2.2. Model Specification
For geo $g$ and time $t$, the GBHMMM model is:
$$
y_{t,g} = \tau_g + \sum_{m=1}^M \beta_{m,g} \cdot \text{Adstock}\left(\text{Hill}(x^*_{t,m,g}; K_m, S_m), \alpha_m, L\right) + \sum_{c=1}^C \gamma_{c,g} z_{t,c,g} + \varepsilon_{t,g}
$$
Where:
- $\text{Hill}(x; K, S) = \frac{1}{1 + (x/K)^{-S}}$: models **diminishing returns**
- $\text{Adstock}$: models **carryover**/lagged effects via geometric decay
**Bayesian Hierarchical Priors**:
$$
\beta_{m,g} \sim \mathcal{N}(\beta_m, \eta_m^2), \quad \tau_g \sim \mathcal{N}(\tau, \kappa^2), \quad \varepsilon_{t,g} \sim \mathcal{N}(0, \sigma^2)
$$
## 3. R&F MMM – Key Innovation
### 3.1. Motivation
**Impressions = Reach × Frequency**, but impressions don’t tell the whole story:
- 1,000 impressions could be 1,000 people once or 10 people 100 times
- Frequency effects are **nonlinear** (diminishing returns or ad fatigue)
### 3.2. R&F Model Equation
$$
y_{t,g} = \tau_g + \sum_{m=1}^M \beta_{m,g} \cdot \text{Adstock}\left(r^*_{t,m,g} \cdot \text{Hill}(f^*_{t,m,g}; K_m, S_m), \alpha_m, L\right) + \sum_{c=1}^C \gamma_{c,g} z_{t,c,g} + \varepsilon_{t,g}
$$
Where:
- $r_{t,m,g}$: Reach (unique people exposed)
- $f_{t,m,g}$: Frequency (average number of exposures)
### 3.3. Intuition
- Reach effect is modeled **linearly** to avoid overparameterization.
- Frequency undergoes **Hill transformation** to capture threshold and saturation behavior.
- Result: **Response surface** instead of a **response curve**.
## 4. Attribution Metrics
### 4.1. ROAS (Return on Ad Spend)
$$
ROAS_{m,g} = \frac{\sum_{t=T_0}^{T_1} \beta_{m,g} r_{t,m,g} \cdot \text{Hill}(f_{t,m,g}; K_m, S_m)}{\sum_{t=T_0}^{T_1} C_{t,m,g}}
$$
### 4.2. mROAS (Marginal ROAS)
- Measures marginal lift in ROI for a 1% change in either Reach or Frequency
- Calculated using perturbations:
$$
mROAS^\text{Reach}_{m,g} = \frac{\Delta y(R_{1.01}, F)}{0.01 \cdot \text{Spend}}, \quad mROAS^\text{Freq}_{m,g} = \frac{\Delta y(R, F_{1.01})}{0.01 \cdot \text{Spend}}
$$
## 5. 🎯 Optimal Frequency Estimation
### Goal:
Find the **frequency $f_m$** that maximizes **national-level ROAS** under the constraint:
$$
f_{t,m,g} = f_m \quad \forall t, g
$$
### Optimization Equation:
$$
ROAS_m(f_m) = \sum_{g=1}^G \sum_{t=T_0}^{T_1} \frac{x_{t,m,g} \cdot \beta_{m,g} \cdot \text{Hill}(f_m; K_m, S_m)}{C_m f_m}
$$
Two Bayesian approaches:
1. **Expectation-maximization over posteriors**
2. **Sample-wise optimization** using posterior draws of parameters
## 6. Simulation Study
### 6.1. User-Level Simulation
- **Reach** modeled as Bernoulli: $r_{i,t,m} \sim \text{Bern}(p_{t,m})$
- **Frequency** if reached: $x_{i,t,m} \sim \text{ZTP}(\lambda_{t,m})$
- Impressions $x$ transformed to **sales** via:
$$
\text{Conv}(x) = \frac{1}{1 + (x/K_m)^{-S_m}}; \quad \text{Adstocked with decay } a_m
$$
### 6.2. Results Summary
- **R² for R&F MMM (Geo)**: 0.99
- **R² for GBHMMM (Geo)**: 0.97
- **ROAS estimates** more accurate and tighter under R&F MMM
- Optimal frequency estimation aligns with simulation ground truth
## 7. Key Takeaways
- **R&F MMM is a meaningful extension** of traditional MMM, incorporating richer user exposure dynamics.
- **Bayesian hierarchical structure** ensures robustness in sparse data settings (region-level).
- **Simulation-based validation** makes the methodology credible and reproducible.
- Enables **media budget optimization** using mathematically interpretable and statistically grounded estimates.
## 📌 Glossary Reference
Term | Meaning | Explanation (Simple Language)
| -------- | -------- | --------
MMM (Marketing Mix Modeling) | A statistical method to estimate the impact of marketing spend on sales | Multivariable regression model over time; goal is to measure which channels (TV, digital, etc.) drive business outcomes.
GBHMMM (Geo-level Bayesian Hierarchical Media Mix Modeling) | An advanced Bayesian MMM model using regional data | Adds a hierarchical (multi-level) structure: estimates allow sharing information across regions (geos).
R&F (Reach and Frequency) | Unique users exposed and how often they're exposed | Reach = number of unique people seeing an ad. Frequency = how many times they see it.
Adstock | Models how advertising effect decays over time | Like exponential decay: impact of an ad drops off after the user first sees it.
Hill Function | Models diminishing returns from ad exposure | A smooth "S" curve showing that the more ads you show, the less extra impact you get per additional ad.
Carryover Effect | The delayed impact of an ad on behavior | Ads don’t always cause immediate purchases; their effects "carry over" into the future.
ROAS (Return on Ad Spend) | The amount of sales generated per unit of ad spend | Measures advertising effectiveness: higher ROAS = better spend.
mROAS (Marginal ROAS) | Additional revenue generated by a small extra investment | Tells you if spending slightly more money would still be efficient.
Posterior Sampling | Sampling from the distribution of model parameters after fitting | Bayesian estimation produces not just point estimates, but distributions.
Hierarchical Model | Model where parameters vary across groups but are connected | Each geo (region) has its own parameters, but they are informed by a global average too.
ZTP (Zero Truncated Poisson) | Poisson distribution with no zeros allowed | Models counts (like number of ad views) but ensures at least one view happens.
Optimal Frequency | The best average number of ad exposures to maximize ROAS | Find the "sweet spot" between showing too few and too many ads.
Simulation Study | Artificial data creation to test models | Here: simulate user-level ad exposure and sales to validate the model.