# NDARMA(1,1) formulation ## As a choice model An observed time series $X_t$ with marginal distribution $X_t \sim \Pi$, in which $\Pi$ can be anything, say, $Categorical(\lambda)$ or $Poisson(\lambda)$, NDARMA(1,1) is $$ X_t = \begin{cases} X_{t-1} &\quad \text{with probability } \alpha_1 \\ U_{t} &\quad \text{with probability } \beta_0 \\ U_{t-1} &\quad \text{with probability } \beta_1 \end{cases} \quad, $$ in which $U_t$ is a latent process with $U_t \sim \Pi$ and $\alpha_1 + \beta_0 + \beta_1 = 1$. ## With ARMA(1,1)-type formulation For a latent innovation process $U_t$ with $U_t \sim \Pi$, an observed time series $X_t$ with marginal distribution $X_t \sim \Pi$ is an NDARMA(1,1) process/dice with and is given by, $$ X_t = a_{1,t} \cdot X_{t-1} + b_{0,t} \cdot U_t + b_{1,t} \cdot U_{t-1} $$ in which $U_t$ is a latent innovation process/dice $U_t$ with $U_t \sim \Pi$ and $D_t = [a_{1,t}, b_{0,t}, b_{1,t}]$ a latent decision variable (dice) $D_t \sim Multinomial(1; ab)$ with decision probabilities $ab = [\alpha_1, \beta_0, \beta_1]$ The Log-likelihood of this process (if I'm not mistaken) may be written as $$ \begin{align} \log\mathcal{L} &= \log \Pr(X_t = x | \lambda, \alpha_1, \beta_0, \beta_1; X_{t-1}, \dots) \\ &= \sum_{t = 1}^N \Big[ \alpha_1 \Pr(X_{t-1} = x|\dots) + \beta_0 \Pr(U_{t} = x|\dots) + \beta_1 \Pr(U_{t-1} = x|\dots) \Big] \end{align} $$ ## Some Stan attempts ``` stan data { int<lower=1> N; // Number of observations int<lower=1> k; // Number of categories in X and U int<lower=1, upper=k> X_t[N]; // Observed data X_t } parameters { simplex[k] lambda; // Parameters of the multinomial distribution Pi simplex[3] ab; // Probabilities for dice (alpha1, beta0, beta1) } model { int U_t[N]; array[3,N] int D_t; // Priors lambda ~ dirichlet(rep_vector(2.0, k)); ab ~ dirichlet(rep_vector(2.0, 3)); for (t in 2:N) { vector[3] contributions; // array[3] real D_t; // D_t ~ multinomial(ab); contributions[1] = D_t[1, t] + X_t[t-1]; contributions[2] = D_t[2, t] + U_t[t-1]; contributions[3] = D_t[3, t] + U_t[t]; target += log_sum_exp(contributions); } } ``` --- **:rage: NO** The conditional probability of the mixture should be $$ \begin{align} & \Pr(X_t = x | \lambda, \alpha_1, \beta_0, \beta_1; X_{t-1}, \dots) = \\ & \alpha_1 \Pr(X_{t-1} = x|\dots) + \beta_0 \Pr(U_{t} = x|\dots) + \beta_1 \Pr(U_{t-1} = x|\dots) \end{align} $$ So the log-likelihood would be $$ \begin{align} \log\mathcal{L} &= \log \Pr(X_t = x | \lambda, \alpha_1, \beta_0, \beta_1; X_{t-1}, \dots) \\ &= \sum_{t = 1}^N \Big[ \alpha_1 \Pr(X_{t-1} = x|\dots) + \beta_0 \underbrace{\Pr(U_{t} = x|\dots)}_{\lambda_x} + \beta_1 \underbrace{\Pr(U_{t-1} = x|\dots)}_{\lambda_x} \\ &= \sum_{t = 1}^N \Big[ \alpha_1 \Pr(X_{t-1} = x|\dots) + \lambda_x (\beta_0 + \beta_1) \Big] \\ &= \sum_{t = 1}^N \Big[ \alpha_1 \Pr(X_{t-1} = x|\dots) + \lambda_x (1 - \alpha_1) \Big] \end{align} $$ $P(X_t, X_{t-1}; \epsilon_t, \epsilon_{t-1}) = P(X_t, \epsilon_t | X_{t-1}, \epsilon_{t-1}) * P(X_{t-1}, \epsilon_{t-1})$ \[ P(X_t = i_0, X_{t-1} = i_1, U_t = j_0, U_{t-1} = j_1) \] \[ = p_{j_0} \cdot \left( (\alpha_0 \delta_{i_0 j_0} + \alpha_1 \delta_{i_0 j_1} + \alpha_1 \delta_{i_0 i_1}) \cdot P(X_{t-1} = i_1, U_{t-1} = j_1) \right. \] \[ + \beta_2 \cdot P(X_{t-1} = i_1, U_{t-1} = j_1, X_{t-2} = i_0, U_{t-1} = i_0) + \beta_2 \cdot P(X_{t-1} = i_1, X_{t-2} = i_0, U_{t-1} = j_1) \Bigg) \] $p_{j_0} \cdot \left( (\beta_0 \delta_{i_0 j_0} + \beta_1 \delta_{i_0 j_1} + \alpha_1 \delta_{i_0 i_1}) \cdot p_{j_1} \cdot \left( (1 - \beta_0) p_{i_1} + \beta_0 \delta_{i_1 j_1} \right)\right)$ $p_{j_0}p_{j_1} \cdot (\beta_0 \delta_{i_0 j_0} + \beta_1 \delta_{i_0 j_1} + \alpha_1 \delta_{i_0 i_1})((1 - \beta_0) p_{i_1} + \beta_0 \delta_{i_1 j_1})$