Unbiased Estimators of the Inverse of the Mean from a Poisson Stream

###### tags: `one-offs` `puzzles` # Unbiased Estimators of the Inverse of the Mean from a Poisson Stream **Overview**: In this post, I describe an estimation puzzle related to Poisson laws, and describe a simple solution. ## The Puzzle Suppose that you have a machine which generates iid $\mathrm{Poisson}(\lambda)$ samples, with $\lambda$ unknown, and I want a non-negative, unbiased estimate of $\lambda^{-1}$. You can take as long as you need, but after a finite amount of time, you have to give me an answer. How do you do this? ## First Thoughts Note that from a single Poisson sample, many quantities admit simple estimators, e.g. - For $n \in \mathbf{N}$, $\mathbf{E} \left[ \mathbf{I} \left[ X = n \right] \right] = \frac{\lambda^n \cdot \exp(-\lambda)}{n!}$. - For $k \in \mathbf{N}$, $\mathbf{E} \left[ X(X-1)(X-2)\cdots(X-k+1) \right] = \lambda^k$. - For $t \in \mathbf{R}$, $\mathbf{E} \left[ \exp (t \cdot X) \right] = \exp(\lambda\cdot(\exp (t) - 1))$. In practice, when taking a fixed number of Poisson samples, one can either take simple averages of these basic estimators, or use tricks related to the Rao-Blackwell theorem to obtain improved estimators. Anyways, let's see whether there is a viable estimator which is computable from just one sample. Suppose that when $X \sim \mathrm{Poisson}(\lambda)$, $f(X)$ is an unbiased estimator of $\lambda^{-1}$, for any $\lambda > 0$. Then, it holds that *as functions of $\lambda$*, \begin{align} \lambda^{-1} &= \sum_{n \geqslant 0} \frac{\lambda^n \cdot \exp(-\lambda)}{n!} \cdot f(n) \\ \lambda^{-1} \cdot \exp(\lambda) &= \sum_{n \geqslant 0} \frac{f(n)}{n!} \cdot \lambda^n. \end{align} This would relate $f$ to the Taylor expansion of $\lambda \mapsto \lambda^{-1} \cdot \exp(\lambda)$ around $\lambda = 0$. But there is a hitch: this function does not admit a Taylor expansion at $0$! As such, this estimator cannot exist. ## Second Pass When a single sample does not suffice, sometimes a fixed number of samples will be enough, i.e. perhaps one could draw $K > 1$ independent Poisson realisations, and combine them somehow. However, one can show that this strategy will also not work here. Roughly speaking, any such strategy could be used to devise a single-sample estimator of $(\lambda \cdot K)^{-1}$ from a single realisation of a $\mathrm{Poisson}(\lambda \cdot K)$, and from our earlier discussion, we know that this is not possible. The next thing to try is then to draw a random number of samples, perhaps adaptively. For example, one could try to design a procedure which is based around drawing independent Poisson variables until one of them returns a nonzero result. This would be equivalent to simulating a Geometric random variable with success parameter $p = 1 - \exp(-\lambda)$, and so the expected number of failed trials satisfies \begin{align} \mathbf{E} [ F ] &= p^{-1} - 1 \\ &= (1 - \exp(-\lambda))^{-1} - 1 \\ &= ( \exp(\lambda) - 1)^{-1}. \end{align} As $\lambda \to 0^+$, this behaves like $\lambda^{-1}$, which suggests that we might be on the right path. I mean this in the following sense: if we are interested in estimating quantities of the form $g(\lambda)$, life is broadly easier if $g$ is smooth, continuous, etc. As such, we might hope to 'absorb the singularity' at $\lambda = 0$ somehow. To this end, write \begin{align} \lambda^{-1} &= ( \exp(\lambda) - 1)^{-1} \cdot \frac{\exp(\lambda) - 1}{\lambda}. \end{align} Bearing this in mind, it becomes clear that if we are also able to form an independent unbiased estimator of $\frac{\exp(\lambda) - 1}{\lambda}$, then we are done. Now, suppose that we implement this procedure, generating $F$ realisations taking the value of $0$, and then generating $X_*$ with the law of $\mathrm{Poisson} (\lambda)$ conditioned to be positive, i.e. with mass function \begin{align} \mathbf{P}_\lambda (X_* = n) := (\exp(\lambda) - 1)^{-1} \cdot \frac{\lambda^n}{n!}. \end{align} As with standard Poisson variables, the expectations under this law of several expressions involving rising and falling factorials are tractable. In particular, one can compute explicitly that \begin{align} \mathbf{E}_{\lambda} \left[ \frac{1}{X_* + 1} \right] &=\lambda^{-1} - (\exp(\lambda) - 1)^{-1}. \end{align} It thus follows that $F + \frac{1}{X_* + 1}$ has expectation $\lambda^{-1}$, and we are done. As an aside (which is not strictly needed for the unbiasedness), it can be checked that $X_*$ and $F$ are independent of one another. ## Closing Comments Some further refinements of this puzzle which might be worth considering are: - Analyse the variance of this estimator, and describe a strategy for controlling the variance (either in absolute or relative terms) uniformly in $\lambda$. - Given $K$ replications of $\{ (F_k, X_{*, k}) \}_{k \in [K]}$, one can improve the estimate of $\lambda^{-1}$ by simple averaging. Can one do anything smarter, in terms of variance reduction? - Similarly, for $K \in \mathbf{N}$, devise an estimator of $\lambda^{-K}$ which makes optimal use of the samples which are generated along the way.