Challenge 1 Slides

--- # Challenge 1 Slides --- <style> .reveal { font-size: 24px; } </style> Challenge 1 Understanding the Fundamental Performance Limit of a Distributed RF Sensor System ---- | Collaborator | Affiliation | Email | | ---------------- | ------------------------ | ----- | | Matthew Aldridge | University of Leeds | m.aldridge@leeds.ac.uk | | David Allwright | University of Oxford | allwrigh@maths.ox.ac.uk | | Alister Burr | University of York | alister.burr@york.ac.uk | | Gabriele Gradoni | University of Nottingham | gabriele.gradoni@nottingham.ac.uk | | Tamara Grossmann | University of Cambridge | tg410@cam.ac.uk | | Bill Liles | Independent | Lilesw@gmail.com | | James Matthews | PA Consulting | james.matthews@paconsulting.com | | Yuhui Luo | National Physical Laboratory | yuhui.luo@npl.co.uk | | Tony Mulholland | University of Bristol | anthony.mulholland@bristol.ac.uk | | Joe Kornycky | Dstl | JRKORNYCKY@dstl.gov.uk | ---- ## Challenge description * Radio frequency (RF) sensors are deployed to measure the electromagnetic spectrum for various applications. The measurement outputs allow us to perform signal analysis and geo-location. We typically seek to understand the nature and source location of RF activity e.g. what communications transmitters of certain types are active and where they may be located. * To do this we measure magnitude, phase and polarisation of electromagnetic waves at specified points in time and space and signal process the results. The challenge is measuring these as accurately as possible and the hypothesis is that there is an ultimate limit to performance driven by factors such as the ‘laws of physics’. * It is not clear how close we are to the limit of what is achievable nor where best to invest for performance increase. * The challenge is to derive a mathematical approach that looks at the fundamental limits as this has the potential to unlock better performance. ---- ![Slide from the challenge presentation](https://i.imgur.com/w6vevVw.png) ---- * Any measurement will exhibit errors of some kind in that we will not measure a field perfectly in magnitude, phase and polarisation at the point in time and space we believe we are measuring at. These errors will contribute to an inaccuracy in our understanding of the signal environment and geolocation. ### Key Questions * How do we describe the system in such a way as to understand the limits of performance? * What are the fundamental limits in our ability to measure field parameters using a distributed sensor network? How do the errors aggregate through the system? * How does this vary with the nature of the network eg scale with the number of measurement points (sensors/sensor arrays)? * Can we move sensing into the information space and define our sensors within this? Would mutual information help us to describe this? ::: ---- ## Stochastic Geometry model * There is an approach to derive the distribution of the Signal to Interference plus Noise Ratio (SINR) based on stochastic geometry and viewing the distribution of receivers as a Poisson Point Process. $$ \text{SINR} = \frac{HpR^{-\alpha}}{\sigma^2 + I_R} $$ $$ I_R = \sum_{X_i \in {\Phi \setminus B_o}} G_i p \|X_i\|^{-\alpha} $$ ---- ## A toy model for geolocation from timing * Sensors at $x_1, x_2, \dots, x_n$ receive a signal from a target at an unknown position $y$. The signal is received at position $x_i$ at time $T_i = \frac{1}{c} \|y - x_i\| + t_0 + \epsilon_i$, where $c$ is the speed of light (let's use units where $c = 1$), $t_0$ is the unknown time the target emitted the signal, and the $\epsilon_i$ are timing noise. * With equally good sensors, the goal might be to minimise $$ S = \sum_i \big(T_j - \|\hat y - x_i\| - t_0\big)^2 $$ over $\hat y$ (the thing we want to estimate) and $t_0$ (a nuisance parameter). If the noise is Gaussian, we can interpret this as a maximum likelihood estimate. ---- * If we model the noise by $\epsilon_i \sim \mathrm{N}(0, \sigma_i^2)$, we can represent good sensors with small $\sigma^2_i$ and bad sensors with large $\sigma^2_i$. We could then look at a *weighted* sum of squares; for example, $$ S = \sum_i w_i \big(T_j - \|\hat y - x_i\| - t_0\big)^2 \qquad \text{with} \qquad w_i \propto \frac{1}{\sigma_i^2}, \qquad \sum_i w_i = 1$$ * But we might want to avoid assuming Gaussian errors. There might also be errors in the positions $x_i$; that is, we don't know exactly where our sensors are. The time differences $$ \Delta T_{ij} = T_i - T_j = \big( \|y - x_i\| - \|y - x_j\| \big) + (\epsilon_i - \epsilon_j) $$ cancel out the unknown $t_0$. Also, $\Delta T_{ij}$ and $\Delta T_{ik}$ are dependent, but $\Delta T_{ij}$ and $\Delta T_{kl}$ are independent for $i,j,k,l$ distinct. In the noiseless case, $\Delta T_{ij} = \text{constant}$ defines a hyperbola in which $y$ must be. ---- $$ \mathbf T = \begin{pmatrix} 0 & \Delta T_{12} & \Delta T_{13} & \cdots \\ \Delta T_{21} & 0 & \Delta T_{23} & \cdots \\ \Delta T_{31} & \Delta T_{32} & 0 & \cdots \\ \vdots & \vdots & \vdots & \ddots \end{pmatrix}$$ for the matrix of time difference. Note that $$ \mathbf T = \begin{pmatrix} T_1-T_0 & T_1-T_0 & T_1-T_0 & \cdots \\ T_2-T_0 & T_2-T_0 & T_2-T_0 & \cdots \\ T_3-T_0 & T_3-T_0 & T_3-T_0 & \cdots \\ \cdots & \cdots & \cdots & \cdots \end{pmatrix} - \begin{pmatrix} T_1-T_0 & T_2-T_0 & T_3-T_0 & \vdots \\ T_1-T_0 & T_2-T_0 & T_3-T_0 & \vdots \\ T_1-T_0 & T_2-T_0 & T_3-T_0 & \vdots \\ \vdots & \vdots & \vdots & \vdots \end{pmatrix}.$$ * We don't know -- but would like to know -- the two matrices on the right-hand side. We do know that they have rank 1, so $\mathbf T$ has rank (at most) 2. This gives a method of finding the $T_i$s, which is to find a low-rank representation of $\mathbf T$. ---- ### The 1D problem for Time Difference of Arrival (TDOA) (multilateration) with noise * The 1D problem for TDOA with noise would take a form something like....suppose we have sensors at points $x_j$ from a Poisson process with rate $\lambda$ (the average number of points per unit length) and each sensor reports a time $T_j=T_0+|x-x_j|+e_j$ at which it receives some distinct recognizable signal emitted from position $x$ at time $T_0$. (This is effectively in units with the wavespeed $c=1$.) * From this we have to reconstruct $x$ (and $T_0$, though this is a nuisance parameter). Suppose initially that the $e_j$ are Gaussian with mean 0 and variance $\sigma^2(|x-x_j|)$, increasing with distance. * For the case of Gaussian noise we expect (Rao–Blackwell theorem) that the variance in our estimated position will be asymptotic to the Cramér–Rao lower bound (CRLB). So we compute that from the log-likelihood function $$\log L=-\sum_j\frac1{2\sigma_j^2}(T_j-T_0-|x-x_j|)^2 + \text{const } .$$ ---- So then $$\frac{\partial\log L}{\partial T_0}=\sum_j\frac{e_j}{\sigma_j^2},\qquad \frac{\partial\log L}{\partial x}=\sum_j\frac{e_j\textrm{sgn}(x-x_j)}{\sigma_j^2}, $$ and then the Fisher information matrix conditional on the $x_j$ is $$I = \mathbb{E} \left(\begin{array}{cc}\sum_j1/\sigma_j^2 & \sum_j\textrm{sgn}(x-x_j)/\sigma_j^2 \\ \sum_j\textrm{sgn}(x-x_j)/\sigma_j^2 & \sum_j1/\sigma_j^2\end{array}\right).$$ ---- * So to get something sensible we are going to assume that $\int dx/\sigma(x)^2$ converges. The expectation of the off-diagonal term over the Poisson process will then be 0 (as the contribution to the expectation from sensors to the left of $x$ will cancel with that from the right). The expectation of the other terms will (by [Campbell's theorem](https://en.wikipedia.org/wiki/Campbell%27s_theorem_(probability))) be $$\mathbb{E} \left(\sum_j\frac1{\sigma_j^2}\right)=\lambda\int_{-\infty}^{\infty}\frac1{\sigma(x)^2}\,\mathrm{d}x.$$ * So we could then model the performance of the sensors as say one of $$\sigma(x)^2=\sigma_0^2\times\begin{cases}(1+(x/L)^2)\\\exp(|x|/L)\\\exp(x^2/L^2)\end{cases},$$ so that in each case $\sigma_0$ is the RMS error at close range and $L$ is the range-scale of the sensor over which the RMS error increases. * The details of what model to assume here would depend on a propagation model for how the SNR decreases with range, and then a further model for how the timing accuracy decreases as the SNR falls. ---- * The resulting CRLB is $$\mathrm{Var}(x)\ge \frac{\sigma_0^2}{c\lambda L},$$ where $c$ is a numerical constant depending on the model (in fact $\pi$, $2$ and $\sqrt\pi$ for the 3 cases above). * So this shows quantitatively the kind of effects we expect, namely that to get good emitter location we need - small $\sigma_0$, i.e. accurate sensors, - large $\lambda$, i.e. dense sensors, - large $L$, i.e. long-range sensors. ---- * Note that the expected number of sensors within $L$ of a given point is $2\lambda L$, so the denominator above is proportional to the number of sensors within range -- which is just the form we would expect. * Note also that we would expect a similar approach to work in $d=2$ or $d=3$ dimensions -- the terms $\textrm{sgn}(x-x_j)$ would be replaced by unit normal vectors, but they would still average to 0 in the off-diagonal terms of the Fisher information matrix. The diagonal terms relating to $x$ would be $1/d$ times the $d\times d$ identity matrix times the expectation of the sum of $1/\sigma_j^2$, and so there would be an additional factor of $d$ multiplying the CRLB for the variance of $x$. ---- * However, this is all predicated on the assumptions of Gaussian measurement noise, and the CRLB. To model non-Gaussian measurement noise, we would need to think about the causes of this: * One cause would be multipath, so that a sensor detects the required signal but at a time *later* than it "should". This gives rise to an error $e_j$ that would be necessarily positive -- a distinctly asymmetric noise distribution; * Another cause that might be that the sensor mistakes some other signal for the expected one, and so it reports a time perhaps uniformly distributed along the time window we are searching, or it fails to report receiving the signal at all. * In addition to the causes of non-Gaussian noise, we need to address the issue whether there are better bounds on the extraction of parameters from observations, and specifically in the case of non-Gaussian noise. There has certainly been a lot of work on this since the time of the Cramer-Rao bound, in particular there is literature associated with the "Barankin bound", the "Bhattacharyya bound", and there is work by [John Hammersley (1950)](https://www.jstor.org/stable/2983981), and [Naudts (2004)](https://arxiv.org/abs/math-ph/0402005), and [others](https://arxiv.org/abs/1802.04483). These should be looked at not only for the results themselves, but more widely to see what approaches they suggest for cases where the CRLB is suspected of being unattainable. ---- ### N Sensors that are "Good"/"Bad" * Thinking back to the general problem posed --- the fundamental performance limits of a distributed RF sensor system --- we shall generally have the situation that some data is: - from the contact we are interested in, - from other contacts, - clutter (e.g. multipath), - noise, - missing. * Any algorithm to resolve such a situation must have ways of coping with these effects: these ways may be based on mathematical models, *ad hoc* approaches, experience with similar problems *etc.* But to get a *theoretical* understanding of the fundamental limits we would need to have mathematical models of *all* of the elements involved. The way they combine and interact will be very dependent on these models, the details of the sensors, the context and so on. This suggests that perhaps the question is too general to have a generic answer. ---- * To try to illustrate this, think about a system where we only have either good data or missing data. Suppose the ideal system needs $N_{\rm ideal}$ sensors (e.g. $N_{\rm ideal}=4$ for unambiguous TDOA geolocation in 2D), and suppose that in fact we have $n$ sensors, each of which provides data with probability $p$, independently of other sensors, and with probability $1-p$ provides no data. * Then the number of sensors providing good data has a binomial distribution $B(n,p)$, and we would need this to exceed $N_{\rm ideal}$ with sufficiently high probability. But if we modify that model so that when a sensor does not provide good data it provides noise (instead of no data at all) then the situation is completely different. * Algorithmically it is necessary to partition the data into some that is regarded as good and some that is regarded as noise. The good data will be recognized by it fitting some characteristic pattern (e.g. lying on the forward light cone from some point in a TDOA geolocation system). Then the "ideal" algorithm will be applied to the data regarded as good. The kinds of errors that this results in are of broadly 2 kinds: * if there is a subset of the noise points that happens to match the characteristic pattern then it may be picked up instead of the good data; * even if the required set of good data points is picked up roughly correctly there are the possibilities that some points that should be in it are excluded, and some noise points are included. ---- * The first of these will result in a completely spurious indication of a contact. * The second will hopefully lead to an approximately correct location but with an error that is not of the same form as the ideal algorithm since it is compounded by having some valid data points excluded (which will tend to increase the variance) and some spurious points included (which will tend to bias the result). * So algorithmically these 2 models (good&missing *vs.* good&noise) have very different features, and from the theoretical point of view of determining the fundamental limits they also have very different features. * The first has the straightforward solution using the binomial distribution but the second needs a more detailed specification before work can start. ---- * Can we propose a model for which we can address this in detail? * For instance it might have the following form * we have $n$ sensors at points $x_j$ that are intended to detect the time of arrival of a signal emitted from an unknown point $x$ at an unknown time $T$. * With probability $p$, sensor $j$ reports the arrival time as $T+|x-x_j|/c+e_j$ (where the error $e_j$ has some known distribution, e.g. Gaussian with mean 0 and variance $\sigma_j^2$), but with probability $1-p$ it reports a random time drawn from some other distribution, which might be unknown.- The value of $p$ might also be unknown. * Finding the fundamental limits in this model would be a question explicit enough to address mathematically, and that would hopefully shed light on how those limits depend on $p$, on the noise distribution and so on. But is this a realistic-enough model problem from DSTL's viewpoint? ---- ### Sensor characterization and transfer function * In the 1D toy model above, the parameter $\sigma_0$ (or $\sigma_j$ in the above) was introduced to capture the sensor uncertainty in the TDOA. In the context of wireless communications, the error/offset in estimating the TDOA of two delayed copies of the transmitted signal has been calculated, e.g., in orthogonal frequency-division multiplexing (OFDM) and ultra-wide bandwidth (UWB) signals. * In OFDM, the time domain transmitted signal is modulated and narrowband, while in UWB the signal is unmodulated and broadband. * UWB pulses are able to resolve multiple path components of the multi-path signal (in frequency domain) generated within the environment hosting the transmitter. * This leads, for a single propagation path, to $$\sigma_0^2\ge \frac{1}{2 \, \beta^2 \, SINR},$$ where it was assumed a zero mean random TDOA estimation error, and ---- * $$\beta^2 = \frac{\int^{\infty}_{-\infty} \, f^2 \, S(f) \, df}{E_p},$$ is the second order moment of the power spectrum $S (f)$ of the transmitted pulse, which can be calculated from the time-domain autocorrelation fucntion by the Wiener-Khinchin theorem, and $E_p$ is energy of the UWB pulse. * This result has been derived using the Cramer-Rao Lower Bound (CRLB) approach and extended in [5] to include the effect of multiple path components - attenuated and delayed replica of the signal arriving at the sensor - using the Ziv-Zakai lower bound. * However this result is useful only if one deals with one perfect sensor/antenna. * In order to tackle this Challenge, the UWB results need to be extended to include multiple (possibly correlated) sensors with different individual $SINR$, operating in a 2D/3D environment. ---- * Also, since actual sensors are imperfect, the transfer fucntion of the sensor should be used to transform the power spectrum of the pulse into the power spectrum of the received electric voltage/current signal, viz., $$W_j(f) = \left | H_j(f) \right|^2 S(f),$$ where $H_j(f)$ configures as an Antenna Factor (ratio between electric field impinging on the sensor and voltage appearing at the sensor terminal at fixed frequency), provided by the manufacturer, or which can be either measured for the specific type of sensor $j$. * Furthermore, as one moves through the processing chain after having received the pulse signal, the noise figure increases with the number of electronics components manipulating the electric voltage/current, i.e., the $SINR$ of $j$ decreases. ---- * After probe characterization, the voltage power spectrum passes through multiple transfer functions that are, again, imperfect and create some additional uncertainty. In a situation where components are cascaded in a linear chain, we can imagine having the product of random transfer functions modelling individual components. In a similar (but much simpler) situation with lossy cascaded chaotic cavities, the probability of the signal/power progating through the chain and arriving at the detector (in our case arriving at the geolocation algorithm) has been devised by random matrix theory, for both single port [6] interconnection and multi-port [7] interconnection between blocks/components. * A more detailed research program can be developed to explore the possibility of applying this approach to characterize the error propagation within a generic transceiver architecture. ---- ## Information Theoretic Approach * As a more general and abstract approach, information theory may allow us to provide a more general bound, not limited to any specific geolocation algorithm, and providing a different means of specifying the quality of the sensors, and quantifying the value of the information they provide. This could be regarded as a generalisation of the "good/bad" sensor approach described above, allowing a continuum of sensor qualities to be included. Lines of attack include: * Analyse the system as a single-input, multiple output (SIMO) system, in which the vector of signals received at the sensors may be used to derive an overall likelihood function for the source location, from which the performance of a maximum-likelihood geolocation algorithm might be derived. * Analyse the performance of the sensors in terms of information flow through them, using concepts of mutual information. * This might allow the effects of multipath, noise, interference, phase noise, sampling errors, quantization noise, etc, to be separately quantified and then combined. In this way "bottlenecks" to the overall system performance might be identified. It would also provide a natural way of combining information from different sensors of different qualities. Rate-distortion theory may also have a role to play, as the location error consitutes a distortion, and the information rate through a sensor is limited by various effects ---- ## Next Steps * Select some more specific examples to work from toy problems to more realistic examples * Consider other location techniques (TOA, TDOA and RoPDC) e.g. [Rate of Phase Difference Change Estimation in Single Airborne Passive Locating System] and also angle of arrival methods, power difference methods. * Review the existing literature on fundamental limits that are *stronger* than Cramer-Rao bounds, and more appropriate for non-Gaussian noise. * Extend the model to 2D/3D * Examine the effect that the SINR timing error has on the spatial error in geolocation * Bring in specific sensor transfer functions * Bring in multipath modelling considerations * Develop an information theoretic approach ---- Thank you for your attention