owned this note
owned this note
Published
Linked with GitHub
# Interleaving Schemes for Multidimensional Cluster Errors
Error Correcting Codes (ECC) typically assume that loss patterns are (uniformly) random. This might however not always be the case: wireless channels typically exhibit temporally correlated and bursty losses, as do 2D media like magnetic drives (think piece of dirt affecting a connected region), or barcodes. Interleaving are techniques designed to spread codewords around an $n$-dimensional space such that instead of relying on faults being random, we tolerate bursty faults instead.
In this writeup I will discuss $t$-interleaving, a technique that relies on knowledge about burst sizes to construct interleavings that are optimal in a very specific sense: namely, that they contain the smallest number of distinct codewords required to tolerate error bursts of sizes up $t$ under a single-error-correcting code, or $\tau$ bursts of size up to $t$ for an ECC that can tolerate up to $\tau$ errors.
**A strange goal.** It is unclear to me why packing as few codewords as possible is an interleaving goal, but I can speculate.
1. **Array size.** Because we are packing codewords into arrays and the codewords must be complete, fewer codewords mean we can have smaller arrays for a given codeword size. Indeed, for a given code $c$ and burst size $t$, the authors could be said to be finding the smallest array that can guarantee $t$-burst resilience using codewords drawn from $c$.
4. **Code size.** Fewer codewords might translate into a smaller code, which could also be seen as an advantage.
This is not explicitly written in the paper though, so _caveat emptor_.
## $t$-interleaved Arrays
To produce an optimal interleaving in one dimension that tolerates error bursts of size $t$ under a single-error-correcting code, it is clear that one needs at least $t$ distinct codewords whose symbols are interleaved in a repeating sequence: $1\,2\,3\,\cdots t\,1\,2\,3\,\cdots\,t\,1\,2\,3\,\cdots$. This derives from the trivial fact that any distinct interleaving will necessarily put two (or more) symbols of a given codeword into a window of size $t$, causing it not be resilient to $t$-sized bursts anymore.
Furthermore, if the code can tolerate up to $\tau$ errors, then up to $\tau$ bursts of size $t$ are tolerated by this interleaving as well[^1]. To see why this is so, notice that each codeword symbol is at least $t$ symbols apart, and that to destroy a codeword you would need a burst that destroys all of the $\tau$ symbols. Since each burst is of length at most $t$, a single burst can destroy at most one symbol, and $\tau$ bursts are required to destroy a codeword.
In two dimensions the situation is more complex. The paper defines a burst error in $n$ dimensions as a _cluster_ in the array, which intuitively is a group of cells which are all neighbors of one another. In two dimensions, the neighbors of $(x, y)$ are $(x \pm 1, y)$ and $(x, y \pm 1)$.
![](https://hackmd.io/_uploads/Bk4Ju8Ee6.png)
**Figure 1.** A (fault) cluster of size $7$.
We can then define a _path_ between elements $e_0$ and $e_{n}$ to be a sequence $P_{0,n} = \{e_0, e_1, \cdots, e_n\}$ of $n+1$ elements such that $e_i$ is a neighbor of $e_{i+1}$ for $i = 0,\cdots,n - 1$.
**Clusters.** The definition of a cluster, then, is that of a subset $\mathcal{S}$ of the array in which all elements in $\mathcal{S}$ have paths to each other. The _size_ of the cluster is given by $|\mathcal{S}|$.
As before, we can depict an interleaved array packing a set of codewords as an integer-valued array, where an integer represents a symbol in the codeword that is identified by that integer. Figure 2, for instance, represents an array that packs $5$ codewords.
![](https://hackmd.io/_uploads/rkkccINe6.png)
**Figure 2.** $3$-interleaved array of degree $5$.
**Definition of $t$-interleaved array.** An array $A$ is said to be $t$-inteleaved if no cluster of size $t$ in the array contains repeated integers. Because clusters can have arbitrary shapes, a $t$-interleaved array is resilient to any burst of errors of that size.
**Degree of interleaving.** The degree $\deg A$ of an interleaved array is given by the number of distinct codewords present in the array.
As before, an error correcting code that can tolerate up to $\tau$ errors will allow the $t$-interleaved array to tolerate up to $\tau$ error clusters of size $t$. The goal of the authors is to be able to construct $t$-interleaved arrays of minimum degrees, in two and three dimensions.
## The Two-Dimensional Case
**Bound.** If we transform the array into a lattice graph $G = (V, E)$ where vertices represent array elements and edges represent the array neighborhood relationship, the problem becomes that of finding the minimum number of colors that we can assign to vertices such that no cluster of size $t$ has two vertices of the same color.
We then construct the power graph $G^t = (V, E^t)$ of $G$ by taking all paths of length up to $t$ in $G$, and adding them as edges to $E^t$. The problem then becomes that of finding the chromatic number of $G^t$, $\chi(G^t)$; i.e., the minimum number of colors that need to be used such that no two neighbors in $G^t$ are of the same color.
Because $G^t$ incorporates all paths of length up to $t$, this means that $\chi(G^t)$ is enough to guarantee a colouring such that no two vertices within distance $t$ of each other in $G$ are colored the same.
They then prove two bounds on the chromatic number, namely:
$$
\chi(G^t) \geq
\begin{cases}
\frac{t^2}{2}, &\text{if $t$ is even}\\
\frac{t^2 + 1}{2}, &\text{otherwise}
\end{cases}
$$
The proof follows from arguing that any two elements in a two-dimensional sphere $\mathcal{S}_{2,t}$ of "diameter" $t$ must belong to some cluster of size at most $t$. The bound is then given by the area of such spheres.
Here a "sphere" is constructed recursively, as follows:
* $\mathcal{S}_{2,1}$ is a single element (tile) in the array;
* $\mathcal{S}_{2,2}$ is a $1 \times 2$ subarray;
* $\mathcal{S}_{2,t+2}$ is constructed by taking each element $e \in \mathcal{S}_{2,t}$ and adding its neighbors that are not already in $\mathcal{S}_{2,t}$; i.e. you add one more "layer" to the sphere.
![](https://hackmd.io/_uploads/ryVnzo4ga.png)
**Figure 1.** Two-dimensional spheres.
The proof is by induction, and the induction step relies on the fact that $\mathcal{S}_{2,t}$ already contains paths of length at most $t - 1$ by the induction hypothesis (Figure 1), and that by construction the path lengths in $\mathcal{S}_{2,t + 2}$ must be therefore at most $t + 1$, as either the elements at both ends of the path are in $\mathcal{S}_{2,t}$; at which point the path connecting them if of lenght at most $t - 1$; or they are not, at which point they are at distance at most $1$ from an element in $\mathcal{S}_{2,t}$ each; hence the $2$ unit increase in length wrt $t - 1$.
One point of confusion is that authors are actually referring to shortest paths. As can be seen from Figure 2, the sphere $\mathcal{S}_{2,8}$ can have paths that are much longer than $8$.
![](https://hackmd.io/_uploads/B1jb7jVg6.png)
**Figure 2.** $\mathcal{S}_{2,8}$ and a path of lenght $10$.
**Construction.** Two-dimensional $t$-interleaved arrays of degree $m$ can be constructed fairly easily when $t$ is even.
If we let $t = 2k$, we just need to construct two $k \times k$ "tiles" that are composed of totally different numbers, and then append such tiles repeatedly and alternatingly, we are able to construct a $t$ interleaved array of degree $4k^2 = t^2/2$; i.e. optimal.
The paper puts forth other more complicated constructions (e.g. a toroidal construction which relies on "lattice interleavers") but I won't go into that now as the goal here is to gain basic understanding.
[^1]: Somewhat implicit is the fact that if the code can tolerate up to $\tau$ errors then each codeword must be more than $\tau$ symbols long, as otherwise the code would trivially not tolerate this many errors. This means that the interleaved must contain more than $t \times \tau$ symbols.