"Toward a Mathematical Framework for Computation in Superposition" (TMF) [link] provides a proof of concept for how superposition can allow parallel on many features in parallel. As a mathematical description, it serves as a starting place for understanding more complex types of superposition computation.
Here we give a condensed version of the points made in the post.
Computing ANDs in superposition
enter image description here
Features: Say we have a set of $m$ sparse, boolean features (eg, a set of topics that are either present or not in a text).Denote features by $f_{\alpha}$. Let $\ell \ll m$ be the number of features "on" at the same time.
Neurons: In the MLP layer, $d$ neurons each take a random subset of the features as inputs. Neurons are only nonzero if at least two features are "on".