nengo brain insanely cool

# How build brain? The hard part of thinking about a hard problem is like saying, "Enough, let's work this out." [CNRGlab @ UWaterloo](http://compneuro.uwaterloo.ca/about.html#frequently-asked-questions) said that for the hard problem of understanding cognition by a brain and made... - [spaun](https://www.youtube.com/watch?v=P_WRCyNQ9KY), the first large-scale brain model - [nengo](https://www.nengo.ai) so you could also code it up - two books ([meths](https://mitpress.mit.edu/9780262550604/neural-engineering/) and [How to build a brain](https://academic.oup.com/book/6263)) so you think you know what you're doing - a [summer school](https://www.nengo.ai/summer-school/)! I think CNRG folks' work is insanely cool. I want to write down what I learned in steps of **biggest surprise**. Sometimes a surprise feels really mindblow "this is sci-fi" and I can't believe it — will attach colab experiments that sort of made me believe it. The series of mindblows can triple as... # Synopsis / ToC [1. Custom dendrites. ](#1.) Given the spiking activity in a group of neurons, you can use a custom dendrite to hear any signal. [2. Vector binding.](#2.) In fixed dimensionality, two information-carrying vectors can be compressed into one, with individual information retained up to Shannon limit. [3. The basal ganglia case.](#3.) # 1. **Given the spiking activity in a group of neurons, you can use a custom dendrite to hear any signal.** ![](https://i.imgur.com/8zpXDjI.png) A group of neurons will be able to represent a high-dimensional vector, each dimension being a different decoding of its spiking activity. Every sufficiently interesting input (so not all neurons are like bleh, no reaction)\* will have a family of interesting signals accessible through custom dendrites. \*I feel like there is a trade-off between population encoding and individual encoding (as in having 'grandma' neurons that fire whenever 'grandma' occurs); and to build a brain right now, we need to assume population encoding wins. This is in accordance with paradigm shift over the past 20 years. - If there is an input such that every neuron reacts very differently, eg. have Gaussian spiking rate ([**tuning curves**](https://www.nengo.ai/nengo/examples/usage/tuning-curves.html)) centered at different values of the input, then it would be very easy to custom-dendrite any signal because there is a Gaussian radial basis. - A tuning curve's input doesn't have to be the amount of current injected into soma; it can be wavelength, or fatigue — anything in **state space**. (Spiking rate vs. soma current is called **response function**.) - Thus, it's natural to have **high-dimensional input**. The tuning curve would then be a scalar field, with nonzero partial derivative with respect to any input that matters to the neuron. - On the other hand, consider a 'grandma' input with corresponding 'grandma' neurons. As soon as 'grandma' occurs, all the 'grandma' neurons get very excited and look the same. It would be hard to get a sine signal out of them but this ensemble is clearly onto something: they are the mythical 'grandma' neurons! --- ## [`Order Custom Dendrites`](https://colab.research.google.com/drive/1SgPNHBRNGR7rZlsvuvehqBqv8KGpGtwd?usp=sharing) ... with **"principal component analysis"**, on some 32 neurons with given spiking activity: ![](https://i.imgur.com/mjqVTqs.png) The default `nengo.Connection` can be used as an interface to order custom dendrites. When the given ensemble was stimulated with some sufficiently interesting high-dimensional input $\mathbf{f}(t)=\langle f_{1}(t),f_{2}(t),\dots \rangle$, the `Connection` would know. `nengo.Connection(given_ensemble[i], listener[j], func g)` would solve for a dendrite (by default using least squares) that gathers the signal $g(f_{i}(t))$, and channel what the dendrite hears to the listener's dimension $j$. --- If an ensemble of neurons can represent vectors, and custom dendrites can compute mathematical transforms, this framework suddenly becomes overpowered and turns into an ALU. The [**Neural Engineering Framework (NEF)**](https://www.nengo.ai/nengo/examples/advanced/nef-summary.html) philosophy that `nengo` implements is **not** an ALU. NEF is more like: > Neural ensembles represent information; let us use state variables (vectors) to **represent** them. > > Information gets processed; let us **describe** them with math. > > We will put parts together (**systems design engineering**) while sticking to spiking neurons, neural anatomy, and neural resources (eg. connections per $\text{mm}^{3}$). I really recommend checking out NEF in [Neural Engineering](https://mitpress.mit.edu/9780262550604/neural-engineering/) pp. 15, 23, 230. In my learning, high-dimensional objects didn't come easily so I tried my best to justify them; hope you find something interesting in flipping through Neural engineering! For a moment, allow me to indulge- could ALU stuff actually happens in our brains? - A neuron can't have positive weights for one dendrite and negative weights for another, as we take into account for [the basal ganglia](#3.) — but this is not a real problem, there's the [`nengo` Parisien transform](https://forum.nengo.ai/t/how-to-realize-the-inhibitory-connection-between-neurons/1351/4) to avoid it. - To create redundancy, at least a couple dendrites have to grow the same; but dendrites can be learned — big thing for writing about next. As implemented in nengo, the level of customization that dendrites reach is really good when you use enough neurons. In step 0 of this [image encoding example](https://www.nengo.ai/nengo-extras/examples/mnist_single_layer.html), 28x28 MNIST images were used as 784D input vectors. 10 custom dendrites were ordered (using least squares) to have PSC = 0 or 1, in accordance with a one-hot encoding of the input label. This is much harder than a usual classification loss function! The 10 custom dendrites don't manage to be 0 or 1 but do get the correct bin to have largest spiking rate, 94% of the time with default encoding. This is quite impressive, given that [best practice](https://forum.nengo.ai/t/mnist-value-range-for-ensemble-representation/2243) is to use 32 neurons per dimension/pixel! # 2. **In fixed dimensionality, two information-carrying vectors can be bound into one with individual information retained up to Shannon limit; binding operation achievable by custom dendrites.** Suppose you want to represent a sentence $$ \text{nengo eat cookies.} $$ In word-to-vec, do they actually add the `nengo` vector and the `eat` vector and the `cookies` vector? There'd be no way of telling if nengo eat cookies or cookies eat nengo. If `nengo`, `eat`, and `cookies` were scalars, you'd tag them by distinct `object`,`verb`,`subject` basis vectors. Structured representation extends to *bind* vector `nengo`, `eat`, and `cookies` with vector `object`,`verb`,`subject`. The **tensor product** is a binding that preserves all information at the cost of *extremely* high dimensionality — you could implement neurons that bind `nengo` and `object` in 64D, then divide a 128D vector by `object` to get `nengo` perfectly. The amount of information that can be packed into given dimensions is bounded. Even so, you can do **circular convolution** ([beautiful book](https://press.uchicago.edu/ucp/books/book/distributed/H/bo3643252.html)) or **vector-derived transformation binding (VTB)** ([newer, better, paper](http://compneuro.uwaterloo.ca/files/publications/gosmann.2019b.pdf)) (both denoted $\otimes$ here) to compress `nengo` and `object` in 64D into one new 64D vector, then divide it by `object` to get a result closer to `nengo` than other members of the vocabulary, `object`/`verb`/`subject`/`eat`/`cookies`. Structured representations can be nested, eg. `I think nengo eat cookies` will let you query what I think, who thinks nengo eat cookies, etc: $$ \text{I}\otimes \text{object}+\text{think}\otimes \text{verb}+(\text{nengo}\otimes \text{subject}+\text{eat}\otimes \text{verb}+\text{cookies}\otimes \text{object})\otimes \text{subject} $$ --- ## [`VTB from scratch`](https://colab.research.google.com/drive/1jyw6mMXQj_DL4ExV4YmgAIo38TOuYy6E?usp=sharing) ... hand-code VTB binding and unbinding, and then run neural simulation with `nengo`! Also, check out the `nengo` [question answering](https://www.nengo.ai/nengo-examples/htbab/ch5-question.html) code to implement `I think nengo eat cookies` and query what I think, who thinks nengo eat cookies, etc. This is the [How to build a brain](https://academic.oup.com/book/6263) ch. 5 tutorial, followed by question answering with memory and control! --- This whole binding shinanigans is called **Semantic Pointer Architecture (SPA)**. In deep image nets, you could compress an image down to few dimensions, then expand it back out to look the way it originally did. SPA can extended this idea everywhere. In [SPAUN](https://www.youtube.com/watch?v=P_WRCyNQ9KY) (Semantic Pointer Architecture Unified Network ☺), 6 muscles are at the bottom (highest-dimensional-end) of a motor hierarchy. Picture of SPAUN, from [How to build a brain](https://academic.oup.com/book/6263) ch. 7 and [Eliasmith et al. 2012]([10.1126/science.1225266](https://doi.org/10.1126/science.1225266)): ![900](https://oup.silverchair-cdn.com/oup/backfile/content_public/books/6263/parts/acprof-9780199794546-chapter-7/2/m_acprof_9780199794546_graphic_095.gif?Expires=1707249715&Signature=4pDuihTCtT6nVMu3kubYlizSIFxcYd15qzQMNyyq0HDzoJDcDGzTgijb2R19PxAT1wsg2rpWwEVCJ03L-7HyFGOwcHKfcydbRCY98WV8P3n~DJiEm6ZYQG2VffaF0SXkvxyf5TWYQdG9q3SAcN~8ve8d32fmIpnK0dGEI-1RkWh9eA6weq24pCqitAzVbp44Ol4kQX9GLf97WyjxLR4LdsWX3zFP9iYF3oWow03E-L8VncY4mU7UG0qWHSnye7PVQCE9T9kE11wzJUy8V6akga~2j7QSFYm4BvKMpz3XHvsZ1Khx6EhCEhuQNQj6vXRfp0EyMaVXjaptrfCXiVlJWg__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA) # 3. **The basal ganglia** implements action selection in SPAUN and is pretty important. \*In [a fun conversation with Tim](https://braininspired.co/podcast/90/), @celiasmith said something along the lines of — > If SPAUN's drawing doesn't look like a human brain my primary suspect would be the basal ganglia. It's too important. Anyways I'm fond of this 1D-neuron circuit and would like to write about it before going on to the 512D-neuron bg-thalamus-cortex loop! A problem instance of action selection \[ [1](http://compneuro.uwaterloo.ca/files/publications/stewart.2010.pdf), [2](https://pubmed.ncbi.nlm.nih.gov/11417052/) ] says: Given a list of $n$ scalar 'salience' values, return the $\text{argmax}$ by constructing a feed-forward network, such that each action has an input node and an output node. Salience and chosen-ness are respectively represented by the activities of input node and output node. We take this invitation to design a network that would do the job and observe biological serendipity along the way. (I am reminded of XOR: make a network with two input nodes, range {`0`,`1`}, that outputs XOR. We heard its story in class: for twenty years people assumed neural networks couldn't do this but that's not true. In computing a node, you can construct any function of the other nodes!) 1. **Order-preserving transfer** from input to output: this could be directly connecting inputs to respective outputs. 2. Apply **squashing function** so that chosen is very different from not-chosen Here is a biologically plausible network for $n=3$; connections for one action is shown (the others are like this too). ![1200](https://i.imgur.com/8pgEMv3.png) This network is carrying out order-preserving transfer and squashing as described above. 1. Higher input activity corresponds to lower output activity. The "average" is also offset from the output, so that the meh input results in output activity 0. Only very high input result in output activity << 0, detectable after the squash. - Weights from striatum D2 to output are $O\left( \frac{1}{n} \right)$. We'll see the basal ganglia has a way of making that happen. 1. Squashing occurs via neuron response functions. The basal ganglia components and connections shown are accurate: - Since each node represents an ensemble of neurons, the connections coming out of it are either all inhibitory or all excitatory as per the ensemble. - Input is forwarded directly from the cortex to striatum D1 and D2 - Striatum D2 has a diffuse connection, exciting all ensembles regardless what action they represent - Striatum D1 inhibits only its same-action family There is some missing component of the basal ganglia, the global palladus external, which seems to implement a control system. ![800](https://i.imgur.com/bU6On8N.png) In the **basal ganglia-thalamus-cortex loop**, we get to use SPA — writing about that is a big todo. Code for implementing a basal ganglia-thalamus-cortex loop is [here](https://www.nengo.ai/nengo-examples/htbab/ch7-spa-sequence.html) ([How to build a brain](https://academic.oup.com/book/6263) ch. 7 tutorials). (*I think* the discussion above sort of explained the basal ganglia ensembles like nowhere else :D Will make notebooks to make more apparent what each ensemble is doing; take out global palladus external asw striatum D2 and see what happens.) # Conclusion SPAUN runs cognitive tasks, with support on learning and memory. > We tried to add emotion and that didn't work — no that didn't happen. — @celiasmith on [same podcast](https://braininspired.co/podcast/90/) I think this has got to be a right way to go. I am justly concerned that my biggest surprises were about NEF ([§1](#1.)) and SPA ([§2](#2)) frameworks, and might not let you feel like building a car by looking at function and putting components together. ``` keep calm & nengo ```