Some overview notes

# Some overview notes ## Interesting follow-up ideas we may want to pursue later - Think carefully about doing the whole project as an explicitly "relative" discussion," e.g. in context of having a pair - The idea of replacing a unit by a bigger space, etc (can we think of any nice mathematical analogues of this construction?) - Can we directly relate the "functional" framework we're currently pursuing to a "probabilistic inference" framework? In what ways do they differ, if any? (There may be some interesting ways in which the prob. framework is more flexible or feels more natural, e.g. by directly incorporating things which are more like correspondences than functions). - Return to this idea from exploring Cotton's function, that "structure" has to do with singular strata with respect to an action - Possibly related, our discussion of using entropy to characterize why two coins having equal odds is a "generic" or natural situation to assume as a prior - How to relate to DiCarlo-style neuro questions? - This relates to the "realization question," which I think is really interesting and important ## Sensor-currying theorem is true only if we are happy to leave the range fixed and just change the domain ## Implicit vs explicit constructions (e.g., charts vs. properties of function spaces ## Comments which feel important to me but which I don't know what do with at the moment 1. The collection of pixels/sensors with its intrinsic "2d metric" feels more real than the collection of sensors as a bare set, but the sensors still aren't "real" in the sense that they are a feature of the measurement device (of course, it might very well be that factoring out the contribution of the measurement device, and learning about it, end up being similar endeavors). Nevertheless, we might be worried about framework which puts "too much" emphasis on the 2d metric, without thinking of going beyond it as well. ## Reference terms 1. *Kuratowski embedding*: given a metric space $(X,d)$, the map $$ K : (X,d) \to \mathbf{R}^X $$ defined by $x \mapsto d(x, -)$. $KW$ is distance preserving for the sup-norm on $\mathbf{R}^X$. *Proof*: follows from the triangle inequality. 2. "*Weyl Embedding*:" an analagous distance preserving map $$ W: X \to L^2(X).$$ To describe it, note that given a map $\phi : \mathbf{R} \to \mathbf{R}$, we can generalize $K$ by defining $$K_\phi(x) = \phi(d(x, -)).$$ Then a theorem of Weyl says that for a fixed complete (uses flat? where?) ... ## What is a ("vanilla") convnet? 1. *Most vanilla example* -- would a definition e.g. from the Bengio-Goodfellow book differ from the one below in 2.? 2. In the geometric deep learning paper, they describe it like this: 1. Think of an image $x \in I$ as a function $x : \Omega \to \mathbf{R}$, $\Omega = [0,1]^2$. - More generally we can take $x : \Omega \to \mathbf{R}^p$, e.g. having multiple color channels per pixel. 2. In the case of one convolution filter only, 3.