papers
Deep Kernel Learning combines the representational power of
neural networks with the reliable uncertainty estimates of Gaussian processed by learning a flexible deep kernel function. The complexity is O(n) at train time and O(1) at test time compared to O(n^3) at train time and O(n^2) at test time for standard Gaussian processes.
One of the central critiques of Gaussian process regression is that it does not actually learn representations of the data. This is because the kernel function is specified and not flexible enough to learn representations of the data. We can solve this through deep kernel learning (DKL), which maps the inputs to intermediate values
through a neural network parameterized by weights and biases, .
These intermediate values are then used as inputs to the standard kernel resulting in the effective kernel .
NLM is essentially Deep Kernel Learning with linear kernel
https://proceedings.mlr.press/v161/ober21a/ober21a.pdf
In this work, we make the following claims: