Paper draft === ## Motivation ### Theoretical advantages of the NEAT algorithm - **Flexibility**: automatic discovery of neural networks architectures, one more step towards complete automatization (no tuning of the NN architecture by hand). This is important since near-term technologies may likely change with time - **Efficiency**: the search starts with the NN of small complexity and grows *only when needed* during learning. This could lead to succint solutions (for instance maybe there is no need for deep architectures which is an assumption in many works). This implies that it is potentially more **scalable** since shallower architectures are faster to execute. This is a very important point as the decoding time is crucial for error correction (if too long, the errors might proliferate). Moreover, it is known that larger neural networks require a larger amount of training samples. - **Interpretability**: .. and more interpretable. Plus, optimizing in the space of policies allows to directly inspect the NN-policy and could possibly give insights on new correcting strategies. To be compared to Q-learning where a deep network estimates $(s, a)\mapsto q_\pi(s,a)$ and the taken action is $a_\text{opt} = \underset{a}{\text{argmin}}\ q_\pi(s,a)$, in NEAT the NNs approximate directly the function $s\mapsto a_\text{opt}$. (Note that back-propagation applied on the NEAT neural-networks is hard because it implies differentiating the argmin function) - **Fast**: gradient-free methods are possibly faster (https://arxiv.org/abs/1712.06567) and can rival back-prop-based ANNs. They are highly parallelizable (independent individuals in the population are evaluated on independent puzzles). The above advantages seem to promote an approach that is both **flexible** and **scalable**. > Maybe systematically compare NEAT to Q-learning as more people in the field are familiar to the latter method ### Potential drawbacks - **TODO**: quantify the number of importance of hyperparameters in the NEAT algorithm compared to Q-learning - **TODO**: Think about how difficult it is to set up a procedure that would tweak the NEAT solution with a few gradient descent steps ## Our NEAT approach ### Advantages - **Unbiased towards MWPM** i.e. no reward based on the number of bit flips, contrary to some previous studies. - **Full use of symmetries** as in Mats' paper. Discuss this point because implementing rotational-invariance is not actually beneficial - **Straightforward transfer learning** (start with good species from smaller code distance): gives great speed-up! It allows to define decoders for large code distances, indeed a $d=5$ is also a $d=7$ decoder (obviously not so efficient) in the same spirit of RG decoders - TO CHECK: the number of samples for training seems much lower than in supervised settings, plus the curriculum learning helps by concentrating on the hardest puzzles ### Limitations - no good definition of algorithmic error threshold since different code sizes do not share the same algorithm. This is actually also true for the other papers. -> this is fine if we want an adaptable decoder for near-term quantum devices - translation invariance is not very realistic - Compared to Q-learning, there is no memory mechanisms (either enforced by experience replay or implicitly via the $q$-function that calculates the future rewards) apart from the information on the qubits already flipped (given in the input data) ## Results 1. **Logical error rate competitive with MWPM.** Slightly superior performance in terms error rate threshold and logical error rates ![](https://i.imgur.com/CBlZJFm.png) > TODO: Push the results for $d=7$ and $d=9$. Try $d=11$? 2. **Transfer learning is natural and provides a real boost** for training bigger code-size decoders from smaller code-size decoders. [Figure showing correction success as a function of time (#generations) for different error rates [0.01, 0.05, 0.10, 0.15] for $d=3$ decoder optimized from scratch, $d=5$ decoder from scratch, $d=5$ decoder started (transplanted) with a $d=3$ decoder. Maybe show larger distance code since the good initialization is expected to help much more due to larger search space.] (we will see that the transplanted $d=5$ performs well on small error rates) > Evaluate different initialization strategies (full, partial, one-random) 3. **TODO**: **Minimal structure enforced by NEAT allows to examine the neural-network decoder directly**. > TODO: Make a list of puzzles to test the strategy. -> Two syndrome points next to each other -> Two syndrome points with distance one -> From these, compare #moves with distance -> Four syndrome points, neighbouring pairs -> Etc. > TODO: Look at the activation of each 'layer'/'neuron' -> As a function of distance between syndrome > TODO: Can we figure out how the network determines not to flip a spin twice? ### Exploiting even more the symmetries 4. **Implementing rotation-invariance allows to reduce action-space even more** and provides equally good results, but slower training. Evaluating a network with 1 output on 4x as many samples takes longer than evaluating a network with 4 outputs on 4x fewer samples. Network evaluation is matrix multiplication, so perhaps not suprising? Fewer actions is good for RL search, not so much for NEAT. 5. **hyperNEAT** could solve different problems: it can define a common algorithm accross code dimensions which makes the definition of the error threshold well posed. As well as interesting properties of decoders taking advantage of symmetries of the toric code. > TODO: maybe analyzing the NEAT decoders will allow for explaining why hyperNEAT struggles ## Future Ideas - Depolarizing noise - Fault-tolerance (measurement errors) - Decodoku (demo & fame) # Paper layout > We will focus on the NEAT side of the work, not on the Toric Code ## Introduction 1. Machine learning techniques are becoming more and more commonplace in quantum physics~\cite{Carrasquilla review}. > Paragraph 1 will have more literature on examples of ML in physics 2. In particular, a physics problem that can be tackled with machine learning is that of quantum error decoding (QED) on 2D stabilizer codes~\cite{findref}. Previous works on tackling the QED problem explore the application of supervised learning~\cite{DelftPaper?} and unsupervised learning~\cite{TorlaiMelko}. Yet other works formulate the QED problem as a single-player game, thereby allowing a solution attempt using reinforcement learning~\cite{Eisert,Mats1,Mats2}. 3. In this work we introduce the use of evolutionary algorithms, specifically the NEAT (neural ...) algorithm. > Here we list the pros and cons from Hugo's analysis > We do so because NEAT has shown to perform well at game-playing RL problems. > The application of neural evolutionary techniques to quantum physics is largely unexplored, but we hope that because of their promising results they will be further explored. 4. We find that the NEAT algorithm is capable of producing a decoding strategy that performs similarly to the well-known MWPM result on the Toric code with bitflip noise, but with much smaller networks than reported previously. > [Hugo check:] Smaller computational cost because of transfer learning > Mention HyperNEAT 5. The rest of this paper is structured as follows. ## Setting up QEC as a reinforcement learning problem > [Figure] Showing picture of toric code with syndrome, and with arrows indicating valid moves. Could be a wide figure showing moves as separate steps. ## Introducing NEAT to solve RL problem > [Figure] Figure explaining essence of NEAT: a crossover ## Results & Comparisons > Introduce MWPM as the benchmark for the threshold > [Figure] Threshold result (mean and best) with variance between NEAT runs > [Figure] Example of the resulting evolved policy network. Show the architecture? > Complexity (#of parameters) as we evolve, as a comparison to the other papers on this. Table like in RL benchmark papers > Compare to other papers > [Figure] Training curves with transfer learning and different error rates > [Possible Figure/Animation on github] Opening the black box: insert specific errors and show what network would do. ## Discussion/Conclusion > Machine learned approaches to decoding the toric code all deal with small distances~\cite{Melko, Eisert, Mats}, and this work is no exception. > > Smaller networks will be faster, so they are better suited for hardware implementations. > > Even though these system sizes (distances) are small, if a Toric code ever gets constructed in a lab it is most likely small and this algorithm might be relevant. > > HyperNEAT ## Acknowledgements > Acknowledge INTERACTIONS EU > CFM ## References