--- title:Reinforcement Learning Study Group Catch-up Presentations --- :loudspeaker::mag::zap: **Reinforcement Learning Study Group Catch-up Presentations**:zap::mag::loudspeaker: Please leave your constructive thoughts, comments and questions for each group presentation below. Add your name to your feedback if you feel comfortable so the teams can reach out to you and discuss further. Feedback could include: What you found interesting :grey_question::question::grey_question: What might they have not thought about/suggestions :grey_question::question::grey_question: **Running order:** :large_blue_diamond: Robustness 2 :large_blue_diamond: Transferability 2 :large_blue_diamond: Robustness 1 :large_blue_diamond: Transferability 1 **Robustness 2 Questions & Comments** *Leave your feedback and comments below:* * Those observations that the type of normalisation are very important for robustness are good learning points to pass back to DSTL. It may seem trivial, but it's the kind of message that we shouldn't lose. (David Leslie) * * * **Transferability 2 Questions & Comments** *Leave your feedback and comments below:* * I like this thinking, trying to look at sensitivity analysis as performance drops (?) when we move from pre-trained values. It's a high-dimensional space. Is there anything we could use from sequential experimental design and/or Bayesian optimisation? Try https://projecteuclid.org/journals/electronic-journal-of-statistics/volume-12/issue-2/Bayes-linear-analysis-of-risks-in-sequential-optimal-design-problems/10.1214/18-EJS1496.full (David Leslie) * * * **Robustness 1 Questions & Comments** *Leave your feedback and comments below:* * How are you choosing to vary the parameters? Is it random, or are you using a (GAN-like) process where you choose the environmental parameters which cause most "trouble" to the current agents? (David Leslie) * * * **Transferability 1 Questions & Comments** *Leave your feedback and comments below:* * Here is some current work on Bayes Opt with moderate numbers of dimensions, based on information theory: https://arxiv.org/abs/2102.03324 (shameless self-promotion!) * * *