# Oralytics Paper Topics Brainstorm **Goal:** Choose a topic that we can write a paper on and design the RL algorithm for the Oralytics clinical trial to run experiments for that topic. ## How to Build and Test Your RL Algorithm for Usage in Digital mHealth [Todo] Outline Paper * Data has no actions (raw dental data) * Data has actions of the same kind (heart steps) Challenges: * evidence of non-stationarity * sparse data ## Dealing with 0 Inflated Values * A majority of the labels (reward) in the ROBAS 2 study are 0s, which indicate that the user did not brush. * Reward approximating function is a combination of Bernoulli and 0 inflated Poisson with the parameter estimated using a Gaussian Process (Generalized Linear Model) * (Johnson and Kotz, 1969; Lambert, 1992) * how to make this a posterior distribution? * posterior over independent parameters $p$ and $\lambda$. * Directly have a 0 inflated GP as the reward function [Hegde, 2019](https://arxiv.org/abs/1803.05036) * A novel zero-inflated Gaussian process formalism consisting of a latent Gaussian process and a separate ‘on-off’ probit-linked Gaussian process that can zero out rows and columns of the model covariance. ![](https://i.imgur.com/xo95uqi.png) ## Model Selection For GPs With Sparse Data * How to init hyperparameter values at the beginning of the clinical trial/ * One idea is to "throw away" the first let's say 5 days of the study and use that data to tune hyperparameters * De facto methods of model selection for GPs is difficult when there is small amounts data * sensitive to overfitting * e.g. in ROBAS 2, there was at most 56 data points per user (2 per day for 28 days) * [Mohammed, 2017](https://ueaeprints.uea.ac.uk/id/eprint/67158/1/Chapter.pdf) suggests using CV when there is limited data * "similarly to over-fitting in training, can be significantly harmful when the data sample is small and the population of hyper-parameters to be tuned is large"