# Oralytics Paper Topics Brainstorm
**Goal:** Choose a topic that we can write a paper on and design the RL algorithm for the Oralytics clinical trial to run experiments for that topic.
## How to Build and Test Your RL Algorithm for Usage in Digital mHealth
[Todo] Outline Paper
* Data has no actions (raw dental data)
* Data has actions of the same kind (heart steps)
Challenges:
* evidence of non-stationarity
* sparse data
## Dealing with 0 Inflated Values
* A majority of the labels (reward) in the ROBAS 2 study are 0s, which indicate that the user did not brush.
* Reward approximating function is a combination of Bernoulli and 0 inflated Poisson with the parameter estimated using a Gaussian Process (Generalized Linear Model)
* (Johnson and Kotz, 1969; Lambert, 1992)
* how to make this a posterior distribution?
* posterior over independent parameters $p$ and $\lambda$.
* Directly have a 0 inflated GP as the reward function [Hegde, 2019](https://arxiv.org/abs/1803.05036)
* A novel zero-inflated Gaussian process formalism consisting of a latent Gaussian process and a separate ‘on-off’ probit-linked Gaussian process that can zero out rows and columns of the model covariance.

## Model Selection For GPs With Sparse Data
* How to init hyperparameter values at the beginning of the clinical trial/
* One idea is to "throw away" the first let's say 5 days of the study and use that data to tune hyperparameters
* De facto methods of model selection for GPs is difficult when there is small amounts data
* sensitive to overfitting
* e.g. in ROBAS 2, there was at most 56 data points per user (2 per day for 28 days)
* [Mohammed, 2017](https://ueaeprints.uea.ac.uk/id/eprint/67158/1/Chapter.pdf) suggests using CV when there is limited data
* "similarly to over-fitting in training, can be significantly harmful when the data sample is small and the population of hyper-parameters to be tuned is large"