Reproduced results
Details about the results, ablation, methodology, hyper-parameters can be found in the following report.
Report on reproducibility
Introduction:
In this paper the authors argue that the limited representational resources of model-based RL agents are better used to build models that are directly useful for value-based planning rather than optimising it to predict the best transition probabilities.
Their major contribution is the following theorem,
Equivalence theorem : two models are value equivalent with respect to a set of functions and policies if they yield the same Bellman updates.