These notes are created from an implementation POV. Main contribution: Their main contribution is to learn long-horizon behaviors by propagating analytic value gradients through imagined trajectories. They show that this method gives empirically scalable results on complex control tasks. Learning long-horizon behaviors by latent imagination. Empirical performance for visual control. Algorithm:
5/17/2021Problem setting: The authors propose "Sentio", a Reinforcement Learning based algorithm to enhance the Forward Collision Warning (FCW) system leading to Driver-in-the-Loop FCW system. On top of considerating the threshold of time-to-crash by traditional FCW systems this algo also claims to take in account Driver's preference or mood. Change in the driver's mood over time. Aproach: To address the above challenges, Sentio:
5/12/2021Introduction They simultaneously train two models for generating data: A generative model G that captures the data distribution and generates new samples from that distribution. A discriminative model D that estimates the probability that a sample belongs to true data rather than Generated data. The training is carried out in such a way that both these models improve in their corresponding tasks until ideally the generated data is indistinguishable from the original training data. Summary of pre-reqs: Information theory
5/4/2021or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up