# 強化學習 ###### tags: `強化學習` - [Policy Gradient](/8SuIRATlSlGjBOdaQ01hlQ) - [Proximal Policy Optimization (PPO)](/f8-xjHBPSUS28GwsbmRO_g) - [Q-learning (Basic Idea)](/bSzqZQ5KRvSmvQ8bQh2gUQ) - [Q-learning (Advanced Tips)](/uKLA4H5JSteu45gRihq1HQ) - [Q-learning (Continuous Action)](/J6YPq_wzQ0a4kUQY1kRZVw) - [Actor-Critic](/9JTyIkUCS4OYqgWIBSjVsQ) - [Sparse Reward](/HZa1TOJsTZueozjKjemLng) - [Imitation Learning](/vE4cMXitT8yfRhjB3fdVyw) 資源 https://bigdatafinance.tw/index.php/tech/data-processing/528-2018-02-27-04-31-42 https://github.com/tinyzqh/awesome-reinforcement-learning https://github.com/NeuronDance/DeepRL https://github.com/dennybritz/reinforcement-learning https://datawhalechina.github.io/easy-rl/ https://www.inside.com.tw/article/12526-deep-reinforcementlearing-not-work
{"metaMigratedAt":"2023-06-17T06:15:47.929Z","metaMigratedFrom":"Content","title":"強化學習","breaks":true,"description":"Policy Gradient","contributors":"[{\"id\":\"e7ee1d27-3870-4a38-b411-cc0220e85f99\",\"add\":3186,\"del\":2396}]"}
Expand menu