RL入门级资料（持续更新中）

# RL入门级资料（持续更新中）本文档记录RL入门需要的学习材料 ## 0. 基础 + 科学上网能够使用Google，YouTube和Google scholar等 + 电脑操作系统 Linux 或者 macOS 要求熟练掌握linux环境基本命令，学会配置代码需要的环境（软件和库） ## 1. 编程能力 ### Python 交互式学习编码：https://www.w3schools.com/python/ ### git 学会利用git基本命令，利用Github或者Gitlab进行代码管理 ### 深度学习框架 TensorFlow：掌握深度强化学习相关的tensorflow, tensorboard等用法 Pytorch：https://pytorch.org/tutorials/intermediate/reinforcement_q_learning.html ### Python IDE + PyCharm + Jupyter notebook ## 2. 仿真环境 + **OpenAI Gym** Gym is an open source Python library for developing and comparing reinforcement learning algorithms by providing a standard API to communicate between learning algorithms and environments, as well as a standard set of environments compliant with that API. **学习文档**：https://www.gymlibrary.dev/ **教程**：https://www.gymlibrary.dev/content/tutorials/ 要求能学会建立一个gym-based环境 + **Safety Gym** It provides a standardized method of comparing algorithms and how well they avoid costly mistakes while learning. If deep reinforcement learning is applied to the real world, whether in robotics or internet-based tasks, it will be important to have algorithms that are safe even while learning—like a self-driving car that can learn to avoid accidents without actually having to experience them. https://openai.com/blog/safety-gym/ + **自动驾驶相关仿真环境** + highway-env: https://github.com/eleurent/highway-env + 华为SMARTS + NeurIPS 2022，有兴趣可组队参加 https://smarts-project.github.io/ + 仿真环境：https://github.com/huawei-noah/SMARTS + SUMO（擅长交通流仿真） + https://www.eclipse.org/sumo/ + CommonRoad-RL（面向自动驾驶决策与规划） + 网站：https://commonroad.in.tum.de/ + 论文https://dl.acm.org/doi/abs/10.1109/ITSC48978.2021.9564898 + carla: https://github.com/carla-simulator/carla + 对硬件要求高 Intel i7 gen 9th - 11th / Intel i9 gen 9th - 11th / AMD ryzen 7 / AMD ryzen 9 16 GB RAM memory NVIDIA RTX 2070 / NVIDIA RTX 2080 / NVIDIA RTX 3070, NVIDIA RTX 3080 Ubuntu 18.04 ## 3. RL算法库 + **spinningup** 学术界，入门必读，针对几种典型的算法，要求能读懂代码并精读相关论文 https://spinningup.openai.com/en/latest/ + OpenAI baselines https://github.com/openai/baselines + **stable-baselines3** （SB3） https://stable-baselines3.readthedocs.io/en/master/guide/install.html 针对SB3，超参数优化参考：https://github.com/DLR-RM/rl-baselines3-zoo + **清华大学开源库 tianshou** https://github.com/thu-ml/tianshou 有中文版tutorial + **Ray** Ray is a unified way to scale Python and AI applications from a laptop to a cluster. 工业界常用，并行计算 https://github.com/ray-project/ray + **RLlib** RLlib is an open-source library for reinforcement learning (RL), offering support for production-level, highly distributed RL workloads while maintaining unified and simple APIs for a large variety of industry applications. Whether you would like to train your agents in a multi-agent setup, purely from offline (historic) datasets, or using externally connected simulators, RLlib offers a simple solution for each of your decision making needs. https://docs.ray.io/en/latest/rllib/index.html ## 4. 书籍/论文 **Sutton的经典书籍（第二版）** http://incompleteideas.net/book/bookdraft2017nov5.pdf 阅读几种典型的算法相关的论文（可在SB3或者spinningup网站中找到对应的算法和论文），比如DQN、PPO、TRPO、TD3、DDPG和SAC等算法。 ## 5. 课程 **CS287** https://www.youtube.com/watch?v=xWPViQ6LI-Q&list=PLwRJQ4m4UJjNBPJdt8WamRAt4XKc639wF 可在b站上看 **CS229** https://www.youtube.com/playlist?list=PLoROMvodv4rMiGQp3WXShtMGgzqpfVfbU ## 6. 云端计算资源 **Colab** https://colab.research.google.com/ **其他** 阿里云、腾讯云、华为云作为云计算龙头，云服务器具有超高性价比，是个人和中小企业选择云平台首选。 ## 7. 知乎强化学习怎么入门好？ https://www.zhihu.com/question/277325426 ## 8. 自动驾驶决策与规划团队 + 慕尼黑工大 https://www.epc.ed.tum.de/en/rt/research/automotive/motion-planning-autonomous-driving/ ## 9. RL-优秀中国学者 + Huazhe Xu： http://hxu.rocks/ + Yao Mu: https://yaomarkmu.github.io/ ## 10. RL-各种研究方向 + curriculum learning for reinforcement learning https://lilianweng.github.io/posts/2020-01-29-curriculum-rl/ ## 11. 目标期刊、会议 ### 期刊 IEEE Transactions on Industrial Informatics ### 会议 AAMAS ## 12. 实验 Machine Learning: What Is Ablation Study? https://www.baeldung.com/cs/ml-ablation-study