Curriculum Learning

# Curriculum Learning ###### tags: `學習紀錄` [toc] --- ## Before Meeting :::success - 课程学习（Curriculum Learning）由Montreal大学的Bengio教授团队在2009年的ICML上提出，主要思想是模仿人类学习的特点，由简单到困难来学习课程（在机器学习里就是容易学习的样本和不容易学习的样本），这样容易使模型找到更好的局部最优，同时加快训练的速度。 - Contributions of the paper - Explore cases that show that curriculum learning benefits machine learning. - Offer hypothesis around when and why does it happen. - Explore relation of curriculum learning with other machine learning approaches. ::: [refer](https://blog.csdn.net/ccbrid/article/details/82421806) [refer](https://www.ijcai.org/proceedings/2018/0587.pdf) [refer](https://www.sohu.com/a/236007345_100118081) [refer](http://www.cs.cmu.edu/~epxing/papers/2016/Sachan_Xing_ACL16a.pdf) [refer](https://blog.csdn.net/qq_25011449/article/details/82914803) [refer](http://www.paperweekly.site/papers/notes/112) [refer](http://ronan.collobert.com/pub/matos/2009_curriculum_icml.pdf) [refer]() [refer]() [refer]() [refer]() --- ## Recent Paper --- ### Curriculum Learning :::success #### Abstracion - 人类和动物在学习时学习材料按照由易到难的顺序呈现是学习效果会更好，在机器学习中课程学习的概念借鉴了这种思想。在非凸问题中，课程学习展现出了巨大的性能提升和很强的泛化能力。作者认为课程学习的策略能够加速收敛速率以及在非凸优化中找到更好的局部最优点（可以看成是continuation method） ::: :::info #### Detail - Introduction - Curriculum Learning - When training machine learning models, start with easier subtasks and gradually increase the difficulty level of the tasks. - Motivation comes from the observation that humans and animals seem to learn better when trained with a curriculum like a strategy. - 介绍了课程学习的思想，并通过动物训练shaping的模式和循环网络学习语法的例子说明，学习要由易到难循序渐进。 - Contributions： - 作者通过有关视觉和语言的任务证明了很简单的多阶段课程学习的策略就能够实现泛化能力的提高和收敛速度的加快。 - 另外解释了课程学习为什么有这些优势。 - 实验表明课程学习的作用类似于某种正则项。 - On the difficult optimization problem of training deep neural networks - 在这一部分作者在深度神经网络中讨论课程学习策略对局部最优问题的处理。深度神经网络就具有层次的结构，使用多层级的抽象特征能够让系统根据数据自动推断出输入输出之间的映射关系，从而排除人工特征的设计。然而，训练深度结构的神经网络却很困难，一些学者的研究证明使用一些无监督预训练策略来确定监督训练的初始化参数可以帮助深度网络的训练得到更好的测试误差（泛化能力增强）。作者使用课程学习的策略来进行预训练，以便找到更好的局部最优以及提高收敛的速度。 - A curriculum as a continuation method - ![](https://i.imgur.com/icVpT1u.png)![](https://i.imgur.com/GjWtKre.png) - Toy Experiments with a Convex Criterion - Cleaner Examples May Yield Better Generalization Faster - ![](https://i.imgur.com/dfeNR8v.png) - Introduction Gradually More Difficult Examples Speeds-up Online Training - ![](https://i.imgur.com/ziUI9nq.png) ::: :::warning #### Conclusion - Experiments on shape recognition - 这个实验关于三角形、长方形和椭圆形的形状的识别。作者用了两组数据集来区分样本的难易。一组数据集包含了等边三角形、正方形和圆形（BasicShapes），另一组中的形状并不那么规则（GeomShapes）。为了说明课程学习的效果，作者采取了以下策略： - 将仅使用GeomShapes数据集训练的结果作为baseline。 - 开始先用BasicShapes数据集中的数据进行训练，为了区分难易程度，分别训练0、2、4……、128个epochs（0 epoch就是baseline），然后再用GeomShapes训练至256个epochs，如果validation error到达设定的最小值就提前停止。结果如下图所示： - ![](https://i.imgur.com/ysUUe1u.png) - 但是这样的结果可能是因为相比没有课程学习的训练，课程学习的方式看到了更多的样本。因此作者又进行了两个实验，一个是使用BasicShapes和GeomShapes两个数据集的数据在没有课程学习策略的情况下进行训练（这样看到的数据就一样多了）；另一个是只使用BasicShapes数据集中的数据进行非课程学习的训练（这样就验证了并非BasicShapes中的数据比较好），两个对比实验的结果都不好，从而说明的课程学习的效果。 - Experiment on language modeling - ![](https://i.imgur.com/OA1dDPI.png) - ![](https://i.imgur.com/LvzjHvB.png) - Discussion and Future Work - 作者认为课程学习之所以有效可以从以下两个方面解释： - 在训练初期能够花更少的时间在有噪声的和很难去训练的数据 - 可以引导训练走向更好的局部最优和更好的泛化效果：课程学习可以被看作是一种特殊的continuation method。 - 另外，如何寻找更好的课程将是未来的研究方向。 - Advantages of Curriculum Learning - Faster training in the online setting as learner does not try to learn difficult examples when it is not ready. - Guiding training towards better local minima in parameter space, specifically useful for non-convex methods. - Criticism - Curriculum Learning is not well understood, making it difficult to define the curriculum. - In one of the examples, anti-curriculum performs better than no-curriculum. Given that curriculum learning is modeled on the idea that learning benefits when examples are presented in order of increasing difficulty, one would expect anti-curriculum to perform worse. ::: [refer](https://gist.github.com/sty61010/be562775fb1c4dabb4bba93f930a7048) --- :::success #### Abstracion ::: :::info #### Detail ::: :::warning #### Conclusion ::: [refer]() ---