# Lifelong/Continual Learning Lit Review ## Resources * Page full of links and relevant (?) papers https://paperswithcode.com/task/continual-learning ## Notes from specific papers ### Lifelong Machine Learning * Book by Zhiyuan Chen, Bing Liu * https://books.google.ca/books?id=JQ5pDwAAQBAJ&pg=PA21&source=gbs_toc_r&cad=4#v=onepage&q&f=true * Definition * Lifelong learning is a continuous learning process. * Learner has performed sequence of N learning tasks $T_1, T_2, ..., T_N$ * Tasks have corresponding datasets $D_1, D_2, ..., D_N$ * Now we wish to learn $T_{N+1}$ with dataset $D_{N+1}$ * Learner should leverage past knowledge in KB to help learn $T_{N+1}$ * KB should maintain knowledge learned from previous tasks * Other way of defining LL (five key characteristics) * continuous learning process * knowledge accumulation and maintenance in the KB * ability to use accumulated past knowledge to help future learning * ability to discover new tasks * ability to learn while working or to learn on the job --- * Transfer learning * Source domain: lots of labeled training data (usually just one domain) * Target domain: little to no labeled training data * Use source domain labeled data to help learn in target domain * Differences with LL * not concerned with continuous learning or knowledge accumulation (one source to one target) * unidirectional * does not identify new tasks to be learned (learn on the job) * Multi-task learning (MTL) * Learn multiple related tasks simultaneously * "introduce inductive bias in the joint hypothesis space of all tasks by exploiting task relatedness" * Prevent overfitting in individual tasks for better generalization * Relation to LL * Similar in that MTL and LL both use shared information across tasks to help learning * MTL is still in "traditional" learning paradigm; if we treat multiple tasks as one big task, it is just traditional optimization * No accumulation of knowledge over time * No continuous learning * Online Learning * training data samples arrive in sequential order * when new data arrives, existing model is quickly updated to produce best model so far * different from traditional batch learning, where we require full set of training data * Differences with LL * Same task over time (LL has multiple tasks) * No retention of knowledge * Reinforcement learning * learn optimal policy that maps states to actions that maximizes sum of rewards * Can apply TL and MTL to RL * Differences with LL * One task, one environment * no accumulation of knowledge * Meta learning * Learn a new task with only a small number of training examples with model trained on many other very similar tasks * Used to solve one-shot or few-shot learning problems * Two components: base (quick) learner, meta (slow) learner * base: trained within a task, quick updates * meta: perfom in task-agnostic meta space, goal of transfering knowledge across tasks * Differences with LL * Training tasks and test tasks are from the same distribution * Most similarly: make use of many tasks to help learn the new task * Conclusions * we're probably not doing LL, but online learning * we might do better if we try some meta learning tricks?