当针对一系列学习问题进行优化时,卷积神经网络会遭受灾难性的遗忘:当它们满足当前训练示例的目标时,它们在先前任务上的表现将急剧下降。在这项工作中,我们引入了一个新颖的框架来通过条件计算解决这个问题。我们为每个卷积层配备特定于任务的选通模块,选择要应用于给定输入的过滤器。这样,我们实现了两个吸引人的特性。首先,门的执行模式允许识别和保护重要的过滤器,从而确保先前学习的任务的模型性能不会损失。其次,通过使用稀疏性目标,我们可以促进选择有限的内核集,从而保留足够的模型能力来消化新任务。现有的解决方案在测试时需要了解每个示例所属的任务。但是,在许多实际情况下可能无法获得此知识。因此,我们另外引入了一个任务分类器,该分类器预测每个示例的任务标签,以处理其中无法使用任务预告片的设置。我们在四个持续学习数据集上验证了我们的建议。结果表明,无论是否存在任务预言,我们的模型始终优于现有方法。值得注意的是,在Split SVHN和Imagenet-50数据集上,我们的模型的w.r.t.精度提高了23.98%和17.42%。竞争方法。
原文标题:Conditional Channel Gated Networks for Task-Aware Continual Learning
原文:Convolutional Neural Networks experience catastrophic forgetting when optimized on a sequence of learning problems: as they meet the objective of the current training examples, their performance on previous tasks drops drastically. In this work, we introduce a novel framework to tackle this problem with conditional computation. We equip each convolutional layer with task-specific gating modules, selecting which filters to apply on the given input. This way, we achieve two appealing properties. Firstly, the execution patterns of the gates allow to identify and protect important filters, ensuring no loss in the performance of the model for previously learned tasks. Secondly, by using a sparsity objective, we can promote the selection of a limited set of kernels, allowing to retain sufficient model capacity to digest new tasks.Existing solutions require, at test time, awareness of the task to which each example belongs to. This knowledge, however, may not be available in many practical scenarios. Therefore, we additionally introduce a task classifier that predicts the task label of each example, to deal with settings in which a task oracle is not available. We validate our proposal on four continual learning datasets. Results show that our model consistently outperforms existing methods both in the presence and the absence of a task oracle. Notably, on Split SVHN and Imagenet-50 datasets, our model yields up to 23.98% and 17.42% improvement in accuracy w.r.t. competing methods.
原文作者:Davide Abati, Jakub Tomczak, Tijmen Blankevoort, Simone Calderara, Rita Cucchiara, Babak Ehteshami Bejnordi
原文地址:https://arxiv.org/abs/2004.00070
用于任务感知的持续学习的条件通道门控网络(CS.CV).pdf