Cross-Modal Alternating Learning with Task-Aware Representations for Continual Learning

IEEE TMM 2023 2023  ·  Bin-Bin Gao ·

Continual learning is a research field of artificial neural networks to simulate human lifelong learning ability. Although a surge of investigations has achieved considerable performance, most rely only on image modality for incremental image recognition tasks. In this paper, we propose a novel yet effective framework coined cross-modal Alternating Learning with Task-Aware representations (ALTA) to make good use of visual and linguistic modal information and achieve more effective continual learning. To do so, ALTA presents a cross-modal joint learning mechanism that leverages simultaneous learning of image and text representations to provide more effective supervision. And it mitigates forgetting by endowing task-aware representations with continual learning capability. Concurrently, considering the dilemma of stability and plasticity, ALTA proposes a cross-modal alternating learning strategy that alternately learns the task-aware cross-modal representations to match the image-text pairs between tasks better, further enhancing the ability of continual learning. We conduct extensive experiments under various popular image classification benchmarks to demonstrate that our approach achieves state-of-the-art performance. At the same time, systematic ablation studies and visualization analyses validate the effectiveness and rationality of our method. Our code for ALTA is available at \url{https://github.com/vijaylee/ALTA}.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Continual Learning Cifar100 (10 tasks) ALTA-ViTB/16 Average Accuracy 92.85 # 1
Continual Learning Cifar100 (10 tasks) ALTA-RN50x4 Average Accuracy 84.91 # 2
Continual Learning Cifar100 (10 tasks) ALTA-RN101 Average Accuracy 84.77 # 4
Continual Learning Cifar100 (10 tasks) ALTA-RN50 Average Accuracy 83.87 # 5
Continual Learning Tiny-ImageNet (10tasks) ALTA-ViTB/16 Average Accuracy 89.80 # 1
Continual Learning Tiny-ImageNet (10tasks) ALTA-RN50x4 Average Accuracy 84.73 # 2
Continual Learning Tiny-ImageNet (10tasks) ALTA-RN101 Average Accuracy 83.35 # 3
Continual Learning Tiny-ImageNet (10tasks) ALTA-RN50 Average Accuracy 81.07 # 4

Methods


No methods listed for this paper. Add relevant methods here