EfficientNetV2: Smaller Models and Faster Training

1 Apr 2021  ·  Mingxing Tan, Quoc V. Le ·

This paper introduces EfficientNetV2, a new family of convolutional networks that have faster training speed and better parameter efficiency than previous models. To develop this family of models, we use a combination of training-aware neural architecture search and scaling, to jointly optimize training speed and parameter efficiency. The models were searched from the search space enriched with new ops such as Fused-MBConv. Our experiments show that EfficientNetV2 models train much faster than state-of-the-art models while being up to 6.8x smaller. Our training can be further sped up by progressively increasing the image size during training, but it often causes a drop in accuracy. To compensate for this accuracy drop, we propose to adaptively adjust regularization (e.g., dropout and data augmentation) as well, such that we can achieve both fast training and good accuracy. With progressive learning, our EfficientNetV2 significantly outperforms previous models on ImageNet and CIFAR/Cars/Flowers datasets. By pretraining on the same ImageNet21k, our EfficientNetV2 achieves 87.3% top-1 accuracy on ImageNet ILSVRC2012, outperforming the recent ViT by 2.0% accuracy while training 5x-11x faster using the same computing resources. Code will be available at https://github.com/google/automl/tree/master/efficientnetv2.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Image Classification Certificate Verification EfficientNetV2-L Percentage correct 99.1 # 1
PARAMS 121M # 6
Top-1 Accuracy 99.1 # 1
Parameters 121M # 1
Image Classification Certificate Verification EfficientNetV2-S Percentage correct 98.7 # 3
PARAMS 24M # 4
Top-1 Accuracy 98.7 # 3
Parameters 24M # 3
Image Classification Certificate Verification EfficientNetV2-M Percentage correct 99.0 # 2
PARAMS 55M # 5
Top-1 Accuracy 99.0 # 2
Parameters 55M # 2
Image Classification CIFAR-100 EfficientNetV2-M Percentage correct 92.2 # 15
Image Classification CIFAR-100 EfficientNetV2-L Percentage correct 92.3 # 14
Image Classification CIFAR-100 EfficientNetV2-S Percentage correct 91.5 # 20
Image Classification Flowers-102 EfficientNetV2-S Accuracy 97.9 # 29
Image Classification Flowers-102 EfficientNetV2-L Accuracy 98.8 # 17
Image Classification Flowers-102 EfficientNetV2-M Accuracy 98.5 # 22
Image Classification ImageNet EfficientNetV2-M (21k) Top 1 Accuracy 86.1% # 170
Number of params 55M # 742
Image Classification ImageNet EfficientNetV2L Top 1 Accuracy 86.8% # 121
Number of params 121M # 877
GFLOPs 53 # 431
Image Classification ImageNet EfficientNetV2-S Top 1 Accuracy 83.9% # 347
Number of params 24M # 579
GFLOPs 8.8 # 286
Image Classification ImageNet EfficientNetV2-S (21k) Top 1 Accuracy 85.0% # 255
Image Classification ImageNet EfficientNetV2-M Top 1 Accuracy 85.1% # 245
Image Classification ImageNet EfficientNetV2-L Top 1 Accuracy 85.7% # 200
Classification InDL EfficientNetV2 Average Recall 85.40% # 9
Image Classification Stanford Cars EfficientNetV2-L Accuracy 95.1 # 3
Image Classification Stanford Cars EfficientNetV2-M Accuracy 94.6 # 4
Image Classification Stanford Cars EfficientNetV2-S Accuracy 93.8 # 8

Methods