The success of ImageNet highlighted that in the era of deep learning, data was at least as important as algorithms. Not only did the ImageNet dataset enable that very important 2012 demonstration of the power of deep learning, but it also allowed a breakthrough of similar importance in transfer learning: researchers soon realized that the weights learned in state of the art models for ImageNet could be used to initialize models for completely other datasets and improve performance significantly.

Pretrained ImageNet models have been used to achieve state-of-the-art results in tasks such as object detection, semantic segmentation, human pose estimation and video recognition. At the same time, they have enabled the application of Computer Vision to domains where the number of training examples is small and annotation is expensive. Transfer learning via pretraining on ImageNet is in fact so effective in CV that not using it is now considered foolhardy (Mahajan et al., 2018) 
NLP’s ImageNet moment has arrived, Sebastian Ruder 

< Prev Next >