Finding Winning Tickets with Limited (or No) Supervision
The lottery ticket hypothesis argues that neural networks contain sparse subnetworks, which, if appropriately initialized (the winning tickets), are capable of matching the accuracy of the full network when trained in isolation. Empirically made in different contexts, such an observation opens interesting questions about the dynamics of neural network optimization and the importance of their initializations. However, the properties of winning tickets are not well understood, especially the importance of supervision in the generating process. In this paper, we aim to answer the following open questions: can we find winning tickets with few data samples or few labels? can we even obtain good tickets without supervision? Perhaps surprisingly, we provide a positive answer to both, by generating winning tickets with limited access to data, or with self-supervision---thus without using manual annotations---and then demonstrating the transferability of the tickets to challenging classification tasks such as ImageNet.
PDF Abstract