PAUSE, or Positive and Annealed Unlabeled Sentence Embedding, is an approach for learning sentence embeddings from a partially labeled dataset. It is based on a dual encoder schema that is widely adopted in supervised sentence embedding training. Each individual sample $\mathbf{x}$ contains a pair of hypothesis and premise sentences $(x_{i},x^{\prime}_{i})$, each of which is fed into a pretrained encoder (e.g. BERT). As shown in Figure, the two encoders are identical during the training by sharing their weights.
Source: PAUSE: Positive and Annealed Unlabeled Sentence EmbeddingPaper | Code | Results | Date | Stars |
---|
Component | Type |
|
---|---|---|
Average Pooling
|
Pooling Operations | |
BERT
|
Language Models | |
Dense Connections
|
Feedforward Networks | |
ELU
|
Activation Functions |