Transformers

Subformer is a Transformer that combines sandwich-style parameter sharing, which overcomes naive cross-layer parameter sharing in generative models, and self-attentive embedding factorization (SAFE). In SAFE, a small self-attention layer is used to reduce embedding parameter count.

Source: Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Abstractive Text Summarization 2 20.00%
Language Modelling 2 20.00%
Machine Translation 2 20.00%
Translation 2 20.00%
Graph Representation Learning 1 10.00%
Decoder 1 10.00%

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories