Search Results for author: Madhura Pande

Found 4 papers, 1 papers with code

On the weak link between importance and prunability of attention heads

no code implementations • EMNLP 2020 • Aakriti Budhraja, Madhura Pande, Preksha Nema, Pratyush Kumar, Mitesh M. Khapra

Given the success of Transformer-based models, two directions of study have emerged: interpreting role of individual attention heads and down-sizing the models for efficiency.

Paper
Add Code

On the Prunability of Attention Heads in Multilingual BERT

no code implementations • 26 Sep 2021 • Aakriti Budhraja, Madhura Pande, Pratyush Kumar, Mitesh M. Khapra

Large multilingual models, such as mBERT, have shown promise in crosslingual transfer.

Paper
Add Code

The heads hypothesis: A unifying statistical approach towards understanding multi-headed attention in BERT

1 code implementation • 22 Jan 2021 • Madhura Pande, Aakriti Budhraja, Preksha Nema, Pratyush Kumar, Mitesh M. Khapra

There are two main challenges with existing methods for classification: (a) there are no standard scores across studies or across functional roles, and (b) these scores are often average quantities measured across sentences without capturing statistical significance.

Sentence

Paper
Code

On the Importance of Local Information in Transformer Based Models

no code implementations • 13 Aug 2020 • Madhura Pande, Aakriti Budhraja, Preksha Nema, Pratyush Kumar, Mitesh M. Khapra

We show that a larger fraction of heads have a locality bias as compared to a syntactic bias.

MRPC QNLI +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.