no code implementations • EMNLP 2020 • Aakriti Budhraja, Madhura Pande, Preksha Nema, Pratyush Kumar, Mitesh M. Khapra
Given the success of Transformer-based models, two directions of study have emerged: interpreting role of individual attention heads and down-sizing the models for efficiency.
no code implementations • 26 Sep 2021 • Aakriti Budhraja, Madhura Pande, Pratyush Kumar, Mitesh M. Khapra
Large multilingual models, such as mBERT, have shown promise in crosslingual transfer.
1 code implementation • 22 Jan 2021 • Madhura Pande, Aakriti Budhraja, Preksha Nema, Pratyush Kumar, Mitesh M. Khapra
There are two main challenges with existing methods for classification: (a) there are no standard scores across studies or across functional roles, and (b) these scores are often average quantities measured across sentences without capturing statistical significance.
no code implementations • 13 Aug 2020 • Madhura Pande, Aakriti Budhraja, Preksha Nema, Pratyush Kumar, Mitesh M. Khapra
We show that a larger fraction of heads have a locality bias as compared to a syntactic bias.