no code implementations • 8 Mar 2024 • Aisha Khatun, Anisur Rahman, Md Saiful Islam, Hemayet Ahmed Chowdhury, Ayesha Tasnim
Moreover, we introduce the publicly available Bangla Authorship Attribution Dataset of 16 authors (BAAD16) containing 17, 966 sample texts and 13. 4+ million words to solve the standard dataset scarcity problem and release six variations of pre-trained language models for use in any Bangla NLP downstream task.
1 code implementation • 11 Jan 2020 • Aisha Khatun, Anisur Rahman, Md. Saiful Islam, Marium-E-Jannat
Characters are the smallest unit of text that can extract stylometric signals to determine the author of a text.
no code implementations • 11 Jan 2020 • Hemayet Ahmed Chowdhury, Md. Azizul Haque Imon, Anisur Rahman, Aisha Khatun, Md. Saiful Islam
Language models are generally employed to estimate the probability distribution of various linguistic units, making them one of the fundamental parts of natural language processing.
no code implementations • 15 Nov 2019 • Aisha Khatun, Anisur Rahman, Hemayet Ahmed Chowdhury, Md. Saiful Islam, Ayesha Tasnim
Language models are at the core of natural language processing.