1 code implementation • 2 May 2024 • Samee Arif, Sualeha Farid, Awais Athar, Agha Ali Raza
This paper introduces UQA, a novel dataset for question answering and text comprehension in Urdu, a low-resource language with over 70 million native speakers.
no code implementations • 14 Mar 2024 • Abdul Hameed Azeemi, Ihsan Ayyub Qazi, Agha Ali Raza
A weighted hybrid score that combines uncertainty and diversity is then used to select the top instances for annotation in each AL iteration.
no code implementations • 18 Mar 2022 • Abdul Hameed Azeemi, Ihsan Ayyub Qazi, Agha Ali Raza
Self-supervised speech recognition models require considerable labeled training data for learning high-fidelity representations for Automatic Speech Recognition (ASR) which is computationally demanding and time-consuming.
no code implementations • LREC 2020 • Namoos Hayat Qasmi, Haris Bin Zia, Awais Athar, Agha Ali Raza
Being a low-resource language in terms of standard linguistic resources, recent text simplification approaches that rely on manually crafted simplified corpora or lexicons such as WordNet are not applicable to Urdu.
1 code implementation • COLING 2018 • Haris Bin Zia, Agha Ali Raza, Awais Athar
State-of-the-art Natural Language Processing algorithms rely heavily on efficient word segmentation.
no code implementations • LREC 2018 • Haris Bin Zia, Agha Ali Raza, Awais Athar
The tool predicts the pronunciation of words using a LSTM-based model trained on a handcrafted expert lexicon of around 39, 000 words and shows an accuracy of 64% upon internal evaluation.