1 code implementation • 21 May 2024 • Hiba Maryam, Ling Fu, Jiajun Song, Tajrian ABM Shafayet, Qidi Luo, Xiang Bai, Yuliang Liu
The development of Urdu scene text detection, recognition, and Visual Question Answering (VQA) technologies is crucial for advancing accessibility, information retrieval, and linguistic diversity in digital content, facilitating better understanding and interaction with Urdu-language visual data.
1 code implementation • 19 May 2024 • Fadila Wendigoundi Douamba, Jianjun Song, Ling Fu, Yuliang Liu, Xiang Bai
We propose a comprehensive dataset of Swahili scene text images and evaluate the dataset on different scene text detection and recognition models.
no code implementations • 28 Nov 2023 • Ling Fu, Zijie Wu, Yingying Zhu, Yuliang Liu, Xiang Bai
We contend that one main limitation of existing generation methods is the insufficient integration of foreground text with the background.
1 code implementation • 31 Jul 2022 • Xudong Xie, Ling Fu, Zhifei Zhang, Zhaowen Wang, Xiang Bai
Thirdly, we utilize Transformer to learn the global feature on image-level and model the global relationship of the corner points, with the assistance of a corner-query cross-attention mechanism.