1 code implementation • 29 Apr 2024 • Huy Quang Pham, Thang Kien-Bao Nguyen, Quan Van Nguyen, Dan Quang Tran, Nghia Hieu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
To this end, we introduce a novel dataset, ViOCRVQA (Vietnamese Optical Character Recognition - Visual Question Answering dataset), consisting of 28, 000+ images and 120, 000+ question-answer pairs.
Optical Character Recognition Optical Character Recognition (OCR) +2
1 code implementation • 16 Apr 2024 • Quan Van Nguyen, Dan Quang Tran, Huy Quang Pham, Thang Kien-Bao Nguyen, Nghia Hieu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Visual Question Answering (VQA) is a complicated task that requires the capability of simultaneously processing natural language and images.
Multimodal Deep Learning Optical Character Recognition (OCR) +5
no code implementations • 17 Jul 2023 • Nghia Hieu Nguyen, Kiet Van Nguyen
Based on these two novel modules, we introduce the Parallel Attention Transformer (PAT), achieving the best accuracy compared to all baselines on the benchmark ViVQA dataset and other SOTA methods including SAAA and MCAN.
1 code implementation • 7 May 2023 • Nghia Hieu Nguyen, Duong T. D. Vo, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
The VQA task requires methods that have the ability to fuse the information from questions and images to produce appropriate answers.
no code implementations • 7 May 2023 • Doanh C. Bui, Nghia Hieu Nguyen, Khang Nguyen
To contribute to the low-resources research community as in Vietnam, we introduce a novel image captioning dataset in Vietnamese, the Open-domain Vietnamese Image Captioning dataset (UIT-OpenViIC).
no code implementations • 23 Feb 2023 • Ngan Luu-Thuy Nguyen, Nghia Hieu Nguyen, Duong T. D Vo, Khanh Quoc Tran, Kiet Van Nguyen
Visual Question Answering (VQA) is a challenging task of natural language processing (NLP) and computer vision (CV), attracting significant attention from researchers.
1 code implementation • 10 Nov 2022 • Nghia Hieu Nguyen, Duong T. D. Vo, Kiet Van Nguyen
Recognizing handwriting images is challenging due to the vast variation in writing style across many people and distinct linguistic aspects of writing languages.
no code implementations • 10 Nov 2022 • Nghia Hieu Nguyen, Duong T. D. Vo, Minh-Quan Ha
Image captioning is currently a challenging task that requires the ability to both understand visual information and use human language to describe this visual information in the image.