no code implementations • 19 Apr 2024 • Yihao Ding, Kaixuan Ren, Jiabin Huang, Siwen Luo, Soyeon Caren Han
Document Question Answering (QA) presents a challenge in understanding visually-rich documents (VRD), particularly those dominated by lengthy textual content like research journal articles.
no code implementations • 28 Feb 2024 • Yihao Ding, Lorenzo Vaiani, Caren Han, Jean Lee, Paolo Garza, Josiah Poon, Luca Cagliero
This paper presents a groundbreaking multimodal, multi-task, multi-teacher joint-grained knowledge distillation model for visually-rich form document understanding.
no code implementations • 31 Jul 2023 • Soyeon Caren Han, Yihao Ding, Siwen Luo, Josiah Poon, HeeGuen Yoon, Zhe Huang, Paul Duuring, Eun Jung Holden
Document understanding and information extraction include different tasks to understand a document and extract valuable information automatically.
no code implementations • 23 Apr 2023 • Kunze Wang, Yihao Ding, Soyeon Caren Han
Text Classification is the most essential and fundamental problem in Natural Language Processing.
no code implementations • 13 Apr 2023 • Yihao Ding, Siwen Luo, Hyunsuk Chung, Soyeon Caren Han
Document-based Visual Question Answering examines the document understanding of document images in conditions of natural language questions.
1 code implementation • 4 Apr 2023 • Yihao Ding, Siqu Long, Jiabin Huang, Kaixuan Ren, Xingxiang Luo, Hyunsuk Chung, Soyeon Caren Han
Compared to general document analysis tasks, form document structure understanding and retrieval are challenging.
1 code implementation • COLING 2022 • Siwen Luo, Yihao Ding, Siqu Long, Josiah Poon, Soyeon Caren Han
Recognizing the layout of unstructured digital documents is crucial when parsing the documents into the structured, machine-readable format for downstream applications.
no code implementations • 27 May 2022 • Yihao Ding, Zhe Huang, Runlin Wang, Yanhang Zhang, Xianru Chen, Yuzhong Ma, Hyunsuk Chung, Soyeon Caren Han
We propose V-Doc, a question-answering tool using document images and PDF, mainly for researchers and general non-deep learning experts looking to generate, process, and understand the document visual question answering tasks.
no code implementations • CVPR 2022 • Yihao Ding, Zhe Huang, Runlin Wang, Yanhang Zhang, Xianru Chen, Yuzhong Ma, Hyunsuk Chung, Soyeon Caren Han
We propose V-Doc, a question-answering tool using document images and PDF, mainly for researchers and general non-deep learning experts looking to generate, process, and understand the document visual question answering tasks.