Search Results for author: Yihao Ding

Found 9 papers, 2 papers with code

PDF-MVQA: A Dataset for Multimodal Information Retrieval in PDF-based Visual Question Answering

no code implementations • 19 Apr 2024 • Yihao Ding, Kaixuan Ren, Jiabin Huang, Siwen Luo, Soyeon Caren Han

Document Question Answering (QA) presents a challenge in understanding visually-rich documents (VRD), particularly those dominated by lengthy textual content like research journal articles.

Information Retrieval Machine Reading Comprehension +3

Paper
Add Code

M3-VRD: Multimodal Multi-task Multi-teacher Visually-Rich Form Document Understanding

no code implementations • 28 Feb 2024 • Yihao Ding, Lorenzo Vaiani, Caren Han, Jean Lee, Paolo Garza, Josiah Poon, Luca Cagliero

This paper presents a groundbreaking multimodal, multi-task, multi-teacher joint-grained knowledge distillation model for visually-rich form document understanding.

document understanding Knowledge Distillation

Paper
Add Code

Workshop on Document Intelligence Understanding

no code implementations • 31 Jul 2023 • Soyeon Caren Han, Yihao Ding, Siwen Luo, Josiah Poon, HeeGuen Yoon, Zhe Huang, Paul Duuring, Eun Jung Holden

Document understanding and information extraction include different tasks to understand a document and extract valuable information automatically.

document understanding Visual Question Answering (VQA)

Paper
Add Code

Graph Neural Networks for Text Classification: A Survey

no code implementations • 23 Apr 2023 • Kunze Wang, Yihao Ding, Soyeon Caren Han

Text Classification is the most essential and fundamental problem in Natural Language Processing.

graph construction text-classification +1

Paper
Add Code

PDFVQA: A New Dataset for Real-World VQA on PDF Documents

no code implementations • 13 Apr 2023 • Yihao Ding, Siwen Luo, Hyunsuk Chung, Soyeon Caren Han

Document-based Visual Question Answering examines the document understanding of document images in conditions of natural language questions.

document understanding Key Information Extraction +2

Paper
Add Code

Form-NLU: Dataset for the Form Natural Language Understanding

1 code implementation • 4 Apr 2023 • Yihao Ding, Siqu Long, Jiabin Huang, Kaixuan Ren, Xingxiang Luo, Hyunsuk Chung, Soyeon Caren Han

Compared to general document analysis tasks, form document structure understanding and retrieval are challenging.

4k Key Information Extraction +4

Paper
Code

Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis

1 code implementation • COLING 2022 • Siwen Luo, Yihao Ding, Siqu Long, Josiah Poon, Soyeon Caren Han

Recognizing the layout of unstructured digital documents is crucial when parsing the documents into the structured, machine-readable format for downstream applications.

Component Classification Document Layout Analysis

Paper
Code

V-Doc : Visual questions answers with Documents

no code implementations • 27 May 2022 • Yihao Ding, Zhe Huang, Runlin Wang, Yanhang Zhang, Xianru Chen, Yuzhong Ma, Hyunsuk Chung, Soyeon Caren Han

We propose V-Doc, a question-answering tool using document images and PDF, mainly for researchers and general non-deep learning experts looking to generate, process, and understand the document visual question answering tasks.

Question Answering Question Generation +2

Paper
Add Code

V-Doc: Visual Questions Answers With Documents

no code implementations • CVPR 2022 • Yihao Ding, Zhe Huang, Runlin Wang, Yanhang Zhang, Xianru Chen, Yuzhong Ma, Hyunsuk Chung, Soyeon Caren Han

Question Answering Question Generation +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.