TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Document Image Classification	Tobacco-3482	VGG	Memory	7.08	# 1
Document Image Classification	Tobacco-3482	Eff-GNN+ Word2Vec [word2vec]	Accuracy	73.5	# 9
Document Image Classification	Tobacco-3482	BERT [BERT]	Accuracy	79	# 7
Document Image Classification	Tobacco-3482	DocBERT [DOCBERT]	Accuracy	82.3	# 6
Document Image Classification	Tobacco-3482	Eff-GNN + Word2Vec [word2vec] + Image Embedding	Accuracy	77.5	# 8
Document Image Classification	Tobacco-3482	Eff-GNN + Word2Vec [word2vec]	Accuracy	91	# 3
Document Image Classification	Tobacco-3482	DocBert [DOCBERT]	Accuracy	91.95	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/efficient-document-image-classification-using/document-image-classification-on-tobacco-3482)](https://paperswithcode.com/sota/document-image-classification-on-tobacco-3482?p=efficient-document-image-classification-using)`

Efficient Document Image Classification Using Region-Based Graph Neural Network

25 Jun 2021 · Jaya Krishna Mandivarapu, Eric Bunch, Qian You, Glenn Fung ·

Document image classification remains a popular research area because it can be commercialized in many enterprise applications across different industries. Recent advancements in large pre-trained computer vision and language models and graph neural networks has lent document image classification many tools. However using large pre-trained models usually requires substantial computing resources which could defeat the cost-saving advantages of automatic document image classification. In the paper we propose an efficient document image classification framework that uses graph convolution neural networks and incorporates textual, visual and layout information of the document. We have rigorously benchmarked our proposed algorithm against several state-of-art vision and language models on both publicly available dataset and a real-life insurance document classification dataset. Empirical results on both publicly available and real-world data show that our methods achieve near SOTA performance yet require much less computing resources and time for model training and inference. This results in solutions than offer better cost advantages, especially in scalable deployment for enterprise applications. The results showed that our algorithm can achieve classification performance quite close to SOTA. We also provide comprehensive comparisons of computing resources, model sizes, train and inference time between our proposed methods and baselines. In addition we delineate the cost per image using our method and other baselines.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Classification

Document Classification

Document Image Classification

Image Classification

Datasets

RVL-CDIP Tobacco-3482

Results from the Paper

Edit

Ranked #1 on Document Image Classification on Tobacco-3482 (Memory metric)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Document Image Classification	Tobacco-3482	VGG	Memory	7.08	# 1	Compare
Document Image Classification	Tobacco-3482	Eff-GNN+ Word2Vec [word2vec]	Accuracy	73.5	# 9	Compare
Document Image Classification	Tobacco-3482	BERT [BERT]	Accuracy	79	# 7	Compare
Document Image Classification	Tobacco-3482	DocBERT [DOCBERT]	Accuracy	82.3	# 6	Compare
Document Image Classification	Tobacco-3482	Eff-GNN + Word2Vec [word2vec] + Image Embedding	Accuracy	77.5	# 8	Compare
Document Image Classification	Tobacco-3482	Eff-GNN + Word2Vec [word2vec]	Accuracy	91	# 3	Compare
Document Image Classification	Tobacco-3482	DocBert [DOCBERT]	Accuracy	91.95	# 2	Compare

Methods

Add Remove

Convolution • Graph Convolutional Networks

Edit Social Preview

Efficient Document Image Classification Using Region-Based Graph Neural Network

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove