Multi-modal Named Entity Recognition

5 papers with code • 5 benchmarks • 0 datasets

Multi-modal named entity recognition aims at improving the accuracy of NER models through utilizing image information.

Most implemented papers

Improving Multimodal Named Entity Recognition via Entity Span Detection with Unified Multimodal Transformer

no code yet • ACL 2020

To tackle the first issue, we propose a multimodal interaction module to obtain both image-aware word representations and word-aware visual representations.

RpBERT: A Text-image Relation Propagation-based BERT Model for Multimodal NER

Multimodal-NER/RpBERT 5 Feb 2021

We integrate soft or hard gates to select visual clues and propose a multitask algorithm to train on the MNER datasets.

ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition

alibaba-nlp/kb-ner NAACL 2022

As text representations take the most important role in MNER, in this paper, we propose {\bf I}mage-{\bf t}ext {\bf A}lignments (ITA) to align image features into the textual space, so that the attention mechanism in transformer-based pretrained textual embeddings can be better utilized.

Named Entity and Relation Extraction with Multi-Modal Retrieval

modelscope/adaseq 3 Dec 2022

MoRe contains a text retrieval module and an image-based retrieval module, which retrieve related knowledge of the input text and image in the knowledge corpus respectively.

Prompting ChatGPT in MNER: Enhanced Multimodal Named Entity Recognition with Auxiliary Refined Knowledge

jinyuanli0012/pgim 20 May 2023

However, these methods either neglect the necessity of providing the model with external knowledge, or encounter issues of high redundancy in the retrieved knowledge.