Search Results for author: Taiki Miyanishi

Found 6 papers, 4 papers with code

Map-based Modular Approach for Zero-shot Embodied Question Answering

no code implementations • 26 May 2024 • Koya Sakamoto, Daichi Azuma, Taiki Miyanishi, Shuhei Kurita, Motoaki Kawanabe

We conduct comprehensive experiments on virtual environments (MP3D-EQA) and two real-world house environments and demonstrate that our method can perform EQA even in the real world.

Embodied Question Answering Navigate +1

Paper
Add Code

JDocQA: Japanese Document Question Answering Dataset for Generative Language Models

1 code implementation • 28 Mar 2024 • Eri Onami, Shuhei Kurita, Taiki Miyanishi, Taro Watanabe

Document question answering is a task of question answering on given documents such as reports, slides, pamphlets, and websites, and it is a truly demanding task as paper and electronic forms of documents are so common in our society.

Hallucination Question Answering +1

Paper
Code

Vision Language Model-based Caption Evaluation Method Leveraging Visual Context Extraction

no code implementations • 28 Feb 2024 • Koki Maeda, Shuhei Kurita, Taiki Miyanishi, Naoaki Okazaki

Given the accelerating progress of vision and language modeling, accurate evaluation of machine-generated image captions remains critical.

Image Captioning Language Modelling

Paper
Add Code

CityRefer: Geography-aware 3D Visual Grounding Dataset on City-scale Point Cloud Data

1 code implementation • NeurIPS 2023 • Taiki Miyanishi, Fumiya Kitamori, Shuhei Kurita, Jungdae Lee, Motoaki Kawanabe, Nakamasa Inoue

To tackle this problem, we introduce the CityRefer dataset for city-level visual grounding.

3D visual grounding Autonomous Vehicles

Paper
Code

Cross3DVG: Cross-Dataset 3D Visual Grounding on Different RGB-D Scans

1 code implementation • 23 May 2023 • Taiki Miyanishi, Daichi Azuma, Shuhei Kurita, Motoki Kawanabe

We present a novel task for cross-dataset visual grounding in 3D scenes (Cross3DVG), which overcomes limitations of existing 3D visual grounding models, specifically their restricted 3D resources and consequent tendencies of overfitting a specific 3D dataset.

3D Reconstruction 3D visual grounding

Paper
Code

ScanQA: 3D Question Answering for Spatial Scene Understanding

1 code implementation • CVPR 2022 • Daichi Azuma, Taiki Miyanishi, Shuhei Kurita, Motoaki Kawanabe

We propose a new 3D spatial understanding task of 3D Question Answering (3D-QA).

Ranked #3 on 3D Question Answering (3D-QA) on ScanQA Test w/ objects

3D Question Answering (3D-QA) Object +5

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.