Search Results for author: Yoshihiro Yamazaki

Found 4 papers, 0 papers with code

Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations

no code implementations21 Feb 2022 Yoshihiro Yamazaki, Shota Orihashi, Ryo Masumura, Mihiro Uchida, Akihiko Takashima

There have been many attempts to build multimodal dialog systems that can respond to a question about given audio-visual information, and the representative task for such systems is the Audio Visual Scene-Aware Dialog (AVSD).

Answer Generation Video Understanding

Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages

no code implementations24 Nov 2021 Shota Orihashi, Yoshihiro Yamazaki, Naoki Makishima, Mana Ihori, Akihiko Takashima, Tomohiro Tanaka, Ryo Masumura

To this end, the proposed method pre-trains the encoder by using a multilingual dataset that combines the resource-poor language's dataset and the resource-rich language's dataset to learn language-invariant knowledge for scene text recognition.

Decoder Scene Text Recognition

Hierarchical Knowledge Distillation for Dialogue Sequence Labeling

no code implementations22 Nov 2021 Shota Orihashi, Yoshihiro Yamazaki, Naoki Makishima, Mana Ihori, Akihiko Takashima, Tomohiro Tanaka, Ryo Masumura

Dialogue sequence labeling is a supervised learning task that estimates labels for each utterance in the target dialogue document, and is useful for many applications such as dialogue act estimation.

Knowledge Distillation Scene Segmentation

Construction and Analysis of a Multimodal Chat-talk Corpus for Dialog Systems Considering Interpersonal Closeness

no code implementations LREC 2020 Yoshihiro Yamazaki, Yuya Chiba, Takashi Nose, Akinori Ito

To facilitate research of such dialog systems, we are currently constructing a large-scale multimodal dialog corpus focusing on the relationship between speakers.

TAG

Cannot find the paper you are looking for? You can Submit a new open access paper.