no code implementations • COLING 2022 • Yuto Kuroda, Tomoyuki Kajiwara, Yuki Arase, Takashi Ninomiya
We propose a method to distill language-agnostic meaning embeddings from multilingual sentence encoders for unsupervised quality estimation of machine translation.
no code implementations • ACL (WAT) 2021 • YuTing Zhao, Mamoru Komachi, Tomoyuki Kajiwara, Chenhui Chu
We introduce our TMEKU system submitted to the English-Japanese Multimodal Translation Task for WAT 2021.
no code implementations • WAT 2022 • Yuki Nakatani, Tomoyuki Kajiwara, Takashi Ninomiya
In text generation tasks such as machine translation, models are generally trained using cross-entropy loss.
1 code implementation • sdp (COLING) 2022 • Hiroki Yamauchi, Tomoyuki Kajiwara, Marie Katsurai, Ikki Ohmukai, Takashi Ninomiya
We release a pretrained Japanese masked language model for an academic domain.
1 code implementation • Findings (EMNLP) 2021 • Junya Takayama, Tomoyuki Kajiwara, Yuki Arase
We create a large-scale dialogue corpus that provides pragmatic paraphrases to advance technology for understanding the underlying intentions of users.
no code implementations • EMNLP 2021 • Han Huang, Tomoyuki Kajiwara, Yuki Arase
Definition generation techniques aim to generate a definition of a target word or phrase given a context.
no code implementations • LREC 2022 • Han Huang, Tomoyuki Kajiwara, Yuki Arase
This study investigated and released the JADE, a corpus for Japanese definition modelling, which is a technique that automatically generates definitions of a given target word and phrase.
1 code implementation • LREC 2022 • Kazuki Tani, Ryoya Yuasa, Kazuki Takikawa, Akihiro Tamura, Tomoyuki Kajiwara, Takashi Ninomiya, Tsuneo Kato
Therefore, we create a benchmark test dataset for Japanese-to-English MLCC-MT from the Newsela corpus by introducing an automatic filtering of data with inappropriate sentence-level complexity, manual check for parallel target language sentences with different complexity levels, and manual translation.
no code implementations • WMT (EMNLP) 2020 • Akifumi Nakamachi, Hiroki Shimanaka, Tomoyuki Kajiwara, Mamoru Komachi
We introduce the TMUOU submission for the WMT20 Quality Estimation Shared Task 1: Sentence-Level Direct Assessment.
1 code implementation • EMNLP 2021 • Nattapong Tiyajamorn, Tomoyuki Kajiwara, Yuki Arase, Makoto Onizuka
Experimental results on both quality estimation of machine translation and cross-lingual semantic textual similarity tasks reveal that our method consistently outperforms the strong baselines using the original multilingual embedding.
Cross-Lingual Semantic Textual Similarity Machine Translation +3
1 code implementation • EMNLP (WNUT) 2020 • Sora Ohashi, Tomoyuki Kajiwara, Chenhui Chu, Noriko Takemura, Yuta Nakashima, Hajime Nagahara
We introduce the IDSOU submission for the WNUT-2020 task 2: identification of informative COVID-19 English Tweets.
1 code implementation • Findings (EMNLP) 2021 • Yuki Arase, Tomoyuki Kajiwara
The results confirm that our representations exhibited a competitive performance compared to that of the state-of-the-art method transforming contextualised representations for the context-aware lexical semantic tasks and outperformed it for STS estimation.
1 code implementation • LREC 2022 • Haruya Suzuki, Yuto Miyauchi, Kazuki Akiyama, Tomoyuki Kajiwara, Takashi Ninomiya, Noriko Takemura, Yuta Nakashima, Hajime Nagahara
We annotate 35, 000 SNS posts with both the writer’s subjective sentiment polarity labels and the reader’s objective ones to construct a Japanese sentiment analysis dataset.
no code implementations • 9 Nov 2023 • Yuto Kuroda, Atsushi Fujita, Tomoyuki Kajiwara, Takashi Ninomiya
In this paper, we extensively investigate the usefulness of synthetic TQE data and pre-trained multilingual encoders in unsupervised sentence-level TQE, both of which have been proven effective in the supervised training scenarios.
1 code implementation • 21 Oct 2022 • Yuki Arase, Satoru Uchida, Tomoyuki Kajiwara
Controllable text simplification is a crucial assistive technique for language learning and teaching.
no code implementations • ACL 2021 • Sora Kadotani, Tomoyuki Kajiwara, Yuki Arase, Makoto Onizuka
Curriculum learning has improved the quality of neural machine translation, where only source-side features are considered in the metrics to determine the difficulty of translation.
1 code implementation • ACL 2021 • Sora Ohashi, Junya Takayama, Tomoyuki Kajiwara, Yuki Arase
Few-shot text classification aims to classify inputs whose label has only a few examples.
1 code implementation • NAACL 2021 • Tomoyuki Kajiwara, Chenhui Chu, Noriko Takemura, Yuta Nakashima, Hajime Nagahara
We annotate 17, 000 SNS posts with both the writer{'}s subjective emotional intensity and the reader{'}s objective one to construct a Japanese emotion analysis dataset.
no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Akifumi Nakamachi, Tomoyuki Kajiwara, Yuki Arase
We optimize rewards of reinforcement learning in text simplification using metrics that are highly correlated with human-perspectives.
1 code implementation • COLING 2020 • Ryoma Yoshimura, Masahiro Kaneko, Tomoyuki Kajiwara, Mamoru Komachi
We propose a reference-less metric trained on manual evaluations of system outputs for grammatical error correction (GEC).
no code implementations • COLING 2020 • Sora Ohashi, Mao Isogawa, Tomoyuki Kajiwara, Yuki Arase
We reduce the model size of pre-trained word embeddings by a factor of 200 while preserving its quality.
1 code implementation • EAMT 2020 • YuTing Zhao, Mamoru Komachi, Tomoyuki Kajiwara, Chenhui Chu
In contrast, we propose the application of semantic image regions for MNMT by integrating visual and textual features using two individual attention mechanisms (double attention).
no code implementations • ACL 2020 • Sora Ohashi, Junya Takayama, Tomoyuki Kajiwara, Chenhui Chu, Yuki Arase
Advanced pre-trained models for text representation have achieved state-of-the-art performance on various text classification tasks.
no code implementations • LREC 2020 • Masato Yoshinaka, Tomoyuki Kajiwara, Yuki Arase
To estimate the likelihood of phrase alignments, SAPPHIRE uses phrase embeddings that are hierarchically composed of word embeddings.
Natural Language Inference Natural Language Understanding +2
no code implementations • LREC 2020 • Yuki Arase, Tomoyuki Kajiwara, Chenhui Chu
The dataset we present in this paper is unique for the richness of annotated information, including detailed descriptions of drug reactions with full context.
no code implementations • LREC 2020 • Daiki Nishihara, Tomoyuki Kajiwara
We introduce three language resources for Japanese lexical simplification: 1) a large-scale word complexity lexicon, 2) the first synonym lexicon for converting complex words to simpler ones, and 3) the first toolkit for developing and benchmarking Japanese lexical simplification system.
no code implementations • WS 2019 • Kazuki Ashihara, Tomoyuki Kajiwara, Yuki Arase, Satoru Uchida
Herein we propose a method that combines these two approaches to contextualize word embeddings for lexical substitution.
no code implementations • 29 Jul 2019 • Hiroki Shimanaka, Tomoyuki Kajiwara, Mamoru Komachi
We introduce the metric using BERT (Bidirectional Encoder Representations from Transformers) (Devlin et al., 2019) for automatic machine translation evaluation.
no code implementations • ACL 2019 • Daiki Nishihara, Tomoyuki Kajiwara, Yuki Arase
Our text simplification method succeeds in translating an input into a specific grade level by considering levels of both sentences and words.
no code implementations • ACL 2019 • Tomoyuki Kajiwara
Paraphrase generation can be regarded as monolingual translation.
no code implementations • 31 May 2019 • Tomoyuki Kajiwara, Chihiro Tanikawa, Yuujin Shimizu, Chenhui Chu, Takashi Yamashiro, Hajime Nagahara
We work on the task of automatically designing a treatment plan from the findings included in the medical certificate written by the dentist.
1 code implementation • WS 2018 • Hiroki Shimanaka, Tomoyuki Kajiwara, Mamoru Komachi
We introduce the RUSE metric for the WMT18 metrics shared task.
1 code implementation • WS 2018 • Masahiro Kaneko, Tomoyuki Kajiwara, Mamoru Komachi
We introduce the TMU systems for the second language acquisition modeling shared task 2018 (Settles et al., 2018).
no code implementations • WS 2018 • Tomoyuki Kajiwara, Mamoru Komachi
We introduce the TMU systems for the Complex Word Identification (CWI) Shared Task 2018.
no code implementations • NAACL 2018 • Hiroki Shimanaka, Tomoyuki Kajiwara, Mamoru Komachi
Sentence representations can capture a wide range of information that cannot be captured by local features based on character or word N-grams.
no code implementations • IJCNLP 2017 • Tomoyuki Kajiwara, Atsushi Fujita
This paper examines the usefulness of semantic features based on word alignments for estimating the quality of text simplification.
1 code implementation • IJCNLP 2017 • Tomoyuki Kajiwara, Mamoru Komachi, Daichi Mochihashi
We present a pointwise mutual information (PMI)-based approach to formalize paraphrasability and propose a variant of PMI, called MIPA, for the paraphrase acquisition.
no code implementations • WS 2017 • Yuuki Sekizawa, Tomoyuki Kajiwara, Mamoru Komachi
Neural machine translation (NMT) produces sentences that are more fluent than those produced by statistical machine translation (SMT).
no code implementations • COLING 2016 • Tomoyuki Kajiwara, Mamoru Komachi
To obviate the need for human annotation, we propose an unsupervised method that automatically builds the monolingual parallel corpus for text simplification using sentence similarity based on word embeddings.