Search Results for author: Tomoyuki Kajiwara

Found 45 papers, 17 papers with code

Adversarial Training on Disentangling Meaning and Language Representations for Unsupervised Quality Estimation

no code implementations • COLING 2022 • Yuto Kuroda, Tomoyuki Kajiwara, Yuki Arase, Takashi Ninomiya

We propose a method to distill language-agnostic meaning embeddings from multilingual sentence encoders for unsupervised quality estimation of machine translation.

Machine Translation Sentence +1

Paper
Add Code

TMEKU System for the WAT2021 Multimodal Translation Task

no code implementations • ACL (WAT) 2021 • YuTing Zhao, Mamoru Komachi, Tomoyuki Kajiwara, Chenhui Chu

We introduce our TMEKU system submitted to the English-Japanese Multimodal Translation Task for WAT 2021.

Machine Translation Translation

Paper
Add Code

Comparing BERT-based Reward Functions for Deep Reinforcement Learning in Machine Translation

no code implementations • WAT 2022 • Yuki Nakatani, Tomoyuki Kajiwara, Takashi Ninomiya

In text generation tasks such as machine translation, models are generally trained using cross-entropy loss.

Machine Translation reinforcement-learning +3

Paper
Add Code

A Japanese Masked Language Model for Academic Domain

1 code implementation • sdp (COLING) 2022 • Hiroki Yamauchi, Tomoyuki Kajiwara, Marie Katsurai, Ikki Ohmukai, Takashi Ninomiya

We release a pretrained Japanese masked language model for an academic domain.

Language Modelling text-classification +1

Paper
Code

Contextualized Word Representations for Multi-Sense Embedding

no code implementations • PACLIC 2018 • Kazuki Ashihara, Tomoyuki Kajiwara, Yuki Arase, Satoru Uchida

Paper
Add Code

DIRECT: Direct and Indirect Responses in Conversational Text Corpus

1 code implementation • Findings (EMNLP) 2021 • Junya Takayama, Tomoyuki Kajiwara, Yuki Arase

We create a large-scale dialogue corpus that provides pragmatic paraphrases to advance technology for understanding the underlying intentions of users.

Paper
Code

Definition Modelling for Appropriate Specificity

no code implementations • EMNLP 2021 • Han Huang, Tomoyuki Kajiwara, Yuki Arase

Definition generation techniques aim to generate a definition of a target word or phrase given a context.

Decoder Definition Modelling +2

Paper
Add Code

JADE: Corpus for Japanese Definition Modelling

no code implementations • LREC 2022 • Han Huang, Tomoyuki Kajiwara, Yuki Arase

This study investigated and released the JADE, a corpus for Japanese definition modelling, which is a technique that automatically generates definitions of a given target word and phrase.

Definition Modelling

Paper
Add Code

A Benchmark Dataset for Multi-Level Complexity-Controllable Machine Translation

1 code implementation • LREC 2022 • Kazuki Tani, Ryoya Yuasa, Kazuki Takikawa, Akihiro Tamura, Tomoyuki Kajiwara, Takashi Ninomiya, Tsuneo Kato

Therefore, we create a benchmark test dataset for Japanese-to-English MLCC-MT from the Newsela corpus by introducing an automatic filtering of data with inappropriate sentence-level complexity, manual check for parallel target language sentences with different complexity levels, and manual translation.

Machine Translation NMT +2

Paper
Code

TMUOU Submission for WMT20 Quality Estimation Shared Task

no code implementations • WMT (EMNLP) 2020 • Akifumi Nakamachi, Hiroki Shimanaka, Tomoyuki Kajiwara, Mamoru Komachi

We introduce the TMUOU submission for the WMT20 Quality Estimation Shared Task 1: Sentence-Level Direct Assessment.

regression Sentence

Paper
Add Code

Language-agnostic Representation from Multilingual Sentence Encoders for Cross-lingual Similarity Estimation

1 code implementation • EMNLP 2021 • Nattapong Tiyajamorn, Tomoyuki Kajiwara, Yuki Arase, Makoto Onizuka

Experimental results on both quality estimation of machine translation and cross-lingual semantic textual similarity tasks reveal that our method consistently outperforms the strong baselines using the original multilingual embedding.

Cross-Lingual Semantic Textual Similarity Machine Translation +3

Paper
Code

IDSOU at WNUT-2020 Task 2: Identification of Informative COVID-19 English Tweets

1 code implementation • EMNLP (WNUT) 2020 • Sora Ohashi, Tomoyuki Kajiwara, Chenhui Chu, Noriko Takemura, Yuta Nakashima, Hajime Nagahara

We introduce the IDSOU submission for the WNUT-2020 task 2: identification of informative COVID-19 English Tweets.

Task 2

Paper
Code

Distilling Word Meaning in Context from Pre-trained Language Models

1 code implementation • Findings (EMNLP) 2021 • Yuki Arase, Tomoyuki Kajiwara

The results confirm that our representations exhibited a competitive performance compared to that of the state-of-the-art method transforming contextualised representations for the context-aware lexical semantic tasks and outperformed it for STS estimation.

Language Modelling Self-Supervised Learning +3

Paper
Code

A Japanese Dataset for Subjective and Objective Sentiment Polarity Classification in Micro Blog Domain

1 code implementation • LREC 2022 • Haruya Suzuki, Yuto Miyauchi, Kazuki Akiyama, Tomoyuki Kajiwara, Takashi Ninomiya, Noriko Takemura, Yuta Nakashima, Hajime Nagahara

We annotate 35, 000 SNS posts with both the writer’s subjective sentiment polarity labels and the reader’s objective ones to construct a Japanese sentiment analysis dataset.

Benchmarking Emotion Recognition +1

144

Paper
Code

Unsupervised Translation Quality Estimation Exploiting Synthetic Data and Pre-trained Multilingual Encoder

no code implementations • 9 Nov 2023 • Yuto Kuroda, Atsushi Fujita, Tomoyuki Kajiwara, Takashi Ninomiya

In this paper, we extensively investigate the usefulness of synthetic TQE data and pre-trained multilingual encoders in unsupervised sentence-level TQE, both of which have been proven effective in the supervised training scenarios.

Sentence Translation

Paper
Add Code

CEFR-Based Sentence Difficulty Annotation and Assessment

1 code implementation • 21 Oct 2022 • Yuki Arase, Satoru Uchida, Tomoyuki Kajiwara

Controllable text simplification is a crucial assistive technique for language learning and teaching.

Sentence Text Simplification

Paper
Code

Edit Distance Based Curriculum Learning for Paraphrase Generation

no code implementations • ACL 2021 • Sora Kadotani, Tomoyuki Kajiwara, Yuki Arase, Makoto Onizuka

Curriculum learning has improved the quality of neural machine translation, where only source-side features are considered in the metrics to determine the difficulty of translation.

Machine Translation Paraphrase Generation +1

Paper
Add Code

Distinct Label Representations for Few-Shot Text Classification

1 code implementation • ACL 2021 • Sora Ohashi, Junya Takayama, Tomoyuki Kajiwara, Yuki Arase

Few-shot text classification aims to classify inputs whose label has only a few examples.

Few-Shot Text Classification text-classification

Paper
Code

WRIME: A New Dataset for Emotional Intensity Estimation with Subjective and Objective Annotations

1 code implementation • NAACL 2021 • Tomoyuki Kajiwara, Chenhui Chu, Noriko Takemura, Yuta Nakashima, Hajime Nagahara

We annotate 17, 000 SNS posts with both the writer{'}s subjective emotional intensity and the reader{'}s objective one to construct a Japanese emotion analysis dataset.

Emotion Recognition

144

Paper
Code

Text Simplification with Reinforcement Learning Using Supervised Rewards on Grammaticality, Meaning Preservation, and Simplicity

no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Akifumi Nakamachi, Tomoyuki Kajiwara, Yuki Arase

We optimize rewards of reinforcement learning in text simplification using metrics that are highly correlated with human-perspectives.

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Add Code

SOME: Reference-less Sub-Metrics Optimized for Manual Evaluations of Grammatical Error Correction

1 code implementation • COLING 2020 • Ryoma Yoshimura, Masahiro Kaneko, Tomoyuki Kajiwara, Mamoru Komachi

We propose a reference-less metric trained on manual evaluations of system outputs for grammatical error correction (GEC).

Grammatical Error Correction Sentence

Paper
Code

Tiny Word Embeddings Using Globally Informed Reconstruction

no code implementations • COLING 2020 • Sora Ohashi, Mao Isogawa, Tomoyuki Kajiwara, Yuki Arase

We reduce the model size of pre-trained word embeddings by a factor of 200 while preserving its quality.

Word Embeddings Word Similarity

Paper
Add Code

Double Attention-based Multimodal Neural Machine Translation with Semantic Image Regions

1 code implementation • EAMT 2020 • YuTing Zhao, Mamoru Komachi, Tomoyuki Kajiwara, Chenhui Chu

In contrast, we propose the application of semantic image regions for MNMT by integrating visual and textual features using two individual attention mechanisms (double attention).

Machine Translation Translation

Paper
Code

Text Classification with Negative Supervision

no code implementations • ACL 2020 • Sora Ohashi, Junya Takayama, Tomoyuki Kajiwara, Chenhui Chu, Yuki Arase

Advanced pre-trained models for text representation have achieved state-of-the-art performance on various text classification tasks.

General Classification Semantic Similarity +4

Paper
Add Code

SAPPHIRE: Simple Aligner for Phrasal Paraphrase with Hierarchical Representation

no code implementations • LREC 2020 • Masato Yoshinaka, Tomoyuki Kajiwara, Yuki Arase

To estimate the likelihood of phrase alignments, SAPPHIRE uses phrase embeddings that are hierarchically composed of word embeddings.

Natural Language Inference Natural Language Understanding +2

Paper
Add Code

Annotation of Adverse Drug Reactions in Patients' Weblogs

no code implementations • LREC 2020 • Yuki Arase, Tomoyuki Kajiwara, Chenhui Chu

The dataset we present in this paper is unique for the richness of annotated information, including detailed descriptions of drug reactions with full context.

Paper
Add Code

Word Complexity Estimation for Japanese Lexical Simplification

no code implementations • LREC 2020 • Daiki Nishihara, Tomoyuki Kajiwara

We introduce three language resources for Japanese lexical simplification: 1) a large-scale word complexity lexicon, 2) the first synonym lexicon for converting complex words to simpler ones, and 3) the first toolkit for developing and benchmarking Japanese lexical simplification system.

Benchmarking Lexical Simplification

Paper
Add Code

Contextualized context2vec

no code implementations • WS 2019 • Kazuki Ashihara, Tomoyuki Kajiwara, Yuki Arase, Satoru Uchida

Herein we propose a method that combines these two approaches to contextualize word embeddings for lexical substitution.

Sentence Word Embeddings

Paper
Add Code

Machine Translation Evaluation with BERT Regressor

no code implementations • 29 Jul 2019 • Hiroki Shimanaka, Tomoyuki Kajiwara, Mamoru Komachi

We introduce the metric using BERT (Bidirectional Encoder Representations from Transformers) (Devlin et al., 2019) for automatic machine translation evaluation.

Machine Translation Translation

Paper
Add Code

Controllable Text Simplification with Lexical Constraint Loss

no code implementations • ACL 2019 • Daiki Nishihara, Tomoyuki Kajiwara, Yuki Arase

Our text simplification method succeeds in translating an input into a specific grade level by considering levels of both sentences and words.

Sentence Text Simplification +1

Paper
Add Code

Negative Lexically Constrained Decoding for Paraphrase Generation

no code implementations • ACL 2019 • Tomoyuki Kajiwara

Paraphrase generation can be regarded as monolingual translation.

Machine Translation Paraphrase Generation +3

Paper
Add Code

Using Natural Language Processing to Develop an Automated Orthodontic Diagnostic System

no code implementations • 31 May 2019 • Tomoyuki Kajiwara, Chihiro Tanikawa, Yuujin Shimizu, Chenhui Chu, Takashi Yamashiro, Hajime Nagahara

We work on the task of automatically designing a treatment plan from the findings included in the medical certificate written by the dentist.

Paper
Add Code

RUSE: Regressor Using Sentence Embeddings for Automatic Machine Translation Evaluation

1 code implementation • WS 2018 • Hiroki Shimanaka, Tomoyuki Kajiwara, Mamoru Komachi

We introduce the RUSE metric for the WMT18 metrics shared task.

Machine Translation Sentence +2

Paper
Code

TMU System for SLAM-2018

1 code implementation • WS 2018 • Masahiro Kaneko, Tomoyuki Kajiwara, Mamoru Komachi

We introduce the TMU systems for the second language acquisition modeling shared task 2018 (Settles et al., 2018).

Language Acquisition

Paper
Code

Complex Word Identification Based on Frequency in a Learner Corpus

no code implementations • WS 2018 • Tomoyuki Kajiwara, Mamoru Komachi

We introduce the TMU systems for the Complex Word Identification (CWI) Shared Task 2018.

Complex Word Identification Lexical Simplification +2

Paper
Add Code

Metric for Automatic Machine Translation Evaluation based on Universal Sentence Representations

no code implementations • NAACL 2018 • Hiroki Shimanaka, Tomoyuki Kajiwara, Mamoru Komachi

Sentence representations can capture a wide range of information that cannot be captured by local features based on character or word N-grams.

Machine Translation Sentence +1

Paper
Add Code

Semantic Features Based on Word Alignments for Estimating Quality of Text Simplification

no code implementations • IJCNLP 2017 • Tomoyuki Kajiwara, Atsushi Fujita

This paper examines the usefulness of semantic features based on word alignments for estimating the quality of text simplification.

Machine Translation Reading Comprehension +4

Paper
Add Code

MIPA: Mutual Information Based Paraphrase Acquisition via Bilingual Pivoting

1 code implementation • IJCNLP 2017 • Tomoyuki Kajiwara, Mamoru Komachi, Daichi Mochihashi

We present a pointwise mutual information (PMI)-based approach to formalize paraphrasability and propose a variant of PMI, called MIPA, for the paraphrase acquisition.

Learning Word Embeddings Semantic Textual Similarity +1

Paper
Code

Improving Japanese-to-English Neural Machine Translation by Paraphrasing the Target Language

no code implementations • WS 2017 • Yuuki Sekizawa, Tomoyuki Kajiwara, Mamoru Komachi

Neural machine translation (NMT) produces sentences that are more fluent than those produced by statistical machine translation (SMT).

Machine Translation NMT +1

Paper
Add Code

Building a Non-Trivial Paraphrase Corpus Using Multiple Machine Translation Systems

1 code implementation • ACL 2017 • Yui Suzuki, Tomoyuki Kajiwara, Mamoru Komachi

Information Retrieval Machine Translation +4

Paper
Code

Building a Monolingual Parallel Corpus for Text Simplification Using Sentence Similarity Based on Alignment between Word Embeddings

no code implementations • COLING 2016 • Tomoyuki Kajiwara, Mamoru Komachi

To obviate the need for human annotation, we propose an unsupervised method that automatically builds the monolingual parallel corpus for text simplification using sentence similarity based on word embeddings.

Machine Translation Sentence +4