Search Results for author: Changbing Yang

Found 14 papers, 3 papers with code

IGT2P: From Interlinear Glossed Texts to Paradigms

no code implementations EMNLP 2020 Sarah Moeller, Ling Liu, Changbing Yang, Katharina Kann, Mans Hulden

An intermediate step in the linguistic analysis of an under-documented language is to find and organize inflected forms that are attested in natural speech.

POS

Morphological Processing of Low-Resource Languages: Where We Are and What’s Next

no code implementations Findings (ACL) 2022 Adam Wiemerslage, Miikka Silfverberg, Changbing Yang, Arya McCarthy, Garrett Nicolai, Eliana Colunga, Katharina Kann

Automatic morphological processing can aid downstream natural language processing applications, especially for low-resource languages, and assist language documentation efforts for endangered languages.

Penalizing Divergence: Multi-Parallel Translation for Low-Resource Languages of North America

no code implementations COLING 2022 Garrett Nicolai, Changbing Yang, Miikka Silfverberg

Experiments on very low-resourced Indigenous North American languages show that an initially deficient multilingual translator can improve by 4. 9 BLEU through mBART pre-training, and 5. 5 BLEU points with the strategic addition of monolingual data, and that a divergence penalty leads to further increases of 0. 4 BLEU.

Machine Translation Translation

An Inflectional Database for Gitksan

1 code implementation LREC 2022 Bruce Oliver, Clarissa Forbes, Changbing Yang, Farhan Samir, Edith Coates, Garrett Nicolai, Miikka Silfverberg

We use Gitksan data in interlinear glossed format, stemming from language documentation efforts, to build a database of partial inflection tables.

Data Augmentation Hallucination +1

Unsupervised Paradigm Clustering Using Transformation Rules

no code implementations ACL (SIGMORPHON) 2021 Changbing Yang, Garrett Nicolai, Miikka Silfverberg

Secondly, we experiment with more general rules which can apply transformations inside the input strings in addition to prefix and suffix transformations.

Clustering Task 2

Generalizing Morphological Inflection Systems to Unseen Lemmas

no code implementations NAACL (SIGMORPHON) 2022 Changbing Yang, Ruixin (Ray) Yang, Garrett Nicolai, Miikka Silfverberg

This paper presents experiments on morphological inflection using data from the SIGMORPHON-UniMorph 2022 Shared Task 0: Generalization and Typologically Diverse Morphological Inflection.

Hallucination LEMMA +1

Embedded Translations for Low-resource Automated Glossing

no code implementations13 Mar 2024 Changbing Yang, Garrett Nicolai, Miikka Silfverberg

Aided by these enhancements, our model demonstrates an average improvement of 3. 97\%-points over the previous state of the art on datasets from the SIGMORPHON 2023 Shared Task on Interlinear Glossing.

Translation

The taste of IPA: Towards open-vocabulary keyword spotting and forced alignment in any language

1 code implementation14 Nov 2023 Jian Zhu, Changbing Yang, Farhan Samir, Jahurul Islam

In this project, we demonstrate that phoneme-based models for speech processing can achieve strong crosslinguistic generalizability to unseen languages.

Keyword Spotting

An Investigation of Noise in Morphological Inflection

1 code implementation26 May 2023 Adam Wiemerslage, Changbing Yang, Garrett Nicolai, Miikka Silfverberg, Katharina Kann

We aim at closing this gap by investigating the types of noise encountered within a pipeline for truly unsupervised morphological paradigm completion and its impact on morphological inflection systems: First, we propose an error taxonomy and annotation pipeline for inflection training data.

Language Modelling Masked Language Modeling +1

Dim Wihl Gat Tun: The Case for Linguistic Expertise in NLP for Underdocumented Languages

no code implementations17 Mar 2022 Clarissa Forbes, Farhan Samir, Bruce Harold Oliver, Changbing Yang, Edith Coates, Garrett Nicolai, Miikka Silfverberg

With this paper, we make the case that IGT data can be leveraged successfully provided that target language expertise is available.

Morphological Processing of Low-Resource Languages: Where We Are and What's Next

no code implementations16 Mar 2022 Adam Wiemerslage, Miikka Silfverberg, Changbing Yang, Arya D. McCarthy, Garrett Nicolai, Eliana Colunga, Katharina Kann

Automatic morphological processing can aid downstream natural language processing applications, especially for low-resource languages, and assist language documentation efforts for endangered languages.

CLiMP: A Benchmark for Chinese Language Model Evaluation

no code implementations EACL 2021 Beilei Xiang, Changbing Yang, Yu Li, Alex Warstadt, Katharina Kann

CLiMP consists of sets of 1, 000 minimal pairs (MPs) for 16 syntactic contrasts in Mandarin, covering 9 major Mandarin linguistic phenomena.

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.