Search Results for author: Changbing Yang

Found 14 papers, 3 papers with code

Unsupervised Paradigm Clustering Using Transformation Rules

no code implementations • ACL (SIGMORPHON) 2021 • Changbing Yang, Garrett Nicolai, Miikka Silfverberg

Secondly, we experiment with more general rules which can apply transformations inside the input strings in addition to prefix and suffix transformations.

Clustering Task 2

Paper
Add Code

Morphological Processing of Low-Resource Languages: Where We Are and What’s Next

no code implementations • Findings (ACL) 2022 • Adam Wiemerslage, Miikka Silfverberg, Changbing Yang, Arya McCarthy, Garrett Nicolai, Eliana Colunga, Katharina Kann

Automatic morphological processing can aid downstream natural language processing applications, especially for low-resource languages, and assist language documentation efforts for endangered languages.

Paper
Add Code

Generalizing Morphological Inflection Systems to Unseen Lemmas

no code implementations • NAACL (SIGMORPHON) 2022 • Changbing Yang, Ruixin (Ray) Yang, Garrett Nicolai, Miikka Silfverberg

This paper presents experiments on morphological inflection using data from the SIGMORPHON-UniMorph 2022 Shared Task 0: Generalization and Typologically Diverse Morphological Inflection.

Hallucination LEMMA +1

Paper
Add Code

An Inflectional Database for Gitksan

1 code implementation • LREC 2022 • Bruce Oliver, Clarissa Forbes, Changbing Yang, Farhan Samir, Edith Coates, Garrett Nicolai, Miikka Silfverberg

We use Gitksan data in interlinear glossed format, stemming from language documentation efforts, to build a database of partial inflection tables.

Data Augmentation Hallucination +1

Paper
Code

IGT2P: From Interlinear Glossed Texts to Paradigms

no code implementations • EMNLP 2020 • Sarah Moeller, Ling Liu, Changbing Yang, Katharina Kann, Mans Hulden

An intermediate step in the linguistic analysis of an under-documented language is to find and organize inflected forms that are attested in natural speech.

POS

Paper
Add Code

Dim Wihl Gat Tun: The Case for Linguistic Expertise in NLP for Under-Documented Languages

no code implementations • Findings (ACL) 2022 • Clarissa Forbes, Farhan Samir, Bruce Oliver, Changbing Yang, Edith Coates, Garrett Nicolai, Miikka Silfverberg

With this paper, we make the case that IGT data can be leveraged successfully provided that target language expertise is available.

Paper
Add Code

Penalizing Divergence: Multi-Parallel Translation for Low-Resource Languages of North America

no code implementations • COLING 2022 • Garrett Nicolai, Changbing Yang, Miikka Silfverberg

Experiments on very low-resourced Indigenous North American languages show that an initially deficient multilingual translator can improve by 4. 9 BLEU through mBART pre-training, and 5. 5 BLEU points with the strategic addition of monolingual data, and that a divergence penalty leads to further increases of 0. 4 BLEU.

Machine Translation Translation

Paper
Add Code

Embedded Translations for Low-resource Automated Glossing

no code implementations • 13 Mar 2024 • Changbing Yang, Garrett Nicolai, Miikka Silfverberg

Aided by these enhancements, our model demonstrates an average improvement of 3. 97\%-points over the previous state of the art on datasets from the SIGMORPHON 2023 Shared Task on Interlinear Glossing.

Decoder Translation

Paper
Add Code

The taste of IPA: Towards open-vocabulary keyword spotting and forced alignment in any language

1 code implementation • 14 Nov 2023 • Jian Zhu, Changbing Yang, Farhan Samir, Jahurul Islam

In this project, we demonstrate that phoneme-based models for speech processing can achieve strong crosslinguistic generalizability to unseen languages.

Keyword Spotting

Paper
Code

An Investigation of Noise in Morphological Inflection

1 code implementation • 26 May 2023 • Adam Wiemerslage, Changbing Yang, Garrett Nicolai, Miikka Silfverberg, Katharina Kann

We aim at closing this gap by investigating the types of noise encountered within a pipeline for truly unsupervised morphological paradigm completion and its impact on morphological inflection systems: First, we propose an error taxonomy and annotation pipeline for inflection training data.

Language Modelling Masked Language Modeling +1

Paper
Code

Dim Wihl Gat Tun: The Case for Linguistic Expertise in NLP for Underdocumented Languages

no code implementations • 17 Mar 2022 • Clarissa Forbes, Farhan Samir, Bruce Harold Oliver, Changbing Yang, Edith Coates, Garrett Nicolai, Miikka Silfverberg

With this paper, we make the case that IGT data can be leveraged successfully provided that target language expertise is available.

Paper
Add Code

Morphological Processing of Low-Resource Languages: Where We Are and What's Next

no code implementations • 16 Mar 2022 • Adam Wiemerslage, Miikka Silfverberg, Changbing Yang, Arya D. McCarthy, Garrett Nicolai, Eliana Colunga, Katharina Kann

Paper
Add Code

CLiMP: A Benchmark for Chinese Language Model Evaluation

no code implementations • EACL 2021 • Beilei Xiang, Changbing Yang, Yu Li, Alex Warstadt, Katharina Kann

CLiMP consists of sets of 1, 000 minimal pairs (MPs) for 16 syntactic contrasts in Mandarin, covering 9 major Mandarin linguistic phenomena.

Language Modelling

Paper
Add Code

Linguist vs. Machine: Rapid Development of Finite-State Morphological Grammars

no code implementations • WS 2020 • Sarah Beemer, Zak Boston, April Bukoski, Daniel Chen, Princess Dickens, Andrew Gerlach, Torin Hopkins, an, Parth Jawale, Chris Koski, Akanksha Malhotra, Piyush Mishra, Saliha Muradoglu, Lan Sang, Tyler Short, Sagarika Shreevastava, Elizabeth Spaulding, Testumichi Umada, Beilei Xiang, Changbing Yang, Mans Hulden

Sequence-to-sequence models have proven to be highly successful in learning morphological inflection from examples as the series of SIGMORPHON/CoNLL shared tasks have shown.

Morphological Inflection

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.