no code implementations • ACL (SIGMORPHON) 2021 • Changbing Yang, Garrett Nicolai, Miikka Silfverberg
Secondly, we experiment with more general rules which can apply transformations inside the input strings in addition to prefix and suffix transformations.
no code implementations • Findings (ACL) 2022 • Adam Wiemerslage, Miikka Silfverberg, Changbing Yang, Arya McCarthy, Garrett Nicolai, Eliana Colunga, Katharina Kann
Automatic morphological processing can aid downstream natural language processing applications, especially for low-resource languages, and assist language documentation efforts for endangered languages.
no code implementations • NAACL (SIGMORPHON) 2022 • Changbing Yang, Ruixin (Ray) Yang, Garrett Nicolai, Miikka Silfverberg
This paper presents experiments on morphological inflection using data from the SIGMORPHON-UniMorph 2022 Shared Task 0: Generalization and Typologically Diverse Morphological Inflection.
1 code implementation • LREC 2022 • Bruce Oliver, Clarissa Forbes, Changbing Yang, Farhan Samir, Edith Coates, Garrett Nicolai, Miikka Silfverberg
We use Gitksan data in interlinear glossed format, stemming from language documentation efforts, to build a database of partial inflection tables.
no code implementations • EMNLP 2020 • Sarah Moeller, Ling Liu, Changbing Yang, Katharina Kann, Mans Hulden
An intermediate step in the linguistic analysis of an under-documented language is to find and organize inflected forms that are attested in natural speech.
no code implementations • Findings (ACL) 2022 • Clarissa Forbes, Farhan Samir, Bruce Oliver, Changbing Yang, Edith Coates, Garrett Nicolai, Miikka Silfverberg
With this paper, we make the case that IGT data can be leveraged successfully provided that target language expertise is available.
no code implementations • COLING 2022 • Garrett Nicolai, Changbing Yang, Miikka Silfverberg
Experiments on very low-resourced Indigenous North American languages show that an initially deficient multilingual translator can improve by 4. 9 BLEU through mBART pre-training, and 5. 5 BLEU points with the strategic addition of monolingual data, and that a divergence penalty leads to further increases of 0. 4 BLEU.
no code implementations • 13 Mar 2024 • Changbing Yang, Garrett Nicolai, Miikka Silfverberg
Aided by these enhancements, our model demonstrates an average improvement of 3. 97\%-points over the previous state of the art on datasets from the SIGMORPHON 2023 Shared Task on Interlinear Glossing.
1 code implementation • 14 Nov 2023 • Jian Zhu, Changbing Yang, Farhan Samir, Jahurul Islam
In this project, we demonstrate that phoneme-based models for speech processing can achieve strong crosslinguistic generalizability to unseen languages.
1 code implementation • 26 May 2023 • Adam Wiemerslage, Changbing Yang, Garrett Nicolai, Miikka Silfverberg, Katharina Kann
We aim at closing this gap by investigating the types of noise encountered within a pipeline for truly unsupervised morphological paradigm completion and its impact on morphological inflection systems: First, we propose an error taxonomy and annotation pipeline for inflection training data.
no code implementations • 17 Mar 2022 • Clarissa Forbes, Farhan Samir, Bruce Harold Oliver, Changbing Yang, Edith Coates, Garrett Nicolai, Miikka Silfverberg
With this paper, we make the case that IGT data can be leveraged successfully provided that target language expertise is available.
no code implementations • 16 Mar 2022 • Adam Wiemerslage, Miikka Silfverberg, Changbing Yang, Arya D. McCarthy, Garrett Nicolai, Eliana Colunga, Katharina Kann
Automatic morphological processing can aid downstream natural language processing applications, especially for low-resource languages, and assist language documentation efforts for endangered languages.
no code implementations • EACL 2021 • Beilei Xiang, Changbing Yang, Yu Li, Alex Warstadt, Katharina Kann
CLiMP consists of sets of 1, 000 minimal pairs (MPs) for 16 syntactic contrasts in Mandarin, covering 9 major Mandarin linguistic phenomena.
no code implementations • WS 2020 • Sarah Beemer, Zak Boston, April Bukoski, Daniel Chen, Princess Dickens, Andrew Gerlach, Torin Hopkins, an, Parth Jawale, Chris Koski, Akanksha Malhotra, Piyush Mishra, Saliha Muradoglu, Lan Sang, Tyler Short, Sagarika Shreevastava, Elizabeth Spaulding, Testumichi Umada, Beilei Xiang, Changbing Yang, Mans Hulden
Sequence-to-sequence models have proven to be highly successful in learning morphological inflection from examples as the series of SIGMORPHON/CoNLL shared tasks have shown.