no code implementations • 13 Apr 2024 • Tomáš Sourada, Jana Straková, Rudolf Rosa
For testing in OOV conditions, we automatically extracted a large dataset of nouns in the morphologically rich Czech language, with lemma-disjoint data splits, and we further manually annotated a real-world OOV dataset of neologisms.