Subjecthood and annotation: The cases of French and Wolof

This article considers the annotation of subjects in UD treebanks. The identification of the subject poses a particular problem in Wolof, due to pronominal indices whose status as a pronoun or a pronominal affix is uncertain. In the UD treebank available for Wolof (Dione, 2019), these have been annotated depending on the construction either as true subjects, or as morphosyntactic features agreeing with the verb. The study of this corpus of 40 000 words allows us to show that the problem is indeed difficult to solve, especially since Wolof has a rich system of auxiliaries and several basic constructions with different properties. Before addressing the case of Wolof, we will present the simpler, but partly comparable, case of French, where subject clitics also tend to behave like affixes, and subjecthood can move from the preverbal to the detached position. We will also make a several annotation recommendations that would avoid overwriting information regarding subjecthood.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here