1 code implementation • LREC 2022 • António Branco, João Ricardo Silva, Luís Gomes, João António Rodrigues
This paper presents a new collection of quality language resources for the computational processing of the Portuguese language under the Universal Dependencies framework (UD).
no code implementations • 8 Apr 2024 • Tomás Osório, Bernardo Leite, Henrique Lopes Cardoso, Luís Gomes, João Rodrigues, Rodrigo Santos, António Branco
Similarly, the respective fine-tuned neural language models, developed with a low-rank adaptation approach, are made available as baselines that can stimulate future work on the neural processing of Portuguese.
no code implementations • 4 Mar 2024 • Rodrigo Santos, João Rodrigues, Luís Gomes, João Silva, António Branco, Henrique Lopes Cardoso, Tomás Freitas Osório, Bernardo Leite
To foster the neural encoding of Portuguese, this paper contributes foundation encoder models that represent an expansion of the still very scarce ecosystem of large language models specifically developed for this language that are fully open, in the sense that they are open source and openly distributed for free under an open license for any purpose, thus including research and commercial usages.
no code implementations • 29 Feb 2024 • Rodrigo Santos, João Silva, Luís Gomes, João Rodrigues, António Branco
To advance the neural decoding of Portuguese, in this paper we present a fully open Transformer-based, instruction-tuned decoder model that sets a new state of the art in this respect.
no code implementations • 11 May 2023 • João Rodrigues, Luís Gomes, João Silva, António Branco, Rodrigo Santos, Henrique Lopes Cardoso, Tomás Osório
To advance the neural encoding of Portuguese (PT), and a fortiori the technological preparation of this language for the digital age, we developed a Transformer-based foundation model that sets a new state of the art in this respect for two of its variants, namely European Portuguese from Portugal (PT-PT) and American Portuguese from Brazil (PT-BR).