no code implementations • Findings (NAACL) 2022 • Jesin James, Vithya Yogarajan, Isabella Shields, Catherine Watson, Peter Keegan, Keoni Mahelona, Peter-Lucas Jones
We also show that BiLSTM with pre-trained Māori-English sub-word embeddings outperforms large-scale contextual language models such as BERT on down streaming tasks of detecting Māori language.
no code implementations • 3 Dec 2023 • Vithya Yogarajan, Gillian Dobbie, Te Taka Keegan, Rostam J. Neuwirth
The importance and novelty of this survey are that it explores the perspective of under-represented societies.
no code implementations • 11 Sep 2023 • Vithya Yogarajan, Gillian Dobbie, Timothy Pistotti, Joshua Bensemann, Kobe Knowles
Recent advances in artificial intelligence, including the development of highly sophisticated large language models (LLM), have proven beneficial in many real-world applications.
1 code implementation • 5 May 2023 • Kobe Knowles, Joshua Bensemann, Diana Benavides-Prado, Vithya Yogarajan, Michael Witbrock, Gillian Dobbie, Yang Chen
We introduce a novel architecture, the Neuromodulation Gated Transformer (NGT), which is a simple implementation of neuromodulation in transformers via a multiplicative effect.
no code implementations • 17 Apr 2023 • Vithya Yogarajan, Gillian Dobbie, Henry Gouk
An indigenous perspective on the effectiveness of debiasing techniques for pre-trained language models (PLMs) is presented in this paper.
no code implementations • 21 Aug 2022 • Jesin James, Isabella Shields, Vithya Yogarajan, Peter J. Keegan, Catherine Watson, Peter-Lucas Jones, Keoni Mahelona
The New Zealand Parliament Hansard debates reports were used to build the database.
1 code implementation • 3 Dec 2021 • Vithya Yogarajan, Bernhard Pfahringer, Tony Smith, Jacob Montiel
Improving the tail-end label predictions in multi-label classifications of medical text enables the potential to understand patients better and improve care.
1 code implementation • 1 Oct 2021 • Vithya Yogarajan, Jacob Montiel, Tony Smith, Bernhard Pfahringer
This study focuses on techniques used for the multi-label classification of medical text.
2 code implementations • 29 Mar 2020 • Vithya Yogarajan, Jacob Montiel, Tony Smith, Bernhard Pfahringer
We also show that high dimensional embeddings pre-trained using health-related data present a significant improvement in a multi-label setting, similarly to the way they improve performance for binary classification.
no code implementations • 27 Jan 2019 • Vithya Yogarajan, Bernhard Pfahringer, Michael Mayo
De-identification of electronic health records (EHR) is a vital step towards advancing health informatics research and maximising the use of available data.
no code implementations • 16 Oct 2018 • Vithya Yogarajan, Michael Mayo, Bernhard Pfahringer
Use of medical data, also known as electronic health records, in research helps develop and advance medical science.