no code implementations • 28 May 2024 • Gili Goldin, Nick Howell, Noam Ordan, Ella Rabinovich, Shuly Wintner
We present the Knesset Corpus, a corpus of Hebrew parliamentary proceedings containing over 30 million sentences (over 384 million tokens) from all the (plenary and committee) protocols held in the Israeli parliament between 1998 and 2022.
no code implementations • EMNLP 2018 • Gili Goldin, Ella Rabinovich, Shuly Wintner
We address the task of native language identification in the context of social media content, where authors are highly-fluent, advanced nonnative speakers (of English).