no code implementations • Findings (NAACL) 2022 • Peide Zhu, Claudia Hauff
We test our approaches on three large public datasets with different domain similarities, using a transformer-based pre-trained QG model.
1 code implementation • 23 May 2023 • Samarth Bhargav, Anne Schuth, Claudia Hauff
We present a study of Tip-of-the-tongue (ToT) retrieval for music, where a searcher is trying to find an existing music entity, but is unable to succeed as they cannot accurately recall important identifying information.
no code implementations • 21 Apr 2023 • Nirmal Roy, Agathe Balayn, David Maxwell, Claudia Hauff
The creation of relevance assessments by human assessors (often nowadays crowdworkers) is a vital step when building IR test collections.
no code implementations • 13 Apr 2023 • Guglielmo Faggioli, Laura Dietz, Charles Clarke, Gianluca Demartini, Matthias Hagen, Claudia Hauff, Noriko Kando, Evangelos Kanoulas, Martin Potthast, Benno Stein, Henning Wachsmuth
When asked, large language models (LLMs) like ChatGPT claim that they can assist with relevance judgments but it is not clear whether automated judgments can reliably be used in evaluations of retrieval systems.
1 code implementation • 13 Jan 2023 • Gustavo Penha, Claudia Hauff
A number of learned sparse and dense retrieval approaches have recently been proposed and proven effective in tasks such as passage retrieval and document retrieval.
no code implementations • 26 Jul 2022 • Nirmal Roy, David Maxwell, Claudia Hauff
The Search Engine Results Page (SERP) has evolved significantly over the last two decades, moving away from the simple ten blue links paradigm to considerably more complex presentations that contain results from multiple verticals and granularities of textual information.
1 code implementation • 17 May 2022 • Arthur Câmara, Claudia Hauff
We show that, when using the most popular libraries for neural ranker research (i. e. PyTorch and Hugging Face's Transformers), the practice of loading all documents into main memory is not always the fastest option and is not feasible for setups with more than a couple GPUs.
1 code implementation • 22 Apr 2022 • Gustavo Penha, Claudia Hauff
Ranking responses for a given dialogue context is a popular benchmark in which the setup is to re-rank the ground-truth response over a limited set of $n$ responses, where $n$ is typically 10.
no code implementations • 26 Jan 2022 • Arthur Câmara, David Maxwell, Claudia Hauff
Complex search tasks - such as those from the Search as Learning (SAL) domain - often result in users developing an information need composed of several aspects.
1 code implementation • 12 Jan 2022 • Arthur Câmara, Claudia Hauff
This means that the axiomatic approach to IR (and its extension of diagnostic datasets created for retrieval heuristics) may in its current form not be applicable to large-scale corpora.
1 code implementation • 29 Nov 2021 • Arthur Câmara, Nirmal Roy, David Maxwell, Claudia Hauff
Search engines are considered the primary tool to assist and empower learners in finding information relevant to their learning goals-be it learning something new, improving their existing skills, or just fulfilling a curiosity.
1 code implementation • 25 Nov 2021 • Gustavo Penha, Arthur Câmara, Claudia Hauff
Our experimental results across two datasets for two IR tasks reveal that retrieval pipelines are not robust to these query variations, with effectiveness drops of $\approx20\%$ on average.
no code implementations • EACL 2021 • Gustavo Penha, Claudia Hauff
According to the Probability Ranking Principle (PRP), ranking documents in decreasing order of their probability of relevance leads to an optimal document ranking for ad-hoc retrieval.
1 code implementation • 12 Jan 2021 • Gustavo Penha, Claudia Hauff
Our experimental results on the ad-hoc retrieval task of conversation response ranking reveal that (i) BERT-based rankers are not robustly calibrated and that stochastic BERT-based rankers yield better calibration; and (ii) uncertainty estimation is beneficial for both risk-aware neural ranking, i. e. taking into account the uncertainty when ranking documents, and for predicting unanswerable conversational contexts.
1 code implementation • 15 Dec 2020 • Gustavo Penha, Claudia Hauff
Inspired by our investigation of LS in the context of neural L2R models, we propose a novel technique called Weakly Supervised Label Smoothing (WSLS) that takes advantage of the retrieval scores of the negative sampled documents as a weak supervision signal in the process of modifying the ground-truth labels.
1 code implementation • EMNLP (scai) 2020 • Gustavo Penha, Claudia Hauff
Understanding when and why neural ranking models fail for an IR task via error analysis is an important part of the research cycle.
1 code implementation • 30 Jul 2020 • Gustavo Penha, Claudia Hauff
Overall, our analyses and experiments show that: (i) BERT has knowledge stored in its parameters about the content of books, movies and music; (ii) it has more content-based knowledge than collaborative-based knowledge; and (iii) fails on conversational recommendation when faced with adversarial data.
1 code implementation • 18 Dec 2019 • Gustavo Penha, Claudia Hauff
Curriculum learning has recently been shown to improve neural models' effectiveness by sampling batches non-uniformly, going from easy to difficult instances during training.
2 code implementations • 10 Dec 2019 • Gustavo Penha, Alexandru Balan, Claudia Hauff
Conversational search is an approach to information retrieval (IR), where users engage in a dialogue with an agent in order to satisfy their information needs.
no code implementations • WS 2018 • Guanliang Chen, Claudia Hauff, Geert-Jan Houben
Knowledge tracing serves as a keystone in delivering personalized education.