Search Results for author: Andrew Yates

Found 48 papers, 25 papers with code

Bag-of-Words Baselines for Semantic Code Search

no code implementations • ACL (NLP4Prog) 2021 • Xinyu Zhang, Ji Xin, Andrew Yates, Jimmy Lin

The task of semantic code search is to retrieve code snippets from a source code corpus based on an information need expressed in natural language.

Code Search Information Retrieval +2

Paper
Add Code

CHARM: Inferring Personal Attributes from Conversations

no code implementations • EMNLP 2020 • Anna Tigunova, Andrew Yates, Paramita Mirza, Gerhard Weikum

Personal knowledge about users{'} professions, hobbies, favorite food, and travel preferences, among others, is a valuable asset for individualized AI, such as recommenders or chatbots.

Attribute Keyword Extraction +2

Paper
Add Code

A Little Bit Is Worse Than None: Ranking with Limited Training Data

no code implementations • EMNLP (sustainlp) 2020 • Xinyu Zhang, Andrew Yates, Jimmy Lin

Researchers have proposed simple yet effective techniques for the retrieval problem based on using BERT as a relevance classifier to rerank initial candidates from keyword search.

Passage Retrieval Retrieval

Paper
Add Code

PRIDE: Predicting Relationships in Conversations

no code implementations • EMNLP 2021 • Anna Tigunova, Paramita Mirza, Andrew Yates, Gerhard Weikum

Automatically extracting interpersonal relationships of conversation interlocutors can enrich personal knowledge bases to enhance personalized search, recommenders and chatbots.

Paper
Add Code

Are We Really Achieving Better Beyond-Accuracy Performance in Next Basket Recommendation?

no code implementations • 2 May 2024 • Ming Li, Yuanna Liu, Sami Jullien, Mozhdeh Ariannezhad, Mohammad Aliannejadi, Andrew Yates, Maarten de Rijke

So far, most NBR studies have focused on optimizing the accuracy of the recommendation, whereas optimizing for beyond-accuracy metrics, e. g., item fairness and diversity remains largely unexplored.

Fairness Navigate +2

Paper
Add Code

Corpus-Steered Query Expansion with Large Language Models

1 code implementation • 28 Feb 2024 • Yibin Lei, Yu Cao, Tianyi Zhou, Tao Shen, Andrew Yates

Recent studies demonstrate that query expansions generated by large language models (LLMs) can considerably enhance information retrieval systems by generating hypothetical documents that answer the queries as expansions.

Information Retrieval Retrieval

Paper
Code

Meta-Task Prompting Elicits Embedding from Large Language Models

no code implementations • 28 Feb 2024 • Yibin Lei, Di wu, Tianyi Zhou, Tao Shen, Yu Cao, Chongyang Tao, Andrew Yates

In this work, we introduce a new unsupervised embedding method, Meta-Task Prompting with Explicit One-Word Limitation (MetaEOL), for generating high-quality sentence embeddings from Large Language Models (LLMs) without the need for model fine-tuning or task-specific engineering.

Semantic Textual Similarity Sentence +2

Paper
Add Code

Multimodal Learned Sparse Retrieval with Probabilistic Expansion Control

1 code implementation • 27 Feb 2024 • Thong Nguyen, Mariya Hendriksen, Andrew Yates, Maarten de Rijke

Our proposed approach efficiently transforms dense vectors from a frozen dense model into sparse lexical vectors.

Image Retrieval Retrieval +1

Paper
Code

Demonstrating and Reducing Shortcuts in Vision-Language Representation Learning

1 code implementation • 27 Feb 2024 • Maurits Bleeker, Mariya Hendriksen, Andrew Yates, Maarten de Rijke

Hence, contrastive losses are not sufficient to learn task-optimal representations, i. e., representations that contain all task-relevant information shared between the image and associated captions.

Contrastive Learning Representation Learning

Paper
Code

Multimodal Learned Sparse Retrieval for Image Suggestion

no code implementations • 12 Feb 2024 • Thong Nguyen, Mariya Hendriksen, Andrew Yates

Motivated by this, in this work, we explore the application of LSR in the multi-modal domain, i. e., we focus on Multi-Modal Learned Sparse Retrieval (MLSR).

Image Captioning Retrieval +1

Paper
Add Code

Recommendations by Concise User Profiles from Review Text

no code implementations • 2 Nov 2023 • Ghazaleh Haratinezhad Torbati, Anna Tigunova, Andrew Yates, Gerhard Weikum

Recommender systems are most successful for popular items and users with ample interactions (likes, ratings etc.).

Recommendation Systems Representation Learning

Paper
Add Code

Replicating Relevance-Ranked Synonym Discovery in a New Language and Domain

no code implementations • 2 Oct 2023 • Andrew Yates, Michael Unterkalmsteiner

We replicate prior work on ranking domain-specific synonyms in the consumer health domain by applying the approach to a new language and domain: identifying Swedish language synonyms in the building construction domain.

Learning-To-Rank

Paper
Add Code

Masked and Swapped Sequence Modeling for Next Novel Basket Recommendation in Grocery Shopping

1 code implementation • 2 Aug 2023 • Ming Li, Mozhdeh Ariannezhad, Andrew Yates, Maarten de Rijke

In next basket recommendation (NBR), it is useful to distinguish between repeat items, i. e., items that a user has consumed before, and explore items, i. e., items that a user has not consumed before.

Next-basket recommendation

Paper
Code

Generative Retrieval as Dense Retrieval

no code implementations • 20 Jun 2023 • Thong Nguyen, Andrew Yates

Generative retrieval is a promising new neural retrieval paradigm that aims to optimize the retrieval pipeline by performing both indexing and retrieval with a single transformer model.

Retrieval

Paper
Add Code

Unsupervised Dense Retrieval with Relevance-Aware Contrastive Pre-Training

1 code implementation • 5 Jun 2023 • Yibin Lei, Liang Ding, Yu Cao, Changtong Zan, Andrew Yates, DaCheng Tao

Dense retrievers have achieved impressive performance, but their demand for abundant training data limits their application scenarios.

Contrastive Learning Retrieval

Paper
Code

Adapting Learned Sparse Retrieval for Long Documents

1 code implementation • 29 May 2023 • Thong Nguyen, Sean MacAvaney, Andrew Yates

We investigate existing aggregation approaches for adapting LSR to longer documents and find that proximal scoring is crucial for LSR to handle long documents.

Language Modelling Masked Language Modeling +1

Paper
Code

MultiTabQA: Generating Tabular Answers for Multi-Table Question Answering

1 code implementation • 22 May 2023 • Vaishali Pal, Andrew Yates, Evangelos Kanoulas, Maarten de Rijke

Recent advances in tabular question answering (QA) with large language models are constrained in their coverage and only answer questions over a single table.

Question Answering

Paper
Code

A Unified Framework for Learned Sparse Retrieval

1 code implementation • 23 Mar 2023 • Thong Nguyen, Sean MacAvaney, Andrew Yates

We then reproduce all prominent methods using a common codebase and re-train them in the same environment, which allows us to quantify how components of the framework affect effectiveness and efficiency.

Retrieval

Paper
Code

Reducing Predictive Feature Suppression in Resource-Constrained Contrastive Image-Caption Retrieval

1 code implementation • 28 Apr 2022 • Maurits Bleeker, Andrew Yates, Maarten de Rijke

We add an additional decoder to the contrastive ICR framework, to reconstruct the input caption in a latent space of a general-purpose sentence encoder, which prevents the image and caption encoder from suppressing predictive features.

Contrastive Learning Retrieval +1

Paper
Code

Zero-shot Query Contextualization for Conversational Search

1 code implementation • 22 Apr 2022 • Antonios Minas Krasakis, Andrew Yates, Evangelos Kanoulas

Current conversational passage retrieval systems cast conversational search into ad-hoc search by using an intermediate query resolution step that places the user's question in context of the conversation.

Conversational Search Passage Retrieval +1

Paper
Code

Improving the Generalizability of Depression Detection by Leveraging Clinical Questionnaires

1 code implementation • ACL 2022 • Thong Nguyen, Andrew Yates, Ayah Zirikly, Bart Desmet, Arman Cohan

In dataset-transfer experiments on three social media datasets, we find that grounding the model in PHQ9's symptoms substantially improves its ability to generalize to out-of-distribution data compared to a standard BERT-based approach.

Depression Detection Domain Generalization

Paper
Code

Language Models As or For Knowledge Bases

no code implementations • 10 Oct 2021 • Simon Razniewski, Andrew Yates, Nora Kassner, Gerhard Weikum

Pre-trained language models (LMs) have recently gained attention for their potential as an alternative to (or proxy for) explicit knowledge bases (KBs).

Position

Paper
Add Code

Personalized Entity Search by Sparse and Scrutable User Profiles

no code implementations • 10 Sep 2021 • Ghazaleh Haratinezhad Torbati, Andrew Yates, Gerhard Weikum

Prior work on personalizing web search results has focused on considering query-and-click logs to capture users individual interests.

Re-Ranking

Paper
Add Code

You Get What You Chat: Using Conversations to Personalize Search-based Recommendations

no code implementations • 10 Sep 2021 • Ghazaleh Haratinezhad Torbati, Andrew Yates, Gerhard Weikum

The paper develops an expressive model and effective methods for personalizing search-based entity recommendations.

Re-Ranking

Paper
Add Code

How Deep is your Learning: the DL-HARD Annotated Deep Learning Dataset

1 code implementation • 17 May 2021 • Iain Mackie, Jeffery Dalton, Andrew Yates

Deep Learning Hard (DL-HARD) is a new annotated dataset designed to more effectively evaluate neural ranking models on complex topics.

Paper
Code

CEQE: Contextualized Embeddings for Query Expansion

no code implementations • 9 Mar 2021 • Shahrzad Naseri, Jeffrey Dalton, Andrew Yates, James Allan

We find that CEQE outperforms static embedding-based expansion methods on multiple collections (by up to 18% on Robust and 31% on Deep Learning on average precision) and also improves over proven probabilistic pseudo-relevance feedback (PRF) models.

Re-Ranking Retrieval

Paper
Add Code

Simplified Data Wrangling with ir_datasets

1 code implementation • 3 Mar 2021 • Sean MacAvaney, Andrew Yates, Sergey Feldman, Doug Downey, Arman Cohan, Nazli Goharian

Managing the data for Information Retrieval (IR) experiments can be challenging.

Information Retrieval Retrieval

298

Paper
Code

Pretrained Transformers for Text Ranking: BERT and Beyond

1 code implementation • NAACL 2021 • Jimmy Lin, Rodrigo Nogueira, Andrew Yates

There are two themes that pervade our survey: techniques for handling long documents, beyond typical sentence-by-sentence processing in NLP, and techniques for addressing the tradeoff between effectiveness (i. e., result quality) and efficiency (e. g., query latency, model and index size).

Information Retrieval Retrieval +1

1,411

Paper
Code

BERT-QE: Contextualized Query Expansion for Document Re-ranking

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Zhi Zheng, Kai Hui, Ben He, Xianpei Han, Le Sun, Andrew Yates

Query expansion aims to mitigate the mismatch between the language used in a query and in a document.

Re-Ranking Retrieval

Paper
Code

PARADE: Passage Representation Aggregation for Document Reranking

1 code implementation • 20 Aug 2020 • Canjia Li, Andrew Yates, Sean MacAvaney, Ben He, Yingfei Sun

In this work, we explore strategies for aggregating relevance signals from a document's passages into a final ranking score.

Ranked #2 on Ad-Hoc Information Retrieval on TREC Robust04

Document Ranking Knowledge Distillation

Paper
Code

RedDust: a Large Reusable Dataset of Reddit User Traits

no code implementations • LREC 2020 • Anna Tigunova, Paramita Mirza, Andrew Yates, Gerhard Weikum

To the best of our knowledge, RedDust is the first annotated language resource about Reddit users at large scale.

Attribute

Paper
Add Code

STANCY: Stance Classification Based on Consistency Cues

1 code implementation • IJCNLP 2019 • Kashyap Popat, Subhabrata Mukherjee, Andrew Yates, Gerhard Weikum

Controversial claims are abundant in online media and discussion forums.

Classification General Classification +1

Paper
Code

Listening between the Lines: Learning Personal Attributes from Conversations

1 code implementation • 24 Apr 2019 • Anna Tigunova, Andrew Yates, Paramita Mirza, Gerhard Weikum

Open-domain dialogue agents must be able to converse about many topics while incorporating knowledge about the user into the conversation.

Attribute

Paper
Code

CEDR: Contextualized Embeddings for Document Ranking

7 code implementations • 15 Apr 2019 • Sean MacAvaney, Andrew Yates, Arman Cohan, Nazli Goharian

We call this joint approach CEDR (Contextualized Embeddings for Document Ranking).

Ranked #3 on Ad-Hoc Information Retrieval on TREC Robust04

Document Ranking General Classification

156

Paper
Code

Investigating Retrieval Method Selection with Axiomatic Features

no code implementations • 11 Apr 2019 • Siddhant Arora, Andrew Yates

We consider algorithm selection in the context of ad-hoc information retrieval.

Ad-Hoc Information Retrieval Information Retrieval +1

Paper
Add Code

Using Multi-Sense Vector Embeddings for Reverse Dictionaries

1 code implementation • WS 2019 • Michael A. Hedderich, Andrew Yates, Dietrich Klakow, Gerard de Melo

However, they typically cannot serve as a drop-in replacement for conventional single-sense embeddings, because the correct sense vector needs to be selected for each word.

Paper
Code

NPRF: A Neural Pseudo Relevance Feedback Framework for Ad-hoc Information Retrieval

1 code implementation • EMNLP 2018 • Canjia Li, Yingfei Sun, Ben He, Le Wang, Kai Hui, Andrew Yates, Le Sun, Jungang Xu

Pseudo-relevance feedback (PRF) is commonly used to boost the performance of traditional information retrieval (IR) models by using top-ranked documents to identify and weight new query terms, thereby reducing the effect of query-document vocabulary mismatches.

Ranked #9 on Ad-Hoc Information Retrieval on TREC Robust04

Ad-Hoc Information Retrieval Information Retrieval +1

Paper
Code

DeClarE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning

2 code implementations • EMNLP 2018 • Kashyap Popat, Subhabrata Mukherjee, Andrew Yates, Gerhard Weikum

Misinformation such as fake news is one of the big challenges of our society.

Fact Checking Misinformation

Paper
Code

RSDD-Time: Temporal Annotation of Self-Reported Mental Health Diagnoses

no code implementations • WS 2018 • Sean MacAvaney, Bart Desmet, Arman Cohan, Luca Soldaini, Andrew Yates, Ayah Zirikly, Nazli Goharian

Self-reported diagnosis statements have been widely employed in studying language related to mental health in social media.

General Classification

Paper
Add Code

SMHD: A Large-Scale Resource for Exploring Online Language Usage for Multiple Mental Health Conditions

no code implementations • COLING 2018 • Arman Cohan, Bart Desmet, Andrew Yates, Luca Soldaini, Sean MacAvaney, Nazli Goharian

Mental health is a significant and growing public health concern.

text-classification Text Classification

Paper
Add Code

Depression and Self-Harm Risk Assessment in Online Forums

no code implementations • EMNLP 2017 • Andrew Yates, Arman Cohan, Nazli Goharian

We propose methods for identifying posts in support communities that may indicate a risk of self-harm, and demonstrate that our approach outperforms strong previously proposed methods for identifying such posts.

Paper
Add Code

Content-Based Weak Supervision for Ad-Hoc Re-Ranking

1 code implementation • 1 Jul 2017 • Sean MacAvaney, Andrew Yates, Kai Hui, Ophir Frieder

One challenge with neural ranking is the need for a large amount of manually-labeled relevance judgments for training.

Information Retrieval Re-Ranking

Paper
Code

Co-PACRR: A Context-Aware Neural IR Model for Ad-hoc Retrieval

3 code implementations • 30 Jun 2017 • Kai Hui, Andrew Yates, Klaus Berberich, Gerard de Melo

Neural IR models, such as DRMM and PACRR, have achieved strong results by successfully capturing relevance matching signals.

Ad-Hoc Information Retrieval Retrieval

Paper
Code

DE-PACRR: Exploring Layers Inside the PACRR Model

no code implementations • 27 Jun 2017 • Andrew Yates, Kai Hui

Recent neural IR models have demonstrated deep learning's utility in ad-hoc information retrieval.

Ad-Hoc Information Retrieval Information Retrieval +1

Paper
Add Code

PACRR: A Position-Aware Neural IR Model for Relevance Matching

3 code implementations • EMNLP 2017 • Kai Hui, Andrew Yates, Klaus Berberich, Gerard de Melo

In order to adopt deep learning for information retrieval, models are needed that can capture all relevant information required to assess the relevance of a document to a given user query.

Ad-Hoc Information Retrieval Information Retrieval +2

Paper
Code

Triaging Content Severity in Online Mental Health Forums

no code implementations • 22 Feb 2017 • Arman Cohan, Sydney Young, Andrew Yates, Nazli Goharian

Our analysis on the interaction of the moderators with the users further indicates that without an automatic way to identify critical content, it is indeed challenging for the moderators to provide timely response to the users in need.

Paper
Add Code

Effects of Sampling on Twitter Trend Detection

no code implementations • LREC 2016 • Andrew Yates, Alek Kolcz, Nazli Goharian, Ophir Frieder

In this work we use a larger feed to investigate the effects of sampling on Twitter trend detection.

Paper
Add Code

A Framework for Public Health Surveillance

no code implementations • LREC 2014 • Andrew Yates, Jon Parker, Nazli Goharian, Ophir Frieder

With the rapid growth of social media, there is increasing potential to augment traditional public health surveillance methods with data from social media.

Information Retrieval

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.