Search Results for author: Moshe Berchansky

Found 5 papers, 2 papers with code

Distributed Speculative Inference of Large Language Models

no code implementations • 23 May 2024 • Nadav Timor, Jonathan Mamou, Daniel Korat, Moshe Berchansky, Oren Pereg, Moshe Wasserblat, Tomer Galanti, Michal Gordon, David Harel

In practice, off-the-shelf LLMs often do not have matching drafters that are sufficiently fast and accurate.

Paper
Add Code

Accelerating Speculative Decoding using Dynamic Speculation Length

no code implementations • 7 May 2024 • Jonathan Mamou, Oren Pereg, Daniel Korat, Moshe Berchansky, Nadav Timor, Moshe Wasserblat, Roy Schwartz

Speculative decoding is a promising method for reducing the inference latency of large language models.

Paper
Add Code

CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity

no code implementations • 16 Apr 2024 • Moshe Berchansky, Daniel Fleischer, Moshe Wasserblat, Peter Izsak

This approach focuses the reasoning process on generating an attribution-centric output.

Question Answering

Paper
Add Code

Optimizing Retrieval-augmented Reader Models via Token Elimination

1 code implementation • 20 Oct 2023 • Moshe Berchansky, Peter Izsak, Avi Caciularu, Ido Dagan, Moshe Wasserblat

Fusion-in-Decoder (FiD) is an effective retrieval-augmented language model applied across a variety of open-domain tasks, such as question answering, fact checking, etc.

Answer Generation Decoder +4

Paper
Code

How to Train BERT with an Academic Budget

4 code implementations • EMNLP 2021 • Peter Izsak, Moshe Berchansky, Omer Levy

While large language models a la BERT are used ubiquitously in NLP, pretraining them is considered a luxury that only a few well-funded industry labs can afford.

Ranked #19 on Question Answering on Quora Question Pairs

Language Modelling Linguistic Acceptability +4

305

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.