Search Results for author: Moshe Berchansky

Found 5 papers, 2 papers with code

Distributed Speculative Inference of Large Language Models

no code implementations23 May 2024 Nadav Timor, Jonathan Mamou, Daniel Korat, Moshe Berchansky, Oren Pereg, Moshe Wasserblat, Tomer Galanti, Michal Gordon, David Harel

In practice, off-the-shelf LLMs often do not have matching drafters that are sufficiently fast and accurate.

Accelerating Speculative Decoding using Dynamic Speculation Length

no code implementations7 May 2024 Jonathan Mamou, Oren Pereg, Daniel Korat, Moshe Berchansky, Nadav Timor, Moshe Wasserblat, Roy Schwartz

Speculative decoding is a promising method for reducing the inference latency of large language models.

Optimizing Retrieval-augmented Reader Models via Token Elimination

1 code implementation20 Oct 2023 Moshe Berchansky, Peter Izsak, Avi Caciularu, Ido Dagan, Moshe Wasserblat

Fusion-in-Decoder (FiD) is an effective retrieval-augmented language model applied across a variety of open-domain tasks, such as question answering, fact checking, etc.

Answer Generation Decoder +4

How to Train BERT with an Academic Budget

4 code implementations EMNLP 2021 Peter Izsak, Moshe Berchansky, Omer Levy

While large language models a la BERT are used ubiquitously in NLP, pretraining them is considered a luxury that only a few well-funded industry labs can afford.

Language Modelling Linguistic Acceptability +4

Cannot find the paper you are looking for? You can Submit a new open access paper.