Search Results for author: Nikita Kitaev

Found 14 papers, 10 papers with code

Interactive Assignments for Teaching Structured Neural NLP

no code implementations • NAACL (TeachingNLP) 2021 • David Gaddy, Daniel Fried, Nikita Kitaev, Mitchell Stern, Rodolfo Corona, John DeNero, Dan Klein

We present a set of assignments for a graduate-level NLP course.

Paper
Add Code

Learned Incremental Representations for Parsing

1 code implementation • ACL 2022 • Nikita Kitaev, Thomas Lu, Dan Klein

We present an incremental syntactic representation that consists of assigning a single discrete label to each word in a sentence, where the label is predicted using strictly incremental processing of a prefix of the sentence, and the sequence of labels for a sentence fully determines a parse tree.

Sentence

Paper
Code

SMYRF - Efficient Attention using Asymmetric Clustering

1 code implementation • NeurIPS 2020 • Giannis Daras, Nikita Kitaev, Augustus Odena, Alexandros G. Dimakis

We propose a novel type of balanced clustering algorithm to approximate attention.

Clustering

Paper
Code

SMYRF: Efficient Attention using Asymmetric Clustering

1 code implementation • 11 Oct 2020 • Giannis Daras, Nikita Kitaev, Augustus Odena, Alexandros G. Dimakis

We also show that SMYRF can be used interchangeably with dense attention before and after training.

16k Clustering

Paper
Code

Unsupervised Parsing via Constituency Tests

no code implementations • EMNLP 2020 • Steven Cao, Nikita Kitaev, Dan Klein

We propose a method for unsupervised parsing based on the linguistic notion of a constituency test.

Sentence

Paper
Add Code

Multilingual Alignment of Contextual Word Representations

no code implementations • ICLR 2020 • Steven Cao, Nikita Kitaev, Dan Klein

We propose procedures for evaluating and strengthening contextual embedding alignment and show that they are useful in analyzing and improving multilingual BERT.

Retrieval

Paper
Add Code

Reformer: The Efficient Transformer

15 code implementations • ICLR 2020 • Nikita Kitaev, Łukasz Kaiser, Anselm Levskaya

Large Transformer models routinely achieve state-of-the-art results on a number of tasks but training these models can be prohibitively costly, especially on long sequences.

Ranked #2 on Question Answering on Quasart-T

D4RL Image Generation +3

125,862

Paper
Code

Cross-Domain Generalization of Neural Constituency Parsers

1 code implementation • ACL 2019 • Daniel Fried, Nikita Kitaev, Dan Klein

Neural parsers obtain state-of-the-art results on benchmark treebanks for constituency parsing -- but to what degree do they generalize to other domains?

Constituency Parsing Domain Generalization

Paper
Code

KERMIT: Generative Insertion-Based Modeling for Sequences

no code implementations • 4 Jun 2019 • William Chan, Nikita Kitaev, Kelvin Guu, Mitchell Stern, Jakob Uszkoreit

During training, one can feed KERMIT paired data $(x, y)$ to learn the joint distribution $p(x, y)$, and optionally mix in unpaired data $x$ or $y$ to refine the marginals $p(x)$ or $p(y)$.

Ranked #39 on Machine Translation on WMT2014 English-German

Machine Translation Question Answering +2

Paper
Add Code

Tetra-Tagging: Word-Synchronous Parsing with Linear-Time Inference

2 code implementations • ACL 2020 • Nikita Kitaev, Dan Klein

We present a constituency parsing algorithm that, like a supertagger, works by assigning labels to each word in a sentence.

Ranked #12 on Constituency Parsing on Penn Treebank

Constituency Parsing Sentence

816

Paper
Code

Multilingual Constituency Parsing with Self-Attention and Pre-Training

4 code implementations • ACL 2019 • Nikita Kitaev, Steven Cao, Dan Klein

We show that constituency parsing benefits from unsupervised pre-training across a variety of languages and a range of pre-training conditions.

Ranked #5 on Constituency Parsing on CTB5

Constituency Parsing Unsupervised Pre-training

843

Paper
Code

Constituency Parsing with a Self-Attentive Encoder

4 code implementations • ACL 2018 • Nikita Kitaev, Dan Klein

We demonstrate that replacing an LSTM encoder with a self-attentive architecture can lead to improvements to a state-of-the-art discriminative constituency parser.

Ranked #8 on Constituency Parsing on CTB5

Constituency Parsing Sentence

843

Paper
Code