no code implementations • 24 May 2023 • Zachary Ankner, Naomi Saphra, Davis Blalock, Jonathan Frankle, Matthew L. Leavitt
Most works on transformers trained with the Masked Language Modeling (MLM) objective use the original BERT model's fixed masking rate of 15%.
no code implementations • 13 Oct 2022 • Brian R. Bartoldson, Bhavya Kailkhura, Davis Blalock
To address this problem, there has been a great deal of research on *algorithmically-efficient deep learning*, which seeks to reduce training costs not at the hardware or implementation level, but through changes in the semantics of the training program.
1 code implementation • 2 Jun 2022 • Jacob Portes, Davis Blalock, Cory Stephenson, Jonathan Frankle
Benchmarking the tradeoff between neural network accuracy and training time is computationally expensive.
3 code implementations • 21 Jun 2021 • Davis Blalock, John Guttag
Multiplying matrices is among the most fundamental and compute-intensive operations in machine learning.
1 code implementation • 13 May 2021 • Maggie Makar, Ben Packer, Dan Moldovan, Davis Blalock, Yoni Halpern, Alexander D'Amour
Shortcut learning, in which models make use of easy-to-represent but unstable associations, is a major failure mode for robust machine learning.
no code implementations • ICCV 2021 • Divya Shanmugam, Davis Blalock, Guha Balakrishnan, John Guttag
In this paper, we present 1) experimental analyses that shed light on cases in which the simple average is suboptimal and 2) a method to address these shortcomings.
1 code implementation • 6 Mar 2020 • Davis Blalock, Jose Javier Gonzalez Ortiz, Jonathan Frankle, John Guttag
Neural network pruning---the task of reducing the size of a network by removing parameters---has been the subject of a great deal of work in recent years.
no code implementations • 2 Dec 2018 • Divya Shanmugam, Davis Blalock, John Guttag
We focus on estimating a patient's risk of cardiovascular death after an acute coronary syndrome based on a patient's raw electrocardiogram (ECG) signal.
no code implementations • ICLR 2018 • Divya Shanmugam, Davis Blalock, John Guttag
Computing distances between examples is at the core of many learning algorithms for time series.