Search Results for author: Johan Bjorck

Found 14 papers, 3 papers with code

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

no code implementations • 22 Apr 2024 • Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Parul Chopra, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Dan Iter, Amit Garg, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Jamie Huynh, Mojan Javaheripi, Xin Jin, Piero Kauffmann, Nikos Karampatziakis, Dongwoo Kim, Mahoud Khademi, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Chen Liang, Weishung Liu, Eric Lin, Zeqi Lin, Piyush Madan, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Xia Song, Masahiro Tanaka, Xin Wang, Rachel Ward, Guanhua Wang, Philipp Witte, Michael Wyatt, Can Xu, Jiahang Xu, Sonali Yadav, Fan Yang, ZiYi Yang, Donghan Yu, Chengruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, Xiren Zhou

We introduce phi-3-mini, a 3. 8 billion parameter language model trained on 3. 3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3. 5 (e. g., phi-3-mini achieves 69% on MMLU and 8. 38 on MT-bench), despite being small enough to be deployed on a phone.

Language Modelling

Paper
Add Code

Language Is Not All You Need: Aligning Perception with Language Models

1 code implementation • NeurIPS 2023 • Shaohan Huang, Li Dong, Wenhui Wang, Yaru Hao, Saksham Singhal, Shuming Ma, Tengchao Lv, Lei Cui, Owais Khan Mohammed, Barun Patra, Qiang Liu, Kriti Aggarwal, Zewen Chi, Johan Bjorck, Vishrav Chaudhary, Subhojit Som, Xia Song, Furu Wei

A big convergence of language, multimodal perception, action, and world modeling is a key step toward artificial general intelligence.

Image Captioning Language Modelling +4

18,590

Paper
Code

Image as a Foreign Language: BEiT Pretraining for Vision and Vision-Language Tasks

no code implementations • CVPR 2023 • Wenhui Wang, Hangbo Bao, Li Dong, Johan Bjorck, Zhiliang Peng, Qiang Liu, Kriti Aggarwal, Owais Khan Mohammed, Saksham Singhal, Subhojit Som, Furu Wei

A big convergence of language, vision, and multimodal pretraining is emerging.

Cross-Modal Retrieval Image Captioning +10

Paper
Add Code

Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks

2 code implementations • 22 Aug 2022 • Wenhui Wang, Hangbo Bao, Li Dong, Johan Bjorck, Zhiliang Peng, Qiang Liu, Kriti Aggarwal, Owais Khan Mohammed, Saksham Singhal, Subhojit Som, Furu Wei

A big convergence of language, vision, and multimodal pretraining is emerging.

Ranked #1 on Visual Reasoning on NLVR2 Test

Cross-Modal Retrieval Image Captioning +11

18,587

Paper
Code

Is High Variance Unavoidable in RL? A Case Study in Continuous Control

no code implementations • ICLR 2022 • Johan Bjorck, Carla P. Gomes, Kilian Q. Weinberger

In this paper, we investigate causes for this perceived instability.

Continuous Control Reinforcement Learning (RL)

Paper
Add Code

Towards Deeper Deep Reinforcement Learning with Spectral Normalization

no code implementations • NeurIPS 2021 • Johan Bjorck, Carla P. Gomes, Kilian Q. Weinberger

In this paper we investigate how RL agents are affected by exchanging the small MLPs with larger modern networks with skip connections and normalization, focusing specifically on actor-critic algorithms.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision

no code implementations • 26 Feb 2021 • Johan Bjorck, Xiangyu Chen, Christopher De Sa, Carla P. Gomes, Kilian Q. Weinberger

Low-precision training has become a popular approach to reduce compute requirements, memory footprint, and energy consumption in supervised learning.

Continuous Control reinforcement-learning +1

Paper
Add Code

Dataset Curation Beyond Accuracy

no code implementations • 1 Jan 2021 • Johan Bjorck, Carla P Gomes

Neural networks are known to be data-hungry, and collecting large labeled datasets is often a crucial step in deep learning deployment.

Self-Driving Cars

Paper
Add Code

Understanding Decoupled and Early Weight Decay

no code implementations • 27 Dec 2020 • Johan Bjorck, Kilian Weinberger, Carla Gomes

We also show how the growth of network weights is heavily influenced by the dataset and its generalization properties.

Paper
Add Code

Star-Convexity in Non-Negative Matrix Factorization

no code implementations • 25 Sep 2019 • Johan Bjorck, Carla Gomes, Kilian Weinberger

Non-negative matrix factorization (NMF) is a highly celebrated algorithm for matrix decomposition that guarantees strictly non-negative factors.

Paper
Add Code

Automatic Detection and Compression for Passive Acoustic Monitoring of the African Forest Elephant

no code implementations • 25 Feb 2019 • Johan Bjorck, Brendan H. Rappazzo, Di Chen, Richard Bernstein, Peter H. Wrege, Carla P. Gomes

In this work, we consider applying machine learning to the analysis and compression of audio signals in the context of monitoring elephants in sub-Saharan Africa.

Audio Compression

Paper
Add Code

Understanding Batch Normalization

no code implementations • NeurIPS 2018 • Johan Bjorck, Carla Gomes, Bart Selman, Kilian Q. Weinberger

Batch normalization (BN) is a technique to normalize activations in intermediate layers of deep neural networks.

Paper
Add Code

Scalable Relaxations of Sparse Packing Constraints: Optimal Biocontrol in Predator-Prey Network

no code implementations • 18 Nov 2017 • Johan Bjorck, Yiwei Bai, Xiaojian Wu, Yexiang Xue, Mark C. Whitmore, Carla Gomes

Cascades represent rapid changes in networks.

Combinatorial Optimization Management

Paper
Add Code

Phase-Mapper: An AI Platform to Accelerate High Throughput Materials Discovery

1 code implementation • 3 Oct 2016 • Yexiang Xue, Junwen Bai, Ronan Le Bras, Brendan Rappazzo, Richard Bernstein, Johan Bjorck, Liane Longpre, Santosh K. Suram, Robert B. van Dover, John Gregoire, Carla P. Gomes

A key problem in materials discovery, the phase map identification problem, involves the determination of the crystal phase diagram from the materials' composition and structural characterization data.

Vocal Bursts Intensity Prediction

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.