no code implementations • 23 Apr 2024 • Xiao Wang, Aristeidis Tsaris, Siyan Liu, Jong-Youl Choi, Ming Fan, Wei zhang, Junqi Yin, Moetasim Ashfaq, Dan Lu, Prasanna Balaprakash
As the largest model of its kind, ORBIT surpasses the current climate AI foundation model size by a thousandfold.
no code implementations • 17 Apr 2024 • Aristeidis Tsaris, Philipe Ambrozio Dias, Abhishek Potnis, Junqi Yin, Feiyi Wang, Dalton Lunga
Although large FMs have demonstrated significant impact in natural language processing and computer vision, efforts toward FMs for geospatial applications have been restricted to smaller size models, as pretraining larger models requires very large computing resources equipped with state-of-the-art hardware accelerators.
1 code implementation • 25 Jan 2024 • Quentin Anthony, Jacob Hatef, Deepak Narayanan, Stella Biderman, Stas Bekman, Junqi Yin, Aamir Shafi, Hari Subramoni, Dhabaleswar Panda
While GPUs are responsible for training the vast majority of state-of-the-art deep learning models, the implications of their architecture are often overlooked when designing new deep learning (DL) models.
no code implementations • 20 Dec 2023 • Sajal Dash, Isaac Lyngaas, Junqi Yin, Xiao Wang, Romain Egele, Guojing Cong, Feiyi Wang, Prasanna Balaprakash
For the training of the 175 Billion parameter model and the 1 Trillion parameter model, we achieved $100\%$ weak scaling efficiency on 1024 and 3072 MI250X GPUs, respectively.
no code implementations • 6 Oct 2023 • Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri, Rao Kotamarthi, Venkatram Vishwanath, Arvind Ramanathan, Sam Foreman, Kyle Hippe, Troy Arcomano, Romit Maulik, Maxim Zvyagin, Alexander Brace, Bin Zhang, Cindy Orozco Bohorquez, Austin Clyde, Bharat Kale, Danilo Perez-Rivera, Heng Ma, Carla M. Mann, Michael Irvin, J. Gregory Pauloski, Logan Ward, Valerie Hayot, Murali Emani, Zhen Xie, Diangen Lin, Maulik Shukla, Ian Foster, James J. Davis, Michael E. Papka, Thomas Brettin, Prasanna Balaprakash, Gina Tourassi, John Gounley, Heidi Hanson, Thomas E Potok, Massimiliano Lupo Pasini, Kate Evans, Dan Lu, Dalton Lunga, Junqi Yin, Sajal Dash, Feiyi Wang, Mallikarjun Shankar, Isaac Lyngaas, Xiao Wang, Guojing Cong, Pei Zhang, Ming Fan, Siyan Liu, Adolfy Hoisie, Shinjae Yoo, Yihui Ren, William Tang, Kyle Felker, Alexey Svyatkovskiy, Hang Liu, Ashwin Aji, Angela Dalton, Michael Schulte, Karl Schulz, Yuntian Deng, Weili Nie, Josh Romero, Christian Dallago, Arash Vahdat, Chaowei Xiao, Thomas Gibbs, Anima Anandkumar, Rick Stevens
In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences.
1 code implementation • 27 Jul 2022 • Victor Fung, Shuyi Jia, Jiaxin Zhang, Sirui Bi, Junqi Yin, P. Ganesh
These methods would help identify or, in the case of generative models, even create novel crystal structures of materials with a set of specified functional properties to then be synthesized or isolated in the laboratory.
no code implementations • 25 Jul 2022 • Massimiliano Lupo Pasini, Junqi Yin
We propose a stable, parallel approach to train Wasserstein Conditional Generative Adversarial Neural Networks (W-CGANs) under the constraint of a fixed computational budget.
1 code implementation • 26 Oct 2021 • Massimiliano Lupo Pasini, Junqi Yin, Viktor Reshniak, Miroslav Stoyanov
Anderson acceleration (AA) is an extrapolation technique designed to speed-up fixed-point iterations like those arising from the iterative training of DL models.
no code implementations • 21 Oct 2021 • Steven Farrell, Murali Emani, Jacob Balma, Lukas Drescher, Aleksandr Drozd, Andreas Fink, Geoffrey Fox, David Kanter, Thorsten Kurth, Peter Mattson, Dawei Mu, Amit Ruhela, Kento Sato, Koichi Shirahata, Tsuguchika Tabaru, Aristeidis Tsaris, Jan Balewski, Ben Cumming, Takumi Danjo, Jens Domke, Takaaki Fukai, Naoto Fukumoto, Tatsuya Fukushi, Balazs Gerofi, Takumi Honda, Toshiyuki Imamura, Akihiko Kasagi, Kentaro Kawakami, Shuhei Kudo, Akiyoshi Kuroda, Maxime Martinasso, Satoshi Matsuoka, Henrique Mendonça, Kazuki Minami, Prabhat Ram, Takashi Sawada, Mallikarjun Shankar, Tom St. John, Akihiro Tabuchi, Venkatram Vishwanath, Mohamed Wahib, Masafumi Yamazaki, Junqi Yin
Scientific communities are increasingly adopting machine learning and deep learning models in their applications to accelerate scientific insights.
no code implementations • 12 Sep 2021 • Junqi Yin, Zongrui Pei, Michael Gao
We propose that the Manhattan distance in the VAE latent space can serve as a generic order parameter for order-disorder phase transitions.
no code implementations • 21 Feb 2021 • Massimiliano Lupo Pasini, Vittorio Gabbi, Junqi Yin, Simona Perotto, Nouamane Laanait
We propose a distributed approach to train deep convolutional generative adversarial neural network (DC-CGANs) models.
no code implementations • 16 Dec 2020 • Shubhankar Gahlot, Junqi Yin, Mallikarjun Shankar
Distributed training in deep learning (DL) is common practice as data and models grow.
no code implementations • 3 Dec 2020 • Jean-Roch Vlimant, Junqi Yin
Deep learning models are yielding increasingly better performances thanks to multiple factors.
no code implementations • 23 Nov 2020 • Rick Archibald, Edmond Chow, Eduardo D'Azevedo, Jack Dongarra, Markus Eisenbach, Rocco Febbo, Florent Lopez, Daniel Nichols, Stanimire Tomov, Kwai Wong, Junqi Yin
This paper discusses the necessities of an HPC deep learning framework and how those needs can be provided (e. g., as in MagmaDNN) through a deep integration with existing HPC libraries, such as MAGMA and its modular memory management, MPI, CuBLAS, CuDNN, MKL, and HIP.
no code implementations • 24 Sep 2019 • Nouamane Laanait, Joshua Romero, Junqi Yin, M. Todd Young, Sean Treichler, Vitalii Starchenko, Albina Borisevich, Alex Sergeev, Michael Matheson
We introduce novel communication strategies in synchronous distributed Deep Learning consisting of decentralized gradient reduction orchestration and computational graph-aware grouping of gradient tensors.
no code implementations • NeurIPS Workshop Deep_Invers 2019 • Nouamane Laanait, Junqi Yin, Albina Borisevich
The phase problem in diffraction physics is one of the oldest inverse problems in all of science.
no code implementations • 7 Sep 2019 • Massimiliano Lupo Pasini, Junqi Yin, Ying Wai Li, Markus Eisenbach
We propose a new scalable method to optimize the architecture of an artificial neural network.
no code implementations • 10 Aug 2019 • Jiaxin Zhang, Xianglin Liu, Sirui Bi, Junqi Yin, Guannan Zhang, Markus Eisenbach
In this study, a robust data-driven framework based on Bayesian approaches is proposed and demonstrated on the accurate and efficient prediction of configurational energy of high entropy alloys.
2 code implementations • 6 Nov 2018 • Drew Schmidt, Junqi Yin, Michael Matheson, Bronson Messer, Mallikarjun Shankar
The design and construction of high performance computing (HPC) systems relies on exhaustive performance analysis and benchmarking.
Performance