no code implementations • RaPID (LREC) 2022 • Ruihao Pan, Ziming Liu, Fengpei Yuan, Maryam Zare, Xiaopeng Zhao, Rebecca Jane Passonneau
An assistive robot Pepper has been designed to administer Referential Communication Tasks (RCTs) to human subjects without dementia as a step towards an agent to administer RCTs to dementia patients, potentially for earlier diagnosis.
no code implementations • 7 May 2024 • Subhash Kantamneni, Ziming Liu, Max Tegmark
Integrable partial differential equation (PDE) systems are of great interest in natural science, but are exceedingly rare and difficult to discover.
3 code implementations • 30 Apr 2024 • Ziming Liu, YiXuan Wang, Sachin Vaidya, Fabian Ruehle, James Halverson, Marin Soljačić, Thomas Y. Hou, Max Tegmark
Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs).
1 code implementation • 15 Mar 2024 • Xuanlei Zhao, Shenggan Cheng, Zangwei Zheng, Zheming Yang, Ziming Liu, Yang You
Scaling large models with long sequences across applications like language generation, video generation and multimodal tasks requires efficient sequence parallelism.
no code implementations • 8 Feb 2024 • David D. Baek, Ziming Liu, Max Tegmark
We present GenEFT: an effective theory framework for shedding light on the statics and dynamics of neural network generalization, and illustrate it with graph learning examples.
no code implementations • 7 Feb 2024 • Jinyeop Song, Ziming Liu, Max Tegmark, Jeff Gore
A task is usually composite hence can be decomposed into many subtasks, which compete for resources (measured by the number of neurons allocated to subtasks).
1 code implementation • 7 Feb 2024 • Eric J. Michaud, Isaac Liao, Vedang Lad, Ziming Liu, Anish Mudide, Chloe Loughridge, Zifan Carl Guo, Tara Rezaei Kheirkhah, Mateja Vukelić, Max Tegmark
We present MIPS, a novel method for program synthesis based on automated mechanistic interpretability of neural networks trained to perform the desired task, auto-distilling the learned algorithm into Python code.
no code implementations • 5 Feb 2024 • Qiyao Liang, Ziming Liu, Ila Fiete
Corresponding to each of these phases, we identify qualitatively different generation behaviors: 1) multiple bumps are generated, 2) one bump is generated but at inaccurate $x$ and $y$ locations, 3) a bump is generated at the correct $x$ and y location.
no code implementations • 19 Jan 2024 • Xuanlei Zhao, Shenggan Cheng, Guangyang Lu, Jiarui Fang, Haotian Zhou, Bin Jia, Ziming Liu, Yang You
The experiments demonstrate that AutoChunk can reduce over 80\% of activation memory while maintaining speed loss within 10%, extend max sequence length by 3. 2x to 11. 7x, and outperform state-of-the-art methods by a large margin.
no code implementations • 15 Dec 2023 • Jingcai Guo, Qihua Zhou, Ruibing Li, Xiaocheng Lu, Ziming Liu, Junyang Chen, Xin Xie, Jie Zhang
Then, to facilitate the generalization of local linearities, we construct a maximal margin geometry on the learned features by enforcing low-rank constraints on intra-class samples and high-rank constraints on inter-class samples, resulting in orthogonal subspaces for different classes and each subspace lies on a compact manifold.
no code implementations • 5 Dec 2023 • Isaac Liao, Ziming Liu, Max Tegmark
The hypernetwork is carefully designed such that it can control network complexity, leading to a diverse family of interpretable algorithms ranked by their complexity.
no code implementations • 11 Oct 2023 • Ziming Liu, Mikail Khona, Ila R. Fiete, Max Tegmark
Recurrent neural networks (RNNs) trained on compositional tasks can exhibit functional modularity, in which neurons can be clustered by activity similarity and participation in shared computational subtasks.
no code implementations • 9 Oct 2023 • Ziming Liu, Ziqian Zhong, Max Tegmark
To do so, we define linear mapping number (LMN) to measure network complexity, which is a generalized version of linear region number for ReLU networks.
no code implementations • 3 Oct 2023 • Ziming Liu, Max Tegmark
Neural scaling laws (NSL) refer to the phenomenon where model performance improves with scale.
no code implementations • 2 Sep 2023 • Ziming Liu, Jingcai Guo, Xiaocheng Lu, Song Guo, Peiran Dong, Jiewei Zhang
That is, in the process of inferring unseen classes, global features represent the principal direction of the image in the feature space, while local features should maintain uniqueness within a certain range.
1 code implementation • NeurIPS 2023 • Yilun Xu, Mingyang Deng, Xiang Cheng, Yonglong Tian, Ziming Liu, Tommi Jaakkola
Restart not only outperforms the previous best SDE results, but also accelerates the sampling speed by 10-fold / 2-fold on CIFAR-10 / ImageNet $64 \times 64$.
1 code implementation • 31 May 2023 • Ziming Liu, Patrick Obin Sturm, Saketh Bharadwaj, Sam Silva, Max Tegmark
Discovering conservation laws for a given dynamical system is important but challenging.
1 code implementation • 4 May 2023 • Ziming Liu, Eric Gan, Max Tegmark
We introduce Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable.
no code implementations • 2 May 2023 • Xiaocheng Lu, Ziming Liu, Song Guo, Jingcai Guo, Fushuo Huo, Sikai Bai, Tao Han
Compositional Zero-shot Learning (CZSL) aims to recognize novel concepts composed of known knowledge without training samples.
no code implementations • 5 Apr 2023 • Ziming Liu, Di Luo, Yilun Xu, Tommi Jaakkola, Max Tegmark
We introduce a general family, Generative Models from Physical Processes (GenPhys), where we translate partial differential equations (PDEs) describing physical processes to generative models.
1 code implementation • NeurIPS 2023 • Eric J. Michaud, Ziming Liu, Uzay Girit, Max Tegmark
We tentatively find that the frequency at which these quanta are used in the training distribution roughly follows a power law corresponding with the empirical scaling exponent for language models, a prediction of our theory.
1 code implementation • 8 Feb 2023 • Yilun Xu, Ziming Liu, Yonglong Tian, Shangyuan Tong, Max Tegmark, Tommi Jaakkola
The new models reduce to PFGM when $D{=}1$ and to diffusion models when $D{\to}\infty$.
Ranked #1 on Image Generation on FFHQ 64x64 - 4x upscaling
no code implementations • CVPR 2023 • Ziming Liu, Song Guo, Xiaocheng Lu, Jingcai Guo, Jiewei Zhang, Yue Zeng, Fushuo Huo
Recent studies usually approach multi-label zero-shot learning (MLZSL) with visual-semantic mapping on spatial-class correlation, which can be computationally costly, and worse still, fails to capture fine-grained class-specific semantics.
1 code implementation • CVPR 2023 • Xiaocheng Lu, Ziming Liu, Song Guo, Jingcai Guo
Existing methods either learn the combined state-object representation, challenging the generalization of unseen compositions, or design two classifiers to identify state and object separately from image features, ignoring the intrinsic relationship between them.
no code implementations • 19 Nov 2022 • Fushuo Huo, Wenchao Xu, Song Guo, Jingcai Guo, Haozhao Wang, Ziming Liu, Xiaocheng Lu
Open-World Compositional Zero-shot Learning (OW-CZSL) aims to recognize novel compositions of state and object primitives in images with no priors on the compositional space, which induces a tremendously large output space containing all possible state-object compositions.
1 code implementation • 24 Oct 2022 • Eric J. Michaud, Ziming Liu, Max Tegmark
We explore unique considerations involved in fitting ML models to data with very high precision, as is often required for science applications.
1 code implementation • 3 Oct 2022 • Ziming Liu, Eric J. Michaud, Max Tegmark
Grokking, the unusual phenomenon for algorithmic datasets where generalization happens long after overfitting the training data, has remained elusive.
1 code implementation • 22 Sep 2022 • Yilun Xu, Ziming Liu, Max Tegmark, Tommi Jaakkola
We interpret the data points as electrical charges on the $z=0$ hyperplane in a space augmented with an additional dimension $z$, generating a high-dimensional electric field (the gradient of the solution to Poisson equation).
Ranked #31 on Image Generation on CIFAR-10
no code implementations • 6 Sep 2022 • Jiangsu Du, Ziming Liu, Jiarui Fang, Shenggui Li, Yongbin Li, Yutong Lu, Yang You
Although the AI community has expanded the model scale to the trillion parameter level, the practical deployment of 10-100 billion parameter models is still uncertain due to the latency, throughput, and memory constraints.
no code implementations • 21 Aug 2022 • Jingcai Guo, Song Guo, Jie Zhang, Ziming Liu
Concretely, we maintain an edge-agnostic hidden model in the cloud server to estimate a less-accurate while direction-aware inversion of the global model.
no code implementations • 9 Aug 2022 • Ziming Liu, Andrew M. Stuart, YiXuan Wang
We propose a sampling method based on an ensemble approximation of second order Langevin dynamics.
1 code implementation • 13 Jun 2022 • Feijie Wu, Song Guo, Zhihao Qu, Shiqi He, Ziming Liu, Jing Gao
The lack of inactive clients' updates in partial client participation makes it more likely for the model aggregation to deviate from the aggregation based on full client participation.
1 code implementation • 20 May 2022 • Ziming Liu, Ouail Kitouni, Niklas Nolte, Eric J. Michaud, Max Tegmark, Mike Williams
We aim to understand grokking, a phenomenon where models generalize long after overfitting their training set.
no code implementations • 23 Mar 2022 • Ziming Liu, Varun Madhavan, Max Tegmark
We present a machine learning algorithm that discovers conservation laws from differential equations, both numerically (parametrized as neural networks) and symbolically, ensuring their functional independence (a non-linear generalization of linear independence).
no code implementations • 7 Mar 2022 • Ziming Liu, Song Guo, Jingcai Guo, Yuanyuan Xu, Fushuo Huo
We argue that disregarding the connection between major and minor classes, i. e., correspond to the global and local information, respectively, is the cause of the problem.
1 code implementation • 17 Dec 2021 • Feijie Wu, Song Guo, Haozhao Wang, Zhihao Qu, Haobo Zhang, Jie Zhang, Ziming Liu
In the setting of federated optimization, where a global model is aggregated periodically, step asynchronism occurs when participants conduct model training by efficiently utilizing their computational resources.
no code implementations • NeurIPS Workshop AI4Scien 2021 • Ziming Liu, Yunyue Chen, Yuanqi Du, Max Tegmark
Integrating physical inductive biases into machine learning can improve model generalizability.
no code implementations • 20 Sep 2021 • Ziming Liu, Max Tegmark
We present an automated method for finding hidden symmetries, defined as symmetries that become manifest only in a new coordinate system that must be discovered.
no code implementations • 31 May 2021 • Ziming Liu, Bohan Wang, Qi Meng, Wei Chen, Max Tegmark, Tie-Yan Liu
Energy conservation is a basic physics principle, the breakdown of which often implies new physics.
no code implementations • 9 Nov 2020 • Ziming Liu, Max Tegmark
We present AI Poincar\'e, a machine learning algorithm for auto-discovering conserved quantities using trajectory data from unknown dynamical systems.
no code implementations • 13 Jun 2020 • Ziming Liu, Guangyu Gao, Lin Sun, Zhiyuan Fang
By extracting various features from high to low resolutions, the MD-IPN is able to improve the performance of small object detection as well as maintaining the performance of middle and large objects.
no code implementations • 13 Jun 2020 • Ziming Liu, Guangyu Gao, A. K. Qin, Jinyang Li
Finally, the DTG-Net is evaluated in two ways: (i) the self-supervised DTG-Net to pre-train the supervised action recognition models with only unlabeled videos; (ii) the supervised DTG-Net to be jointly trained with the supervised action networks in an end-to-end way.
no code implementations • 8 Jun 2020 • Ziming Liu, Sitian Qian, Yi-Xuan Wang, Yuxuan Yan, Tianyi Yang
Counterintuitively, by drawing the connection between PCA and Schr\"odinger equation, we can not only attack the undersampling challenge but also compute in an efficient and decoupled way with the proposed algorithm called Schr\"odinger PCA.
no code implementations • 6 Dec 2019 • Ziming Liu, Yi-Xuan Wang, Zizhao Han, Dian Wu
Finally, both the original model and the perturbed model are tested on regional examples, as validations of our models.
1 code implementation • 4 Dec 2019 • Ziming Liu, Zheng Zhang
Hamiltonian Monte Carlo (HMC) is an efficient Bayesian sampling method that can make distant proposals in the parameter space by simulating a Hamiltonian dynamical system.
no code implementations • 2 Dec 2019 • Ziming Liu, Guangyu Gao, Lin Sun, Li Fang
In this paper, except for top-down combining of information for shallow layers, we propose a novel network called Image Pyramid Guidance Network (IPG-Net) to make sure both the spatial information and semantic information are abundant for each layer.
no code implementations • 21 Nov 2019 • Ziming Liu, Xiaobo Liu
The traditional PCA fault detection methods completely depend on the training data.