no code implementations • 16 Apr 2024 • Haozheng Fan, Hao Zhou, Guangtai Huang, Parameswaran Raman, Xinwei Fu, Gaurav Gupta, Dhananjay Ram, Yida Wang, Jun Huan
In this paper, we showcase HLAT: a 7 billion parameter decoder-only LLM pre-trained using trn1 instances over 1. 8 trillion tokens.
1 code implementation • 4 Jul 2023 • Dongsheng Luo, Yuchen Bian, Yaowei Yan, Xiong Yu, Jun Huan, Xiao Liu, Xiang Zhang
To take advantage of rich information in multiple networks and make better inferences on entities, in this study, we propose random walk on multiple networks, RWM.
no code implementations • 20 Dec 2022 • Siyu Huang, Tianyang Wang, Haoyi Xiong, Bihan Wen, Jun Huan, Dejing Dou
Inspired by the fact that the samples with higher loss are usually more informative to the model than the samples with lower loss, in this paper we present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss.
1 code implementation • 19 Jul 2022 • Linbo Liu, Youngsuk Park, Trong Nghia Hoang, Hilaf Hasson, Jun Huan
This work studies the threats of adversarial attack on multivariate probabilistic forecasting models and viable defense mechanisms.
1 code implementation • ICCV 2021 • Siyu Huang, Tianyang Wang, Haoyi Xiong, Jun Huan, Dejing Dou
To lower the cost of data annotation, active learning has been proposed to interactively query an oracle to annotate a small proportion of informative samples in an unlabeled dataset.
1 code implementation • 9 Nov 2020 • Dongsheng Luo, Yuchen Bian, Xiang Zhang, Jun Huan
Social recommendation system is to predict unobserved user-item rating values by taking advantage of user-user social relation and user-item ratings.
1 code implementation • 17 Jul 2020 • Siyu Huang, Haoyi Xiong, Zhi-Qi Cheng, Qingzhong Wang, Xingran Zhou, Bihan Wen, Jun Huan, Dejing Dou
Generation of high-quality person images is challenging, due to the sophisticated entanglements among image factors, e. g., appearance, pose, foreground, background, local details, global structures, etc.
1 code implementation • 17 Mar 2020 • Siyu Huang, Haoyi Xiong, Tianyang Wang, Bihan Wen, Qingzhong Wang, Zeyu Chen, Jun Huan, Dejing Dou
This paper further presents a real-time feed-forward model to leverage Style Projection for arbitrary image style transfer, which includes a regularization term for matching the semantics between input contents and stylized outputs.
no code implementations • 5 Dec 2019 • Jie An, Haoyi Xiong, Jun Huan, Jiebo Luo
Our method consists of a construction step (C-step) to build a photorealistic stylization network and a pruning step (P-step) for acceleration.
no code implementations • 27 Nov 2019 • Zhi Fengy, Haoyi Xiong, Chuanyuan Song, Sijia Yang, Baoxin Zhao, Licheng Wang, Zeyu Chen, Shengwen Yang, Li-Ping Liu, Jun Huan
Our experiments using the real-world data showed that SecureGBM can well secure the communication and computation of LightGBM training and inference procedures for the both parties while only losing less than 3% AUC, using the same number of iterations for gradient boosting, on a wide range of benchmark datasets.
no code implementations • 18 Nov 2019 • Ruosi Wan, Haoyi Xiong, Xingjian Li, Zhanxing Zhu, Jun Huan
The empirical results show that the proposed descent direction estimation strategy DTNH can always improve the performance of deep transfer learning tasks based on all above regularizers, even when transferring pre-trained weights from inappropriate networks.
no code implementations • ICLR 2020 • Isaac Ahern, Adam Noack, Luis Guzman-Nateras, Dejing Dou, Boyang Li, Jun Huan
The problem of explaining deep learning models, and model predictions generally, has attracted intensive interest recently.
no code implementations • 6 Jul 2019 • Jie An, Haoyi Xiong, Jiebo Luo, Jun Huan, Jinwen Ma
Given a pair of images as the source of content and the reference of style, existing solutions usually first train an auto-encoder (AE) to reconstruct the image using deep features and then embeds pre-defined style transfer modules into the AE reconstruction procedure to transfer the style of the reconstructed image through modifying the deep features.
no code implementations • 25 Jun 2019 • Hanchao Wang, Jun Huan
Recent progress in Generative Adversarial Networks (GANs) has shown promising signs of improving GAN training via architectural change.
1 code implementation • ICML 2020 • Jingfeng Wu, Wenqing Hu, Haoyi Xiong, Jun Huan, Vladimir Braverman, Zhanxing Zhu
The gradient noise of SGD is considered to play a central role in the observed strong generalization abilities of deep learning.
no code implementations • 6 Jun 2019 • Jie An, Haoyi Xiong, Jinwen Ma, Jiebo Luo, Jun Huan
Finally compared to existing universal style transfer networks for photorealistic rendering such as PhotoWCT that stacks multiple well-trained auto-encoders and WCT transforms in a non-end-to-end manner, the architectures designed by StyleNAS produce better style-transferred images with details preserving, using a tiny number of operators/parameters, and enjoying around 500x inference time speed-up.
no code implementations • ICLR 2019 • Haoyi Xiong, Wenqing Hu, Zhanxing Zhu, Xinjian Li, Yunchao Zhang, Jun Huan
Derivative-free optimization (DFO) using trust region methods is frequently used for machine learning applications, such as (hyper-)parameter optimization without the derivatives of objective functions known.
no code implementations • ICLR 2020 • Yingzhen Yang, Jiahui Yu, Nebojsa Jojic, Jun Huan, Thomas S. Huang
FSNet has the same architecture as that of the baseline CNN to be compressed, and each convolution layer of FSNet has the same number of filters from FS as that of the basline CNN in the forward process.
no code implementations • 3 Feb 2019 • Yingzhen Yang, Jiahui Yu, Xingjian Li, Jun Huan, Thomas S. Huang
In this paper, we investigate the role of Rademacher complexity in improving generalization of DNNs and propose a novel regularizer rooted in Local Rademacher Complexity (LRC).
2 code implementations • ICLR 2019 • Xingjian Li, Haoyi Xiong, Hanchao Wang, Yuxuan Rao, Li-Ping Liu, Zeyu Chen, Jun Huan
Instead of constraining the weights of neural network, DELTA aims to preserve the outer layer outputs of the target network.
no code implementations • 18 Jan 2019 • Wenqing Hu, Zhanxing Zhu, Haoyi Xiong, Jun Huan
We show in this case that the quasi-potential function is related to the noise covariance structure of SGD via a partial differential equation of Hamilton-Jacobi type.
no code implementations • 8 Sep 2018 • Tianyang Wang, Jun Huan, Michelle Zhu
It makes use of pre-trained models that are learned from a source domain, and utilizes these models for the tasks in a target domain.
no code implementations • 1 Sep 2018 • Tianyang Wang, Jun Huan, Bo Li
In this paper, we demonstrate that deep learning models such as convolutional neural networks may not favor all training samples, and generalization accuracy can be further improved by dropping those unfavorable samples.
no code implementations • 6 Apr 2018 • Hannah Kim, Denys Katerenchuk, Daniel Billet, Jun Huan, Haesun Park, Boyang Li
Understanding narrative content has become an increasingly popular topic.
no code implementations • 3 Jul 2017 • Chao Lan, Jun Huan
We observe standard transfer learning can improve prediction accuracies of target tasks at the cost of lowering their prediction fairness -- a phenomenon we named discriminatory transfer.
no code implementations • 3 Apr 2017 • Chao Lan, Sai Nivedita Chandrasekaran, Jun Huan
In cheminformatics, compound-target binding profiles has been a main source of data for research.
no code implementations • 16 Jul 2016 • Chao Lan, Yuhao Yang, Xiao-Li Li, Bo Luo, Jun Huan
Based on extensive automatic and manual experimental evaluations, we deliver two major findings: first, multi-view clustering techniques perform better than common single-view clustering techniques, which only use one view or naively integrate all views for detection, second, the standard multi-view clustering technique is less robust than our modified technique, which selectively transfers information across views based on an assumption that sparse network structures are (potentially) incomplete.