Search Results for author: Dian Jiao

Found 4 papers, 0 papers with code

Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling

no code implementations23 May 2024 Shuaipeng Li, Penghao Zhao, Hailin Zhang, Xingwu Sun, Hao Wu, Dian Jiao, Weiyan Wang, Chengjun Liu, Zheng Fang, Jinbao Xue, Yangyu Tao, Bin Cui, Di Wang

First, we raise the scaling law between batch sizes and optimal learning rates in the sign of gradient case, in which we prove that the optimal learning rate first rises and then falls as the batch size increases.

DuetRAG: Collaborative Retrieval-Augmented Generation

no code implementations12 May 2024 Dian Jiao, Li Cai, Jingsheng Huang, Wenqiao Zhang, Siliang Tang, Yueting Zhuang

Retrieval-Augmented Generation (RAG) methods augment the input of Large Language Models (LLMs) with relevant retrieved passages, reducing factual errors in knowledge-intensive tasks.

GraphControl: Adding Conditional Control to Universal Graph Pre-trained Models for Graph Domain Transfer Learning

no code implementations11 Oct 2023 Yun Zhu, Yaoke Wang, Haizhou Shi, Zhenshuo Zhang, Dian Jiao, Siliang Tang

These pre-trained models can be applied to various downstream Web applications, saving training time and improving downstream (target) performance.

Attribute Specificity +1

Angel-PTM: A Scalable and Economical Large-scale Pre-training System in Tencent

no code implementations6 Mar 2023 Xiaonan Nie, Yi Liu, Fangcheng Fu, Jinbao Xue, Dian Jiao, Xupeng Miao, Yangyu Tao, Bin Cui

Recent years have witnessed the unprecedented achievements of large-scale pre-trained models, especially the Transformer models.

Management Scheduling

Cannot find the paper you are looking for? You can Submit a new open access paper.