1 code implementation • 29 Mar 2024 • Peng Ding, Jiading Fang, Peng Li, Kangrui Wang, Xiaochen Zhou, Mo Yu, Jing Li, Matthew R. Walter, Hongyuan Mei
The task is question-answering: for each maze, a large language model reads the walkthrough and answers hundreds of mapping and navigation questions such as "How should you go to Attic from West of House?"
1 code implementation • 17 May 2023 • Mo Yu, Jiangnan Li, Shunyu Yao, Wenjie Pang, Xiaochen Zhou, Zhou Xiao, Fandong Meng, Jie zhou
As readers engage with a story, their understanding of a character evolves based on new events and information; and multiple fine-grained aspects of personalities can be perceived.
no code implementations • 9 May 2023 • Xiaochen Zhou, Bosheng Li, Bedrich Benes, Songlin Fei, Sören Pirk
We use a neural network pipeline to train a situated latent space that allows us to locally predict branch growth only based on a single node in the branch graph of a tree model.
no code implementations • 6 Apr 2023 • Chen Feng Tsai, Xiaochen Zhou, Sierra S. Liu, Jing Li, Mo Yu, Hongyuan Mei
Large language models (LLMs) such as ChatGPT and GPT-4 have recently demonstrated their remarkable abilities of communicating with human users.
no code implementations • 29 Sep 2021 • Xiaochen Zhou, Yuchuan Tian, Xudong Wang
Moreover, to prevent the compact model from forgetting the knowledge of the source data during knowledge distillation, a collaborative knowledge distillation (Co-KD) method is developed to unify the source data on the server and the target data on the edge device to train the compact model.
no code implementations • 1 Jan 2021 • Xiaochen Zhou, Xudong Wang
Theoretical analysis shows: 1) DSPGD with CEM converges with an $O(1/T)$ rate, where $T$ is the number of iterations; 2) the communication cost of DSPGD with CEM is unrelated to the number of data samples; 3) the clustering loss of the federated kernel $k$-means can approach that of the centralized kernel $k$-means.