no code implementations • EMNLP 2020 • Zhengjue Wang, Zhibin Duan, Hao Zhang, Chaojie Wang, Long Tian, Bo Chen, Mingyuan Zhou
Abstractive document summarization is a comprehensive task including document understanding and summary generation, in which area Transformer-based models have achieved the state-of-the-art performance.
no code implementations • 16 Mar 2023 • Xinyang Liu, Dongsheng Wang, Miaoge Li, Zhibin Duan, Yishi Xu, Bo Chen, Mingyuan Zhou
For downstream applications of vision-language pre-trained models, there has been significant interest in constructing effective prompts.
1 code implementation • 16 Oct 2022 • Yishi Xu, Dongsheng Wang, Bo Chen, Ruiying Lu, Zhibin Duan, Mingyuan Zhou
With the tree-likeness property of hyperbolic space, the underlying semantic hierarchy among words and topics can be better exploited to mine more interpretable topics.
1 code implementation • 20 Sep 2022 • Dongsheng Wang, Yishi Xu, Miaoge Li, Zhibin Duan, Chaojie Wang, Bo Chen, Mingyuan Zhou
We propose a Bayesian generative model for incorporating prior domain knowledge into hierarchical topic modeling.
1 code implementation • NeurIPS 2021 • Zhibin Duan, Yishi Xu, Bo Chen, Dongsheng Wang, Chaojie Wang, Mingyuan Zhou
Existing deep hierarchical topic models are able to extract semantically meaningful topics from a text corpus in an unsupervised manner and automatically organize them into a topic hierarchy.
no code implementations • 29 Sep 2021 • Shujian Zhang, Zhibin Duan, Huangjie Zheng, Pengcheng He, Bo Chen, Weizhu Chen, Mingyuan Zhou
Crossformer with states sharing not only provides the desired cross-layer guidance and regularization but also reduces the memory requirement.
1 code implementation • ACL 2021 • Zhibin Duan, Hao Zhang, Chaojie Wang, Zhengjue Wang, Bo Chen, Mingyuan Zhou
As a result, the backbone learns the shared knowledge among all clusters while modulated weights extract the cluster-specific features.
1 code implementation • 30 Jun 2021 • Zhibin Duan, Dongsheng Wang, Bo Chen, Chaojie Wang, Wenchao Chen, Yewen Li, Jie Ren, Mingyuan Zhou
However, they often assume in the prior that the topics at each layer are independently drawn from the Dirichlet distribution, ignoring the dependencies between the topics both at the same layer and across different layers.