no code implementations • 27 Apr 2024 • Kaixuan Huang, Yuanhao Qu, Henry Cousins, William A. Johnson, Di Yin, Mihir Shah, Denny Zhou, Russ Altman, Mengdi Wang, Le Cong
We showcase the potential of CRISPR-GPT for assisting non-expert researchers with gene-editing experiments from scratch and validate the agent's effectiveness in a real-world use case.
1 code implementation • 19 Feb 2024 • Junru Lu, Siyu An, Min Zhang, Yulan He, Di Yin, Xing Sun
In the quest to facilitate the deep intelligence of Large Language Models (LLMs) accessible in final-end user-bot interactions, the art of prompt crafting emerges as a critical yet complex task for the average user.
2 code implementations • 19 Dec 2023 • Chaoyou Fu, Renrui Zhang, Zihan Wang, Yubo Huang, Zhengye Zhang, Longtian Qiu, Gaoxiang Ye, Yunhang Shen, Mengdan Zhang, Peixian Chen, Sirui Zhao, Shaohui Lin, Deqiang Jiang, Di Yin, Peng Gao, Ke Li, Hongsheng Li, Xing Sun
They endow Large Language Models (LLMs) with powerful capabilities in visual understanding, enabling them to tackle diverse multi-modal tasks.
1 code implementation • 18 Dec 2023 • Bing Wang, Changyu Ren, Jian Yang, Xinnian Liang, Jiaqi Bai, Linzheng Chai, Zhao Yan, Qian-Wen Zhang, Di Yin, Xing Sun, Zhoujun Li
Our framework comprises a core decomposer agent for Text-to-SQL generation with few-shot chain-of-thought reasoning, accompanied by two auxiliary agents that utilize external tools or models to acquire smaller sub-databases and refine erroneous SQL queries.
no code implementations • 18 Oct 2023 • Siyu An, Ye Liu, Haoyuan Peng, Di Yin
Extracting structured information from videos is critical for numerous downstream applications in the industry.
1 code implementation • 23 Aug 2023 • Zhen Zhao, Ye Liu, Meng Zhao, Di Yin, Yixuan Yuan, Luping Zhou
Studies on semi-supervised medical image segmentation (SSMIS) have seen fast progress recently.
1 code implementation • 16 Aug 2023 • Junru Lu, Siyu An, Mingbao Lin, Gabriele Pergola, Yulan He, Di Yin, Xing Sun, Yunsheng Wu
We propose MemoChat, a pipeline for refining instructions that enables large language models (LLMs) to effectively employ self-composed memos for maintaining consistent long-range open-domain conversations.
no code implementations • CVPR 2023 • Ye Liu, Lingfeng Qiao, Changchong Lu, Di Yin, Chen Lin, Haoyuan Peng, Bo Ren
An intuitive way to handle these two problems is to fulfill these tasks in two separate stages: aligning modalities followed by domain adaptation, or vice versa.
no code implementations • 14 Nov 2022 • Lingfeng Qiao, Chen Wu, Ye Liu, Haoyuan Peng, Di Yin, Bo Ren
In this paper, we propose a novel approach to graft the video encoder from the pre-trained video-language model on the generative pre-trained language model.
no code implementations • 9 Nov 2022 • Chen Lin, Ye Liu, Siyu An, Di Yin
In the scenario of unsupervised extractive summarization, learning high-quality sentence representations is essential to select salient sentences from the input document.
no code implementations • 10 Oct 2022 • Zhuoxuan Jiang, Lingfeng Qiao, Di Yin, Shanshan Feng, Bo Ren
Recent language generative models are mostly trained on large-scale datasets, while in some real scenarios, the training datasets are often expensive to obtain and would be small-scale.
no code implementations • 4 Jul 2022 • Ye Liu, Lingfeng Qiao, Di Yin, Zhuoxuan Jiang, Xinghua Jiang, Deqiang Jiang, Bo Ren
In this paper, from an alternate perspective to overcome the above challenges, we unite these two tasks into one task by a new form of predicting shots link: a link connects two adjacent shots, indicating that they belong to the same scene or category.
1 code implementation • NAACL 2022 • Yuan Liang, Zhuoxuan Jiang, Di Yin, Bo Ren
To further leverage relation information, we introduce a separate event relation prediction task and adopt multi-task learning method to explicitly enhance event extraction performance.
Ranked #1 on Document-level Event Extraction on ChFinAnn
no code implementations • 6 Jun 2022 • Ye Liu, Changchong Lu, Chen Lin, Di Yin, Bo Ren
However, to our knowledge, there is no existing work focused on the second step of video text classification, which will limit the guidance to downstream tasks such as video indexing and browsing.