Search Results for author: Yao Wan

Found 46 papers, 20 papers with code

Modeling Hierarchical Syntax Structure with Triplet Position for Source Code Summarization

no code implementations • ACL 2022 • Juncai Guo, Jin Liu, Yao Wan, Li Li, Pingyi Zhou

In this paper, we propose CODESCRIBE to model the hierarchical syntax structure of code by introducing a novel triplet position for code summarization.

Code Summarization Position +1

Paper
Add Code

Are Pre-trained Transformers Robust in Intent Classification? A Missing Ingredient in Evaluation of Out-of-Scope Intent Detection

no code implementations • NLP4ConvAI (ACL) 2022 • JianGuo Zhang, Kazuma Hashimoto, Yao Wan, Zhiwei Liu, Ye Liu, Caiming Xiong, Philip Yu

Pre-trained Transformer-based models were reported to be robust in intent classification.

intent-classification Intent Classification +2

Paper
Add Code

Disentangled Code Representation Learning for Multiple Programming Languages

no code implementations • Findings (ACL) 2021 • Jingfeng Zhang, Haiwen Hong, Yin Zhang, Yao Wan, Ye Liu, Yulei Sui

Representation Learning

Paper
Add Code

Fix-Filter-Fix: Intuitively Connect Any Models for Effective Bug Fixing

no code implementations • EMNLP 2021 • Haiwen Hong, Jingfeng Zhang, Yin Zhang, Yao Wan, Yulei Sui

Obviously, unchanged fix is not the correct fix because it is the same as the buggy code that needs to be fixed.

Bug fixing Machine Translation +1

Paper
Add Code

Automated Data Visualization from Natural Language via Large Language Models: An Exploratory Study

1 code implementation • 26 Apr 2024 • Yang Wu, Yao Wan, Hongyu Zhang, Yulei Sui, Wucai Wei, Wei Zhao, Guandong Xu, Hai Jin

In particular, we first explore the ways of transforming structured tabular data into sequential text prompts, as to feed them into LLMs and analyze which table content contributes most to the NL2Vis.

Data Visualization In-Context Learning

242

Paper
Code

Graph Neural Networks for Vulnerability Detection: A Counterfactual Explanation

1 code implementation • 24 Apr 2024 • Zhaoyang Chu, Yao Wan, Qian Li, Yang Wu, Hongyu Zhang, Yulei Sui, Guandong Xu, Hai Jin

We argue that these factual reasoning-based explanations cannot answer critical what-if questions: What would happen to the GNN's decision if we were to alter the code graph into alternative structures?

counterfactual Counterfactual Explanation +2

Paper
Code

CodeIP: A Grammar-Guided Multi-Bit Watermark for Large Language Models of Code

1 code implementation • 24 Apr 2024 • Batu Guan, Yao Wan, Zhangqian Bi, Zheng Wang, Hongyu Zhang, Yulei Sui, Pan Zhou, Lichao Sun

As Large Language Models (LLMs) are increasingly used to automate code generation, it is often desired to know if the code is AI-generated and by which model, especially for purposes like protecting intellectual property (IP) in industry and preventing academic misconduct in education.

Code Generation

242

Paper
Code

Does Your Neural Code Completion Model Use My Code? A Membership Inference Approach

1 code implementation • 22 Apr 2024 • Yao Wan, Guanghua Wan, Shijie Zhang, Hongyu Zhang, Yulei Sui, Pan Zhou, Hai Jin, Lichao Sun

Subsequently, the membership classifier can be effectively employed to deduce the membership status of a given code sample based on the output of a target code completion model.

Code Completion Memorization

242

Paper
Code

VISION2UI: A Real-World Dataset with Layout for Code Generation from UI Designs

no code implementations • 9 Apr 2024 • Yi Gui, Zhen Li, Yao Wan, Yemin Shi, Hongyu Zhang, Yi Su, Shaoling Dong, Xing Zhou, Wenbin Jiang

Automatically generating UI code from webpage design visions can significantly alleviate the burden of developers, enabling beginner developers or designers to directly generate Web pages from design diagrams.

Code Generation

Paper
Add Code

Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback

no code implementations • 25 Mar 2024 • Zhangqian Bi, Yao Wan, Zheng Wang, Hongyu Zhang, Batu Guan, Fangxin Lu, Zili Zhang, Yulei Sui, Xuanhua Shi, Hai Jin

Large language models (LLMs) have shown remarkable progress in automated code generation.

Code Generation

Paper
Add Code

NL2Formula: Generating Spreadsheet Formulas from Natural Language Queries

no code implementations • 20 Feb 2024 • Wei Zhao, Zhitao Hou, Siyuan Wu, Yan Gao, Haoyu Dong, Yao Wan, Hongyu Zhang, Yulei Sui, Haidong Zhang

Writing formulas on spreadsheets, such as Microsoft Excel and Google Sheets, is a widespread practice among users performing data analysis.

Natural Language Queries

Paper
Add Code

MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark

1 code implementation • 7 Feb 2024 • Dongping Chen, Ruoxi Chen, Shilin Zhang, Yinuo Liu, Yaochen Wang, Huichi Zhou, Qihui Zhang, Pan Zhou, Yao Wan, Lichao Sun

Multimodal Large Language Models (MLLMs) have gained significant attention recently, showing remarkable potential in artificial general intelligence.

Paper
Code

I Think, Therefore I am: Benchmarking Awareness of Large Language Models Using AwareBench

2 code implementations • 31 Jan 2024 • Yuan Li, Yue Huang, Yuli Lin, Siyuan Wu, Yao Wan, Lichao Sun

Do large language models (LLMs) exhibit any forms of awareness similar to humans?

Benchmarking Multiple-choice +1

294

Paper
Code

LLM-as-a-Coauthor: Can Mixed Human-Written and Machine-Generated Text Be Detected?

2 code implementations • 11 Jan 2024 • Qihui Zhang, Chujie Gao, Dongping Chen, Yue Huang, Yixin Huang, Zhenyang Sun, Shilin Zhang, Weiye Li, Zhengyan Fu, Yao Wan, Lichao Sun

With the rapid development and widespread application of Large Language Models (LLMs), the use of Machine-Generated Text (MGT) has become increasingly common, bringing with it potential risks, especially in terms of quality and integrity in fields like news, education, and science.

Paper
Code

Deep Learning for Code Intelligence: Survey, Benchmark and Toolkit

no code implementations • 30 Dec 2023 • Yao Wan, Yang He, Zhangqian Bi, JianGuo Zhang, Hongyu Zhang, Yulei Sui, Guandong Xu, Hai Jin, Philip S. Yu

We also benchmark several state-of-the-art neural models for code intelligence, and provide an open-source toolkit tailored for the rapid prototyping of deep-learning-based code intelligence models.

Representation Learning

Paper
Add Code

kNN-ICL: Compositional Task-Oriented Parsing Generalization with Nearest Neighbor In-Context Learning

no code implementations • 17 Dec 2023 • Wenting Zhao, Ye Liu, Yao Wan, Yibo Wang, Qingyang Wu, Zhongfen Deng, Jiangshu Du, Shuaiqi Liu, Yunlong Xu, Philip S. Yu

Task-Oriented Parsing (TOP) enables conversational assistants to interpret user commands expressed in natural language, transforming them into structured outputs that combine elements of both natural language and intent/slot tags.

In-Context Learning Prompt Engineering +1

Paper
Add Code

DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain Question Answering over Knowledge Base and Text

no code implementations • 31 Oct 2023 • Wenting Zhao, Ye Liu, Tong Niu, Yao Wan, Philip S. Yu, Shafiq Joty, Yingbo Zhou, Semih Yavuz

Moreover, a significant gap in the current landscape is the absence of a realistic benchmark for evaluating the effectiveness of grounding LLMs on heterogeneous knowledge sources (e. g., knowledge base and text).

Knowledge Graphs Open-Domain Question Answering +2

Paper
Add Code

MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use

1 code implementation • 4 Oct 2023 • Yue Huang, Jiawen Shi, Yuan Li, Chenrui Fan, Siyuan Wu, Qihui Zhang, Yixin Liu, Pan Zhou, Yao Wan, Neil Zhenqiang Gong, Lichao Sun

However, in scenarios where LLMs serve as intelligent agents, as seen in applications like AutoGPT and MetaGPT, LLMs are expected to engage in intricate decision-making processes that involve deciding whether to employ a tool and selecting the most suitable tool(s) from a collection of available tools to fulfill user requests.

Decision Making

Paper
Code

Named Entity Recognition via Machine Reading Comprehension: A Multi-Task Learning Approach

1 code implementation • 20 Sep 2023 • Yibo Wang, Wenting Zhao, Yao Wan, Zhongfen Deng, Philip S. Yu

In this paper, we propose to incorporate the label dependencies among entity types into a multi-task learning framework for better MRC-based NER.

Machine Reading Comprehension Multi-Task Learning +3

Paper
Code

Localize, Retrieve and Fuse: A Generalized Framework for Free-Form Question Answering over Tables

no code implementations • 20 Sep 2023 • Wenting Zhao, Ye Liu, Yao Wan, Yibo Wang, Zhongfen Deng, Philip S. Yu

Furthermore, TAG-QA outperforms the end-to-end model T5 by 16% and 12% on BLEU-4 and PARENT F-score, respectively.

Question Answering TAG

Paper
Add Code

Rethinking the Video Sampling and Reasoning Strategies for Temporal Sentence Grounding

no code implementations • 2 Jan 2023 • Jiahao Zhu, Daizong Liu, Pan Zhou, Xing Di, Yu Cheng, Song Yang, Wenzheng Xu, Zichuan Xu, Yao Wan, Lichao Sun, Zeyu Xiong

All existing works first utilize a sparse sampling strategy to extract a fixed number of video frames and then conduct multi-modal interactions with query sentence for reasoning.

Sentence Temporal Sentence Grounding

Paper
Add Code

Diverse Title Generation for Stack Overflow Posts with Multiple Sampling Enhanced Transformer

1 code implementation • 24 Aug 2022 • Fengji Zhang, Jin Liu, Yao Wan, Xiao Yu, Xiao Liu, Jacky Keung

Stack Overflow is one of the most popular programming communities where developers can seek help for their encountered problems.

Paper
Code

Collaborative Knowledge Graph Fusion by Exploiting the Open Corpus

no code implementations • 15 Jun 2022 • Yue Wang, Yao Wan, Lu Bai, Lixin Cui, Zhuo Xu, Ming Li, Philip S. Yu, Edwin R Hancock

To alleviate the challenges of building Knowledge Graphs (KG) from scratch, a more general task is to enrich a KG using triples from an open corpus, where the obtained triples contain noisy entities and relations.

Event Extraction Knowledge Graphs

Paper
Add Code

CODE-MVP: Learning to Represent Source Code from Multiple Views with Contrastive Pre-Training

no code implementations • Findings (NAACL) 2022 • Xin Wang, Yasheng Wang, Yao Wan, Jiawei Wang, Pingyi Zhou, Li Li, Hao Wu, Jin Liu

Specifically, we first extract multiple code views using compiler tools, and learn the complementary information among them under a contrastive learning framework.

Contrastive Learning Defect Detection +2

Paper
Add Code

Compilable Neural Code Generation with Compiler Feedback

no code implementations • Findings (ACL) 2022 • Xin Wang, Yasheng Wang, Yao Wan, Fei Mi, Yitong Li, Pingyi Zhou, Jin Liu, Hao Wu, Xin Jiang, Qun Liu

Automatically generating compilable programs with (or without) natural language descriptions has always been a touchstone problem for computational linguistics and automated software engineering.

Code Completion Code Generation +4

Paper
Add Code

Reinforced MOOCs Concept Recommendation in Heterogeneous Information Networks

no code implementations • 8 Mar 2022 • Jibing Gong, Yao Wan, Ye Liu, Xuewen Li, Yi Zhao, Cheng Wang, YuTing Lin, Xiaohan Fang, Wenzheng Feng, Jingyi Zhang, Jie Tang

Despite the usefulness of this service, we consider that recommending courses to users directly may neglect their varying degrees of expertise.

Graph Attention reinforcement-learning +1

Paper
Add Code

Attend, Memorize and Generate: Towards Faithful Table-to-Text Generation in Few Shots

1 code implementation • Findings (EMNLP) 2021 • Wenting Zhao, Ye Liu, Yao Wan, Philip S. Yu

Few-shot table-to-text generation is a task of composing fluent and faithful sentences to convey table content using limited data.

Table-to-Text Generation

Paper
Code

What Do They Capture? -- A Structural Analysis of Pre-Trained Language Models for Source Code

1 code implementation • 14 Feb 2022 • Yao Wan, Wei Zhao, Hongyu Zhang, Yulei Sui, Guandong Xu, Hai Jin

In this paper, we conduct a thorough structural analysis aiming to provide an interpretation of pre-trained language models for source code (e. g., CodeBERT, and GraphCodeBERT) from three distinctive perspectives: (1) attention analysis, (2) probing on the word embedding, and (3) syntax tree induction.

Code Completion Code Search +1

242

Paper
Code

Cross-Language Binary-Source Code Matching with Intermediate Representations

1 code implementation • 19 Jan 2022 • Yi Gui, Yao Wan, Hongyu Zhang, Huifang Huang, Yulei Sui, Guandong Xu, Zhiyuan Shao, Hai Jin

Binary-source code matching plays an important role in many security and software engineering related tasks such as malware detection, reverse engineering and vulnerability assessment.

Malware Detection

242

Paper
Code

DANets: Deep Abstract Networks for Tabular Data Classification and Regression

1 code implementation • 6 Dec 2021 • Jintai Chen, Kuanlun Liao, Yao Wan, Danny Z. Chen, Jian Wu

A special basic block is built using AbstLays, and we construct a family of Deep Abstract Networks (DANets) for tabular data classification and regression by stacking such blocks.

regression

Paper
Code

FedHM: Efficient Federated Learning for Heterogeneous Models via Low-rank Factorization

no code implementations • 29 Nov 2021 • Dezhong Yao, Wanning Pan, Michael J O'Neill, Yutong Dai, Yao Wan, Hai Jin, Lichao Sun

To this end, this paper proposes FedHM, a novel heterogeneous federated model compression framework, distributing the heterogeneous low-rank models to clients and then aggregating them into a full-rank model.

Distributed Computing Federated Learning +3

Paper
Add Code

HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text Extractive Summarization

1 code implementation • EMNLP 2021 • Ye Liu, Jian-Guo Zhang, Yao Wan, Congying Xia, Lifang He, Philip S. Yu

To capture the semantic graph structure from raw text, most existing summarization approaches are built on GNNs with a pre-trained model.

Document Summarization Extractive Summarization +1

Paper
Code

SynCoBERT: Syntax-Guided Multi-Modal Contrastive Pre-Training for Code Representation

no code implementations • 10 Aug 2021 • Xin Wang, Yasheng Wang, Fei Mi, Pingyi Zhou, Yao Wan, Xiao Liu, Li Li, Hao Wu, Jin Liu, Xin Jiang

Code representation learning, which aims to encode the semantics of source code into distributed vectors, plays an important role in recent deep-learning-based models for code intelligence.

Clone Detection Code Search +5

Paper
Add Code

Local-Global Knowledge Distillation in Heterogeneous Federated Learning with Non-IID Data

no code implementations • 30 Jun 2021 • Dezhong Yao, Wanning Pan, Yutong Dai, Yao Wan, Xiaofeng Ding, Hai Jin, Zheng Xu, Lichao Sun

Federated learning enables multiple clients to collaboratively learn a global model by periodically aggregating the clients' models without transferring the local data.

Federated Learning Knowledge Distillation

Paper
Add Code

Are Pretrained Transformers Robust in Intent Classification? A Missing Ingredient in Evaluation of Out-of-Scope Intent Detection

1 code implementation • 8 Jun 2021 • JianGuo Zhang, Kazuma Hashimoto, Yao Wan, Zhiwei Liu, Ye Liu, Caiming Xiong, Philip S. Yu

Pre-trained Transformer-based models were reported to be robust in intent classification.

intent-classification Intent Classification +2

124

Paper
Code

Enriching Non-Autoregressive Transformer with Syntactic and Semantic Structures for Neural Machine Translation

no code implementations • EACL 2021 • Ye Liu, Yao Wan, JianGuo Zhang, Wenting Zhao, Philip Yu

In this paper, we claim that the syntactic and semantic structures among natural language are critical for non-autoregressive machine translation and can further improve the performance.

Machine Translation Translation

Paper
Add Code

Enriching Non-Autoregressive Transformer with Syntactic and SemanticStructures for Neural Machine Translation

no code implementations • 22 Jan 2021 • Ye Liu, Yao Wan, Jian-Guo Zhang, Wenting Zhao, Philip S. Yu

In this paper, we claim that the syntactic and semantic structures among natural language are critical for non-autoregressive machine translation and can further improve the performance.

Machine Translation Translation

Paper
Add Code

Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference

1 code implementation • EMNLP 2020 • Jian-Guo Zhang, Kazuma Hashimoto, Wenhao Liu, Chien-Sheng Wu, Yao Wan, Philip S. Yu, Richard Socher, Caiming Xiong

Intent detection is one of the core components of goal-oriented dialog systems, and detecting out-of-scope (OOS) intents is also a practically important skill.

Few-Shot Learning Goal-Oriented Dialog +3

Paper
Code

Cross-Supervised Joint-Event-Extraction with Heterogeneous Information Networks

no code implementations • 13 Oct 2020 • Yue Wang, Zhuo Xu, Lu Bai, Yao Wan, Lixin Cui, Qian Zhao, Edwin R. Hancock, Philip S. Yu

To verify the effectiveness of our proposed method, we conduct extensive experiments on four real-world datasets as well as compare our method with state-of-the-art methods.

Event Extraction TAG

Paper
Add Code

KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning

1 code implementation • 26 Sep 2020 • Ye Liu, Yao Wan, Lifang He, Hao Peng, Philip S. Yu

To promote the ability of commonsense reasoning for text generation, we propose a novel knowledge graph augmented pre-trained language generation model KG-BART, which encompasses the complex relations of concepts through the knowledge graph and produces more logical and natural sentences as output.

Graph Attention Text Generation

158

Paper
Code

Find or Classify? Dual Strategy for Slot-Value Predictions on Multi-Domain Dialog State Tracking

1 code implementation • Joint Conference on Lexical and Computational Semantics 2020 • Jian-Guo Zhang, Kazuma Hashimoto, Chien-Sheng Wu, Yao Wan, Philip S. Yu, Richard Socher, Caiming Xiong

Dialog state tracking (DST) is a core component in task-oriented dialog systems.

Ranked #4 on Multi-domain Dialogue State Tracking on MULTIWOZ 2.0

dialog state tracking Multi-domain Dialogue State Tracking +1

Paper
Code

Competitive Multi-Agent Deep Reinforcement Learning with Counterfactual Thinking

no code implementations • 13 Aug 2019 • Yue Wang, Yao Wan, Chenwei Zhang, Lixin Cui, Lu Bai, Philip S. Yu

During the iterations, our model updates the parallel policies and the corresponding scenario-based regrets for agents simultaneously.

counterfactual Decision Making +3

Paper
Add Code

Multi-Modal Generative Adversarial Network for Short Product Title Generation in Mobile E-Commerce

no code implementations • NAACL 2019 • Jian-Guo Zhang, Pengcheng Zou, Zhao Li, Yao Wan, Xiuming Pan, Yu Gong, Philip S. Yu

To address this discrepancy, previous studies mainly consider textual information of long product titles and lacks of human-like view during training and evaluation process.

Attribute Generative Adversarial Network

Paper
Add Code

Improving Automatic Source Code Summarization via Deep Reinforcement Learning

2 code implementations • 17 Nov 2018 • Yao Wan, Zhou Zhao, Min Yang, Guandong Xu, Haochao Ying, Jian Wu, Philip S. Yu

To the best of our knowledge, most state-of-the-art approaches follow an encoder-decoder framework which encodes the code into a hidden space and then decode it into natural language space, suffering from two major drawbacks: a) Their encoders only consider the sequential content of code, ignoring the tree structure which is also critical for the task of code summarization, b) Their decoders are typically trained to predict the next word by maximizing the likelihood of next ground-truth word with previous ground-truth word given.

Code Summarization Decoder +4

Paper
Code

Improved Dynamic Memory Network for Dialogue Act Classification with Adversarial Training

no code implementations • 12 Nov 2018 • Yao Wan, Wenqiang Yan, Jianwei Gao, Zhou Zhao, Jian Wu, Philip S. Yu

Dialogue Act (DA) classification is a challenging problem in dialogue interpretation, which aims to attach semantic labels to utterances and characterize the speaker's intention.

Ranked #5 on Dialogue Act Classification on Switchboard corpus

Classification Dialogue Act Classification +3

Paper
Add Code

Product Title Refinement via Multi-Modal Generative Adversarial Learning

no code implementations • 11 Nov 2018 • Jian-Guo Zhang, Pengcheng Zou, Zhao Li, Yao Wan, Ye Liu, Xiuming Pan, Yu Gong, Philip S. Yu

Nowadays, an increasing number of customers are in favor of using E-commerce Apps to browse and purchase products.

Attribute Generative Adversarial Network +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.