Search Results for author: Hao Yang

Found 181 papers, 53 papers with code

Unified Humor Detection Based on Sentence-pair Augmentation and Transfer Learning

no code implementations • EAMT 2020 • Minghan Wang, Hao Yang, Ying Qin, Shiliang Sun, Yao Deng

We propose a unified multilingual model for humor detection which can be trained under a transfer learning framework.

Paper
Add Code

HW-TSC’s Participation in the WMT 2021 News Translation Shared Task

no code implementations • WMT (EMNLP) 2021 • Daimeng Wei, Zongyao Li, Zhanglin Wu, Zhengzhe Yu, Xiaoyu Chen, Hengchao Shang, Jiaxin Guo, Minghan Wang, Lizhi Lei, Min Zhang, Hao Yang, Ying Qin

This paper presents the submission of Huawei Translate Services Center (HW-TSC) to the WMT 2021 News Translation Shared Task.

Knowledge Distillation Translation

Paper
Add Code

Huawei’s Submissions to the WMT20 Biomedical Translation Task

no code implementations • WMT (EMNLP) 2020 • Wei Peng, Jianfeng Liu, Minghan Wang, Liangyou Li, Xupeng Meng, Hao Yang, Qun Liu

This paper describes Huawei’s submissions to the WMT20 biomedical translation shared task.

Machine Translation Transfer Learning +1

Paper
Add Code

HW-TSC’s Participation at WMT 2020 Automatic Post Editing Shared Task

no code implementations • WMT (EMNLP) 2020 • Hao Yang, Minghan Wang, Daimeng Wei, Hengchao Shang, Jiaxin Guo, Zongyao Li, Lizhi Lei, Ying Qin, Shimin Tao, Shiliang Sun, Yimeng Chen

The paper presents the submission by HW-TSC in the WMT 2020 Automatic Post Editing Shared Task.

Automatic Post-Editing NMT +1

Paper
Add Code

HI-CMLM: Improve CMLM with Hybrid Decoder Input

no code implementations • INLG (ACL) 2021 • Minghan Wang, Guo Jiaxin, Yuxia Wang, Yimeng Chen, Su Chang, Daimeng Wei, Min Zhang, Shimin Tao, Hao Yang

Mask-predict CMLM (Ghazvininejad et al., 2019) has achieved stunning performance among non-autoregressive NMT models, but we find that the mechanism of predicting all of the target words only depending on the hidden state of [MASK] is not effective and efficient in initial iterations of refinement, resulting in ungrammatical repetitions and slow convergence.

Decoder NMT +1

Paper
Add Code

Make the Blind Translator See The World: A Novel Transfer Learning Solution for Multimodal Machine Translation

no code implementations • MTSummit 2021 • Minghan Wang, Jiaxin Guo, Yimeng Chen, Chang Su, Min Zhang, Shimin Tao, Hao Yang

Based on large-scale pretrained networks and the liability to be easily overfitting with limited labelled training data of multimodal translation (MMT) is a critical issue in MMT.

Multimodal Machine Translation NMT +2

Paper
Add Code

Efficient Transfer Learning for Quality Estimation with Bottleneck Adapter Layer

no code implementations • EAMT 2020 • Hao Yang, Minghan Wang, Ning Xie, Ying Qin, Yao Deng

Compared with the commonly used NuQE baseline, BAL-QE achieves 47% (En-Ru) and 75% (En-De) of performance promotions.

NMT Transfer Learning

Paper
Add Code

HW-TSC’s Participation in the WMT 2020 News Translation Shared Task

no code implementations • WMT (EMNLP) 2020 • Daimeng Wei, Hengchao Shang, Zhanglin Wu, Zhengzhe Yu, Liangyou Li, Jiaxin Guo, Minghan Wang, Hao Yang, Lizhi Lei, Ying Qin, Shiliang Sun

We also conduct experiment with similar language augmentation, which lead to positive results, although not used in our submission.

Knowledge Distillation Translation

Paper
Add Code

HwTscSU’s Submissions on WAT 2022 Shared Task

no code implementations • WAT 2022 • Yilun Liu, Zhen Zhang, Shimin Tao, Junhui Li, Hao Yang

In this paper we describe our submission to the shared tasks of the 9th Workshop on Asian Translation (WAT 2022) on NICT–SAP under the team name ”HwTscSU”.

Domain Adaptation NMT +1

Paper
Add Code

HW-TSC’s Participation in the IWSLT 2022 Isometric Spoken Language Translation

no code implementations • IWSLT (ACL) 2022 • Zongyao Li, Jiaxin Guo, Daimeng Wei, Hengchao Shang, Minghan Wang, Ting Zhu, Zhanglin Wu, Zhengzhe Yu, Xiaoyu Chen, Lizhi Lei, Hao Yang, Ying Qin

This paper presents our submissions to the IWSLT 2022 Isometric Spoken Language Translation task.

Decoder Translation

Paper
Add Code

The HW-TSC’s Offline Speech Translation System for IWSLT 2022 Evaluation

no code implementations • IWSLT (ACL) 2022 • Minghan Wang, Jiaxin Guo, Xiaosong Qiao, Yuxia Wang, Daimeng Wei, Chang Su, Yimeng Chen, Min Zhang, Shimin Tao, Hao Yang, Ying Qin

For machine translation part, we pretrained three translation models on WMT21 dataset and fine-tuned them on in-domain corpora.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

HW-TSC’s Participation at WMT 2021 Quality Estimation Shared Task

no code implementations • WMT (EMNLP) 2021 • Yimeng Chen, Chang Su, Yingtao Zhang, Yuxia Wang, Xiang Geng, Hao Yang, Shimin Tao, Guo Jiaxin, Wang Minghan, Min Zhang, Yujia Liu, ShuJian Huang

This paper presents our work in WMT 2021 Quality Estimation (QE) Shared Task.

Data Augmentation Sentence +1

Paper
Add Code

HW-TSC’s Submissions to the WMT21 Biomedical Translation Task

no code implementations • WMT (EMNLP) 2021 • Hao Yang, Zhanglin Wu, Zhengzhe Yu, Xiaoyu Chen, Daimeng Wei, Zongyao Li, Hengchao Shang, Minghan Wang, Jiaxin Guo, Lizhi Lei, Chuanfei Xu, Min Zhang, Ying Qin

This paper describes the submission of Huawei Translation Service Center (HW-TSC) to WMT21 biomedical translation task in two language pairs: Chinese↔English and German↔English (Our registered team name is HuaweiTSC).

Translation

Paper
Add Code

HW-TSC’s Participation in the WMT 2021 Efficiency Shared Task

no code implementations • WMT (EMNLP) 2021 • Hengchao Shang, Ting Hu, Daimeng Wei, Zongyao Li, Jianfei Feng, Zhengzhe Yu, Jiaxin Guo, Shaojun Li, Lizhi Lei, Shimin Tao, Hao Yang, Jun Yao, Ying Qin

This paper presents the submission of Huawei Translation Services Center (HW-TSC) to WMT 2021 Efficiency Shared Task.

Decoder Quantization +2

Paper
Add Code

HW-TSC’s Participation in the WMT 2021 Large-Scale Multilingual Translation Task

no code implementations • WMT (EMNLP) 2021 • Zhengzhe Yu, Daimeng Wei, Zongyao Li, Hengchao Shang, Xiaoyu Chen, Zhanglin Wu, Jiaxin Guo, Minghan Wang, Lizhi Lei, Min Zhang, Hao Yang, Ying Qin

This paper presents the submission of Huawei Translation Services Center (HW-TSC) to the WMT 2021 Large-Scale Multilingual Translation Task.

Knowledge Distillation Translation

Paper
Add Code

HW-TSC’s Participation in the WMT 2021 Triangular MT Shared Task

no code implementations • WMT (EMNLP) 2021 • Zongyao Li, Daimeng Wei, Hengchao Shang, Xiaoyu Chen, Zhanglin Wu, Zhengzhe Yu, Jiaxin Guo, Minghan Wang, Lizhi Lei, Min Zhang, Hao Yang, Ying Qin

This paper presents the submission of Huawei Translation Service Center (HW-TSC) to WMT 2021 Triangular MT Shared Task.

Denoising Translation

Paper
Add Code

数字人文视角下的《史记》《汉书》比较研究(A Comparative Study of Shiji and Hanshu from the Perspective of Digital Humanities)

no code implementations • CCL 2022 • Zekun Deng, Hao Yang, Jun Wang

"《史记》和《汉书》具有经久不衰的研究价值。尽管两书异同的研究已经较为丰富, 但研究的全面性、完备性、科学性、客观性均仍显不足。在数字人文的视角下, 本文利用计算语言学方法, 通过对字、词、命名实体、段落等的多粒度、多角度分析, 开展对于《史》《汉》的比较研究。首先, 本文对于《史》《汉》中的字、词、命名实体的分布和特点进行对比, 以遍历穷举的考察方式提炼出两书在主要内容上的相同点与不同点, 揭示了汉武帝之前和汉武帝到西汉灭亡两段历史时期在政治、文化、思想上的重要变革与承袭。其次, 本文使用一种融入命名实体作为外部特征的文本相似度算法对于《史记》《汉书》的异文进行自动发现, 成功识别出过去研究者通过人工手段没有发现的袭用段落, 使得我们对于《史》《汉》的承袭关系形成更加完整和立体的认识。再次, 本文通过计算异文段落之间的最长公共子序列来自动得出两段异文之间存在的差异, 从宏观统计上证明了《汉书》文字风格《史记》的差别, 并从微观上进一步对二者语言特点进行了阐释, 为理解《史》《汉》异文特点提供了新的角度和启发。本研究站在数字人文的视域下, 利用先进的计算方法对于传世千年的中国古代经典进行了再审视、再发现, 其方法对于今人研究古籍有一定的借鉴价值。”

Paper
Add Code

基于预训练语言模型的繁体古文自动句读研究(Automatic Traditional Ancient Chinese Texts Segmentation and Punctuation Based on Pre-training Language Model)

no code implementations • CCL 2021 • Xuemei Tang, Qi Su, Jun Wang, Yuhang Chen, Hao Yang

Language Modelling

Paper
Add Code

HW-TSC’s Participation in the WAT 2020 Indic Languages Multilingual Task

no code implementations • AACL (WAT) 2020 • Zhengzhe Yu, Zhanglin Wu, Xiaoyu Chen, Daimeng Wei, Hengchao Shang, Jiaxin Guo, Zongyao Li, Minghan Wang, Liangyou Li, Lizhi Lei, Hao Yang, Ying Qin

This paper describes our work in the WAT 2020 Indic Multilingual Translation Task.

Translation

Paper
Add Code

Capture Human Disagreement Distributions by Calibrated Networks for Natural Language Inference

no code implementations • Findings (ACL) 2022 • Yuxia Wang, Minghan Wang, Yimeng Chen, Shimin Tao, Jiaxin Guo, Chang Su, Min Zhang, Hao Yang

Natural Language Inference (NLI) datasets contain examples with highly ambiguous labels due to its subjectivity.

Natural Language Inference

Paper
Add Code

HW-TSC at SemEval-2022 Task 7: Ensemble Model Based on Pretrained Models for Identifying Plausible Clarifications

no code implementations • SemEval (NAACL) 2022 • Xiaosong Qiao, Yinglu Li, Min Zhang, Minghan Wang, Hao Yang, Shimin Tao, Qin Ying

This paper describes the system for the identifying Plausible Clarifications of Implicit and Underspecified Phrases.

regression

Paper
Add Code

HW-TSC’s Participation at WMT 2020 Quality Estimation Shared Task

no code implementations • WMT (EMNLP) 2020 • Minghan Wang, Hao Yang, Hengchao Shang, Daimeng Wei, Jiaxin Guo, Lizhi Lei, Ying Qin, Shimin Tao, Shiliang Sun, Yimeng Chen, Liangyou Li

This paper presents our work in the WMT 2020 Word and Sentence-Level Post-Editing Quality Estimation (QE) Shared Task.

Sentence Transfer Learning

Paper
Add Code

The HW-TSC’s Simultaneous Speech Translation System for IWSLT 2022 Evaluation

no code implementations • IWSLT (ACL) 2022 • Minghan Wang, Jiaxin Guo, Yinglu Li, Xiaosong Qiao, Yuxia Wang, Zongyao Li, Chang Su, Yimeng Chen, Min Zhang, Shimin Tao, Hao Yang, Ying Qin

The cascade system is composed of a chunking-based streaming ASR model and the SimulMT model used in the T2T track.

Chunking Sentence +1

Paper
Add Code

The HW-TSC’s Speech to Speech Translation System for IWSLT 2022 Evaluation

no code implementations • IWSLT (ACL) 2022 • Jiaxin Guo, Yinglu Li, Minghan Wang, Xiaosong Qiao, Yuxia Wang, Hengchao Shang, Chang Su, Yimeng Chen, Min Zhang, Shimin Tao, Hao Yang, Ying Qin

The paper presents the HW-TSC’s pipeline and results of Offline Speech to Speech Translation for IWSLT 2022.

Machine Translation Speech-to-Speech Translation +1

Paper
Add Code

How Length Prediction Influence the Performance of Non-Autoregressive Translation?

no code implementations • EMNLP (BlackboxNLP) 2021 • Minghan Wang, Guo Jiaxin, Yuxia Wang, Yimeng Chen, Su Chang, Hengchao Shang, Min Zhang, Shimin Tao, Hao Yang

Length prediction is a special task in a series of NAT models where target length has to be determined before generation.

Language Modelling Translation

Paper
Add Code

An Effectiveness Study Across Baseline and Neural Network-based Force Estimation Methods on the da Vinci Research Kit Si System

no code implementations • 13 May 2024 • Hao Yang, Ayberk Acar, Keshuai Xu, Anton Deguet, Peter Kazanzides, Jie Ying Wu

In this study, we further investigate the robustness and generalization ability of an neural network (NN) based force estimation method, using the da Vinci Research Kit Si (dVRK-Si).

Paper
Add Code

THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models

no code implementations • 8 May 2024 • Prannay Kaul, Zhizhong Li, Hao Yang, Yonatan Dukler, Ashwin Swaminathan, C. J. Taylor, Stefano Soatto

By evaluating a large selection of recent LVLMs using public datasets, we show that an improvement in existing metrics do not lead to a reduction in Type I hallucinations, and that established benchmarks for measuring Type I hallucinations are incomplete.

Attribute Data Augmentation +2

Paper
Add Code

Double Mixture: Towards Continual Event Detection from Speech

1 code implementation • 20 Apr 2024 • Jingqi Kang, Tongtong Wu, Jinming Zhao, Guitao Wang, Yinwei Wei, Hao Yang, Guilin Qi, Yuan-Fang Li, Gholamreza Haffari

To address the challenges of catastrophic forgetting and effective disentanglement, we propose a novel method, 'Double Mixture.'

Continual Learning Disentanglement +1

Paper
Code

OSR-ViT: A Simple and Modular Framework for Open-Set Object Detection and Discovery

no code implementations • 16 Apr 2024 • Matthew Inkawhich, Nathan Inkawhich, Hao Yang, Jingyang Zhang, Randolph Linderman, Yiran Chen

Our method also excels in low-data settings, outperforming supervised baselines using a fraction of the training data.

Object object-detection +1

Paper
Add Code

Cross-Domain Audio Deepfake Detection: Dataset and Analysis

no code implementations • 7 Apr 2024 • Yuang Li, Min Zhang, Mengxin Ren, Miaomiao Ma, Daimeng Wei, Hao Yang

Audio deepfake detection (ADD) is essential for preventing the misuse of synthetic voices that may infringe on personal rights and privacy.

DeepFake Detection Face Swapping

Paper
Add Code

Mixed-Query Transformer: A Unified Image Segmentation Architecture

no code implementations • 6 Apr 2024 • Pei Wang, Zhaowei Cai, Hao Yang, Ashwin Swaminathan, R. Manmatha, Stefano Soatto

Existing unified image segmentation models either employ a unified architecture across multiple tasks but use separate weights tailored to each dataset, or apply a single set of weights to multiple datasets but are limited to a single task.

Data Augmentation Image Segmentation +2

Paper
Add Code

CHisIEC: An Information Extraction Corpus for Ancient Chinese History

no code implementations • 22 Mar 2024 • Xuemei Tang, Zekun Deng, Qi Su, Hao Yang, Jun Wang

Additionally, we have evaluated the capabilities of Large Language Models (LLMs) in the context of tasks related to ancient Chinese history.

named-entity-recognition Named Entity Recognition +3

Paper
Add Code

From Handcrafted Features to LLMs: A Brief Survey for Machine Translation Quality Estimation

no code implementations • 21 Mar 2024 • Haofei Zhao, Yilun Liu, Shimin Tao, Weibin Meng, Yimeng Chen, Xiang Geng, Chang Su, Min Zhang, Hao Yang

Machine Translation Quality Estimation (MTQE) is the task of estimating the quality of machine-translated text in real time without the need for reference translations, which is of great importance for the development of MT.

Machine Translation Sentence

Paper
Add Code

A Novel Paradigm Boosting Translation Capabilities of Large Language Models

no code implementations • 18 Mar 2024 • Jiaxin Guo, Hao Yang, Zongyao Li, Daimeng Wei, Hengchao Shang, Xiaoyu Chen

Experimental results conducted using the Llama2 model, particularly on Chinese-Llama2 after monolingual augmentation, demonstrate the improved translation capabilities of LLMs.

Machine Translation Translation

Paper
Add Code

DeepSeek-VL: Towards Real-World Vision-Language Understanding

2 code implementations • 8 Mar 2024 • Haoyu Lu, Wen Liu, Bo Zhang, Bingxuan Wang, Kai Dong, Bo Liu, Jingxiang Sun, Tongzheng Ren, Zhuoshu Li, Hao Yang, Yaofeng Sun, Chengqi Deng, Hanwei Xu, Zhenda Xie, Chong Ruan

The DeepSeek-VL family (both 1. 3B and 7B models) showcases superior user experiences as a vision-language chatbot in real-world applications, achieving state-of-the-art or competitive performance across a wide range of visual-language benchmarks at the same model size while maintaining robust performance on language-centric benchmarks.

Ranked #31 on Visual Question Answering on MM-Vet

Chatbot Language Modelling +3

1,664

Paper
Code

Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation

1 code implementation • 28 Feb 2024 • Yuan Ge, Yilun Liu, Chi Hu, Weibin Meng, Shimin Tao, Xiaofeng Zhao, Hongxia Ma, Li Zhang, Hao Yang, Tong Xiao

The second step involves preserving dataset diversity through a clustering process. In our experiment, CaR selected a subset containing only 1. 96% of Alpaca's IT data, yet the underlying AlpaCaR model trained on this subset outperforms Alpaca by an average of 32. 1% in GPT-4 evaluations.

Clustering

Paper
Code

DeMPT: Decoding-enhanced Multi-phase Prompt Tuning for Making LLMs Be Better Context-aware Translators

no code implementations • 23 Feb 2024 • Xinglin Lyu, Junhui Li, Yanqing Zhao, Daimeng Wei, Shimin Tao, Hao Yang, Min Zhang

In this paper, we propose an alternative adaptation approach, named Decoding-enhanced Multi-phase Prompt Tuning (DeMPT), to make LLMs discriminately model and utilize the inter- and intra-sentence context and more effectively adapt LLMs to context-aware NMT.

Decoder Machine Translation +2

Paper
Add Code

Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining

1 code implementation • 5 Feb 2024 • Jiarun Liu, Hao Yang, Hong-Yu Zhou, Yan Xi, Lequan Yu, Yizhou Yu, Yong Liang, Guangming Shi, Shaoting Zhang, Hairong Zheng, Shanshan Wang

However, it is challenging for existing methods to model long-range global information, where convolutional neural networks (CNNs) are constrained by their local receptive fields, and vision transformers (ViTs) suffer from high quadratic complexity of their attention mechanism.

Image Segmentation Medical Image Segmentation +1

158

Paper
Code

Rethinking Personalized Federated Learning with Clustering-based Dynamic Graph Propagation

no code implementations • 29 Jan 2024 • Jiaqi Wang, Yuzhong Chen, Yuhang Wu, Mahashweta Das, Hao Yang, Fenglong Ma

Subsequently, we design a precise personalized model distribution strategy to allow clients to obtain the most suitable model from the server side.

Clustering Personalized Federated Learning

Paper
Add Code

Consistency Enhancement-Based Deep Multiview Clustering via Contrastive Learning

no code implementations • 23 Jan 2024 • Hao Yang, Hua Mao, Wai Lok Woo, Jie Chen, Xi Peng

Furthermore, the representation process for clustering is enhanced through spectral clustering, and the consistency across multiple views is improved.

Clustering Contrastive Learning +2

Paper
Add Code

Using Large Language Model for End-to-End Chinese ASR and NER

no code implementations • 21 Jan 2024 • Yuang Li, Jiawei Yu, Yanqing Zhao, Min Zhang, Mengxin Ren, Xiaofeng Zhao, Xiaosong Qiao, Chang Su, Miaomiao Ma, Hao Yang

In this work, we connect the Whisper encoder with ChatGLM3 and provide in-depth comparisons of these two approaches using Chinese automatic speech recognition (ASR) and name entity recognition (NER) tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

Enhancing the vision-language foundation model with key semantic knowledge-emphasized report refinement

no code implementations • 21 Jan 2024 • Cheng Li, Weijian Huang, Hao Yang, Jiarun Liu, Shanshan Wang

Particularly, raw radiology reports are refined to highlight the key information according to a constructed clinical dictionary and two model-optimized knowledge-enhancement metrics.

Phrase Grounding Representation Learning

Paper
Add Code

A New Creative Generation Pipeline for Click-Through Rate with Stable Diffusion Model

1 code implementation • 17 Jan 2024 • Hao Yang, Jianxin Yuan, Shuai Yang, Linhe Xu, Shuo Yuan, Yifan Zeng

2) Prompt model is designed to generate individualized creatives for different user groups, which can further improve the diversity and quality.

Paper
Code

Deep Ensemble Shape Calibration: Multi-Field Post-hoc Calibration in Online Advertising

1 code implementation • 17 Jan 2024 • Shuai Yang, Hao Yang, Zhuang Zou, Linhe Xu, Shuo Yuan, Yifan Zeng

Shape calibration is defined as no over- or under-estimation for each subset of the pCTR within the specified range under condition of concerned fields.

Paper
Code

UCorrect: An Unsupervised Framework for Automatic Speech Recognition Error Correction

no code implementations • 11 Jan 2024 • Jiaxin Guo, Minghan Wang, Xiaosong Qiao, Daimeng Wei, Hengchao Shang, Zongyao Li, Zhengzhe Yu, Yinglu Li, Chang Su, Min Zhang, Shimin Tao, Hao Yang

Previous works usually adopt end-to-end models and has strong dependency on Pseudo Paired Data and Original Paired Data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

R-BI: Regularized Batched Inputs enhance Incremental Decoding Framework for Low-Latency Simultaneous Speech Translation

no code implementations • 11 Jan 2024 • Jiaxin Guo, Zhanglin Wu, Zongyao Li, Hengchao Shang, Daimeng Wei, Xiaoyu Chen, Zhiqiang Rao, Shaojun Li, Hao Yang

Moreover, these strategies are more suitable for end-to-end systems than cascade systems.

Translation

Paper
Add Code

Can AI Write Classical Chinese Poetry like Humans? An Empirical Study Inspired by Turing Test

no code implementations • 10 Jan 2024 • Zekun Deng, Hao Yang, Jun Wang

Some argue that the essence of humanity, such as creativity and sentiment, can never be mimicked by machines.

Paper
Add Code

Multi-modal vision-language model for generalizable annotation-free pathological lesions localization and clinical diagnosis

1 code implementation • 4 Jan 2024 • Hao Yang, Hong-Yu Zhou, Zhihuan Li, Yuanxu Gao, Cheng Li, Weijian Huang, Jiarun Liu, Hairong Zheng, Kang Zhang, Shanshan Wang

Defining pathologies automatically from medical images aids the understanding of the emergence and progression of diseases, and such an ability is crucial in clinical diagnostics.

Contrastive Learning Language Modelling

Paper
Code

Enhancing Representation in Medical Vision-Language Foundation Models via Multi-Scale Information Extraction Techniques

no code implementations • 3 Jan 2024 • Weijian Huang, Cheng Li, Hong-Yu Zhou, Jiarun Liu, Hao Yang, Yong Liang, Guangming Shi, Hairong Zheng, Shanshan Wang

The development of medical vision-language foundation models has attracted significant attention in the field of medicine and healthcare due to their promising prospect in various clinical applications.

Representation Learning

Paper
Add Code

MLIP: Medical Language-Image Pre-training with Masked Local Representation Learning

no code implementations • 3 Jan 2024 • Jiarun Liu, Hong-Yu Zhou, Cheng Li, Weijian Huang, Hao Yang, Yong Liang, Shanshan Wang

Existing contrastive language-image pre-training aims to learn a joint representation by matching abundant image-text pairs.

Contrastive Learning Representation Learning +1

Paper
Add Code

Multimodal self-supervised learning for lesion localization

no code implementations • 3 Jan 2024 • Hao Yang, Hong-Yu Zhou, Cheng Li, Weijian Huang, Jiarun Liu, Yong Liang, Shanshan Wang

Multimodal deep learning utilizing imaging and diagnostic reports has made impressive progress in the field of medical imaging diagnostics, demonstrating a particularly strong capability for auxiliary diagnosis in cases where sufficient annotation information is lacking.

Contrastive Learning Multimodal Deep Learning +1

Paper
Add Code

On the Promises and Challenges of Multimodal Foundation Models for Geographical, Environmental, Agricultural, and Urban Planning Applications

no code implementations • 23 Dec 2023 • Chenjiao Tan, Qian Cao, Yiwei Li, Jielu Zhang, Xiao Yang, Huaqin Zhao, Zihao Wu, Zhengliang Liu, Hao Yang, Nemin Wu, Tao Tang, Xinyue Ye, Lilong Chai, Ninghao Liu, Changying Li, Lan Mu, Tianming Liu, Gengchen Mai

The advent of large language models (LLMs) has heightened interest in their potential for multimodal applications that integrate language and vision.

Image Classification Land Cover Classification +5

Paper
Add Code

Language-driven All-in-one Adverse Weather Removal

no code implementations • 3 Dec 2023 • Hao Yang, Liyuan Pan, Yan Yang, Wei Liang

Then, with the guidance of degradation prior, we sparsely select restoration experts from a candidate list dynamically based on a Mixture-of-Experts (MoE) structure.

Paper
Add Code

INarIG: Iterative Non-autoregressive Instruct Generation Model For Word-Level Auto Completion

no code implementations • 30 Nov 2023 • Hengchao Shang, Zongyao Li, Daimeng Wei, Jiaxin Guo, Minghan Wang, Xiaoyu Chen, Lizhi Lei, Hao Yang

WLAC predicts a target word given a source sentence, translation context, and a human typed character sequence.

Machine Translation Sentence +1

Paper
Add Code

CoachLM: Automatic Instruction Revisions Improve the Data Quality in LLM Instruction Tuning

2 code implementations • 22 Nov 2023 • Yilun Liu, Shimin Tao, Xiaofeng Zhao, Ming Zhu, Wenbing Ma, Junhao Zhu, Chang Su, Yutai Hou, Miao Zhang, Min Zhang, Hongxia Ma, Li Zhang, Hao Yang, Yanfei Jiang

Instruction tuning is crucial for enabling Language Learning Models (LLMs) in responding to human instructions.

Instruction Following

Paper
Code

An Early Evaluation of GPT-4V(ision)

1 code implementation • 25 Oct 2023 • Yang Wu, Shilong Wang, Hao Yang, Tian Zheng, Hongbo Zhang, Yanyan Zhao, Bing Qin

In this paper, we evaluate different abilities of GPT-4V including visual understanding, language understanding, visual puzzle solving, and understanding of other modalities such as depth, thermal, video, and audio.

Math

Paper
Code

Data-driven Traffic Simulation: A Comprehensive Review

no code implementations • 24 Oct 2023 • Di Chen, Meixin Zhu, Hao Yang, Xuesong Wang, Yinhai Wang

The primary objective of this paper is to review current research efforts and provide a futuristic perspective that will benefit future developments in the field.

Autonomous Driving Imitation Learning

Paper
Add Code

Qwen Technical Report

2 code implementations • 28 Sep 2023 • Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng Xu, Jin Xu, An Yang, Hao Yang, Jian Yang, Shusheng Yang, Yang Yao, Bowen Yu, Hongyi Yuan, Zheng Yuan, Jianwei Zhang, Xingxuan Zhang, Yichang Zhang, Zhenru Zhang, Chang Zhou, Jingren Zhou, Xiaohuan Zhou, Tianhang Zhu

Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans.

Ranked #3 on Multi-Label Text Classification on CC3M-TagMask

Language Modelling Large Language Model +2

11,485

Paper
Code

Unify word-level and span-level tasks: NJUNLP's Participation for the WMT2023 Quality Estimation Shared Task

1 code implementation • 23 Sep 2023 • Xiang Geng, Zhejian Lai, Yu Zhang, Shimin Tao, Hao Yang, Jiajun Chen, ShuJian Huang

We generate pseudo MQM data using parallel data from the WMT translation task.

Sentence

Paper
Code

A Multitask Training Approach to Enhance Whisper with Contextual Biasing and Open-Vocabulary Keyword Spotting

no code implementations • 18 Sep 2023 • Yuang Li, Yinglu Li, Min Zhang, Chang Su, Mengxin Ren, Xiaosong Qiao, Xiaofeng Zhao, Mengyao Piao, Jiawei Yu, Xinglin Lv, Miaomiao Ma, Yanqing Zhao, Hao Yang

End-to-end automatic speech recognition (ASR) systems often struggle to recognize rare name entities, such as personal names, organizations, and terminologies not frequently encountered in the training data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Enhancing Representation in Radiography-Reports Foundation Model: A Granular Alignment Algorithm Using Masked Contrastive Learning

no code implementations • 12 Sep 2023 • Weijian Huang, Cheng Li, Hao Yang, Jiarun Liu, Shanshan Wang

Recently, multi-modal vision-language foundation models have gained significant attention in the medical field.

Contrastive Learning Representation Learning +1

Paper
Add Code

PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models

no code implementations • 11 Sep 2023 • Li Chen, Mengyi Zhao, Yiheng Liu, Mingxu Ding, Yangyang Song, Shizun Wang, Xu Wang, Hao Yang, Jing Liu, Kang Du, Min Zheng

Personalized text-to-image generation has emerged as a powerful and sought-after tool, empowering users to create customized images based on their specific concepts and prompts.

Text-to-Image Generation

Paper
Add Code

Hessian-aware Quantized Node Embeddings for Recommendation

no code implementations • 2 Sep 2023 • Huiyuan Chen, Kaixiong Zhou, Kwei-Herng Lai, Chin-Chia Michael Yeh, Yan Zheng, Xia Hu, Hao Yang

To address the gradient mismatch problem in STE, we further consider the quantized errors and its second-order derivatives for better stability.

Recommendation Systems Retrieval

Paper
Add Code

Tackling Diverse Minorities in Imbalanced Classification

no code implementations • 28 Aug 2023 • Kwei-Herng Lai, Daochen Zha, Huiyuan Chen, Mangesh Bendre, Yuzhong Chen, Mahashweta Das, Hao Yang, Xia Hu

Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.

Anomaly Detection Classification +2

Paper
Add Code

Translate Meanings, Not Just Words: IdiomKB's Role in Optimizing Idiomatic Translation with Language Models

1 code implementation • 26 Aug 2023 • Shuang Li, Jiangjie Chen, Siyu Yuan, Xinyi Wu, Hao Yang, Shimin Tao, Yanghua Xiao

To translate well, machine translation (MT) systems and general-purposed language models (LMs) need a deep understanding of both source and target languages and cultures.

Machine Translation Translation

Paper
Code

A Survey on Large Language Model based Autonomous Agents

2 code implementations • 22 Aug 2023 • Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, ZhiYuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, Wayne Xin Zhao, Zhewei Wei, Ji-Rong Wen

In this paper, we present a comprehensive survey of these studies, delivering a systematic review of the field of LLM-based autonomous agents from a holistic perspective.

Language Modelling Large Language Model

2,280

Paper
Code

UAV 3-D path planning based on MOEA/D with adaptive areal weight adjustment

no code implementations • 20 Aug 2023 • Yougang Xiao, Hao Yang, Huan Liu, Keyu Wu, Guohua Wu

Unmanned aerial vehicles (UAVs) are desirable platforms for time-efficient and cost-effective task execution.

Decision Making

Paper
Add Code

Enhancing Transformers without Self-supervised Learning: A Loss Landscape Perspective in Sequential Recommendation

no code implementations • 20 Aug 2023 • Vivian Lai, Huiyuan Chen, Chin-Chia Michael Yeh, Minghua Xu, Yiwei Cai, Hao Yang

Despite their success, Transformer-based models often require the optimization of a large number of parameters, making them difficult to train from sparse data in sequential recommendation.

Self-Supervised Learning Sequential Recommendation

Paper
Add Code

Adversarial Collaborative Filtering for Free

no code implementations • 20 Aug 2023 • Huiyuan Chen, Xiaoting Li, Vivian Lai, Chin-Chia Michael Yeh, Yujie Fan, Yan Zheng, Mahashweta Das, Hao Yang

In this paper, we present Sharpness-aware Collaborative Filtering (SharpCF), a simple yet effective method that conducts adversarial training without extra computational cost over the base optimizer.

Collaborative Filtering

Paper
Add Code

Interpretable Online Log Analysis Using Large Language Models with Prompt Strategies

1 code implementation • 15 Aug 2023 • Yilun Liu, Shimin Tao, Weibin Meng, Jingyu Wang, Wenbing Ma, Yanqing Zhao, Yuhang Chen, Hao Yang, Yanfei Jiang, Xun Chen

LogPrompt employs large language models (LLMs) to perform online log analysis tasks via a suite of advanced prompt strategies tailored for log tasks, which enhances LLMs' performance by up to 380. 7% compared with simple prompts.

Anomaly Detection Log Parsing +1

Paper
Code

Collective Human Opinions in Semantic Textual Similarity

1 code implementation • 8 Aug 2023 • Yuxia Wang, Shimin Tao, Ning Xie, Hao Yang, Timothy Baldwin, Karin Verspoor

Despite the subjective nature of semantic textual similarity (STS) and pervasive disagreements in STS annotation, existing benchmarks have used averaged human ratings as the gold standard.

Semantic Textual Similarity Sentence +1

Paper
Code

Flocking control against the malicious agent

no code implementations • 8 Aug 2023 • Chencheng Zhang, Hao Yang, Bin Jiang, Ming Cao

This paper investigates the flocking control of a swarm with a malicious agent that falsifies its controller parameters to cause collision, division, and escape of agents in the swarm.

Paper
Add Code

AvatarVerse: High-quality & Stable 3D Avatar Creation from Text and Pose

1 code implementation • 7 Aug 2023 • Huichao Zhang, Bowen Chen, Hao Yang, Liao Qu, Xu Wang, Li Chen, Chao Long, Feida Zhu, Kang Du, Min Zheng

We present AvatarVerse, a stable pipeline for generating expressive high-quality 3D avatars from nothing but text descriptions and pose guidance.

Text-to-3D-Human Generation

Paper
Code

LDP: Language-driven Dual-Pixel Image Defocus Deblurring Network

no code implementations • 19 Jul 2023 • Hao Yang, Liyuan Pan, Yan Yang, Richard Hartley, Miaomiao Liu

In this paper, we propose, to the best of our knowledge, the first framework that introduces the contrastive language-image pre-training framework (CLIP) to accurately estimate the blur map from a DP pair unsupervisedly.

Deblurring Image Defocus Deblurring

Paper
Add Code

Sharpness-Aware Graph Collaborative Filtering

no code implementations • 18 Jul 2023 • Huiyuan Chen, Chin-Chia Michael Yeh, Yujie Fan, Yan Zheng, Junpeng Wang, Vivian Lai, Mahashweta Das, Hao Yang

Graph Neural Networks (GNNs) have achieved impressive performance in collaborative filtering.

Collaborative Filtering

Paper
Add Code

Knowledge-Driven Resource Allocation for D2D Networks: A WMMSE Unrolled Graph Neural Network Approach

no code implementations • 12 Jul 2023 • Hao Yang, Nan Cheng, Ruijin Sun, Wei Quan, Rong Chai, Khalid Aldubaikhy, Abdullah Alqasir, Xuemin Shen

This paper proposes an novel knowledge-driven approach for resource allocation in device-to-device (D2D) networks using a graph neural network (GNN) architecture.

Management

Paper
Add Code

Knowledge-Prompted Estimator: A Novel Approach to Explainable Machine Translation Assessment

no code implementations • 13 Jun 2023 • Hao Yang, Min Zhang, Shimin Tao, Minghan Wang, Daimeng Wei, Yanfei Jiang

Cross-lingual Machine Translation (MT) quality estimation plays a crucial role in evaluating translation performance.

Machine Translation Sentence +1

Paper
Add Code

User Behavior Simulation with Large Language Model based Agents

1 code implementation • 5 Jun 2023 • Lei Wang, Jingsen Zhang, Hao Yang, ZhiYuan Chen, Jiakai Tang, Zeyu Zhang, Xu Chen, Yankai Lin, Ruihua Song, Wayne Xin Zhao, Jun Xu, Zhicheng Dou, Jun Wang, Ji-Rong Wen

Simulating high quality user behavior data has always been a fundamental problem in human-centered applications, where the major difficulty originates from the intricate mechanism of human decision process.

Language Modelling Large Language Model +2

216

Paper
Code

Text Style Transfer Back-Translation

1 code implementation • 2 Jun 2023 • Daimeng Wei, Zhanglin Wu, Hengchao Shang, Zongyao Li, Minghan Wang, Jiaxin Guo, Xiaoyu Chen, Zhengzhe Yu, Hao Yang

To address this issue, we propose Text Style Transfer Back Translation (TST BT), which uses a style transfer model to modify the source side of BT data.

Data Augmentation Domain Adaptation +4

Paper
Code

Investigating Pre-trained Audio Encoders in the Low-Resource Condition

1 code implementation • 28 May 2023 • Hao Yang, Jinming Zhao, Gholamreza Haffari, Ehsan Shareghi

Pre-trained speech encoders have been central to pushing state-of-the-art results across various speech understanding and generation tasks.

Paper
Code

Integrating Action Knowledge and LLMs for Task Planning and Situation Handling in Open Worlds

1 code implementation • 27 May 2023 • Yan Ding, Xiaohan Zhang, Saeid Amiri, Nieqing Cao, Hao Yang, Andy Kaminski, Chad Esselink, Shiqi Zhang

Each situation corresponds to a state instance wherein a robot is potentially unable to complete a task using a solution that normally works.

World Knowledge

Paper
Code

UNIMO-3: Multi-granularity Interaction for Vision-Language Representation Learning

no code implementations • 23 May 2023 • Hao Yang, Can Gao, Hao Líu, Xinyan Xiao, Yanyan Zhao, Bing Qin

The experimental results show that our model achieves state-of-the-art performance in various downstream tasks, and through ablation study can prove that effective cross-layer learning improves the model's ability of multimodal representation.

Representation Learning

Paper
Add Code

Imbalanced Aircraft Data Anomaly Detection

no code implementations • 17 May 2023 • Hao Yang, Junyu Gao, Yuan Yuan, Xuelong Li

Anomaly detection in temporal data from sensors under aviation scenarios is a practical but challenging task: 1) long temporal data is difficult to extract contextual information with temporal correlation; 2) the anomalous data are rare in time series, causing normal/abnormal imbalance in anomaly detection, making the detector classification degenerate or even fail.

Anomaly Detection Time Series

Paper
Add Code

Musketeer: Joint Training for Multi-task Vision Language Model with Task Explanation Prompts

1 code implementation • 11 May 2023 • Zhaoyang Zhang, Yantao Shen, Kunyu Shi, Zhaowei Cai, Jun Fang, Siqi Deng, Hao Yang, Davide Modolo, Zhuowen Tu, Stefano Soatto

We present a vision-language model whose parameters are jointly trained on all tasks and fully shared among multiple heterogeneous tasks which may interfere with each other, resulting in a single model which we named Musketeer.

Language Modelling

Paper
Code

Surgical tool classification and localization: results and methods from the MICCAI 2022 SurgToolLoc challenge

no code implementations • 11 May 2023 • Aneeq Zia, Kiran Bhattacharyya, Xi Liu, Max Berniker, Ziheng Wang, Rogerio Nespolo, Satoshi Kondo, Satoshi Kasai, Kousuke Hirasawa, Bo Liu, David Austin, Yiheng Wang, Michal Futrega, Jean-Francois Puget, Zhenqiang Li, Yoichi Sato, Ryo Fujii, Ryo Hachiuma, Mana Masuda, Hideo Saito, An Wang, Mengya Xu, Mobarakol Islam, Long Bai, Winnie Pang, Hongliang Ren, Chinedu Nwoye, Luca Sestini, Nicolas Padoy, Maximilian Nielsen, Samuel Schüttler, Thilo Sentker, Hümeyra Husseini, Ivo Baltruschat, Rüdiger Schmitz, René Werner, Aleksandr Matsun, Mugariya Farooq, Numan Saaed, Jose Renato Restom Viera, Mohammad Yaqub, Neil Getty, Fangfang Xia, Zixuan Zhao, Xiaotian Duan, Xing Yao, Ange Lou, Hao Yang, Jintong Han, Jack Noble, Jie Ying Wu, Tamer Abdulbaki Alshirbaji, Nour Aldeen Jalal, Herag Arabian, Ning Ding, Knut Moeller, Weiliang Chen, Quan He, Muhammad Bilal, Taofeek Akinosho, Adnan Qayyum, Massimo Caputo, Hunaid Vohra, Michael Loizou, Anuoluwapo Ajayi, Ilhem Berrou, Faatihah Niyi-Odumosu, Lena Maier-Hein, Danail Stoyanov, Stefanie Speidel, Anthony Jarc

Unfortunately, obtaining the annotations needed to train machine learning models to identify and localize surgical tools is a difficult task.

Paper
Add Code

Context-aware Domain Adaptation for Time Series Anomaly Detection

no code implementations • 15 Apr 2023 • Kwei-Herng Lai, Lan Wang, Huiyuan Chen, Kaixiong Zhou, Fei Wang, Hao Yang, Xia Hu

We formulate context sampling into the Markov decision process and exploit deep reinforcement learning to optimize the time series domain adaptation process via context sampling and design a tailored reward function to generate domain-invariant features that better align two domains for anomaly detection.

Anomaly Detection Domain Adaptation +3

Paper
Add Code

Few-shot Class-incremental Learning for Cross-domain Disease Classification

no code implementations • 12 Apr 2023 • Hao Yang, Weijian Huang, Jiarun Liu, Cheng Li, Shanshan Wang

The ability to incrementally learn new classes from limited samples is crucial to the development of artificial intelligence systems for real clinical application.

Cross-Domain Few-Shot Data Augmentation +2

Paper
Add Code

InterFormer: Real-time Interactive Image Segmentation

1 code implementation • ICCV 2023 • You Huang, Hao Yang, Ke Sun, Shengchuan Zhang, Liujuan Cao, Guannan Jiang, Rongrong Ji

Interactive image segmentation enables annotators to efficiently perform pixel-level annotation for segmentation tasks.

Computational Efficiency Image Segmentation +3

Paper
Code

ContraNeRF: Generalizable Neural Radiance Fields for Synthetic-to-real Novel View Synthesis via Contrastive Learning

no code implementations • CVPR 2023 • Hao Yang, Lanqing Hong, Aoxue Li, Tianyang Hu, Zhenguo Li, Gim Hee Lee, LiWei Wang

In this work, we first investigate the effects of synthetic data in synthetic-to-real novel view synthesis and surprisingly observe that models trained with synthetic data tend to produce sharper but less accurate volume densities.

Contrastive Learning Generalizable Novel View Synthesis +2

Paper
Add Code

MGA: Medical generalist agent through text-guided knowledge transformation

no code implementations • 15 Mar 2023 • Weijian Huang, Hao Yang, Cheng Li, Mingtong Dai, Rui Yang, Shanshan Wang

To this end, we propose a novel medical generalist agent, MGA, that can address three kinds of common clinical tasks via clinical reports knowledge transformation.

Clinical Knowledge Inductive Bias

Paper
Add Code

A Meta-Learning Approach to Predicting Performance and Data Requirements

no code implementations • CVPR 2023 • Achin Jain, Gurumurthy Swaminathan, Paolo Favaro, Hao Yang, Avinash Ravichandran, Hrayr Harutyunyan, Alessandro Achille, Onkar Dabeer, Bernt Schiele, Ashwin Swaminathan, Stefano Soatto

The PPL improves the performance estimation on average by 37% across 16 classification and 33% across 10 detection datasets, compared to the power law.

Meta-Learning

Paper
Add Code

KG-BERTScore: Incorporating Knowledge Graph into BERTScore for Reference-Free Machine Translation Evaluation

no code implementations • 30 Jan 2023 • Zhanglin Wu, Min Zhang, Ming Zhu, Yinglu Li, Ting Zhu, Hao Yang, Song Peng, Ying Qin

BERTScore is an effective and robust automatic metric for referencebased machine translation evaluation.

Machine Translation Translation

Paper
Add Code

SwiftAvatar: Efficient Auto-Creation of Parameterized Stylized Character on Arbitrary Avatar Engines

no code implementations • 19 Jan 2023 • Shizun Wang, Weihong Zeng, Xu Wang, Hao Yang, Li Chen, Yi Yuan, Yunzhao Zeng, Min Zheng, Chuang Zhang, Ming Wu

To this end, we propose SwiftAvatar, a novel avatar auto-creation framework that is evidently superior to previous works.

Paper
Add Code

Local and Global Logit Adjustments for Long-Tailed Learning

no code implementations • ICCV 2023 • Yingfan Tao, Jingna Sun, Hao Yang, Li Chen, Xu Wang, Wenming Yang, Daniel Du, Min Zheng

LGLA consists of two core components: a Class-aware Logit Adjustment (CLA) strategy and an Adaptive Angular Weighted (AAW) loss.

Paper
Add Code

Guided Recommendation for Model Fine-Tuning

no code implementations • CVPR 2023 • Hao Li, Charless Fowlkes, Hao Yang, Onkar Dabeer, Zhuowen Tu, Stefano Soatto

With thousands of historical training jobs, a recommendation system can be learned to predict the model selection score given the features of the dataset and the model as input.

Model Selection Transfer Learning

Paper
Add Code

FreeEnricher: Enriching Face Landmarks without Additional Cost

no code implementations • 19 Dec 2022 • Yangyu Huang, Xi Chen, Jongyoo Kim, Hao Yang, Chong Li, Jiaolong Yang, Dong Chen

To evaluate our method, we manually label the dense landmarks on 300W testset.

Ranked #1 on Face Alignment on 300W

Face Alignment

Paper
Add Code

P-Transformer: Towards Better Document-to-Document Neural Machine Translation

no code implementations • 12 Dec 2022 • Yachao Li, Junhui Li, Jing Jiang, Shimin Tao, Hao Yang, Min Zhang

To alleviate this problem, we propose a position-aware Transformer (P-Transformer) to enhance both the absolute and relative position information in both self-attention and cross-attention.

Machine Translation NMT +3

Paper
Add Code

Denoising Self-attentive Sequential Recommendation

no code implementations • 8 Dec 2022 • Huiyuan Chen, Yusan Lin, Menghai Pan, Lan Wang, Chin-Chia Michael Yeh, Xiaoting Li, Yan Zheng, Fei Wang, Hao Yang

Transformer-based sequential recommenders are very powerful for capturing both short-term and long-term sequential item dependencies.

Denoising Sequential Recommendation

Paper
Add Code

TinyKG: Memory-Efficient Training Framework for Knowledge Graph Neural Recommender Systems

no code implementations • 8 Dec 2022 • Huiyuan Chen, Xiaoting Li, Kaixiong Zhou, Xia Hu, Chin-Chia Michael Yeh, Yan Zheng, Hao Yang

We found that our TinyKG with INT2 quantization aggressively reduces the memory footprint of activation maps with $7 \times$, only with $2\%$ loss in accuracy, allowing us to deploy KGNNs on memory-constrained devices.

Knowledge Graphs Quantization +1

Paper
Add Code

OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models

1 code implementation • 8 Dec 2022 • Jinze Bai, Rui Men, Hao Yang, Xuancheng Ren, Kai Dang, Yichang Zhang, Xiaohuan Zhou, Peng Wang, Sinan Tan, An Yang, Zeyu Cui, Yu Han, Shuai Bai, Wenbin Ge, Jianxin Ma, Junyang Lin, Jingren Zhou, Chang Zhou

As a starting point, we provide presets of 7 different modalities and 23 highly-diverse example tasks in OFASys, with which we also develop a first-in-kind, single model, OFA+, that can handle text, image, speech, video, and motion data.

Multi-Task Learning

142

Paper
Code

SMARTQUERY: An Active Learning Framework for Graph Neural Networks through Hybrid Uncertainty Reduction

no code implementations • 2 Dec 2022 • Xiaoting Li, Yuhang Wu, Vineeth Rakesh, Yusan Lin, Hao Yang, Fei Wang

Graph neural networks have achieved significant success in representation learning.

Active Learning Graph Learning +1

Paper
Add Code

Self-supervised Rewiring of Pre-trained Speech Encoders: Towards Faster Fine-tuning with Less Labels in Speech Processing

1 code implementation • 24 Oct 2022 • Hao Yang, Jinming Zhao, Gholamreza Haffari, Ehsan Shareghi

Pre-trained speech Transformers have facilitated great success across various speech processing tasks.

Self-Supervised Learning

Paper
Code

Towards Generating Adversarial Examples on Mixed-type Data

no code implementations • 17 Oct 2022 • Han Xu, Menghai Pan, Zhimeng Jiang, Huiyuan Chen, Xiaoting Li, Mahashweta Das, Hao Yang

The existence of adversarial attacks (or adversarial examples) brings huge concern about the machine learning (ML) model's safety issues.

Anomaly Detection Vocal Bursts Type Prediction

Paper
Add Code

RedApt: An Adaptor for wav2vec 2 Encoding \\ Faster and Smaller Speech Translation without Quality Compromise

no code implementations • 16 Oct 2022 • Jinming Zhao, Hao Yang, Gholamreza Haffari, Ehsan Shareghi

Pre-trained speech Transformers in speech translation (ST) have facilitated state-of-the-art (SotA) results; yet, using such encoders is computationally expensive.

Translation

Paper
Add Code

Robot Task Planning and Situation Handling in Open Worlds

no code implementations • 4 Oct 2022 • Yan Ding, Xiaohan Zhang, Saeid Amiri, Nieqing Cao, Hao Yang, Chad Esselink, Shiqi Zhang

This paper introduces a novel algorithm (COWP) for open-world task planning and situation handling that dynamically augments the robot's action knowledge with task-oriented common sense.

Common Sense Reasoning Robot Task Planning +1

Paper
Add Code

ComplETR: Reducing the cost of annotations for object detection in dense scenes with vision transformers

no code implementations • 13 Sep 2022 • Achin Jain, Kibok Lee, Gurumurthy Swaminathan, Hao Yang, Bernt Schiele, Avinash Ravichandran, Onkar Dabeer

Combined with a matching loss, it can effectively find objects that are similar to the input patch and complete the missing annotations.

Decoder Object +2

Paper
Add Code

MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining

no code implementations • CVPR 2023 • Xiaoyi Dong, Jianmin Bao, Yinglin Zheng, Ting Zhang, Dongdong Chen, Hao Yang, Ming Zeng, Weiming Zhang, Lu Yuan, Dong Chen, Fang Wen, Nenghai Yu

Second, masked self-distillation is also consistent with vision-language contrastive from the perspective of training objective as both utilize the visual encoder for feature aligning, and thus is able to learn local semantics getting indirect supervision from the language.

Representation Learning

Paper
Add Code

Prompt Tuning for Generative Multimodal Pretrained Models

1 code implementation • 4 Aug 2022 • Hao Yang, Junyang Lin, An Yang, Peng Wang, Chang Zhou, Hongxia Yang

Prompt tuning has become a new paradigm for model tuning and it has demonstrated success in natural language pretraining and even vision pretraining.

Ranked #2 on Visual Entailment on SNLI-VE test

Image Captioning Visual Entailment +1

2,339

Paper
Code

Rethinking Few-Shot Object Detection on a Multi-Domain Benchmark

1 code implementation • 22 Jul 2022 • Kibok Lee, Hao Yang, Satyaki Chakraborty, Zhaowei Cai, Gurumurthy Swaminathan, Avinash Ravichandran, Onkar Dabeer

Most existing works on few-shot object detection (FSOD) focus on a setting where both pre-training and few-shot learning datasets are from a similar domain.

Few-Shot Learning Few-Shot Object Detection +1

Paper
Code

Boosting 3D Object Detection via Object-Focused Image Fusion

1 code implementation • 21 Jul 2022 • Hao Yang, Chen Shi, Yihong Chen, LiWei Wang

Given a set of point features and image feature maps, DeMF adaptively aggregates image features by taking the projected 2D location of the 3D point as reference.

Ranked #5 on 3D Object Detection on SUN-RGBD val

3D Object Detection Object +1

Paper
Code

M-Adapter: Modality Adaptation for End-to-End Speech-to-Text Translation

1 code implementation • 3 Jul 2022 • Jinming Zhao, Hao Yang, Ehsan Shareghi, Gholamreza Haffari

End-to-end speech-to-text translation models are often initialized with pre-trained speech encoder and pre-trained text decoder.

Decoder Speech-to-Text Translation +1

Paper
Code

Reliable Representations Make A Stronger Defender: Unsupervised Structure Refinement for Robust GNN

1 code implementation • 30 Jun 2022 • Kuan Li, Yang Liu, Xiang Ao, Jianfeng Chi, Jinghua Feng, Hao Yang, Qing He

However, both strategies are faced with some immediate problems: raw features cannot represent various properties of nodes (e. g., structure information), and representations learned by supervised GNN may suffer from the poor performance of the classifier on the poisoned graph.

Paper
Code

MACSA: A Multimodal Aspect-Category Sentiment Analysis Dataset with Multimodal Fine-grained Aligned Annotations

no code implementations • 28 Jun 2022 • Hao Yang, Yanyan Zhao, Jianwei Liu, Yang Wu, Bing Qin

In this paper, we propose a new dataset, the Multimodal Aspect-Category Sentiment Analysis (MACSA) dataset, which contains more than 21K text-image pairs.

Aspect Category Sentiment Analysis Sentiment Analysis

Paper
Add Code

Traffic-Twitter Transformer: A Nature Language Processing-joined Framework For Network-wide Traffic Forecasting

no code implementations • 19 Jun 2022 • Meng-Ju Tsai, Zhiyong Cui, Hao Yang, Cole Kopca, Sophie Tien, Yinhai Wang

To better manage future roadway capacity and accommodate social and human impacts, it is crucial to propose a flexible and comprehensive framework to predict physical-aware long-term traffic conditions for public users and transportation agencies.

Management Time Series +2

Paper
Add Code

Instance-wise Prompt Tuning for Pretrained Language Models

no code implementations • 4 Jun 2022 • Yuezihan Jiang, Hao Yang, Junyang Lin, Hanyu Zhao, An Yang, Chang Zhou, Hongxia Yang, Zhi Yang, Bin Cui

Prompt Learning has recently gained great popularity in bridging the gap between pretraining tasks and various downstream tasks.

Paper
Add Code

Exploring Entity Interactions for Few-Shot Relation Learning (Student Abstract)

no code implementations • 4 May 2022 • Yi Liang, Shuai Zhao, Bo Cheng, Yuwei Yin, Hao Yang

Few-shot relation learning refers to infer facts for relations with a limited number of observed triples.

Metric Learning Relation

Paper
Add Code

Neighbors Are Not Strangers: Improving Non-Autoregressive Translation under Low-Frequency Lexical Constraints

1 code implementation • NAACL 2022 • Chun Zeng, Jiangjie Chen, Tianyi Zhuang, Rui Xu, Hao Yang, Ying Qin, Shimin Tao, Yanghua Xiao

To this end, we propose a plug-in algorithm for this line of work, i. e., Aligned Constrained Training (ACT), which alleviates this problem by familiarizing the model with the source-side context of the constraints.

Translation

Paper
Code

Real-Time Neural Character Rendering with Pose-Guided Multiplane Images

1 code implementation • 25 Apr 2022 • Hao Ouyang, Bo Zhang, Pan Zhang, Hao Yang, Jiaolong Yang, Dong Chen, Qifeng Chen, Fang Wen

We propose pose-guided multiplane image (MPI) synthesis which can render an animatable character in real scenes with photorealistic quality.

Image-to-Image Translation Neural Rendering +1

Paper
Code

Omni-DETR: Omni-Supervised Object Detection with Transformers

1 code implementation • CVPR 2022 • Pei Wang, Zhaowei Cai, Hao Yang, Gurumurthy Swaminathan, Nuno Vasconcelos, Bernt Schiele, Stefano Soatto

This is enabled by a unified architecture, Omni-DETR, based on the recent progress on student-teacher framework and end-to-end transformer based object detection.

Ranked #14 on Semi-Supervised Object Detection on COCO 2% labeled data

Object object-detection +2

Paper
Code

Large-Scale Pre-training for Person Re-identification with Noisy Labels

2 code implementations • CVPR 2022 • Dengpan Fu, Dongdong Chen, Hao Yang, Jianmin Bao, Lu Yuan, Lei Zhang, Houqiang Li, Fang Wen, Dong Chen

Since theses ID labels automatically derived from tracklets inevitably contain noises, we develop a large-scale Pre-training framework utilizing Noisy Labels (PNL), which consists of three learning modules: supervised Re-ID learning, prototype-based contrastive learning, and label-guided contrastive learning.

Ranked #7 on Person Re-Identification on CUHK03

Contrastive Learning Multi-Object Tracking +3

221

Paper
Code

Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors

1 code implementation • Findings (ACL) 2022 • Yang Wu, Yanyan Zhao, Hao Yang, Song Chen, Bing Qin, Xiaohuan Cao, Wenting Zhao

Through further analysis of the ASR outputs, we find that in some cases the sentiment words, the key sentiment elements in the textual modality, are recognized as other words, which makes the sentiment of the text change and hurts the performance of multimodal sentiment models directly.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Code

Forecast-based Multi-aspect Framework for Multivariate Time-series Anomaly Detection

no code implementations • 13 Jan 2022 • Lan Wang, Yusan Lin, Yuhang Wu, Huiyuan Chen, Fei Wang, Hao Yang

Today's cyber-world is vastly multivariate.

Time Series Time Series Anomaly Detection +1

Paper
Add Code

Rethinking Feature Uncertainty in Stochastic Neural Networks for Adversarial Robustness

no code implementations • 1 Jan 2022 • Hao Yang, Min Wang, Zhengfei Yu, Yun Zhou

Extensive experiments on well-known white- and black-box attacks show that MFDV-SNN achieves a significant improvement over existing methods, which indicates that it is a simple but effective method to improve model robustness.

Adversarial Robustness

Paper
Add Code

Joint-training on Symbiosis Networks for Deep Nueral Machine Translation models

no code implementations • 22 Dec 2021 • Zhengzhe Yu, Jiaxin Guo, Minghan Wang, Daimeng Wei, Hengchao Shang, Zongyao Li, Zhanglin Wu, Yuxia Wang, Yimeng Chen, Chang Su, Min Zhang, Lizhi Lei, Shimin Tao, Hao Yang

Deep encoders have been proven to be effective in improving neural machine translation (NMT) systems, but it reaches the upper bound of translation quality when the number of encoder layers exceeds 18.

Machine Translation NMT +1

Paper
Add Code

Diformer: Directional Transformer for Neural Machine Translation

no code implementations • EAMT 2022 • Minghan Wang, Jiaxin Guo, Yuxia Wang, Daimeng Wei, Hengchao Shang, Chang Su, Yimeng Chen, Yinglu Li, Min Zhang, Shimin Tao, Hao Yang

In this paper, we aim to close the gap by preserving the original objective of AR and NAR under a unified framework.

Language Modelling Machine Translation +1

Paper
Add Code

Self-Distillation Mixup Training for Non-autoregressive Neural Machine Translation

no code implementations • 22 Dec 2021 • Jiaxin Guo, Minghan Wang, Daimeng Wei, Hengchao Shang, Yuxia Wang, Zongyao Li, Zhengzhe Yu, Zhanglin Wu, Yimeng Chen, Chang Su, Min Zhang, Lizhi Lei, Shimin Tao, Hao Yang

An effective training strategy to improve the performance of AT models is Self-Distillation Mixup (SDM) Training, which pre-trains a model on raw data, generates distilled data by the pre-trained model itself and finally re-trains a model on the combination of raw data and distilled data.

Knowledge Distillation Machine Translation +1

Paper
Add Code

General Facial Representation Learning in a Visual-Linguistic Manner

2 code implementations • CVPR 2022 • Yinglin Zheng, Hao Yang, Ting Zhang, Jianmin Bao, Dongdong Chen, Yangyu Huang, Lu Yuan, Dong Chen, Ming Zeng, Fang Wen

In this paper, we study the transfer performance of pre-trained models on face analysis tasks and introduce a framework, called FaRL, for general Facial Representation Learning in a visual-linguistic manner.

Ranked #1 on Face Parsing on CelebAMask-HQ (using extra training data)

Face Alignment Face Parsing +1

337

Paper
Code

Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems

1 code implementation • NeurIPS 2021 • Wenqing Zheng, Qiangqiang Guo, Hao Yang, Peihao Wang, Zhangyang Wang

This paper presents the Delayed Propagation Transformer (DePT), a new transformer-based model that specializes in the global modeling of CPS while taking into account the immutable constraints from the physical world.

Inductive Bias

Paper
Code

Few-shot graph link prediction with domain adaptation

no code implementations • 29 Sep 2021 • Hao Zhu, Mahashweta Das, Mangesh Bendre, Fei Wang, Hao Yang, Soha Hassoun

In this work, we propose an adversarial training based modification to the current state-of-the-arts link prediction method to solve this problem.

Domain Adaptation Few-Shot Learning +1

Paper
Add Code

ADNet: Leveraging Error-Bias Towards Normal Direction in Face Alignment

1 code implementation • ICCV 2021 • Yangyu Huang, Hao Yang, Chong Li, Jongyoo Kim, Fangyun Wei

On the other hand, AAM is an attention module which can get anisotropic attention mask focusing on the region of point and its local edge connected by adjacent points, it has a stronger response in tangent than in normal, which means relaxed constraints in the tangent.

Ranked #5 on Face Alignment on 300W

Face Alignment

Paper
Code

How Does Adversarial Fine-Tuning Benefit BERT?

no code implementations • 31 Aug 2021 • Javid Ebrahimi, Hao Yang, Wei zhang

Adversarial training (AT) is one of the most reliable methods for defending against adversarial attacks in machine learning.

Continual Learning Dependency Parsing +3

Paper
Add Code

Event2Graph: Event-driven Bipartite Graph for Multivariate Time-series Anomaly Detection

no code implementations • 15 Aug 2021 • Yuhang Wu, Mengting Gu, Lan Wang, Yusan Lin, Fei Wang, Hao Yang

Modeling inter-dependencies between time-series is the key to achieve high performance in anomaly detection for multivariate time-series data.

Anomaly Detection Time Series +1

Paper
Add Code

The HW-TSC's Offline Speech Translation Systems for IWSLT 2021 Evaluation

no code implementations • 9 Aug 2021 • Minghan Wang, Yuxia Wang, Chang Su, Jiaxin Guo, Yingtao Zhang, Yujia Liu, Min Zhang, Shimin Tao, Xingshan Zeng, Liangyou Li, Hao Yang, Ying Qin

This paper describes our work in participation of the IWSLT-2021 offline speech translation task.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

Task and Situation Structures for Service Agent Planning

no code implementations • 27 Jul 2021 • Hao Yang, Tavan Eftekhar, Chad Esselink, Yan Ding, Shiqi Zhang

Everyday tasks are characterized by their varieties and variations, and frequently are not clearly specified to service agents.

Paper
Add Code

Patch-Wise Spatial-Temporal Quality Enhancement for HEVC Compressed Video

1 code implementation • journal 2021 • Qing Ding, Liquan Shen, Liangwei Yu, Hao Yang, Mai Xu

To overcome these limitations, we propose a patch-wise spatial-temporal quality enhancement network which firstly extracts spatial and temporal features, then recalibrates and fuses the obtained spatial and temporal features.

Quantization Video Enhancement

Paper
Code

Normalization of Language Embeddings for Cross-Lingual Alignment

1 code implementation • NeurIPS 2021 • Prince Osei Aboagye, Jeff Phillips, Yan Zheng, Chin-Chia Michael Yeh, Junpeng Wang, Wei zhang, Liang Wang, Hao Yang

Learning a good transfer function to map the word vectors from two languages into a shared cross-lingual word vector space plays a crucial role in cross-lingual NLP.

Translation

Paper
Code

Pick and Choose: A GNN-based Imbalanced Learning Approach for Fraud Detection

1 code implementation • The Web Conference 2021 • Yang Liu1, Xiang Ao, Zidi Qin, Jianfeng Chi, Jinghua Feng, Hao Yang, Qing He

Graph-based fraud detection approaches have escalated lots of attention recently due to the abundant relational information of graph-structured data, which may be beneficial for the detection of fraudsters.

Ranked #4 on Fraud Detection on Amazon-Fraud

Fraud Detection Node Classification

Paper
Code

Learning from Noisy Labels via Dynamic Loss Thresholding

no code implementations • 1 Apr 2021 • Hao Yang, Youzhi Jin, Ziyin Li, Deng-Bao Wang, Lei Miao, Xin Geng, Min-Ling Zhang

During the training process, DLT records the loss value of each sample and calculates dynamic loss thresholds.

Paper
Add Code

Integrating Subgraph-aware Relation and DirectionReasoning for Question Answering

no code implementations • 1 Apr 2021 • Xu Wang, Shuai Zhao, Bo Cheng, Jiale Han, Yingting Li, Hao Yang, Ivan Sekulic, Guoshun Nan

Question Answering (QA) models over Knowledge Bases (KBs) are capable of providing more precise answers by utilizing relation information among entities.

Question Answering Relation

Paper
Add Code

Style-based Point Generator with Adversarial Rendering for Point Cloud Completion

1 code implementation • CVPR 2021 • Chulin Xie, Chuxin Wang, Bo Zhang, Hao Yang, Dong Chen, Fang Wen

In this paper, we proposed a novel Style-based Point Generator with Adversarial Rendering (SpareNet) for point cloud completion.

Ranked #1 on Point Cloud Completion on ShapeNet (Earth Mover's Distance metric)

Point Cloud Completion

135

Paper
Code

Large-Scale Training System for 100-Million Classification at Alibaba

no code implementations • 9 Feb 2021 • Liuyihan Song, Pan Pan, Kang Zhao, Hao Yang, Yiming Chen, Yingya Zhang, Yinghui Xu, Rong Jin

In the last decades, extreme classification has become an essential topic for deep learning.

Classification General Classification

Paper
Add Code

Adversarial Example Detection Using Latent Neighborhood Graph

no code implementations • ICCV 2021 • Ahmed Abusnaina, Yuhang Wu, Sunpreet Arora, Yizhen Wang, Fei Wang, Hao Yang, David Mohaisen

We present the first graph-based adversarial detection method that constructs a Latent Neighborhood Graph (LNG) around an input example to determine if the input example is adversarial.

Adversarial Attack Graph Attention

Paper
Add Code

On Position Embeddings in BERT

no code implementations • ICLR 2021 • Benyou Wang, Lifeng Shang, Christina Lioma, Xin Jiang, Hao Yang, Qun Liu, Jakob Grue Simonsen

Various Position Embeddings (PEs) have been proposed in Transformer based architectures~(e. g. BERT) to model word order.

General Classification Position +1

Paper
Add Code

Beating Attackers At Their Own Games: Adversarial Example Detection Using Adversarial Gradient Directions

no code implementations • 31 Dec 2020 • Yuhang Wu, Sunpreet S. Arora, Yanhong Wu, Hao Yang

Adversarial examples are input examples that are specifically crafted to deceive machine learning classifiers.

Paper
Add Code

Unsupervised Pre-training for Person Re-identification

1 code implementation • CVPR 2021 • Dengpan Fu, Dongdong Chen, Jianmin Bao, Hao Yang, Lu Yuan, Lei Zhang, Houqiang Li, Dong Chen

In this paper, we present a large scale unlabeled person re-identification (Re-ID) dataset "LUPerson" and make the first attempt of performing unsupervised pre-training for improving the generalization ability of the learned person Re-ID feature representation.

Ranked #1 on Person Re-Identification on Market-1501 (using extra training data)

Data Augmentation Person Re-Identification +1

221

Paper
Code

Modelling Long-distance Node Relations for KBQA with Global Dynamic Graph

no code implementations • COLING 2020 • Xu Wang, Shuai Zhao, Jiale Han, Bo Cheng, Hao Yang, Jianchang Ao, Zhenzi Li

The structural information of Knowledge Bases (KBs) has proven effective to Question Answering (QA).

Question Answering Vocal Bursts Type Prediction

Paper
Add Code

Gaussian State-Based Quantum Illumination with Simple Photodetection

no code implementations • 27 Nov 2020 • Hao Yang, Wojciech Roga, Jonathan D. Pritchard, John Jeffers

We use the continuous-variable Gaussian quantum information formalism to show that quantum illumination is better for object detection compared with coherent states of the same mean photon number, even for simple direct photodetection.

Object Detection Quantum Physics

Paper
Add Code

A coarse-to-fine framework for unsupervised multi-contrast MR image deformable registration with dual consistency constraint

no code implementations • 5 Aug 2020 • Weijian Huang, Hao Yang, Xinfeng Liu, Cheng Li, Ian Zhang, Rongpin Wang, Hairong Zheng, Shan-Shan Wang

Multi-contrast magnetic resonance (MR) image registration is useful in the clinic to achieve fast and accurate imaging-based disease diagnosis and treatment planning.

Image Registration

Paper
Add Code

Edge Computing for Real-Time Near-Crash Detection for Smart Transportation Applications

no code implementations • 2 Aug 2020 • Ruimin Ke, Zhiyong Cui, Yanlong Chen, Meixin Zhu, Hao Yang, Yinhai Wang

It is among the first efforts in applying edge computing for real-time traffic video analytics and is expected to benefit multiple sub-fields in smart transportation research and applications.

Autonomous Driving Edge-computing +2

Paper
Add Code

The HW-TSC Video Speech Translation System at IWSLT 2020

no code implementations • WS 2020 • Minghan Wang, Hao Yang, Yao Deng, Ying Qin, Lizhi Lei, Daimeng Wei, Hengchao Shang, Ning Xie, Xiaochun Li, Jiaxian Guo

The paper presents details of our system in the IWSLT Video Speech Translation evaluation.

NMT Translation

Paper
Add Code

Category-Specific CNN for Visual-aware CTR Prediction at JD.com

no code implementations • 18 Jun 2020 • Hu Liu, Jing Lu, Hao Yang, Xiwei Zhao, Sulong Xu, Hao Peng, Zehua Zhang, Wenjie Niu, Xiaokun Zhu, Yongjun Bao, Weipeng Yan

Existing algorithms usually extract visual features using off-the-shelf Convolutional Neural Networks (CNNs) and late fuse the visual and non-visual features for the finally predicted CTR.

Click-Through Rate Prediction

Paper
Add Code

GroupIM: A Mutual Information Maximization Framework for Neural Group Recommendation

1 code implementation • 5 Jun 2020 • Aravind Sankar, Yanhong Wu, Yuhang Wu, Wei zhang, Hao Yang, Hari Sundaram

We study the problem of making item recommendations to ephemeral groups, which comprise users with limited or no historical activities together.

Paper
Code

Transfer Learning via Contextual Invariants for One-to-Many Cross-Domain Recommendation

no code implementations • 21 May 2020 • Adit Krishnan, Mahashweta Das, Mangesh Bendre, Hao Yang, Hari Sundaram

The rapid proliferation of new users and items on the social web has aggravated the gray-sheep user/long-tail item challenge in recommender systems.

Clustering Collaborative Filtering +2

Paper
Add Code

Fashion Recommendation and Compatibility Prediction Using Relational Network

no code implementations • 13 May 2020 • Maryam Moosaei, Yusan Lin, Hao Yang

There are a few approaches that consider an entire outfit, but these approaches have limitations such as requiring rich semantic information, category labels, and fixed order of items.

Relation Network

Paper
Add Code

Adversarial Light Projection Attacks on Face Recognition Systems: A Feasibility Study

no code implementations • 24 Mar 2020 • Dinh-Luan Nguyen, Sunpreet S. Arora, Yuhang Wu, Hao Yang

While feasible, digital attacks have limited applicability in attacking deployed systems, including face recognition systems, where an adversary typically has access to the input and not the transmission channel.

Face Recognition

Paper
Add Code

Feedback Graph Convolutional Network for Skeleton-based Action Recognition

no code implementations • 17 Mar 2020 • Hao Yang, Dan Yan, Li Zhang, Dong Li, YunDa Sun, ShaoDi You, Stephen J. Maybank

It transmits the high-level semantic features to the low-level layers and flows temporal information stage by stage to progressively model global spatial-temporal features for action recognition; (3) The FGCN model provides early predictions.

Ranked #34 on Skeleton Based Action Recognition on NTU RGB+D 120

Action Recognition Skeleton Based Action Recognition

Paper
Add Code

Rethinking the Hyperparameters for Fine-tuning

1 code implementation • ICLR 2020 • Hao Li, Pratik Chaudhari, Hao Yang, Michael Lam, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

Our findings challenge common practices of fine-tuning and encourages deep learning practitioners to rethink the hyperparameters for fine-tuning.

Transfer Learning

197

Paper
Code

Multi-Task Incremental Learning for Object Detection

no code implementations • 13 Feb 2020 • Xialei Liu, Hao Yang, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

For the difficult cases, where the domain gaps and especially category differences are large, we explore three different exemplar sampling methods and show the proposed adaptive sampling method is effective to select diverse and informative samples from entire datasets, to further prevent forgetting.

Incremental Learning Object +2

Paper
Add Code

Face X-ray for More General Face Forgery Detection

4 code implementations • CVPR 2020 • Lingzhi Li, Jianmin Bao, Ting Zhang, Hao Yang, Dong Chen, Fang Wen, Baining Guo

For this reason, face X-ray provides an effective way for detecting forgery generated by most existing face manipulation algorithms.

DeepFake Detection Face Swapping

Paper
Code

FaceShifter: Towards High Fidelity And Occlusion Aware Face Swapping

10 code implementations • 31 Dec 2019 • Lingzhi Li, Jianmin Bao, Hao Yang, Dong Chen, Fang Wen

We propose a novel attributes encoder for extracting multi-level target face attributes, and a new generator with carefully designed Adaptive Attentional Denormalization (AAD) layers to adaptively integrate the identity and the attributes for face synthesis.

Face Generation Face Swapping +1

592

Paper
Code

motif2vec: Motif Aware Node Representation Learning for Heterogeneous Networks

no code implementations • 22 Aug 2019 • Manoj Reddy Dareddy, Mahashweta Das, Hao Yang

Supervised machine learning tasks in networks such as node classification and link prediction require us to perform feature engineering that is known and agreed to be the key to success in applied machine learning.

BIG-bench Machine Learning Feature Engineering +4

Paper
Add Code

Detecting 11K Classes: Large Scale Object Detection without Fine-Grained Bounding Boxes

no code implementations • ICCV 2019 • Hao Yang, Hao Wu, Hao Chen

However, these methods require fully annotated object bounding boxes for training, which are incredibly hard to scale up due to the high annotation cost.

Object object-detection +2

Paper
Add Code

Position Focused Attention Network for Image-Text Matching

1 code implementation • 23 Jul 2019 • Yaxiong Wang, Hao Yang, Xueming Qian, Lin Ma, Jing Lu, Biao Li, Xin Fan

Then, an attention mechanism is proposed to model the relations between the image region and blocks and generate the valuable position feature, which will be further utilized to enhance the region expression and model a more reliable relationship between the visual image and the textual sentence.

Image-text matching Position +2

Paper
Code

X-Net: Brain Stroke Lesion Segmentation Based on Depthwise Separable Convolution and Long-range Dependencies

1 code implementation • 16 Jul 2019 • Kehan Qi, Hao Yang, Cheng Li, Zaiyi Liu, Meiyun Wang, Qiegen Liu, Shan-Shan Wang

Recently, approaches based on deep learning and methods for contextual information extraction have served in many image segmentation tasks.

Ranked #1 on Lesion Segmentation on Anatomical Tracings of Lesions After Stroke (ATLAS)

Image Segmentation Lesion Segmentation +2

Paper
Code

Predicting Next-Season Designs on High Fashion Runway

no code implementations • 16 Jul 2019 • Yusan Lin, Hao Yang

Fashion is a large and fast-changing industry.

Vocal Bursts Intensity Prediction

Paper
Add Code

CLCI-Net: Cross-Level fusion and Context Inference Networks for Lesion Segmentation of Chronic Stroke

2 code implementations • 16 Jul 2019 • Hao Yang, Weijian Huang, Kehan Qi, Cheng Li, Xinfeng Liu, Meiyun Wang, Hairong Zheng, Shan-Shan Wang

To address these challenges, this paper proposes a Cross-Level fusion and Context Inference Network (CLCI-Net) for the chronic stroke lesion segmentation from T1-weighted MR images.

Decoder Image Segmentation +2

Paper
Code

pNovo 3: precise de novo peptide sequencing using a learning-to-rank framework

1 code implementation • Bioinformatics 2019 • Hao Yang, Hao Chi, Wen-Feng Zeng, Wen-Jing Zhou, Si-Min He

In order to solve this problem, we developed pNovo 3, which used a learning-to-rank framework to distinguish similar peptide candidates for each spectrum.

de novo peptide sequencing Learning-To-Rank

Paper
Code

Face Parsing with RoI Tanh-Warping

2 code implementations • CVPR 2019 • Jinpeng Lin, Hao Yang, Dong Chen, Ming Zeng, Fang Wen, Lu Yuan

It uses hierarchical local based method for inner facial components and global methods for outer facial components.

Face Parsing

376

Paper
Code

The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding Distillation with Ensemble Learning

1 code implementation • 31 May 2019 • Bonggun Shin, Hao Yang, Jinho D. Choi

Recent advances in deep learning have facilitated the demand of neural models for real applications.

Ranked #2 on Sentiment Analysis on MPQA

Document Classification Ensemble Learning +2

Paper
Code

Mask-Guided Portrait Editing with Conditional GANs

no code implementations • CVPR 2019 • Shuyang Gu, Jianmin Bao, Hao Yang, Dong Chen, Fang Wen, Lu Yuan

Portrait editing is a popular subject in photo manipulation.

Data Augmentation Face Generation +3

Paper
Add Code

Real-Time Steganalysis for Stream Media Based on Multi-channel Convolutional Sliding Windows

no code implementations • 4 Feb 2019 • Zhongliang Yang, Hao Yang, Yuting Hu, Yongfeng Huang, Yu-Jin Zhang

To solve these two challenges, in this paper, combined with the sliding window detection algorithm and Convolution Neural Network we propose a real-time VoIP steganalysis method which based on multi-channel convolution sliding windows.

Steganalysis

Paper
Add Code

Dynamic Graph Representation Learning via Self-Attention Networks

2 code implementations • 22 Dec 2018 • Aravind Sankar, Yanhong Wu, Liang Gou, Wei zhang, Hao Yang

Learning latent representations of nodes in graphs is an important and ubiquitous task with widespread applications such as link prediction, node classification, and graph visualization.

General Classification Graph Embedding +3

272

Paper
Code

An End-to-End Multi-task Learning Model for Fact Checking

no code implementations • WS 2018 • Sizhen Li, Shuai Zhao, Bo Cheng, Hao Yang

With huge amount of information generated every day on the web, fact checking is an important and challenging task which can help people identify the authenticity of most claims as well as providing evidences selected from knowledge source like Wikipedia.

Common Sense Reasoning Entity Linking +4

Paper
Add Code

Zero-Annotation Object Detection with Web Knowledge Transfer

no code implementations • ECCV 2018 • Qingyi Tao, Hao Yang, Jianfei Cai

Object detection is one of the major problems in computer vision, and has been extensively studied.

Domain Adaptation Object +3

Paper
Add Code

Exploiting Web Images for Weakly Supervised Object Detection

no code implementations • 27 Jul 2017 • Qingyi Tao, Hao Yang, Jianfei Cai

Object detection without bounding box annotations, i. e, weakly supervised detection methods, are still lagging far behind.

Ranked #17 on Weakly Supervised Object Detection on PASCAL VOC 2012 test (using extra training data)

Object object-detection +2

Paper
Add Code

MIML-FCN+: Multi-instance Multi-label Learning via Fully Convolutional Networks with Privileged Information

no code implementations • CVPR 2017 • Hao Yang, Joey Tianyi Zhou, Jianfei Cai, Yew Soon Ong

As the proposed PI loss is convex and SGD compatible and the framework itself is a fully convolutional network, MIML-FCN+ can be easily integrated with state of-the-art deep learning networks.

Image Captioning Multi-Label Learning +1

Paper
Add Code

Improving Multi-label Learning with Missing Labels by Structured Semantic Correlations

no code implementations • 4 Aug 2016 • Hao Yang, Joey Tianyi Zhou, Jianfei Cai

Experimental results demonstrate the effectiveness of the proposed semantic descriptor and the usefulness of incorporating the structured semantic correlations.

Missing Labels Object Recognition

Paper
Add Code

Efficient 3D Room Shape Recovery From a Single Panorama

no code implementations • CVPR 2016 • Hao Yang, HUI ZHANG

We propose a method to recover the shape of a 3D room from a full-view indoor panorama.

3D Reconstruction Superpixels

Paper
Add Code

A Comparative Study of Object Trackers for Infrared Flying Bird Tracking

no code implementations • 18 Jan 2016 • Ying Huang, Hong Zheng, Haibin Ling, Erik Blasch, Hao Yang

Bird strikes present a huge risk for aircraft, especially since traditional airport bird surveillance is mainly dependent on inefficient human observation.

Paper
Add Code

Exploit Bounding Box Annotations for Multi-label Object Recognition

no code implementations • CVPR 2016 • Hao Yang, Joey Tianyi Zhou, Yu Zhang, Bin-Bin Gao, Jianxin Wu, Jianfei Cai

With strong labels, our framework is able to achieve state-of-the-art results in both datasets.

Ranked #16 on Multi-Label Classification on PASCAL VOC 2007

Multi-Label Classification Object +1

Paper
Add Code

A Parallel Way to Select the Parameters of SVM Based on the Ant Optimization Algorithm

no code implementations • 19 May 2014 • Chao Zhang, Hong-cen Mei, Hao Yang

A large number of experimental data shows that Support Vector Machine (SVM) algorithm has obvious advantages in text classification, handwriting recognition, image classification, bioinformatics, and some other fields.

General Classification Handwriting Recognition +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.