no code implementations • 22 Apr 2024 • Man Tik Ng, Hui Tung Tse, Jen-tse Huang, Jingjing Li, Wenxuan Wang, Michael R. Lyu
However, existing studies focus on imitating well-known public figures or fictional characters, overlooking the potential for simulating ordinary individuals.
1 code implementation • 17 Apr 2024 • Xin Li, Kun Yuan, Yajing Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Wei Sun, HaoNing Wu, ZiCheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai, Jianhui Sun, Tianyi Wang, Lei LI, Han Kong, Wenxuan Wang, Bing Li, Cheng Luo, Haiqiang Wang, Xiangguang Chen, Wenhui Meng, Xiang Pan, Huiying Shi, Han Zhu, Xiaozhong Xu, Lei Sun, Zhenzhong Chen, Shan Liu, Fangyuan Kong, Haotian Fan, Yifang Xu, Haoran Xu, Mengduo Yang, Jie zhou, Jiaze Li, Shijie Wen, Mai Xu, Da Li, Shunyu Yao, Jiazhi Du, WangMeng Zuo, Zhibo Li, Shuai He, Anlong Ming, Huiyuan Fu, Huadong Ma, Yong Wu, Fie Xue, Guozhi Zhao, Lina Du, Jie Guo, Yu Zhang, huimin zheng, JunHao Chen, Yue Liu, Dulan Zhou, Kele Xu, Qisheng Xu, Tao Sun, Zhixiang Ding, Yuhang Hu
This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i. e., Kuaishou/Kwai Platform.
1 code implementation • 18 Mar 2024 • Jen-tse Huang, Eric John Li, Man Ho Lam, Tian Liang, Wenxuan Wang, Youliang Yuan, Wenxiang Jiao, Xing Wang, Zhaopeng Tu, Michael R. Lyu
Decision-making, a complicated task requiring various types of abilities, presents an excellent framework for assessing Large Language Models (LLMs).
no code implementations • 17 Feb 2024 • Wenxuan Wang, Yihang Su, Jingyuan Huan, Jie Liu, WenTing Chen, Yudi Zhang, Cheng-Yi Li, Kao-Jung Chang, Xiaohan Xin, Linlin Shen, Michael R. Lyu
However, these models are often evaluated on benchmarks that are unsuitable for the Med-MLLMs due to the intricate nature of the real-world diagnostic frameworks, which encompass diverse medical specialties and involve complex clinical decisions.
1 code implementation • 17 Feb 2024 • Wenxuan Wang, Yisi Zhang, Xingjian He, Yichen Yan, Zijia Zhao, Xinlong Wang, Jing Liu
Previous datasets and methods for classic VG task mainly rely on the prior assumption that the given expression must literally refer to the target object, which greatly impedes the practical deployment of agents in real-world scenarios.
no code implementations • 1 Jan 2024 • Wenxuan Wang, Haonan Bai, Jen-tse Huang, Yuxuan Wan, Youliang Yuan, Haoyi Qiu, Nanyun Peng, Michael R. Lyu
BiasPainter uses a diverse range of seed images of individuals and prompts the image generation models to edit these images using gender, race, and age-neutral queries.
no code implementations • 1 Jan 2024 • Yuxuan Wan, Wenxuan Wang, Yiliu Yang, Youliang Yuan, Jen-tse Huang, Pinjia He, Wenxiang Jiao, Michael R. Lyu
In addition, the test cases of LogicAsker can be further used to design demonstration examples for in-context learning, which effectively improves the logical reasoning ability of LLMs, e. g., 10\% for GPT-4.
no code implementations • 1 Jan 2024 • Wenxuan Wang, Juluan Shi, Zhaopeng Tu, Youliang Yuan, Jen-tse Huang, Wenxiang Jiao, Michael R. Lyu
Current methods for evaluating LLMs' veracity are limited by test data leakage or the need for extensive human labor, hindering efficient and accurate error detection.
1 code implementation • 13 Dec 2023 • Wenxuan Wang, Tongtian Yue, Yisi Zhang, Longteng Guo, Xingjian He, Xinlong Wang, Jing Liu
To foster future research into fine-grained visual grounding, our benchmark RefCOCOm, the MRES-32M dataset and model UniRES will be publicly available at https://github. com/Rubics-Xuan/MRES
no code implementations • 6 Nov 2023 • Xujie Song, Tong Liu, Shengbo Eben Li, Jingliang Duan, Wenxuan Wang, Keqiang Li
This paper proposes an Ising learning algorithm to train quantized neural network (QNN), by incorporating two essential techinques, namely binary representation of topological network and order reduction of loss function.
1 code implementation • 31 Oct 2023 • Tian Liang, Zhiwei He, Jen-tse Huang, Wenxuan Wang, Wenxiang Jiao, Rui Wang, Yujiu Yang, Zhaopeng Tu, Shuming Shi, Xing Wang
Ideally, an advanced agent should possess the ability to accurately describe a given word using an aggressive description while concurrently maximizing confusion in the conservative description, enhancing its participation in the game.
no code implementations • 31 Oct 2023 • Kunyu Wang, Juluan Shi, Wenxuan Wang
In this work, we present a novel approach to generate transferable targeted adversarial examples by exploiting the vulnerability of deep neural networks to perturbations on high-frequency components of images.
no code implementations • 28 Oct 2023 • Haoran Shen, Yifu Zhang, Wenxuan Wang, Chen Chen, Jing Liu, Shanshan Song, Jiangyun Li
As a pioneering work, a dynamic architecture network for medical volumetric segmentation (i. e. Med-DANet) has achieved a favorable accuracy and efficiency trade-off by dynamically selecting a suitable 2D candidate model from the pre-defined model bank for different slices.
no code implementations • 19 Oct 2023 • Wenxuan Wang, Wenxiang Jiao, Jingyuan Huang, Ruyi Dai, Jen-tse Huang, Zhaopeng Tu, Michael R. Lyu
This paper identifies a cultural dominance issue within large language models (LLMs) due to the predominant use of English data in model training (e. g., ChatGPT).
1 code implementation • 9 Oct 2023 • Jingliang Duan, Wenxuan Wang, Liming Xiao, Jiaxin Gao, Shengbo Eben Li
Reinforcement learning (RL) has proven to be highly effective in tackling complex decision-making and control tasks.
1 code implementation • 2 Oct 2023 • Wenxuan Wang, Zhaopeng Tu, Chang Chen, Youliang Yuan, Jen-tse Huang, Wenxiang Jiao, Michael R. Lyu
In this work, we build the first multilingual safety benchmark for LLMs, XSafety, in response to the global deployment of LLMs in practice.
1 code implementation • 2 Oct 2023 • Jen-tse Huang, Wenxuan Wang, Eric John Li, Man Ho Lam, Shujie Ren, Youliang Yuan, Wenxiang Jiao, Zhaopeng Tu, Michael R. Lyu
Large Language Models (LLMs) have recently showcased their remarkable capacities, not only in natural language processing tasks but also across diverse domains such as clinical medicine, legal consultation, and education.
2 code implementations • 20 Aug 2023 • Kunyu Wang, Xuanran He, Wenxuan Wang, Xiaosen Wang
In this work, we observe that existing input transformation based attacks, one of the mainstream transfer-based attacks, result in different attention heatmaps on various models, which might limit the transferability.
no code implementations • 18 Aug 2023 • Wenxuan Wang, Jingyuan Huang, Jen-tse Huang, Chang Chen, Jiazhen Gu, Pinjia He, Michael R. Lyu
Moreover, through retraining the models with the test cases generated by OASIS, the robustness of the moderation model can be improved without performance degradation.
no code implementations • 18 Aug 2023 • Yichen Yan, Xingjian He, Wenxuan Wang, Sihan Chen, Jing Liu
In previous approaches, fused vision-language features are directly fed into a decoder and pass through a convolution with a fixed kernel to obtain the result, which follows a similar pattern as traditional image segmentation.
1 code implementation • 12 Aug 2023 • Youliang Yuan, Wenxiang Jiao, Wenxuan Wang, Jen-tse Huang, Pinjia He, Shuming Shi, Zhaopeng Tu
We propose a novel framework CipherChat to systematically examine the generalizability of safety alignment to non-natural languages -- ciphers.
1 code implementation • 7 Aug 2023 • Jen-tse Huang, Man Ho Lam, Eric John Li, Shujie Ren, Wenxuan Wang, Wenxiang Jiao, Zhaopeng Tu, Michael R. Lyu
Evaluating Large Language Models' (LLMs) anthropomorphic capabilities has become increasingly important in contemporary discourse.
1 code implementation • 31 May 2023 • Jen-tse Huang, Wenxuan Wang, Man Ho Lam, Eric John Li, Wenxiang Jiao, Michael R. Lyu
Recent research has extended beyond assessing the performance of Large Language Models (LLMs) to examining their characteristics from a psychological standpoint, acknowledging the necessity of understanding their behavioral characteristics.
no code implementations • 23 May 2023 • Wenxuan Wang, Jingyuan Huang, Chang Chen, Jiazhen Gu, Jianping Zhang, Weibin Wu, Pinjia He, Michael Lyu
To this end, content moderation software has been widely deployed on these platforms to detect and blocks toxic content.
1 code implementation • 21 May 2023 • Yuxuan Wan, Wenxuan Wang, Pinjia He, Jiazhen Gu, Haonan Bai, Michael Lyu
Particularly, it is hard to generate inputs that can comprehensively trigger potential bias due to the lack of data containing both social groups as well as biased properties.
no code implementations • 19 May 2023 • Wenxuan Wang, Jing Liu, Xingjian He, Yisi Zhang, Chen Chen, Jiachen Shen, Yan Zhang, Jiangyun Li
Referring image segmentation (RIS) is a fundamental vision-language task that intends to segment a desired object from an image based on a given natural language expression.
1 code implementation • 21 Apr 2023 • Wenxuan Wang, Jing Wang, Chen Chen, Jianbo Jiao, Yuanxiu Cai, Shanshan Song, Jiangyun Li
The research community has witnessed the powerful potential of self-supervised Masked Image Modeling (MIM), which enables the models capable of learning visual representation from unlabeled data.
no code implementations • 21 Apr 2023 • Wenxuan Wang, Jiachen Shen, Chen Chen, Jianbo Jiao, Jing Liu, Yan Zhang, Shanshan Song, Jiangyun Li
In this paper, we present the study on parameter-efficient transfer learning for medical volumetric segmentation and propose a new framework named Med-Tuning based on intra-stage feature enhancement and inter-stage feature interaction.
1 code implementation • 5 Apr 2023 • Wenxiang Jiao, Jen-tse Huang, Wenxuan Wang, Zhiwei He, Tian Liang, Xing Wang, Shuming Shi, Zhaopeng Tu
Therefore, we propose ParroT, a framework to enhance and regulate the translation abilities during chat based on open-source LLMs (e. g., LLaMA), human-written translation and feedback data.
1 code implementation • CVPR 2023 • Jianping Zhang, Jen-tse Huang, Wenxuan Wang, Yichen Li, Weibin Wu, Xiaosen Wang, Yuxin Su, Michael R. Lyu
However, such methods selected the image augmentation path heuristically and may augment images that are semantics-inconsistent with the target images, which harms the transferability of the generated adversarial samples.
no code implementations • 15 Mar 2023 • Haoran Wu, Wenxuan Wang, Yuxuan Wan, Wenxiang Jiao, Michael Lyu
ChatGPT is a cutting-edge artificial intelligence language model developed by OpenAI, which has attracted a lot of attention due to its surprisingly strong ability in answering follow-up questions.
1 code implementation • 11 Feb 2023 • Wenxuan Wang, Jen-tse Huang, Weibin Wu, Jianping Zhang, Yizhan Huang, Shuqing Li, Pinjia He, Michael Lyu
In addition, we leverage the test cases generated by MTTM to retrain the model we explored, which largely improves model robustness (0% to 5. 9% EFR) while maintaining the accuracy on the original test set.
1 code implementation • 20 Jan 2023 • Wenxiang Jiao, Wenxuan Wang, Jen-tse Huang, Xing Wang, Shuming Shi, Zhaopeng Tu
By evaluating on a number of benchmark test sets, we find that ChatGPT performs competitively with commercial translation products (e. g., Google Translate) on high-resource European languages but lags behind significantly on low-resource or distant languages.
no code implementations • 3 Dec 2022 • Yangang Ren, Yao Lyu, Wenxuan Wang, Shengbo Eben Li, Zeyang Li, Jingliang Duan
In this paper, we propose the smoothing policy iteration (SPI) algorithm to solve the zero-sum MGs approximately, where the maximum operator is replaced by the weighted LogSumExp (WLSE) function to obtain the nearly optimal equilibrium policies.
1 code implementation • 18 Oct 2022 • Wenxiang Jiao, Zhaopeng Tu, Jiarui Li, Wenxuan Wang, Jen-tse Huang, Shuming Shi
This paper describes Tencent's multilingual machine translation systems for the WMT22 shared task on Large-Scale Machine Translation Evaluation for African Languages.
no code implementations • 14 Jul 2022 • Jinjing Shi, Ren-xin Zhao, Wenxuan Wang, Shichao Zhang, Xuelong Li
Self-Attention Mechanism (SAM) is good at capturing the internal connections of features and greatly improves the performance of machine learning models, espeacially requiring efficient characterization and feature extraction of high-dimensional data.
no code implementations • 4 Jul 2022 • Jing Wang, Jiangyun Li, Wei Li, Lingfei Xuan, Tianxiang Zhang, Wenxuan Wang
The contextual information is critical for various computer vision tasks, previous works commonly design plug-and-play modules and structural losses to effectively extract and aggregate the global context.
no code implementations • 14 Jun 2022 • Wenxuan Wang, Chen Chen, Jing Wang, Sen Zha, Yan Zhang, Jiangyun Li
For 3D medical image (e. g. CT and MRI) segmentation, the difficulty of segmenting each slice in a clinical case varies greatly.
no code implementations • 20 May 2022 • Wenxuan Wang, Wenxiang Jiao, Shuo Wang, Zhaopeng Tu, Michael R. Lyu
Zero-shot translation is a promising direction for building a comprehensive multilingual neural machine translation (MNMT) system.
1 code implementation • 13 May 2022 • Jen-tse Huang, Jianping Zhang, Wenxuan Wang, Pinjia He, Yuxin Su, Michael R. Lyu
However, in practice, many of the generated test cases fail to preserve similar semantic meaning and are unnatural (e. g., grammar errors), which leads to a high false alarm rate and unnatural test cases.
no code implementations • CVPR 2022 • Wenxuan Wang, Xuelin Qian, Yanwei Fu, xiangyang xue
With the wide applications of deep neural network models in various computer vision tasks, more and more works study the model vulnerability to adversarial examples.
2 code implementations • CVPR 2022 • Jianping Zhang, Weibin Wu, Jen-tse Huang, Yizhan Huang, Wenxuan Wang, Yuxin Su, Michael R. Lyu
Deep neural networks (DNNs) are known to be vulnerable to adversarial examples.
no code implementations • ACL 2022 • Wenxuan Wang, Wenxiang Jiao, Yongchang Hao, Xing Wang, Shuming Shi, Zhaopeng Tu, Michael Lyu
In this paper, we present a substantial step in better understanding the SOTA sequence-to-sequence (Seq2Seq) pretraining for neural machine translation~(NMT).
1 code implementation • 30 Jan 2022 • Jiangyun Li, Wenxuan Wang, Chen Chen, Tianxiang Zhang, Sen Zha, Jing Wang, Hong Yu
Different from TransBTS, the proposed TransBTSV2 is not limited to brain tumor segmentation (BTS) but focuses on general medical image segmentation, providing a stronger and more efficient 3D baseline for volumetric segmentation of medical images.
no code implementations • 3 Dec 2021 • Yuting Yang, Binbin Du, Yingxin Zhang, Wenxuan Wang, Yuke Li
We propose a mandarin keyword spotting system (KWS) with several novel and effective improvements, including a big backbone (B) model, a keyword biasing (B) mechanism and the introduction of syllable modeling units (S).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 30 Jul 2021 • Ben Zhai, Yanli Wang, Wenxuan Wang, Bing Wu
This study developed optimal VSL control strategy under fog conditions with fully consideration of factors that affect traffic safety risks.
no code implementations • 25 Jun 2021 • Shuo Wang, Zhaopeng Tu, Zhixing Tan, Wenxuan Wang, Maosong Sun, Yang Liu
Inspired by the recent progress of large-scale pre-trained language models on machine translation in a limited scenario, we firstly demonstrate that a single language model (LM4MT) can achieve comparable performance with strong encoder-decoder NMT models on standard machine translation benchmarks, using the same training data and similar amount of model parameters.
1 code implementation • 7 May 2021 • Bangjie Yin, Wenxuan Wang, Taiping Yao, Junfeng Guo, Zelun Kong, Shouhong Ding, Jilin Li, Cong Liu
Deep neural networks, particularly face recognition models, have been shown to be vulnerable to both digital and physical adversarial examples.
no code implementations • CVPR 2021 • Wenxuan Wang, Bangjie Yin, Taiping Yao, Li Zhang, Yanwei Fu, Shouhong Ding, Jilin Li, Feiyue Huang, xiangyang xue
Previous substitute training approaches focus on stealing the knowledge of the target model based on real training data or synthetic data, without exploring what kind of data can further improve the transferability between the substitute and target models.
2 code implementations • 7 Mar 2021 • Wenxuan Wang, Chen Chen, Meng Ding, Jiangyun Li, Hong Yu, Sen Zha
To capture the local 3D context information, the encoder first utilizes 3D CNN to extract the volumetric spatial feature maps.
no code implementations • 23 Feb 2021 • Zhengyu Liu, Jingliang Duan, Wenxuan Wang, Shengbo Eben Li, Yuming Yin, Ziyu Lin, Qi Sun, Bo Cheng
This paper proposes an off-line algorithm, called Recurrent Model Predictive Control (RMPC), to solve general nonlinear finite-horizon optimal control problems.
no code implementations • 20 Feb 2021 • Zhengyu Liu, Jingliang Duan, Wenxuan Wang, Shengbo Eben Li, Yuming Yin, Ziyu Lin, Bo Cheng
This paper proposes an offline control algorithm, called Recurrent Model Predictive Control (RMPC), to solve large-scale nonlinear finite-horizon optimal control problems.
no code implementations • COLING 2020 • Wenxuan Wang, Zhaopeng Tu
Transformer becomes the state-of-the-art translation model, while it is not well studied how each intermediate component contributes to the model performance, which poses significant challenges for designing optimal architectures.
no code implementations • 10 Sep 2020 • Jiali Liu, Wenxuan Wang, Tianyao Guan, Ningbo Zhao, Xiaoguang Han, Zhen Li
An indicator-guided learning mechanism is further proposed to ease the training of the proposed model.
no code implementations • 4 Sep 2020 • Yanwei Fu, Feng Li, Wenxuan Wang, Haicheng Tang, Xuelin Qian, Mengwei Gu, xiangyang xue
After more than four months study, we found that the confirmed cases of COVID-19 present the consistent ocular pathological symbols; and we propose a new screening method of analyzing the eye-region images, captured by common CCD and CMOS cameras, could reliably make a rapid risk screening of COVID-19 with very high accuracy.
no code implementations • CVPR 2020 • Wenxuan Wang, Yanwei Fu, Xuelin Qian, Yu-Gang Jiang, Qi Tian, Xiangyang Xue
It is challenging in learning a makeup-invariant face verification model, due to (1) insufficient makeup/non-makeup face training pairs, (2) the lack of diverse makeup faces, and (3) the significant appearance changes caused by cosmetics.
no code implementations • 26 May 2020 • Xuelin Qian, Wenxuan Wang, Li Zhang, Fangrui Zhu, Yanwei Fu, Tao Xiang, Yu-Gang Jiang, xiangyang xue
Specifically, we consider that under cloth-changes, soft-biometrics such as body shape would be more reliable.
no code implementations • 10 May 2020 • Long Huang, Ruoming Li, Peng Xiang, Pan Dai, Wenxuan Wang, Mi Li, Xiangfei Chen, Yuechun Shi
Theoretical analysis shows that the SNR is a function of the center frequency of the passband, the modulation index, the chromatic dispersion, and the shape of the IBOS.
no code implementations • 17 Jan 2020 • Wenxuan Wang, Yanwei Fu, Qiang Sun, Tao Chen, Chenjie Cao, Ziqi Zheng, Guoqiang Xu, Han Qiu, Yu-Gang Jiang, xiangyang xue
Considering the phenomenon of uneven data distribution and lack of samples is common in real-world scenarios, we further evaluate several tasks of few-shot expression learning by virtue of our F2ED, which are to recognize the facial expressions given only few training instances.
Facial Expression Recognition Facial Expression Recognition (FER) +1
no code implementations • 25 Sep 2019 • Qiang Sun, Zhinan Cheng, Yanwei Fu, Wenxuan Wang, Yu-Gang Jiang, xiangyang xue
Instead of learning the cross features directly, DeepEnFM adopts the Transformer encoder as a backbone to align the feature embeddings with the clues of other fields.
no code implementations • 25 Jul 2019 • Wenxuan Wang, Qiang Sun, Tao Chen, Chenjie Cao, Ziqi Zheng, Guoqiang Xu, Han Qiu, Yanwei Fu
First, we create a new facial expression dataset of more than 200k images with 119 persons, 4 poses and 54 expressions.
Facial Expression Recognition Facial Expression Recognition (FER) +2
2 code implementations • 12 Dec 2018 • Dabiao Ma, Zhiba Su, Wenxuan Wang, Yuhao Lu
End-to-end Text-to-speech (TTS) system can greatly improve the quality of synthesised speech.
2 code implementations • ECCV 2018 • Xuelin Qian, Yanwei Fu, Tao Xiang, Wenxuan Wang, Jie Qiu, Yang Wu, Yu-Gang Jiang, xiangyang xue
Person Re-identification (re-id) faces two major challenges: the lack of cross-view paired training data and learning discriminative identity-sensitive and view-invariant features in the presence of large pose variations.