Search Results for author: Li Zhang

Found 309 papers, 136 papers with code

Label Definitions Improve Semantic Role Labeling

1 code implementation • NAACL 2022 • Li Zhang, Ishan Jindal, Yunyao Li

Given a sentence and the predicate, a semantic role label is assigned to each argument of the predicate.

Paper
Code

Is “My Favorite New Movie” My Favorite Movie? Probing the Understanding of Recursive Noun Phrases

no code implementations • NAACL 2022 • Qing Lyu, Zheng Hua, Daoxin Li, Li Zhang, Marianna Apidianaki, Chris Callison-Burch

We introduce the Recursive Noun Phrase Challenge (RNPC), a dataset of three textual inference tasks involving textual entailment and event plausibility comparison, precisely targeting the understanding of recursive NPs.

Common Sense Reasoning Natural Language Inference

Paper
Add Code

SmartCiteCon: Implicit Citation Context Extraction from Academic Literature Using Supervised Learning

no code implementations • WOSP 2020 • Chenrui Guo, Haoran Cui, Li Zhang, Jiamin Wang, Wei Lu, Jian Wu

The tool is built on a Support Vector Machine (SVM) model trained on a set of 7, 058 manually annotated citation context sentences, curated from 34, 000 papers from the ACL Anthology.

Paper
Add Code

Multi-Level Gazetteer-Free Geocoding

no code implementations • ACL (splurobonlp) 2021 • Sayali Kulkarni, Shailee Jain, Mohammad Javad Hosseini, Jason Baldridge, Eugene Ie, Li Zhang

We present a multi-level geocoding model (MLG) that learns to associate texts to geographic coordinates.

Toponym Resolution

Paper
Add Code

PromptCIR: Blind Compressed Image Restoration with Prompt Learning

1 code implementation • 26 Apr 2024 • Bingchen Li, Xin Li, Yiting Lu, Ruoyu Feng, Mengxi Guo, Shijie Zhao, Li Zhang, Zhibo Chen

Existing works on blind CIR often seek assistance from a quality factor prediction network to facilitate their network to restore compressed images.

Image Enhancement Image Restoration

Paper
Code

ResVR: Joint Rescaling and Viewport Rendering of Omnidirectional Images

no code implementations • 25 Apr 2024 • Weiqi Li, Shijie Zhao, Bin Chen, Xinhua Cheng, Junlin Li, Li Zhang, Jian Zhang

With the advent of virtual reality technology, omnidirectional image (ODI) rescaling techniques are increasingly embraced for reducing transmitted and stored file sizes while preserving high image quality.

ERP

Paper
Add Code

MDDD: Manifold-based Domain Adaptation with Dynamic Distribution for Non-Deep Transfer Learning in Cross-subject and Cross-session EEG-based Emotion Recognition

no code implementations • 24 Apr 2024 • Ting Luo, Jing Zhang, Yingwei Qiu, Li Zhang, Yaohua Hu, Zhuliang Yu, Zhen Liang

The proposed MDDD includes four main modules: manifold feature transformation, dynamic distribution alignment, classifier learning, and ensemble learning.

Domain Adaptation EEG +4

Paper
Add Code

LaneCorrect: Self-supervised Lane Detection

no code implementations • 23 Apr 2024 • Ming Nie, Xinyue Cai, Hang Xu, Li Zhang

Lane detection has evolved highly functional autonomous driving system to understand driving scenes even under complex environments.

Autonomous Driving Lane Detection

Paper
Add Code

LlamaTouch: A Faithful and Scalable Testbed for Mobile UI Automation Task Evaluation

1 code implementation • 12 Apr 2024 • Li Zhang, Shihe Wang, Xianqing Jia, Zhihan Zheng, Yunhe Yan, Longxi Gao, Yuanchun Li, Mengwei Xu

The emergent large language/multimodal models facilitate the evolution of mobile agents, especially in the task of mobile UI automation.

Paper
Code

Effective Lymph Nodes Detection in CT Scans Using Location Debiased Query Selection and Contrastive Query Representation in Transformer

no code implementations • 4 Apr 2024 • Qinji Yu, Yirui Wang, Ke Yan, Haoshen Li, Dazhou Guo, Li Zhang, Le Lu, Na Shen, Qifeng Wang, Xiaowei Ding, Xianghua Ye, Dakai Jin

Lymph node (LN) assessment is a critical, indispensable yet very challenging task in the routine clinical workflow of radiology and oncology.

Contrastive Learning Lesion Detection

Paper
Add Code

Diffusion$^2$: Dynamic 3D Content Generation via Score Composition of Orthogonal Diffusion Models

1 code implementation • 2 Apr 2024 • Zeyu Yang, Zijie Pan, Chun Gu, Li Zhang

Recent advancements in 3D generation are predominantly propelled by improvements in 3D-aware image diffusion models which are pretrained on Internet-scale image data and fine-tuned on massive 3D data, offering the capability of producing highly consistent multi-view images.

3D Generation 4D reconstruction +1

Paper
Code

Peer-aided Repairer: Empowering Large Language Models to Repair Advanced Student Assignments

no code implementations • 2 Apr 2024 • Qianhui Zhao, Fang Liu, Li Zhang, Yang Liu, Zhen Yan, Zhenghao Chen, Yufei Zhou, Jing Jiang, Ge Li

Automated generation of feedback on programming assignments holds significant benefits for programming education, especially when it comes to advanced assignments.

Language Modelling Large Language Model +1

Paper
Add Code

Exploring and Evaluating Hallucinations in LLM-Powered Code Generation

no code implementations • 1 Apr 2024 • Fang Liu, Yang Liu, Lin Shi, Houkun Huang, Ruifeng Wang, Zhen Yang, Li Zhang

The rise of Large Language Models (LLMs) has significantly advanced many applications on software engineering tasks, particularly in code generation.

Code Generation Hallucination +2

Paper
Add Code

InternLM2 Technical Report

1 code implementation • 26 Mar 2024 • Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang, Penglong Jiao, Zhenjiang Jin, Zhikai Lei, Jiaxing Li, Jingwen Li, Linyang Li, Shuaibin Li, Wei Li, Yining Li, Hongwei Liu, Jiangning Liu, Jiawei Hong, Kaiwen Liu, Kuikun Liu, Xiaoran Liu, Chengqi Lv, Haijun Lv, Kai Lv, Li Ma, Runyuan Ma, Zerun Ma, Wenchang Ning, Linke Ouyang, Jiantao Qiu, Yuan Qu, FuKai Shang, Yunfan Shao, Demin Song, Zifan Song, Zhihao Sui, Peng Sun, Yu Sun, Huanze Tang, Bin Wang, Guoteng Wang, Jiaqi Wang, Jiayu Wang, Rui Wang, Yudong Wang, Ziyi Wang, Xingjian Wei, Qizhen Weng, Fan Wu, Yingtong Xiong, Chao Xu, Ruiliang Xu, Hang Yan, Yirong Yan, Xiaogui Yang, Haochen Ye, Huaiyuan Ying, JIA YU, Jing Yu, Yuhang Zang, Chuyu Zhang, Li Zhang, Pan Zhang, Peng Zhang, Ruijie Zhang, Shuo Zhang, Songyang Zhang, Wenjian Zhang, Wenwei Zhang, Xingcheng Zhang, Xinyue Zhang, Hui Zhao, Qian Zhao, Xiaomeng Zhao, Fengzhe Zhou, Zaida Zhou, Jingming Zhuo, Yicheng Zou, Xipeng Qiu, Yu Qiao, Dahua Lin

The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI).

Ranked #5 on Long-Context Understanding on Ada-LEval (BestAnswer)

4k Long-Context Understanding

5,213

Paper
Code

STEntConv: Predicting Disagreement with Stance Detection and a Signed Graph Convolutional Network

1 code implementation • 23 Mar 2024 • Isabelle Lorge, Li Zhang, Xiaowen Dong, Janet B. Pierrehumbert

The rise of social media platforms has led to an increase in polarised online discussions, especially on political and socio-cultural topics such as elections and climate change.

Stance Detection

Paper
Code

Generalizable and Stable Finetuning of Pretrained Language Models on Low-Resource Texts

1 code implementation • 19 Mar 2024 • Sai Ashish Somayajula, Youwei Liang, Abhishek Singh, Li Zhang, Pengtao Xie

Pretrained Language Models (PLMs) have advanced Natural Language Processing (NLP) tasks significantly, but finetuning PLMs on low-resource datasets poses significant challenges such as instability and overfitting.

Paper
Code

Urban Scene Diffusion through Semantic Occupancy Map

no code implementations • 18 Mar 2024 • Junge Zhang, Qihang Zhang, Li Zhang, Ramana Rao Kompella, Gaowen Liu, Bolei Zhou

Generating unbounded 3D scenes is crucial for large-scale scene understanding and simulation.

Image Generation Scene Understanding

Paper
Add Code

OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy Representation

no code implementations • 18 Mar 2024 • Haochen Jiang, Yueming Xu, Yihan Zeng, Hang Xu, Wei zhang, Jianfeng Feng, Li Zhang

We model the geometric structure of the scene with occupancy representation and distill the pre-trained open vocabulary model into a 3D language field via volume rendering for zero-shot inference.

3D Reconstruction 3D Scene Reconstruction +3

Paper
Add Code

FrameQuant: Flexible Low-Bit Quantization for Transformers

no code implementations • 10 Mar 2024 • Harshavardhan Adepu, Zhanpeng Zeng, Li Zhang, Vikas Singh

If quantization is interpreted as the addition of noise, our casting of the problem allows invoking an extensive body of known consistent recovery and noise robustness guarantees.

Quantization

Paper
Add Code

Implicit Image-to-Image Schrodinger Bridge for CT Super-Resolution and Denoising

no code implementations • 10 Mar 2024 • Yuang Wang, Siyeop Yoon, Pengfei Jin, Matthew Tivnan, Zhennong Chen, Rui Hu, Li Zhang, Zhiqiang Chen, Quanzheng Li, Dufan Wu

As a promising alternative, the Image-to-Image Schr\"odinger Bridge (I2SB) initializes the generative process from corrupted images and integrates training techniques from conditional diffusion models.

Denoising Image Restoration +1

Paper
Add Code

Modular Blind Video Quality Assessment

1 code implementation • 29 Feb 2024 • Wen Wen, Mu Li, Yabin Zhang, Yiting Liao, Junlin Li, Li Zhang, Kede Ma

Blind video quality assessment (BVQA) plays a pivotal role in evaluating and improving the viewing experience of end-users across a wide range of video-based platforms and services.

Video Quality Assessment

Paper
Code

CAMixerSR: Only Details Need More "Attention"

1 code implementation • 29 Feb 2024 • Yan Wang, Yi Liu, Shijie Zhao, Junlin Li, Li Zhang

To satisfy the rapidly increasing demands on the large image (2K-8K) super-resolution (SR), prevailing methods follow two independent tracks: 1) accelerate existing networks by content-aware routing, and 2) design better super-resolution networks via token mixer refining.

2k 8k +1

124

Paper
Code

PROC2PDDL: Open-Domain Planning Representations from Texts

no code implementations • 29 Feb 2024 • Tianyi Zhang, Li Zhang, Zhaoyi Hou, Ziyu Wang, Yuling Gu, Peter Clark, Chris Callison-Burch, Niket Tandon

Planning in a text-based environment continues to be a major challenge for AI systems.

Paper
Add Code

RSAM-Seg: A SAM-based Approach with Prior Knowledge Integration for Remote Sensing Image Semantic Segmentation

no code implementations • 29 Feb 2024 • Jie Zhang, Xubing Yang, Rui Jiang, Wei Shao, Li Zhang

While the direct application of SAM to remote sensing image segmentation tasks does not yield satisfactory results, we propose RSAM-Seg, which stands for Remote Sensing SAM with Semantic Segmentation, as a tailored modification of SAM for the remote sensing field and eliminates the need for manual intervention to provide prompts.

Cloud Detection Image Segmentation +2

Paper
Add Code

From Summary to Action: Enhancing Large Language Models for Complex Tasks with Open World APIs

no code implementations • 28 Feb 2024 • Yulong Liu, Yunlong Yuan, Chunwei Wang, Jianhua Han, Yongqiang Ma, Li Zhang, Nanning Zheng, Hang Xu

In this work, we introduce a novel tool invocation pipeline designed to control massive real-world APIs.

In-Context Learning

Paper
Add Code

Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation

1 code implementation • 28 Feb 2024 • Yuan Ge, Yilun Liu, Chi Hu, Weibin Meng, Shimin Tao, Xiaofeng Zhao, Hongxia Ma, Li Zhang, Hao Yang, Tong Xiao

The second step involves preserving dataset diversity through a clustering process. In our experiment, CaR selected a subset containing only 1. 96% of Alpaca's IT data, yet the underlying AlpaCaR model trained on this subset outperforms Alpaca by an average of 32. 1% in GPT-4 evaluations.

Clustering

Paper
Code

Data Interpreter: An LLM Agent For Data Science

1 code implementation • 28 Feb 2024 • Sirui Hong, Yizhang Lin, Bang Liu, Bangbang Liu, Binhao Wu, Danyang Li, Jiaqi Chen, Jiayi Zhang, Jinlin Wang, Li Zhang, Lingyao Zhang, Min Yang, Mingchen Zhuge, Taicheng Guo, Tuo Zhou, Wei Tao, Wenyi Wang, Xiangru Tang, Xiangtao Lu, Xiawu Zheng, Xinbing Liang, Yaying Fei, Yuheng Cheng, Zongze Xu, Chenglin Wu

Large Language Model (LLM)-based agents have demonstrated remarkable effectiveness.

Language Modelling Large Language Model +1

39,442

Paper
Code

BLO-SAM: Bi-level Optimization Based Overfitting-Preventing Finetuning of SAM

no code implementations • 26 Feb 2024 • Li Zhang, Youwei Liang, Ruiyi Zhang, Amirhosein Javadi, Pengtao Xie

Secondly, SAM faces challenges in excelling at specific downstream tasks, like medical imaging, due to a disparity between the distribution of its pretraining data, which predominantly consists of general-domain images, and the data used in downstream tasks.

Image Segmentation Segmentation +1

Paper
Add Code

A First Look at GPT Apps: Landscape and Vulnerability

no code implementations • 23 Feb 2024 • Zejun Zhang, Li Zhang, Xin Yuan, Anlan Zhang, Mengwei Xu, Feng Qian

With the advancement of Large Language Models (LLMs), increasingly sophisticated and powerful GPTs are entering the market.

Paper
Add Code

FrameNeRF: A Simple and Efficient Framework for Few-shot Novel View Synthesis

no code implementations • 22 Feb 2024 • Yan Xing, Pan Wang, Ligang Liu, Daolun Li, Li Zhang

We present a novel framework, called FrameNeRF, designed to apply off-the-shelf fast high-fidelity NeRF models with fast training speed and high rendering quality for few-shot novel view synthesis tasks.

Novel View Synthesis

Paper
Add Code

Calibrating Large Language Models with Sample Consistency

no code implementations • 21 Feb 2024 • Qing Lyu, Kumar Shridhar, Chaitanya Malaviya, Li Zhang, Yanai Elazar, Niket Tandon, Marianna Apidianaki, Mrinmaya Sachan, Chris Callison-Burch

Accurately gauging the confidence level of Large Language Models' (LLMs) predictions is pivotal for their reliable application.

Paper
Add Code

A Neural-network Enhanced Video Coding Framework beyond ECM

no code implementations • 13 Feb 2024 • Yanchen Zhao, Wenxuan He, Chuanmin Jia, Qizhe Wang, Junru Li, Yue Li, Chaoyi Lin, Kai Zhang, Li Zhang, Siwei Ma

In this paper, a hybrid video compression framework is proposed that serves as a demonstrative showcase of deep learning-based approaches extending beyond the confines of traditional coding methodologies.

Video Compression

Paper
Add Code

Translating Images to Road Network:A Non-Autoregressive Sequence-to-Sequence Approach

2 code implementations • 13 Feb 2024 • Jiachen Lu, Renyuan Peng, Xinyue Cai, Hang Xu, Hongyang Li, Feng Wen, Wei zhang, Li Zhang

Instead, our work establishes a unified representation of both types of data domain by projecting both Euclidean and non-Euclidean data into an integer series called RoadNet Sequence.

Paper
Code

RA-Rec: An Efficient ID Representation Alignment Framework for LLM-based Recommendation

no code implementations • 7 Feb 2024 • Xiaohan Yu, Li Zhang, Xin Zhao, Yue Wang, Zhongrui Ma

To address this limitation, we propose a new paradigm, ID representation, which incorporates pre-trained ID embeddings into LLMs in a complementary manner.

Recommendation Systems

Paper
Add Code

S-Agents: Self-organizing Agents in Open-ended Environments

1 code implementation • 7 Feb 2024 • Jiaqi Chen, Yuxian Jiang, Jiachen Lu, Li Zhang

Leveraging large language models (LLMs), autonomous agents have significantly improved, gaining the ability to handle a variety of tasks.

Paper
Code

TimeSiam: A Pre-Training Framework for Siamese Time-Series Modeling

no code implementations • 4 Feb 2024 • Jiaxiang Dong, Haixu Wu, Yuxuan Wang, Yunzhong Qiu, Li Zhang, Jianmin Wang, Mingsheng Long

To emphasize temporal correlation modeling, this paper proposes TimeSiam as a simple but effective self-supervised pre-training framework for Time series based on Siamese networks.

Contrastive Learning Data Augmentation +1

Paper
Add Code

S-NeRF++: Autonomous Driving Simulation via Neural Reconstruction and Generation

no code implementations • 3 Feb 2024 • Yurui Chen, Junge Zhang, Ziyang Xie, Wenye Li, Feihu Zhang, Jiachen Lu, Li Zhang

Autonomous driving simulation system plays a crucial role in enhancing self-driving data and simulating complex and rare traffic scenarios, ensuring navigation safety.

Autonomous Driving

Paper
Add Code

LVC-LGMC: Joint Local and Global Motion Compensation for Learned Video Compression

no code implementations • 1 Feb 2024 • Wei Jiang, Junru Li, Kai Zhang, Li Zhang

To validate the effectiveness of our proposed LGMC, we integrate it with DCVC-TCM and obtain learned video compression with joint local and global motion compensation (LVC-LGMC).

Motion Compensation Video Compression

Paper
Add Code

LaneGraph2Seq: Lane Topology Extraction with Language Model via Vertex-Edge Encoding and Connectivity Enhancement

2 code implementations • 31 Jan 2024 • Renyuan Peng, Xinyue Cai, Hang Xu, Jiachen Lu, Feng Wen, Wei zhang, Li Zhang

Accurate extraction of lane graphs relies on precisely estimating vertex and edge information within the DAG.

Autonomous Driving Language Modelling

Paper
Code

A Survey of Resource-efficient LLM and Multimodal Foundation Models

1 code implementation • 16 Jan 2024 • Mengwei Xu, Wangsong Yin, Dongqi Cai, Rongjie Yi, Daliang Xu, QiPeng Wang, Bingyang Wu, Yihao Zhao, Chen Yang, Shihe Wang, Qiyang Zhang, Zhenyan Lu, Li Zhang, Shangguang Wang, Yuanchun Li, Yunxin Liu, Xin Jin, Xuanzhe Liu

Large foundation models, including large language models (LLMs), vision transformers (ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine learning lifecycle, from training to deployment.

140

Paper
Code

Fast Dynamic 3D Object Generation from a Single-view Video

no code implementations • 16 Jan 2024 • Zijie Pan, Zeyu Yang, Xiatian Zhu, Li Zhang

Generating dynamic 3D object from a single-view video is challenging due to the lack of 4D labeled data.

Image Generation Image to 3D +3

Paper
Add Code

UPDP: A Unified Progressive Depth Pruner for CNN and Vision Transformer

no code implementations • 12 Jan 2024 • Ji Liu, Dehua Tang, Yuanxian Huang, Li Zhang, Xiaocheng Zeng, Dong Li, Mingjie Lu, Jinzhang Peng, Yu Wang, Fan Jiang, Lu Tian, Ashish Sirasao

Our method also achieves state-of-the-art pruning performance on the vision transformer model.

Paper
Add Code

Optimal Transcoding Resolution Prediction for Efficient Per-Title Bitrate Ladder Estimation

no code implementations • 9 Jan 2024 • Jinhai Yang, Mengxi Guo, Shijie Zhao, Junlin Li, Li Zhang

In this paper, we propose to directly predict the optimal transcoding resolution at each preset bitrate for efficient bitrate ladder construction.

Paper
Add Code

DGDNN: Decoupled Graph Diffusion Neural Network for Stock Movement Prediction

1 code implementation • 3 Jan 2024 • Zinuo You, Zijian Shi, Hongbo Bo, John Cartlidge, Li Zhang, Yan Ge

Moreover, the ablation study and sensitivity study further illustrate the effectiveness of the proposed method in modeling the time-evolving inter-stock and intra-stock dynamics.

Graph Learning Representation Learning

Paper
Code

FGENet: Fine-Grained Extraction Network for Congested Crowd Counting

no code implementations • 2 Jan 2024 • Hao-Yuan Ma, Li Zhang, Xiang-Yi Wei

Crowd counting has gained significant popularity due to its practical applications.

Ranked #1 on Crowd Counting on ShanghaiTech A

Crowd Counting

Paper
Add Code

Harnessing Diffusion Models for Visual Perception with Meta Prompts

1 code implementation • 22 Dec 2023 • Qiang Wan, Zilong Huang, Bingyi Kang, Jiashi Feng, Li Zhang

Our key insight is to introduce learnable embeddings (meta prompts) to the pre-trained diffusion models to extract proper features for perception.

Ranked #2 on Semantic Segmentation on Cityscapes test (using extra training data)

Monocular Depth Estimation Pose Estimation +1

Paper
Code

Joint Training or Not: An Exploration of Pre-trained Speech Models in Audio-Visual Speaker Diarization

no code implementations • 7 Dec 2023 • Huan Zhao, Li Zhang, Yue Li, Yannan Wang, Hongji Wang, Wei Rao, Qing Wang, Lei Xie

The scarcity of labeled audio-visual datasets is a constraint for training superior audio-visual speaker diarization systems.

speaker-diarization Speaker Diarization

Paper
Add Code

Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving

1 code implementation • 6 Dec 2023 • Ming Nie, Renyuan Peng, Chunwei Wang, Xinyue Cai, Jianhua Han, Hang Xu, Li Zhang

Large vision-language models (VLMs) have garnered increasing interest in autonomous driving areas, due to their advanced capabilities in complex reasoning tasks essential for highly autonomous vehicle behavior.

Autonomous Driving Decision Making

Paper
Code

WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation

1 code implementation • 5 Dec 2023 • Jiachen Lu, Ze Huang, Zeyu Yang, Jiahui Zhang, Li Zhang

Generating multi-camera street-view videos is critical for augmenting autonomous driving datasets, addressing the urgent demand for extensive and varied data.

Autonomous Driving Scene Generation +1

Paper
Code

Periodic Vibration Gaussian: Dynamic Urban Scene Reconstruction and Real-time Rendering

no code implementations • 30 Nov 2023 • Yurui Chen, Chun Gu, Junzhe Jiang, Xiatian Zhu, Li Zhang

To address this challenge, we present a unified representation model, called Periodic Vibration Gaussian (PVG).

Novel View Synthesis Optical Flow Estimation +1

Paper
Add Code

Demonstration of Programmable Brain-Inspired Optoelectronic Neuron in Photonic Spiking Neural Network with Neural Heterogeneity

no code implementations • 27 Nov 2023 • Yun-jhu Lee, Mehmet Berkay On, Luis El Srouji, Li Zhang, Mahmoud Abdelghany, S. J. Ben Yoo

Photonic Spiking Neural Networks (PSNN) composed of the co-integrated CMOS and photonic elements can offer low loss, low power, highly-parallel, and high-throughput computing for brain-inspired neuromorphic systems.

Paper
Add Code

Relightable 3D Gaussian: Real-time Point Cloud Relighting with BRDF Decomposition and Ray Tracing

no code implementations • 27 Nov 2023 • Jian Gao, Chun Gu, Youtian Lin, Hao Zhu, Xun Cao, Li Zhang, Yao Yao

We present a novel differentiable point-based rendering framework for material and lighting decomposition from multi-view images, enabling editing, ray-tracing, and real-time relighting of the 3D point cloud.

BRDF estimation Lighting Estimation

Paper
Add Code

CoachLM: Automatic Instruction Revisions Improve the Data Quality in LLM Instruction Tuning

2 code implementations • 22 Nov 2023 • Yilun Liu, Shimin Tao, Xiaofeng Zhao, Ming Zhu, Wenbing Ma, Junhao Zhu, Chang Su, Yutai Hou, Miao Zhang, Min Zhang, Hongxia Ma, Li Zhang, Hao Yang, Yanfei Jiang

Instruction tuning is crucial for enabling Language Learning Models (LLMs) in responding to human instructions.

Instruction Following

Paper
Code

One Size Does Not Fit All: Customizing Open-Domain Procedures

no code implementations • 16 Nov 2023 • Yash Kumar Lal, Li Zhang, Faeze Brahman, Bodhisattwa Prasad Majumder, Peter Clark, Niket Tandon

Our approach is to test several simple multi-LLM-agent architectures for customization, as well as an end-to-end LLM, using a new evaluation set, called CustomPlans, of over 200 WikiHow procedures each with a customization need.

Paper
Add Code

Improved Dense Nested Attention Network Based on Transformer for Infrared Small Target Detection

1 code implementation • 15 Nov 2023 • Chun Bao, Jie Cao, Yaqian Ning, Tianhua Zhao, Zhijun Li, Zechen Wang, Li Zhang, Qun Hao

To address this issue, we propose a novel method for detecting infrared small targets called improved dense nested attention network (IDNANet), which is based on the transformer architecture.

Paper
Code

Consistent4D: Consistent 360° Dynamic Object Generation from Monocular Video

no code implementations • 6 Nov 2023 • Yanqin Jiang, Li Zhang, Jin Gao, Weimin Hu, Yao Yao

This is achieved by leveraging the object-level 3D-aware image diffusion model as the primary supervision signal for training Dynamic Neural Radiance Fields (DyNeRF).

3D Generation Camera Calibration +3

Paper
Add Code

Large Language Models Illuminate a Progressive Pathway to Artificial Healthcare Assistant: A Review

1 code implementation • 3 Nov 2023 • Mingze Yuan, Peng Bao, Jiajia Yuan, Yunhao Shen, ZiFan Chen, Yi Xie, Jie Zhao, Yang Chen, Li Zhang, Lin Shen, Bin Dong

This has sparked significant interest in applying LLMs to enhance various aspects of healthcare, ranging from medical education to clinical decision support.

148

Paper
Code

Private Learning with Public Features

no code implementations • 24 Oct 2023 • Walid Krichene, Nicolas Mayoraz, Steffen Rendle, Shuang Song, Abhradeep Thakurta, Li Zhang

We study a class of private learning problems in which the data is a join of private and public features.

Paper
Add Code

Enhancing High-Resolution 3D Generation through Pixel-wise Gradient Clipping

1 code implementation • 19 Oct 2023 • Zijie Pan, Jiachen Lu, Xiatian Zhu, Li Zhang

In this framework, a significant challenge arises: To compute gradients for individual image pixels, it is necessary to backpropagate gradients from the designated latent space through the frozen components of the image model, such as the VAE encoder used within LDM.

3D Generation Transfer Learning

Paper
Code

Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting

1 code implementation • 16 Oct 2023 • Zeyu Yang, Hongye Yang, Zijie Pan, Li Zhang

Reconstructing dynamic 3D scenes from 2D images and generating diverse views over time is challenging due to scene complexity and temporal dynamics.

432

Paper
Code

CLIN: A Continually Learning Language Agent for Rapid Task Adaptation and Generalization

no code implementations • 16 Oct 2023 • Bodhisattwa Prasad Majumder, Bhavana Dalvi Mishra, Peter Jansen, Oyvind Tafjord, Niket Tandon, Li Zhang, Chris Callison-Burch, Peter Clark

Language agents have shown some ability to interact with an external environment, e. g., a virtual world such as ScienceWorld, to perform complex tasks, e. g., growing a plant, without the startup costs of reinforcement learning.

Paper
Add Code

CBARF: Cascaded Bundle-Adjusting Neural Radiance Fields from Imperfect Camera Poses

no code implementations • 15 Oct 2023 • Hongyu Fu, Xin Yu, Lincheng Li, Li Zhang

Existing volumetric neural rendering techniques, such as Neural Radiance Fields (NeRF), face limitations in synthesizing high-quality novel views when the camera poses of input images are imperfect.

3D Reconstruction Neural Rendering +1

Paper
Add Code

Multi-Depth Branch Network for Efficient Image Super-Resolution

1 code implementation • 29 Sep 2023 • Huiyuan Tian, Li Zhang, Shijian Li, Min Yao, Gang Pan

We visualize this process using feature maps, and further demonstrate the rationality and effectiveness of this design using proposed novel Fourier spectral analysis methods.

Image Super-Resolution

Paper
Code

Choice-75: A Dataset on Decision Branching in Script Learning

no code implementations • 21 Sep 2023 • Zhaoyi Joey Hou, Li Zhang, Chris Callison-Burch

Script learning studies how stereotypical events unfold, enabling machines to reason about narratives with implicit information.

Descriptive

Paper
Add Code

Private Matrix Factorization with Public Item Features

no code implementations • 17 Sep 2023 • Mihaela Curmei, Walid Krichene, Li Zhang, Mukund Sundararajan

It can be applied to different types of public item data, including: (1) categorical item features; (2) item-item similarities learned from public sources; and (3) publicly available user feedback.

Collaborative Filtering

Paper
Add Code

SAMUS: Adapting Segment Anything Model for Clinically-Friendly and Generalizable Ultrasound Image Segmentation

1 code implementation • 13 Sep 2023 • Xian lin, Yangyang Xiang, Li Zhang, Xin Yang, Zengqiang Yan, Li Yu

Segment anything model (SAM), an eminent universal image segmentation model, has recently gathered considerable attention within the domain of medical image segmentation.

Image Segmentation Medical Image Segmentation +2

113

Paper
Code

Designs and Implementations in Neural Network-based Video Coding

no code implementations • 11 Sep 2023 • Yue Li, Junru Li, Chaoyi Lin, Kai Zhang, Li Zhang, Franck Galpin, Thierry Dumas, Hongtao Wang, Muhammed Coban, Jacob Ström, Du Liu, Kenneth Andersson

The past decade has witnessed the huge success of deep learning in well-known artificial intelligence applications such as face recognition, autonomous driving, and large language model like ChatGPT.

Autonomous Driving Face Recognition +3

Paper
Add Code

Semi-Supervised Dual-Stream Self-Attentive Adversarial Graph Contrastive Learning for Cross-Subject EEG-based Emotion Recognition

no code implementations • 13 Aug 2023 • Weishan Ye, Zhiguo Zhang, Min Zhang, Fei Teng, Li Zhang, Linling Li, Gan Huang, Jianhong Wang, Dong Ni, Zhen Liang

In this paper, a semi-supervised Dual-stream Self-Attentive Adversarial Graph Contrastive learning framework (termed as DS-AGC) is proposed to tackle the challenge of limited labeled data in cross-subject EEG-based emotion recognition.

Contrastive Learning Domain Adaptation +2

Paper
Add Code

PARTNER: Level up the Polar Representation for LiDAR 3D Object Detection

1 code implementation • ICCV 2023 • Ming Nie, Yujing Xue, Chunwei Wang, Chaoqiang Ye, Hang Xu, Xinge Zhu, Qingqiu Huang, Michael Bi Mi, Xinchao Wang, Li Zhang

Recently, polar-based representation has shown promising properties in perceptual tasks.

3D Object Detection object-detection

Paper
Code

Improved Prognostic Prediction of Pancreatic Cancer Using Multi-Phase CT by Integrating Neural Distance and Texture-Aware Transformer

no code implementations • 1 Aug 2023 • Hexin Dong, Jiawen Yao, Yuxing Tang, Mingze Yuan, Yingda Xia, Jian Zhou, Hong Lu, Jingren Zhou, Bin Dong, Le Lu, Li Zhang, Zaiyi Liu, Yu Shi, Ling Zhang

Pancreatic ductal adenocarcinoma (PDAC) is a highly lethal cancer in which the tumor-vascular involvement greatly affects the resectability and, thus, overall survival of patients.

Paper
Add Code

Scale-aware Test-time Click Adaptation for Pulmonary Nodule and Mass Segmentation

1 code implementation • 28 Jul 2023 • Zhihao LI, Jiancheng Yang, Yongchao Xu, Li Zhang, Wenhui Dong, Bo Du

Extensive experiments on both open-source and in-house datasets consistently demonstrate the effectiveness of the proposed method over some CNN and Transformer-based segmentation methods.

Image Segmentation Management +4

Paper
Code

Deep neural network improves the estimation of polygenic risk scores for breast cancer

no code implementations • 24 Jul 2023 • Adrien Badré, Li Zhang, Wellington Muchero, Justin C. Reynolds, Chongle Pan

In the test cohort with 50% prevalence, the Area Under the receiver operating characteristic Curve (AUC) were 67. 4% for DNN, 64. 2% for BLUP, 64. 5% for BayesA, and 62. 4% for LDpred.

Paper
Add Code

Cluster-Induced Mask Transformers for Effective Opportunistic Gastric Cancer Screening on Non-contrast CT Scans

no code implementations • 10 Jul 2023 • Mingze Yuan, Yingda Xia, Xin Chen, Jiawen Yao, Junli Wang, Mingyan Qiu, Hexin Dong, Jingren Zhou, Bin Dong, Le Lu, Li Zhang, Zaiyi Liu, Ling Zhang

In our experiments, the proposed method achieves a sensitivity of 85. 0% and specificity of 92. 6% for detecting gastric tumors on a hold-out test set consisting of 100 patients with cancer and 148 normal.

Specificity

Paper
Add Code

Towards Efficient In-memory Computing Hardware for Quantized Neural Networks: State-of-the-art, Open Challenges and Perspectives

no code implementations • 8 Jul 2023 • Olga Krestinskaya, Li Zhang, Khaled Nabil Salama

Limited energy and computational resources on edge push the transition from traditional von Neumann architectures to In-memory Computing (IMC), especially for machine learning and neural network applications.

Quantization

Paper
Add Code

SUIT: Learning Significance-guided Information for 3D Temporal Detection

no code implementations • 4 Jul 2023 • Zheyuan Zhou, Jiachen Lu, Yihan Zeng, Hang Xu, Li Zhang

To this end, we propose to learn Significance-gUided Information for 3D Temporal detection (SUIT), which simplifies temporal information as sparse features for information fusion across frames.

3D Object Detection Autonomous Driving +2

Paper
Add Code

Joint Coordinate Regression and Association For Multi-Person Pose Estimation, A Pure Neural Network Approach

no code implementations • 3 Jul 2023 • Dongyang Yu, Yunshi Xie, Wangpeng An, Li Zhang, YuFeng Yao

We introduce a novel one-stage end-to-end multi-person 2D pose estimation algorithm, known as Joint Coordinate Regression and Association (JCRA), that produces human pose joints and associations without requiring any post-processing.

2D Pose Estimation Multi-Person Pose Estimation

Paper
Add Code

MCPI: Integrating Multimodal Data for Enhanced Prediction of Compound Protein Interactions

no code implementations • 15 Jun 2023 • Li Zhang, Wenhao Li, Haotian Guan, Zhiquan He, Mingjun Cheng, Han Wang

The identification of compound-protein interactions (CPI) plays a critical role in drug screening, drug repurposing, and combination therapy studies.

Paper
Add Code

CD-CTFM: A Lightweight CNN-Transformer Network for Remote Sensing Cloud Detection Fusing Multiscale Features

no code implementations • 12 Jun 2023 • Wenxuan Ge, Xubing Yang, Li Zhang

In the decoder part, we utilize a lightweight network combing CNN and Transformer as backbone, which is conducive to extract local and global features simultaneously.

Cloud Detection

Paper
Add Code

Answering Compositional Queries with Set-Theoretic Embeddings

no code implementations • 7 Jun 2023 • Shib Dasgupta, Andrew McCallum, Steffen Rendle, Li Zhang

The need to compactly and robustly represent item-attribute relations arises in many important tasks, such as faceted browsing and recommendation systems.

Attribute Recommendation Systems +1

Paper
Add Code

Video Compression with Arbitrary Rescaling Network

no code implementations • 7 Jun 2023 • Mengxi Guo, Shijie Zhao, Hao Jiang, Junlin Li, Li Zhang

Most video platforms provide video streaming services with different qualities, and the quality of the services is usually adjusted by the resolution of the videos.

Video Compression

Paper
Add Code

Probabilistic computation and uncertainty quantification with emerging covariance

1 code implementation • 30 May 2023 • Hengyuan Ma, Yang Qi, Li Zhang, Wenlian Lu, Jianfeng Feng

Building robust, interpretable, and secure AI system requires quantifying and representing uncertainty under a probabilistic perspective to mimic human cognitive abilities.

Uncertainty Quantification

Paper
Code

propnet: Propagating 2D Annotation to 3D Segmentation for Gastric Tumors on CT Scans

no code implementations • 29 May 2023 • ZiFan Chen, Jiazheng Li, Jie Zhao, Yiting Liu, Hongfeng Li, Bin Dong, Lei Tang, Li Zhang

This model consists of a proposing stage for coarse segmentation and a refining stage for improved segmentation, using two-way branches for enhanced performance and an up-down strategy for efficiency.

Segmentation Tumor Segmentation

Paper
Add Code

OpenPI2.0: An Improved Dataset for Entity Tracking in Texts

1 code implementation • 24 May 2023 • Li Zhang, Hainiu Xu, Abhinav Kommula, Chris Callison-Burch, Niket Tandon

An earlier dataset, OpenPI, provided crowdsourced annotations of entity state changes in text.

Question Answering

Paper
Code

DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs

1 code implementation • 18 May 2023 • Youwei Liang, Ruiyi Zhang, Li Zhang, Pengtao Xie

The DrugChat system consists of a graph neural network (GNN), a large language model (LLM), and an adaptor.

Drug Discovery Language Modelling +1

Paper
Code

Hybrid Transformer and CNN Attention Network for Stereo Image Super-resolution

no code implementations • 9 May 2023 • Ming Cheng, Haoyu Ma, Qiufang Ma, Xiaopeng Sun, Weiqi Li, Zhenyu Zhang, Xuhan Sheng, Shijie Zhao, Junlin Li, Li Zhang

Multi-stage strategies are frequently employed in image restoration tasks.

Data Augmentation Image Enhancement +2

Paper
Add Code

NeRF-LiDAR: Generating Realistic LiDAR Point Clouds with Neural Radiance Fields

1 code implementation • 28 Apr 2023 • Junge Zhang, Feihu Zhang, Shaochen Kuang, Li Zhang

We verify the effectiveness of our NeRF-LiDAR by training different 3D segmentation models on the generated LiDAR point clouds.

Autonomous Driving Novel View Synthesis +2

Paper
Code

OPDN: Omnidirectional Position-aware Deformable Network for Omnidirectional Image Super-Resolution

no code implementations • 26 Apr 2023 • Xiaopeng Sun, Weiqi Li, Zhenyu Zhang, Qiufang Ma, Xuhan Sheng, Ming Cheng, Haoyu Ma, Shijie Zhao, Jian Zhang, Junlin Li, Li Zhang

Model A aims to enhance the feature extraction ability of 360{\deg} image positional information, while Model B further focuses on the high-frequency information of 360{\deg} images.

Image Super-Resolution Position

Paper
Add Code

Exploring the Curious Case of Code Prompts

1 code implementation • 26 Apr 2023 • Li Zhang, Liam Dugan, Hainiu Xu, Chris Callison-Burch

Furthermore, we show that the style of code prompt has a large effect on performance for some but not all tasks and that fine-tuning on text instructions leads to better relative performance of code prompts.

Paper
Code

Federated Learning of Shareable Bases for Personalization-Friendly Image Classification

no code implementations • 16 Apr 2023 • Hong-You Chen, Jike Zhong, Mingda Zhang, Xuhui Jia, Hang Qi, Boqing Gong, Wei-Lun Chao, Li Zhang

FedBasis learns a set of few shareable ``basis'' models, which can be linearly combined to form personalized models for clients.

Image Classification Personalized Federated Learning

Paper
Add Code

Learning by Grouping: A Multilevel Optimization Framework for Improving Fairness in Classification without Losing Accuracy

no code implementations • 2 Apr 2023 • Ramtin Hosseini, Li Zhang, Bhanu Garg, Pengtao Xie

Our proposed framework involves three stages of learning, which are formulated as a three-level optimization problem: (i) learning to group problems into different subgroups; (ii) learning group-specific sub-models for problem-solving; and (iii) updating group assignments of training examples by minimizing the validation loss.

Decision Making Domain Adaptation +2

Paper
Add Code

Devil is in the Queries: Advancing Mask Transformers for Real-world Medical Image Segmentation and Out-of-Distribution Localization

no code implementations • CVPR 2023 • Mingze Yuan, Yingda Xia, Hexin Dong, ZiFan Chen, Jiawen Yao, Mingyan Qiu, Ke Yan, Xiaoli Yin, Yu Shi, Xin Chen, Zaiyi Liu, Bin Dong, Jingren Zhou, Le Lu, Ling Zhang, Li Zhang

Real-world medical image segmentation has tremendous long-tailed complexity of objects, among which tail conditions correlate with relatively rare diseases and are clinically significant.

Image Segmentation Medical Image Segmentation +2

Paper
Add Code

EEGMatch: Learning with Incomplete Labels for Semi-Supervised EEG-based Cross-Subject Emotion Recognition

1 code implementation • 27 Mar 2023 • Rushuang Zhou, Weishan Ye, Zhiguo Zhang, Yanyang Luo, Li Zhang, Linling Li, Gan Huang, Yining Dong, Yuan-Ting Zhang, Zhen Liang

The results show the proposed EEGmatch performs better than the state-of-the-art methods under different incomplete label conditions (with 6. 89% improvement on SEED and 1. 44% improvement on SEED-IV), which demonstrates the effectiveness of the proposed EEGMatch in dealing with the label scarcity problem in emotion recognition using EEG signals.

Data Augmentation Domain Adaptation +3

Paper
Code

Generative Semantic Segmentation

2 code implementations • CVPR 2023 • Jiaqi Chen, Jiachen Lu, Xiatian Zhu, Li Zhang

To that end, the segmentation mask is expressed with a special type of image (dubbed as maskige).

Segmentation Semantic Segmentation

188

Paper
Code

Single-view Neural Radiance Fields with Depth Teacher

no code implementations • 17 Mar 2023 • Yurui Chen, Chun Gu, Feihu Zhang, Li Zhang

Moreover, it has poor generalizations to new scenes and requires retraining or fine-tuning on each scene.

Novel View Synthesis

Paper
Add Code

RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose

1 code implementation • 13 Mar 2023 • Tao Jiang, Peng Lu, Li Zhang, Ningsheng Ma, Rui Han, Chengqi Lyu, Yining Li, Kai Chen

Recent studies on 2D pose estimation have achieved excellent performance on public benchmarks, yet its application in the industrial community still suffers from heavy model parameters and high latency.

Ranked #3 on Pose Estimation on OCHuman (using extra training data)

2D Human Pose Estimation 2D Pose Estimation +1

5,023

Paper
Code

QVRF: A Quantization-error-aware Variable Rate Framework for Learned Image Compression

6 code implementations • 10 Mar 2023 • Kedeng Tong, Yaojun Wu, Yue Li, Kai Zhang, Li Zhang, Xin Jin

In this paper, we present a Quantization-error-aware Variable Rate Framework (QVRF) that utilizes a univariate quantization regulator a to achieve wide-range variable rates within a single model.

Image Compression Quantization

Paper
Code

Self-Asymmetric Invertible Network for Compression-Aware Image Rescaling

1 code implementation • 4 Mar 2023 • Jinhai Yang, Mengxi Guo, Shijie Zhao, Junlin Li, Li Zhang

In this paper, we propose the Self-Asymmetric Invertible Network (SAIN) for compression-aware image rescaling.

Image Compression

Paper
Code

FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks

1 code implementation • CVPR 2023 • Xiao Han, Xiatian Zhu, Licheng Yu, Li Zhang, Yi-Zhe Song, Tao Xiang

In the fashion domain, there exists a variety of vision-and-language (V+L) tasks, including cross-modal retrieval, text-guided image retrieval, multi-modal classification, and image captioning.

Cross-Modal Retrieval Image Captioning +4

Paper
Code

S-NeRF: Neural Radiance Fields for Street Views

no code implementations • 1 Mar 2023 • Ziyang Xie, Junge Zhang, Wenye Li, Feihu Zhang, Li Zhang

Specifically, we improve the scene parameterization function and the camera poses for learning better neural representations from street views.

Novel View Synthesis Self-Driving Cars

Paper
Add Code

Nonlinear Intensity, Scale and Rotation Invariant Matching for Multimodal Images

1 code implementation • 28 Feb 2023 • Zhongli Fan, Li Zhang, Yuxuan Liu

We present an effective method for the matching of multimodal images.

Image Registration Template Matching

Paper
Code

Human-in-the-Loop Schema Induction

no code implementations • 25 Feb 2023 • Tianyi Zhang, Isaac Tham, Zhaoyi Hou, Jiaxuan Ren, Liyang Zhou, Hainiu Xu, Li Zhang, Lara J. Martin, Rotem Dror, Sha Li, Heng Ji, Martha Palmer, Susan Brown, Reece Suchocki, Chris Callison-Burch

Schema induction builds a graph representation explaining how events unfold in a scenario.

Information Retrieval Retrieval

Paper
Add Code

Multi-Task Differential Privacy Under Distribution Skew

no code implementations • 15 Feb 2023 • Walid Krichene, Prateek Jain, Shuang Song, Mukund Sundararajan, Abhradeep Thakurta, Li Zhang

We study the problem of multi-task learning under user-level differential privacy, in which $n$ users contribute data to $m$ tasks, each involving a subset of users.

Multi-Task Learning

Paper
Add Code

Preconditioned Score-based Generative Models

1 code implementation • 13 Feb 2023 • Hengyuan Ma, Li Zhang, Xiatian Zhu, Jianfeng Feng

Compared with the latest generative models (\eg, CLD-SGM, DDIM, and Analytic-DDIM), PDS can achieve the best sampling quality on CIFAR-10 at a FID score of 1. 99.

Image Generation

Paper
Code

Syntax and Domain Aware Model for Unsupervised Program Translation

no code implementations • 8 Feb 2023 • Fang Liu, Jia Li, Li Zhang

The experimental results on function translation tasks between Python, Java, and C++ show that SDA-Trans outperforms many large-scale pre-trained models, especially for unseen language translation.

Cross-Lingual Transfer Translation

Paper
Add Code

SimMTM: A Simple Pre-Training Framework for Masked Time-Series Modeling

1 code implementation • NeurIPS 2023 • Jiaxiang Dong, Haixu Wu, Haoran Zhang, Li Zhang, Jianmin Wang, Mingsheng Long

By relating masked modeling to manifold learning, SimMTM proposes to recover masked time points by the weighted aggregation of multiple neighbors outside the manifold, which eases the reconstruction task by assembling ruined but complementary temporal variations from multiple masked series.

Representation Learning Time Series +1

Paper
Code

Faithful Chain-of-Thought Reasoning

1 code implementation • 31 Jan 2023 • Qing Lyu, Shreya Havaldar, Adam Stein, Li Zhang, Delip Rao, Eric Wong, Marianna Apidianaki, Chris Callison-Burch

While Chain-of-Thought (CoT) prompting boosts Language Models' (LM) performance on a gamut of complex reasoning tasks, the generated reasoning chain does not necessarily reflect how the model arrives at the answer (aka.

Math Multi-hop Question Answering +1

146

Paper
Code

SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation

1 code implementation • 30 Jan 2023 • Qiang Wan, Zilong Huang, Jiachen Lu, Gang Yu, Li Zhang

Coupled with a light segmentation head, we achieve the best trade-off between segmentation accuracy and latency on the ARM-based mobile devices on the ADE20K and Cityscapes datasets.

Image Classification Segmentation +1

243

Paper
Code

Causal Reasoning of Entities and Events in Procedural Texts

1 code implementation • 26 Jan 2023 • Li Zhang, Hainiu Xu, Yue Yang, Shuyan Zhou, Weiqiu You, Manni Arora, Chris Callison-Burch

By injecting the causal relations between entities and events as intermediate reasoning steps in our representation, we further boost the performance to . 67 F1.

Paper
Code

LB-SimTSC: An Efficient Similarity-Aware Graph Neural Network for Semi-Supervised Time Series Classification

no code implementations • 12 Jan 2023 • Wenjie Xi, Arnav Jain, Li Zhang, Jessica Lin

Recently, Similarity-aware Time Series Classification (SimTSC) is proposed to address this problem by using a graph neural network classification model on the graph generated from pairwise Dynamic Time Warping (DTW) distance of batch data.

Classification Dynamic Time Warping +3

Paper
Add Code

PMP: Privacy-Aware Matrix Profile against Sensitive Pattern Inference for Time Series

1 code implementation • 4 Jan 2023 • Li Zhang, Jiahao Ding, Yifeng Gao, Jessica Lin

During the process, data sharing is often involved to allow the third-party modelers to perform specific time series data mining (TSDM) tasks based on the need of data owner.

Privacy Preserving Time Series +1

Paper
Code

Language Models are Drummers: Drum Composition with Natural Language Pre-Training

1 code implementation • 3 Jan 2023 • Li Zhang, Chris Callison-Burch

Automatic music generation with artificial intelligence typically requires a large amount of data which is hard to obtain for many less common genres and musical instruments.

Music Generation Transfer Learning

Paper
Code

Translating Images to Road Network: A Non-Autoregressive Sequence-to-Sequence Approach

no code implementations • ICCV 2023 • Jiachen Lu, Renyuan Peng, Xinyue Cai, Hang Xu, Hongyang Li, Feng Wen, Wei zhang, Li Zhang

The extraction of road network is essential for the generation of high-definition maps since it enables the precise localization of road landmarks and their interconnections.

Paper
Add Code

Train-Once-for-All Personalization

no code implementations • CVPR 2023 • Hong-You Chen, Yandong Li, Yin Cui, Mingda Zhang, Wei-Lun Chao, Li Zhang

We study the problem of how to train a "personalization-friendly" model such that given only the task descriptions, the model can be adapted to different end-users' needs, e. g., for accurately classifying different subsets of objects.

Paper
Add Code

MSV Challenge 2022: NPU-HC Speaker Verification System for Low-resource Indian Languages

no code implementations • 30 Nov 2022 • Yue Li, Li Zhang, Namin Wang, Jie Liu, Lei Xie

Specifically, the weight transfer fine-tuning aims to constrain the distance of the weights between the pre-trained model and the fine-tuned model, which takes advantage of the previously acquired discriminative ability from the large-scale out-domain datasets and avoids catastrophic forgetting and overfitting at the same time.

Speaker Verification

Paper
Add Code

Panoramic Video Salient Object Detection with Ambisonic Audio Guidance

no code implementations • 26 Nov 2022 • Xiang Li, Haoyuan Cao, Shijie Zhao, Junlin Li, Li Zhang, Bhiksha Raj

In this paper, we aim to tackle the video salient object detection problem for panoramic videos, with their corresponding ambisonic audios.

Object object-detection +2

Paper
Add Code

Robust Time Series Chain Discovery with Incremental Nearest Neighbors

no code implementations • 3 Nov 2022 • Li Zhang, Yan Zhu, Yifeng Gao, Jessica Lin

Inspired by a recent work that tracks how the nearest neighbor of a time series subsequence changes over time, we introduce a new TSC definition which is much more robust to noise in the data, in the sense that they can better locate the evolving patterns while excluding the non-evolving ones.

Time Series Time Series Analysis

Paper
Add Code

TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge

no code implementations • 26 Oct 2022 • Bowen Pang, Huan Zhao, Gaosheng Zhang, Xiaoyue Yang, Yang Sun, Li Zhang, Qing Wang, Lei Xie

In this challenge, we explore three kinds of typical speaker diarization systems, which are spectral clustering(SC) based diarization, target-speaker voice activity detection(TS-VAD) and end-to-end neural diarization(EEND) respectively.

Action Detection Activity Detection +2

Paper
Add Code

Generative Model Watermarking Based on Human Visual System

no code implementations • 30 Sep 2022 • Li Zhang, Yong liu, Shaoteng Liu, Tianshu Yang, Yexin Wang, Xinpeng Zhang, Hanzhou Wu

Intellectual property protection of deep neural networks is receiving attention from more and more researchers, and the latest research applies model watermarking to generative models for image processing.

Paper
Add Code

NWPU-ASLP System for the VoicePrivacy 2022 Challenge

no code implementations • 24 Sep 2022 • Jixun Yao, Qing Wang, Li Zhang, Pengcheng Guo, Yuhao Liang, Lei Xie

Our system consists of four modules, including feature extractor, acoustic model, anonymization module, and neural vocoder.

Speaker Verification

Paper
Add Code

Dynamic Graph Message Passing Networks for Visual Recognition

2 code implementations • 20 Sep 2022 • Li Zhang, Mohan Chen, Anurag Arnab, xiangyang xue, Philip H. S. Torr

A fully-connected graph, such as the self-attention operation in Transformers, is beneficial for such modelling, however, its computational overhead is prohibitive.

Image Classification object-detection +3

Paper
Code

Model-Guided Multi-Contrast Deep Unfolding Network for MRI Super-resolution Reconstruction

1 code implementation • 15 Sep 2022 • Gang Yang, Li Zhang, Man Zhou, Aiping Liu, Xun Chen, Zhiwei Xiong, Feng Wu

Interpretable neural network models are of significant interest since they enhance the trustworthiness required in clinical practice when dealing with medical images.

Super-Resolution

Paper
Code

Data-Driven Deep Supervision for Skin Lesion Classification

no code implementations • 4 Sep 2022 • Suraj Mishra, Yizhe Zhang, Li Zhang, Tianyu Zhang, X. Sharon Hu, Danny Z. Chen

Specifically, we analyze the convolutional network's behavior (field-of-view) to find the location of deep supervision for improved feature extraction.

Classification Lesion Classification +2

Paper
Add Code

Scalable Nanophotonic-Electronic Spiking Neural Networks

no code implementations • 28 Aug 2022 • Luis El Srouji, Yun-jhu Lee, Mehmet Berkay On, Li Zhang, S. J. Ben Yoo

Photonic devices are ideal for the design of high-bandwidth, parallel architectures matching the SNN computational paradigm.

Paper
Add Code

Hierarchical Reinforcement Learning Based Video Semantic Coding for Segmentation

no code implementations • 24 Aug 2022 • Guangqi Xie, Xin Li, Shiqi Lin, Li Zhang, Kai Zhang, Yue Li, Zhibo Chen

In this paper, we take a step forward to video semantic compression and propose the Hierarchical Reinforcement Learning based task-driven Video Semantic Coding, named as HRLVSC.

Hierarchical Reinforcement Learning reinforcement-learning +3

Paper
Add Code

DeepInteraction: 3D Object Detection via Modality Interaction

2 code implementations • 23 Aug 2022 • Zeyu Yang, Jiaqi Chen, Zhenwei Miao, Wei Li, Xiatian Zhu, Li Zhang

Existing top-performance 3D object detectors typically rely on the multi-modal fusion strategy.

3D Object Detection Object +2

190

Paper
Code

CelebV-HQ: A Large-Scale Video Facial Attributes Dataset

1 code implementation • 25 Jul 2022 • Hao Zhu, Wayne Wu, Wentao Zhu, Liming Jiang, Siwei Tang, Li Zhang, Ziwei Liu, Chen Change Loy

Large-scale datasets have played indispensable roles in the recent success of face generation/editing and significantly facilitated the advances of emerging research fields.

Ranked #1 on Unconditional Video Generation on CelebV-HQ

Attribute Face Generation +1

353

Paper
Code

Vision Transformers: From Semantic Segmentation to Dense Prediction

3 code implementations • 19 Jul 2022 • Li Zhang, Jiachen Lu, Sixiao Zheng, Xinxuan Zhao, Xiatian Zhu, Yanwei Fu, Tao Xiang, Jianfeng Feng, Philip H. S. Torr

In this work, for the first time we explore the global context learning potentials of ViTs for dense visual prediction (e. g., semantic segmentation).

Image Classification Instance Segmentation +5

1,015

Paper
Code

RCLane: Relay Chain Prediction for Lane Detection

no code implementations • 19 Jul 2022 • Shenghua Xu, Xinyue Cai, Bin Zhao, Li Zhang, Hang Xu, Yanwei Fu, xiangyang xue

This is because most of the existing lane detection methods either treat the lane detection as a dense prediction or a detection task, few of them consider the unique topologies (Y-shape, Fork-shape, nearly horizontal lane) of the lane markers, which leads to sub-optimal solution.

Lane Detection

Paper
Add Code

FashionViL: Fashion-Focused Vision-and-Language Representation Learning

1 code implementation • 17 Jul 2022 • Xiao Han, Licheng Yu, Xiatian Zhu, Li Zhang, Yi-Zhe Song, Tao Xiang

We thus propose a Multi-View Contrastive Learning task for pulling closer the visual representation of one image to the compositional multimodal representation of another image+text.

Contrastive Learning Image Retrieval +2

Paper
Code

What Makes for Automatic Reconstruction of Pulmonary Segments

1 code implementation • 7 Jul 2022 • Kaiming Kuang, Li Zhang, Jingyu Li, Hongwei Li, Jiajun Chen, Bo Du, Jiancheng Yang

The automatic reconstruction of pulmonary segments by ImPulSe is accurate in metrics and visually appealing.

3D Reconstruction

Paper
Code

Softmax-free Linear Transformers

1 code implementation • 5 Jul 2022 • Jiachen Lu, Junge Zhang, Xiatian Zhu, Jianfeng Feng, Tao Xiang, Li Zhang

With linear complexity, much longer token sequences are permitted by SOFT, resulting in superior trade-off between accuracy and complexity.

Computational Efficiency

293

Paper
Code

SiamMask: A Framework for Fast Online Object Tracking and Segmentation

no code implementations • 5 Jul 2022 • Weiming Hu, Qiang Wang, Li Zhang, Luca Bertinetto, Philip H. S. Torr

In this paper we introduce SiamMask, a framework to perform both visual object tracking and video object segmentation, in real-time, with the same simple method.

Multiple Object Tracking Object +5

Paper
Add Code

Accelerating Score-based Generative Models with Preconditioned Diffusion Sampling

1 code implementation • 5 Jul 2022 • Hengyuan Ma, Li Zhang, Xiatian Zhu, Jianfeng Feng

However, a fundamental limitation is that their inference is very slow due to a need for many (e. g., 2000) iterations of sequential computations.

Image Generation

Paper
Code

PolarFormer: Multi-camera 3D Object Detection with Polar Transformer

1 code implementation • 30 Jun 2022 • Yanqin Jiang, Li Zhang, Zhenwei Miao, Xiatian Zhu, Jin Gao, Weiming Hu, Yu-Gang Jiang

3D object detection in autonomous driving aims to reason "what" and "where" the objects of interest present in a 3D world.

Ranked #2 on Robust Camera Only 3D Object Detection on nuScenes-C

3D Object Detection Autonomous Driving +5

153

Paper
Code

Knowledge-aware Neural Collective Matrix Factorization for Cross-domain Recommendation

no code implementations • 27 Jun 2022 • Li Zhang, Yan Ge, Jun Ma, Jianmo Ni, Haiping Lu

In this paper, we propose to incorporate the knowledge graph (KG) for CDR, which enables items in different domains to share knowledge.

General Knowledge

Paper
Add Code

GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

no code implementations • 22 Jun 2022 • Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina McMillan-Major, Anna Shvets, Ashish Upadhyay, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Wang, Daniel Deutsch, Deyi Xiong, Di Jin, Dimitra Gkatzia, Dragomir Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter, Genta Indra Winata, Hendrik Strobelt, Hiroaki Hayashi, Jekaterina Novikova, Jenna Kanerva, Jenny Chim, Jiawei Zhou, Jordan Clive, Joshua Maynez, João Sedoc, Juraj Juraska, Kaustubh Dhole, Khyathi Raghavi Chandu, Laura Perez-Beltrachini, Leonardo F. R. Ribeiro, Lewis Tunstall, Li Zhang, Mahima Pushkarna, Mathias Creutz, Michael White, Mihir Sanjay Kale, Moussa Kamal Eddine, Nico Daheim, Nishant Subramani, Ondrej Dusek, Paul Pu Liang, Pawan Sasanka Ammanamanchi, Qi Zhu, Ratish Puduppully, Reno Kriz, Rifat Shahriyar, Ronald Cardenas, Saad Mahamood, Salomey Osei, Samuel Cahyawijaya, Sanja Štajner, Sebastien Montella, Shailza, Shailza Jolly, Simon Mille, Tahmid Hasan, Tianhao Shen, Tosin Adewumi, Vikas Raunak, Vipul Raheja, Vitaly Nikolaev, Vivian Tsai, Yacine Jernite, Ying Xu, Yisi Sang, Yixin Liu, Yufang Hou

This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims.

Benchmarking Text Generation

Paper
Add Code

Intra Encoding Complexity Control with a Time-Cost Model for Versatile Video Coding

no code implementations • 13 Jun 2022 • Yan Huang, Jizheng Xu, Li Zhang, Yan Zhao, Li Song

Inspired by rate control algorithms, we propose a scheme to precisely control the intra encoding complexity of VVC.

Paper
Add Code

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

3 code implementations • 9 Jun 2022 • Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza, Ambrose Slone, Ameet Rahane, Anantharaman S. Iyer, Anders Andreassen, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew Dai, Andrew La, Andrew Lampinen, Andy Zou, Angela Jiang, Angelica Chen, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Ashish Sabharwal, Austin Herrick, Avia Efrat, Aykut Erdem, Ayla Karakaş, B. Ryan Roberts, Bao Sheng Loe, Barret Zoph, Bartłomiej Bojanowski, Batuhan Özyurt, Behnam Hedayatnia, Behnam Neyshabur, Benjamin Inden, Benno Stein, Berk Ekmekci, Bill Yuchen Lin, Blake Howald, Bryan Orinion, Cameron Diao, Cameron Dour, Catherine Stinson, Cedrick Argueta, César Ferri Ramírez, Chandan Singh, Charles Rathkopf, Chenlin Meng, Chitta Baral, Chiyu Wu, Chris Callison-Burch, Chris Waites, Christian Voigt, Christopher D. Manning, Christopher Potts, Cindy Ramirez, Clara E. Rivera, Clemencia Siro, Colin Raffel, Courtney Ashcraft, Cristina Garbacea, Damien Sileo, Dan Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Daniel Freeman, Daniel Khashabi, Daniel Levy, Daniel Moseguí González, Danielle Perszyk, Danny Hernandez, Danqi Chen, Daphne Ippolito, Dar Gilboa, David Dohan, David Drakard, David Jurgens, Debajyoti Datta, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Derek Tam, Dieuwke Hupkes, Diganta Misra, Dilyar Buzan, Dimitri Coelho Mollo, Diyi Yang, Dong-Ho Lee, Dylan Schrader, Ekaterina Shutova, Ekin Dogus Cubuk, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Ellie Pavlick, Emanuele Rodola, Emma Lam, Eric Chu, Eric Tang, Erkut Erdem, Ernie Chang, Ethan A. Chi, Ethan Dyer, Ethan Jerzak, Ethan Kim, Eunice Engefu Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fatemeh Siar, Fernando Martínez-Plumed, Francesca Happé, Francois Chollet, Frieda Rong, Gaurav Mishra, Genta Indra Winata, Gerard de Melo, Germán Kruszewski, Giambattista Parascandolo, Giorgio Mariani, Gloria Wang, Gonzalo Jaimovitch-López, Gregor Betz, Guy Gur-Ari, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hannaneh Hajishirzi, Harsh Mehta, Hayden Bogar, Henry Shevlin, Hinrich Schütze, Hiromu Yakura, Hongming Zhang, Hugh Mee Wong, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, Jacob Hilton, Jaehoon Lee, Jaime Fernández Fisac, James B. Simon, James Koppel, James Zheng, James Zou, Jan Kocoń, Jana Thompson, Janelle Wingfield, Jared Kaplan, Jarema Radom, Jascha Sohl-Dickstein, Jason Phang, Jason Wei, Jason Yosinski, Jekaterina Novikova, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesse Engel, Jesujoba Alabi, Jiacheng Xu, Jiaming Song, Jillian Tang, Joan Waweru, John Burden, John Miller, John U. Balis, Jonathan Batchelder, Jonathan Berant, Jörg Frohberg, Jos Rozen, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Guerr, Joseph Jones, Joshua B. Tenenbaum, Joshua S. Rule, Joyce Chua, Kamil Kanclerz, Karen Livescu, Karl Krauth, Karthik Gopalakrishnan, Katerina Ignatyeva, Katja Markert, Kaustubh D. Dhole, Kevin Gimpel, Kevin Omondi, Kory Mathewson, Kristen Chiafullo, Ksenia Shkaruta, Kumar Shridhar, Kyle McDonell, Kyle Richardson, Laria Reynolds, Leo Gao, Li Zhang, Liam Dugan, Lianhui Qin, Lidia Contreras-Ochando, Louis-Philippe Morency, Luca Moschella, Lucas Lam, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros Colón, Luke Metz, Lütfi Kerem Şenel, Maarten Bosma, Maarten Sap, Maartje ter Hoeve, Maheen Farooqi, Manaal Faruqui, Mantas Mazeika, Marco Baturan, Marco Marelli, Marco Maru, Maria Jose Ramírez Quintana, Marie Tolkiehn, Mario Giulianelli, Martha Lewis, Martin Potthast, Matthew L. Leavitt, Matthias Hagen, Mátyás Schubert, Medina Orduna Baitemirova, Melody Arnaud, Melvin McElrath, Michael A. Yee, Michael Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michał Swędrowski, Michele Bevilacqua, Michihiro Yasunaga, Mihir Kale, Mike Cain, Mimee Xu, Mirac Suzgun, Mitch Walker, Mo Tiwari, Mohit Bansal, Moin Aminnaseri, Mor Geva, Mozhdeh Gheini, Mukund Varma T, Nanyun Peng, Nathan A. Chi, Nayeon Lee, Neta Gur-Ari Krakover, Nicholas Cameron, Nicholas Roberts, Nick Doiron, Nicole Martinez, Nikita Nangia, Niklas Deckers, Niklas Muennighoff, Nitish Shirish Keskar, Niveditha S. Iyer, Noah Constant, Noah Fiedel, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Omer Levy, Owain Evans, Pablo Antonio Moreno Casares, Parth Doshi, Pascale Fung, Paul Pu Liang, Paul Vicol, Pegah Alipoormolabashi, Peiyuan Liao, Percy Liang, Peter Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Pouya Pezeshkpour, Priti Oli, Qiaozhu Mei, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Etta Rudolph, Raefer Gabriel, Rahel Habacker, Ramon Risco, Raphaël Millière, Rhythm Garg, Richard Barnes, Rif A. Saurous, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Roman Sitelew, Ronan LeBras, Rosanne Liu, Rowan Jacobs, Rui Zhang, Ruslan Salakhutdinov, Ryan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Rylan Yang, Sahib Singh, Saif M. Mohammad, Sajant Anand, Sam Dillavou, Sam Shleifer, Sam Wiseman, Samuel Gruetter, Samuel R. Bowman, Samuel S. Schoenholz, Sanghyun Han, Sanjeev Kwatra, Sarah A. Rous, Sarik Ghazarian, Sayan Ghosh, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sepideh Sadeghi, Shadi Hamdan, Sharon Zhou, Shashank Srivastava, Sherry Shi, Shikhar Singh, Shima Asaadi, Shixiang Shane Gu, Shubh Pachchigar, Shubham Toshniwal, Shyam Upadhyay, Shyamolima, Debnath, Siamak Shakeri, Simon Thormeyer, Simone Melzi, Siva Reddy, Sneha Priscilla Makini, Soo-Hwan Lee, Spencer Torene, Sriharsha Hatwar, Stanislas Dehaene, Stefan Divic, Stefano Ermon, Stella Biderman, Stephanie Lin, Stephen Prasad, Steven T. Piantadosi, Stuart M. Shieber, Summer Misherghi, Svetlana Kiritchenko, Swaroop Mishra, Tal Linzen, Tal Schuster, Tao Li, Tao Yu, Tariq Ali, Tatsu Hashimoto, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Tiberius Nkinyili, Timo Schick, Timofei Kornev, Titus Tunduny, Tobias Gerstenberg, Trenton Chang, Trishala Neeraj, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Vera Demberg, Victoria Nyamai, Vikas Raunak, Vinay Ramasesh, Vinay Uday Prabhu, Vishakh Padmakumar, Vivek Srikumar, William Fedus, William Saunders, William Zhang, Wout Vossen, Xiang Ren, Xiaoyu Tong, Xinran Zhao, Xinyi Wu, Xudong Shen, Yadollah Yaghoobzadeh, Yair Lakretz, Yangqiu Song, Yasaman Bahri, Yejin Choi, Yichi Yang, Yiding Hao, Yifu Chen, Yonatan Belinkov, Yu Hou, Yufang Hou, Yuntao Bai, Zachary Seid, Zhuoye Zhao, Zijian Wang, Zijie J. Wang, ZiRui Wang, Ziyi Wu

BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models.

Common Sense Reasoning Math +1

2,660

Paper
Code

Learning Ego 3D Representation as Ray Tracing

1 code implementation • 8 Jun 2022 • Jiachen Lu, Zheyuan Zhou, Xiatian Zhu, Hang Xu, Li Zhang

A self-driving perception model aims to extract 3D semantic representations from multiple cameras collectively into the bird's-eye-view (BEV) coordinate frame of the ego car in order to ground downstream planner.

3D Object Detection Computational Efficiency +4

104

Paper
Code

Accelerating Score-based Generative Models for High-Resolution Image Synthesis

no code implementations • 8 Jun 2022 • Hengyuan Ma, Li Zhang, Xiatian Zhu, Jingfeng Zhang, Jianfeng Feng

To ensure stability of convergence in sampling and generation quality, however, this sequential sampling process has to take a small step size and many sampling iterations (e. g., 2000).

Image Generation Vocal Bursts Intensity Prediction

Paper
Add Code

Region-Aware Metric Learning for Open World Semantic Segmentation via Meta-Channel Aggregation

1 code implementation • 17 May 2022 • Hexin Dong, ZiFan Chen, Mingze Yuan, Yutong Xie, Jie Zhao, Fei Yu, Bin Dong, Li Zhang

Therefore, we propose a method called region-aware metric learning (RAML), which first separates the regions of the images and generates region-aware features for further metric learning.

Few-Shot Learning Metric Learning +2

Paper
Code

Reasoning about Procedures with Natural Language Processing: A Tutorial

no code implementations • 16 May 2022 • Li Zhang

This tutorial provides a comprehensive and in-depth view of the research on procedures, primarily in Natural Language Processing.

Paper
Add Code

ONCE-3DLanes: Building Monocular 3D Lane Detection

2 code implementations • CVPR 2022 • Fan Yan, Ming Nie, Xinyue Cai, Jianhua Han, Hang Xu, Zhen Yang, Chaoqiang Ye, Yanwei Fu, Michael Bi Mi, Li Zhang

We present ONCE-3DLanes, a real-world autonomous driving dataset with lane layout annotation in 3D space.

3D Lane Detection Autonomous Driving

395

Paper
Code

In Defense of Subspace Tracker: Orthogonal Embedding for Visual Tracking

no code implementations • 17 Apr 2022 • Yao Sui, Guanghui Wang, Li Zhang

The paper focuses on a classical tracking model, subspace learning, grounded on the fact that the targets in successive frames are considered to reside in a low-dimensional subspace or manifold due to the similarity in their appearances.

Visual Tracking

Paper
Add Code

Bidirectional Self-Training with Multiple Anisotropic Prototypes for Domain Adaptive Semantic Segmentation

1 code implementation • 16 Apr 2022 • Yulei Lu, Yawei Luo, Li Zhang, Zheyang Li, Yi Yang, Jun Xiao

A thriving trend for domain adaptive segmentation endeavors to generate the high-quality pseudo labels for target domain and retrain the segmentor on them.

Ranked #12 on Unsupervised Domain Adaptation on GTAV-to-Cityscapes Labels

Pseudo Label Semantic Segmentation +2

Paper
Code

CholecTriplet2021: A benchmark challenge for surgical action triplet recognition

6 code implementations • 10 Apr 2022 • Chinedu Innocent Nwoye, Deepak Alapatt, Tong Yu, Armine Vardazaryan, Fangfang Xia, Zixuan Zhao, Tong Xia, Fucang Jia, Yuxuan Yang, Hao Wang, Derong Yu, Guoyan Zheng, Xiaotian Duan, Neil Getty, Ricardo Sanchez-Matilla, Maria Robu, Li Zhang, Huabin Chen, Jiacheng Wang, Liansheng Wang, Bokai Zhang, Beerend Gerats, Sista Raviteja, Rachana Sathish, Rong Tao, Satoshi Kondo, Winnie Pang, Hongliang Ren, Julian Ronald Abbing, Mohammad Hasan Sarhan, Sebastian Bodenstedt, Nithya Bhasker, Bruno Oliveira, Helena R. Torres, Li Ling, Finn Gaida, Tobias Czempiel, João L. Vilaça, Pedro Morais, Jaime Fonseca, Ruby Mae Egging, Inge Nicole Wijma, Chen Qian, GuiBin Bian, Zhen Li, Velmurugan Balasubramanian, Debdoot Sheet, Imanol Luengo, Yuanbo Zhu, Shuai Ding, Jakob-Anton Aschenbrenner, Nicolas Elini van der Kar, Mengya Xu, Mobarakol Islam, Lalithkumar Seenivasan, Alexander Jenke, Danail Stoyanov, Didier Mutter, Pietro Mascagni, Barbara Seeliger, Cristians Gonzalez, Nicolas Padoy

In this paper, we present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge.

Ranked #1 on Action Triplet Recognition on CholecT50 (Challenge) (using extra training data)

Action Detection Action Triplet Recognition +1

Paper
Code

UIGR: Unified Interactive Garment Retrieval

1 code implementation • 6 Apr 2022 • Xiao Han, Sen He, Li Zhang, Yi-Zhe Song, Tao Xiang

In this paper, we propose a Unified Interactive Garment Retrieval (UIGR) framework to unify TGR and VCR.

Retrieval

Paper
Code

ImpDet: Exploring Implicit Fields for 3D Object Detection

no code implementations • 31 Mar 2022 • Xuelin Qian, Li Wang, Yi Zhu, Li Zhang, Yanwei Fu, xiangyang xue

Conventional 3D object detection approaches concentrate on bounding boxes representation learning with several parameters, i. e., localization, dimension, and orientation.

3D Object Detection Object +2

Paper
Add Code

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

2 code implementations • 24 Mar 2022 • Likun Cai, Zhi Zhang, Yi Zhu, Li Zhang, Mu Li, xiangyang xue

Multiple datasets and open challenges for object detection have been introduced in recent years.

Ranked #1 on Object Detection on BigDetection val

Object object-detection +1

380

Paper
Code

Multi-Scale Context-Guided Lumbar Spine Disease Identification with Coarse-to-fine Localization and Classification

1 code implementation • 16 Mar 2022 • ZiFan Chen, Jie Zhao, Hao Yu, Yue Zhang, Li Zhang

Accurate and efficient lumbar spine disease identification is crucial for clinical diagnosis.

Paper
Code

Show Me More Details: Discovering Hierarchies of Procedures from Semi-structured Web Data

1 code implementation • ACL 2022 • Shuyan Zhou, Li Zhang, Yue Yang, Qing Lyu, Pengcheng Yin, Chris Callison-Burch, Graham Neubig

To this end, we develop a simple and efficient method that links steps (e. g., "purchase a camera") in an article to other articles with similar goals (e. g., "how to choose a camera"), recursively constructing the KB.

Retrieval Video Retrieval

Paper
Code

A general framework for adaptive two-index fusion attribute weighted naive Bayes

no code implementations • 24 Feb 2022 • Xiaoliang Zhou, Dongyang Wu, Zitong You, Li Zhang, Ning Ye

In addition, the ATFNB framework can improve the existing two-index NB model by introducing the adaptive switching factor \{beta}.

Attribute

Paper
Add Code

CrossMoDA 2021 challenge: Benchmark of Cross-Modality Domain Adaptation techniques for Vestibular Schwannoma and Cochlea Segmentation

3 code implementations • 8 Jan 2022 • Reuben Dorent, Aaron Kujawa, Marina Ivory, Spyridon Bakas, Nicola Rieke, Samuel Joutard, Ben Glocker, Jorge Cardoso, Marc Modat, Kayhan Batmanghelich, Arseniy Belkov, Maria Baldeon Calisto, Jae Won Choi, Benoit M. Dawant, Hexin Dong, Sergio Escalera, Yubo Fan, Lasse Hansen, Mattias P. Heinrich, Smriti Joshi, Victoriya Kashtanova, Hyeon Gyu Kim, Satoshi Kondo, Christian N. Kruse, Susana K. Lai-Yuen, Hao Li, Han Liu, Buntheng Ly, Ipek Oguz, Hyungseob Shin, Boris Shirokikh, Zixian Su, Guotai Wang, Jianghao Wu, Yanwu Xu, Kai Yao, Li Zhang, Sebastien Ourselin, Jonathan Shapey, Tom Vercauteren

The aim was to automatically perform unilateral VS and bilateral cochlea segmentation on hrT2 as provided in the testing set (N=137).

Brain Segmentation Domain Adaptation +4

110

Paper
Code

The Devil is in the Task: Exploiting Reciprocal Appearance-Localization Features for Monocular 3D Object Detection

no code implementations • ICCV 2021 • Zhikang Zou, Xiaoqing Ye, Liang Du, Xianhui Cheng, Xiao Tan, Li Zhang, Jianfeng Feng, xiangyang xue, Errui Ding

Low-cost monocular 3D object detection plays a fundamental role in autonomous driving, whereas its accuracy is still far from satisfactory.

Autonomous Driving Monocular 3D Object Detection +4

Paper
Add Code

Is "My Favorite New Movie" My Favorite Movie? Probing the Understanding of Recursive Noun Phrases

1 code implementation • 15 Dec 2021 • Qing Lyu, Hua Zheng, Daoxin Li, Li Zhang, Marianna Apidianaki, Chris Callison-Burch

Common Sense Reasoning Natural Language Inference

Paper
Code

Persistent Animal Identification Leveraging Non-Visual Markers

2 code implementations • 13 Dec 2021 • Michael P. J. Camilleri, Li Zhang, Rasneer S. Bains, Andrew Zisserman, Christopher K. I. Williams

Our objective is to locate and provide a unique identifier for each mouse in a cluttered home-cage environment through time, as a precursor to automated behaviour recognition for biological research.

Visual Tracking

Paper
Code

SGM3D: Stereo Guided Monocular 3D Object Detection

1 code implementation • 3 Dec 2021 • Zheyuan Zhou, Liang Du, Xiaoqing Ye, Zhikang Zou, Xiao Tan, Li Zhang, xiangyang xue, Jianfeng Feng

Monocular 3D object detection aims to predict the object location, dimension and orientation in 3D space alongside the object category given only a monocular image.

Autonomous Driving Depth Estimation +4

Paper
Code

ALX: Large Scale Matrix Factorization on TPUs

no code implementations • 3 Dec 2021 • Harsh Mehta, Steffen Rendle, Walid Krichene, Li Zhang

We present ALX, an open-source library for distributed matrix factorization using Alternating Least Squares, written in JAX.

Link Prediction

Paper
Add Code

Learning from Mistakes -- A Framework for Neural Architecture Search

1 code implementation • 11 Nov 2021 • Bhanu Garg, Li Zhang, Pradyumna Sridhara, Ramtin Hosseini, Eric Xing, Pengtao Xie

We propose a novel machine learning method called Learning From Mistakes (LFM), wherein the learner improves its ability to learn by focusing more on the mistakes during revision.

BIG-bench Machine Learning Neural Architecture Search

Paper
Code

iALS++: Speeding up Matrix Factorization with Subspace Optimization

1 code implementation • 26 Oct 2021 • Steffen Rendle, Walid Krichene, Li Zhang, Yehuda Koren

However, iALS does not scale well with large embedding dimensions, d, due to its cubic runtime dependency on d. Coordinate descent variations, iCD, have been proposed to lower the complexity to quadratic in d. In this work, we show that iCD approaches are not well suited for modern processors and can be an order of magnitude slower than a careful iALS implementation for small to mid scale embedding sizes (d ~ 100) and only perform better than iALS on large embeddings d ~ 1000.

32,870

Paper
Code

Revisiting the Performance of iALS on Item Recommendation Benchmarks

1 code implementation • 26 Oct 2021 • Steffen Rendle, Walid Krichene, Li Zhang, Yehuda Koren

Matrix factorization learned by implicit alternating least squares (iALS) is a popular baseline in recommender system research publications.

Collaborative Filtering Recommendation Systems

32,870

Paper
Code

SOFT: Softmax-free Transformer with Linear Complexity

2 code implementations • NeurIPS 2021 • Jiachen Lu, Jinghan Yao, Junge Zhang, Xiatian Zhu, Hang Xu, Weiguo Gao, Chunjing Xu, Tao Xiang, Li Zhang

Crucially, with a linear complexity, much longer token sequences are permitted in SOFT, resulting in superior trade-off between accuracy and complexity.

Computational Efficiency

293

Paper
Code

Text-Based Person Search with Limited Data

1 code implementation • 20 Oct 2021 • Xiao Han, Sen He, Li Zhang, Tao Xiang

Firstly, to fully utilize the existing small-scale benchmarking datasets for more discriminative feature learning, we introduce a cross-modal momentum contrastive learning framework to enrich the training data for a given mini-batch.

Ranked #10 on Text based Person Retrieval on CUHK-PEDES (using extra training data)

Benchmarking Contrastive Learning +7

Paper
Code

Unsupervised Domain Adaptation in Semantic Segmentation Based on Pixel Alignment and Self-Training

no code implementations • 29 Sep 2021 • Hexin Dong, Fei Yu, Jie Zhao, Bin Dong, Li Zhang

This paper proposes an unsupervised cross-modality domain adaptation approach based on pixel alignment and self-training.

Segmentation Semantic Segmentation +1

Paper
Add Code

Multi-Frequency Wireless Channel Measurements and Characteristics Analysis in Indoor Corridor Scenarios

no code implementations • 14 Aug 2021 • ZiHao Zhou, Li Zhang, Xinyue Chen, Cheng-Xiang Wang, Jie Huang

In this paper, we conduct wireless channel measurements in indoor corridor scenarios at 2. 4, 5 and 6 GHz bands with bandwidth of 320 MHz.

Paper
Add Code

Progressive Coordinate Transforms for Monocular 3D Object Detection

1 code implementation • NeurIPS 2021 • Li Wang, Li Zhang, Yi Zhu, Zhi Zhang, Tong He, Mu Li, xiangyang xue

Recognizing and localizing objects in the 3D space is a crucial ability for an AI agent to perceive its surrounding environment.

Monocular 3D Object Detection Object +2

Paper
Code

Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer

1 code implementation • ICCV 2021 • Zhihe Lu, Sen He, Xiatian Zhu, Li Zhang, Yi-Zhe Song, Tao Xiang

A few-shot semantic segmentation model is typically composed of a CNN encoder, a CNN decoder and a simple classifier (separating foreground and background pixels).

Ranked #9 on Few-Shot Semantic Segmentation on COCO-20i -> Pascal VOC (5-shot)

Few-Shot Semantic Segmentation Meta-Learning +1

126

Paper
Code

A Unified Efficient Pyramid Transformer for Semantic Segmentation

no code implementations • 29 Jul 2021 • Fangrui Zhu, Yi Zhu, Li Zhang, Chongruo wu, Yanwei Fu, Mu Li

Semantic segmentation is a challenging problem due to difficulties in modeling context in complex scenes and class confusions along boundaries.

Segmentation Semantic Segmentation

Paper
Add Code

Goal-Oriented Script Construction

1 code implementation • INLG (ACL) 2021 • Qing Lyu, Li Zhang, Chris Callison-Burch

The knowledge of scripts, common chains of events in stereotypical scenarios, is a valuable asset for task-oriented natural language understanding systems.

Language Modelling Natural Language Understanding +1

Paper
Code

Global Aggregation then Local Distribution for Scene Parsing

1 code implementation • 28 Jul 2021 • Xiangtai Li, Li Zhang, Guangliang Cheng, Kuiyuan Yang, Yunhai Tong, Xiatian Zhu, Tao Xiang

Modelling long-range contextual relationships is critical for pixel-wise prediction tasks such as semantic segmentation.

Scene Parsing Segmentation +1

344

Paper
Code

Private Alternating Least Squares: Practical Private Matrix Completion with Tighter Rates

no code implementations • 20 Jul 2021 • Steve Chien, Prateek Jain, Walid Krichene, Steffen Rendle, Shuang Song, Abhradeep Thakurta, Li Zhang

We study the problem of differentially private (DP) matrix completion under user-level privacy.

Matrix Completion

Paper
Add Code

Oneshot Differentially Private Top-k Selection

no code implementations • 18 May 2021 • Gang Qiao, Weijie J. Su, Li Zhang

Being able to efficiently and accurately select the top-$k$ elements with differential privacy is an integral component of various private data analysis tasks.

Paper
Add Code

Composite Localization for Human Pose Estimation

no code implementations • 15 May 2021 • ZiFan Chen, Xin Qin, Chao Yang, Li Zhang

This work proposes a novel deep learning framework for human pose estimation called composite localization to divide the complex learning objective into two simpler ones: a sparse heatmap to find the keypoint's approximate location and two short-distance offsetmaps to obtain its final precise coordinates.

Distance regression Pose Estimation

Paper
Add Code

BasisNet: Two-stage Model Synthesis for Efficient Inference

no code implementations • 7 May 2021 • Mingda Zhang, Chun-Te Chu, Andrey Zhmoginov, Andrew Howard, Brendan Jou, Yukun Zhu, Li Zhang, Rebecca Hwa, Adriana Kovashka

With early termination, the average cost can be further reduced to 198M MAdds while maintaining accuracy of 80. 0% on ImageNet.

Ranked #664 on Image Classification on ImageNet

Efficient Neural Network Image Classification +1

Paper
Add Code

Prediction of clinical tremor severity using Rank Consistent Ordinal Regression

no code implementations • 3 May 2021 • Li Zhang, Vijay Yadav, Vidya Koesmahargyo, Anzar Abbas, Isaac Galatzer-Levy

The videos are coupled with clinician assessed TETRAS scores, which are used as ground truth labels to train the DNN.

regression Transfer Learning

Paper
Add Code

Delving into Data: Effectively Substitute Training for Black-box Attack

no code implementations • CVPR 2021 • Wenxuan Wang, Bangjie Yin, Taiping Yao, Li Zhang, Yanwei Fu, Shouhong Ding, Jilin Li, Feiyue Huang, xiangyang xue

Previous substitute training approaches focus on stealing the knowledge of the target model based on real training data or synthetic data, without exploring what kind of data can further improve the transferability between the substitute and target models.

Adversarial Attack

Paper
Add Code

Optimize Neural Fictitious Self-Play in Regret Minimization Thinking

no code implementations • 22 Apr 2021 • Yuxuan Chen, Li Zhang, Shijian Li, Gang Pan

Optimization of deep learning algorithms to approach Nash Equilibrium remains a significant problem in imperfect information games, e. g. StarCraft and poker.

Starcraft

Paper
Add Code

Improving Weakly-supervised Object Localization via Causal Intervention

1 code implementation • 21 Apr 2021 • Feifei Shao, Yawei Luo, Li Zhang, Lu Ye, Siliang Tang, Yi Yang, Jun Xiao

The recent emerged weakly supervised object localization (WSOL) methods can learn to localize an object in the image only using image-level labels.

Object Weakly-Supervised Object Localization

Paper
Code

Visual Goal-Step Inference using wikiHow

1 code implementation • EMNLP 2021 • Yue Yang, Artemis Panagopoulou, Qing Lyu, Li Zhang, Mark Yatskar, Chris Callison-Burch

Understanding what sequence of steps are needed to complete a goal can help artificial intelligence systems reason about human activities.

Ranked #1 on VGSI on wikiHow-image

Multimodal Reasoning VGSI

Paper
Code

BEFD: Boundary Enhancement and Feature Denoising for Vessel Segmentation

no code implementations • 8 Apr 2021 • Mo Zhang, Fei Yu, Jie Zhao, Li Zhang, Quanzheng Li

Blood vessel segmentation is crucial for many diagnostic and research applications.

Denoising Image Segmentation +3

Paper
Add Code

Hierarchical Road Topology Learning for Urban Map-less Driving

no code implementations • 31 Mar 2021 • Li Zhang, Faezeh Tafazzoli, Gunther Krehl, Runsheng Xu, Timo Rehfeld, Manuel Schier, Arunava Seal

The majority of current approaches in autonomous driving rely on High-Definition (HD) maps which detail the road geometry and surrounding area.

Autonomous Driving

Paper
Add Code

Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection

1 code implementation • CVPR 2021 • Li Wang, Liang Du, Xiaoqing Ye, Yanwei Fu, Guodong Guo, xiangyang xue, Jianfeng Feng, Li Zhang

The objective of this paper is to learn context- and depth-aware feature representation to solve the problem of monocular 3D object detection.

Ranked #13 on Monocular 3D Object Detection on KITTI Cars Moderate

Monocular 3D Object Detection object-detection

Paper
Code

Learning Dynamic Alignment via Meta-filter for Few-shot Learning

1 code implementation • CVPR 2021 • Chengming Xu, Chen Liu, Li Zhang, Chengjie Wang, Jilin Li, Feiyue Huang, xiangyang xue, Yanwei Fu

Our insight is that these methods would lead to poor adaptation with redundant matching, and leveraging channel-wise adjustment is the key to well adapting the learned knowledge to new classes.

Few-Shot Learning Position

Paper
Code

Robust and Accurate Object Detection via Adversarial Learning

1 code implementation • CVPR 2021 • Xiangning Chen, Cihang Xie, Mingxing Tan, Li Zhang, Cho-Jui Hsieh, Boqing Gong

Data augmentation has become a de facto component for training high-performance deep image classifiers, but its potential is under-explored for object detection.

Ranked #17 on Object Detection on COCO-O

AutoML Data Augmentation +3

6,156

Paper
Code

Complementary Evidence Identification in Open-Domain Question Answering

no code implementations • EACL 2021 • Xiangyang Mou, Mo Yu, Shiyu Chang, Yufei Feng, Li Zhang, Hui Su

This paper proposes a new problem of complementary evidence identification for open-domain question answering (QA).

Evidence Selection Open-Domain Question Answering

Paper
Add Code

MoViNets: Mobile Video Networks for Efficient Video Recognition

3 code implementations • CVPR 2021 • Dan Kondratyuk, Liangzhe Yuan, Yandong Li, Li Zhang, Mingxing Tan, Matthew Brown, Boqing Gong

We present Mobile Video Networks (MoViNets), a family of computation and memory efficient video networks that can operate on streaming video for online inference.

Ranked #3 on Action Classification on Charades

Action Classification Action Recognition +4

76,614

Paper
Code

Automatically detecting the conflicts between software requirements based on finer semantic analysis

1 code implementation • 3 Mar 2021 • Weize Guo, Li Zhang, Xiaoli Lian

Besides, our approach is capable of transforming the natural language functional requirements into eight semantic tuples, which is useful not only the detection of the conflicts between requirements but also some other tasks such as constructing the association between requirements and so on.

Paper
Code

The NPU System for the 2020 Personalized Voice Trigger Challenge

1 code implementation • 26 Feb 2021 • Jingyong Hou, Li Zhang, Yihui Fu, Qing Wang, Zhanheng Yang, Qijie Shao, Lei Xie

This paper describes the system developed by the NPU team for the 2020 personalized voice trigger challenge.

Small-Footprint Keyword Spotting Speaker Verification

10,168

Paper
Code

EEGFuseNet: Hybrid Unsupervised Deep Feature Characterization and Fusion for High-Dimensional EEG with An Application to Emotion Recognition

no code implementations • 7 Feb 2021 • Zhen Liang, Rushuang Zhou, Li Zhang, Linling Li, Gan Huang, Zhiguo Zhang, Shin Ishii

The performance of the extracted deep and low-dimensional features by EEGFuseNet is carefully evaluated in an unsupervised emotion recognition application based on three public emotion databases.

EEG Emotion Recognition +2

Paper
Add Code

Failure Prediction in Production Line Based on Federated Learning: An Empirical Study

no code implementations • 25 Jan 2021 • Ning Ge, Guanghao Li, Li Zhang, Yi Liu Yi Liu

Data protection across organizations is limiting the application of centralized learning (CL) techniques.

Federated Learning

Paper
Add Code

Few-shot Action Recognition with Prototype-centered Attentive Learning

1 code implementation • 20 Jan 2021 • Xiatian Zhu, Antoine Toisoul, Juan-Manuel Perez-Rua, Li Zhang, Brais Martinez, Tao Xiang

Extensive experiments on four standard few-shot action benchmarks show that our method clearly outperforms previous state-of-the-art methods, with the improvement particularly significant (10+\%) on the most challenging fine-grained action recognition benchmark.

Contrastive Learning Few-Shot action recognition +3

Paper
Code

TEAC: Intergrating Trust Region and Max Entropy Actor Critic for Continuous Control

1 code implementation • 1 Jan 2021 • Hongyu Zang, Xin Li, Li Zhang, Peiyao Zhao, Mingzhong Wang

Trust region methods and maximum entropy methods are two state-of-the-art branches used in reinforcement learning (RL) for the benefits of stability and exploration in continuous environments, respectively.

Continuous Control Reinforcement Learning (RL)

Paper
Code

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

5 code implementations • CVPR 2021 • Sixiao Zheng, Jiachen Lu, Hengshuang Zhao, Xiatian Zhu, Zekun Luo, Yabiao Wang, Yanwei Fu, Jianfeng Feng, Tao Xiang, Philip H. S. Torr, Li Zhang

In this paper, we aim to provide an alternative perspective by treating semantic segmentation as a sequence-to-sequence prediction task.

Ranked #2 on Semantic Segmentation on FoodSeg103 (using extra training data)

Medical Image Segmentation Segmentation +1

8,265

Paper
Code

Hop-Hop Relation-aware Graph Neural Networks

no code implementations • 21 Dec 2020 • Li Zhang, Yan Ge, Haiping Lu

Graph Neural Networks (GNNs) are widely used in graph representation learning.

Knowledge Graph Embedding Relation

Paper
Add Code

Unifying Homophily and Heterophily Network Transformation via Motifs

no code implementations • 21 Dec 2020 • Yan Ge, Jun Ma, Li Zhang, Haiping Lu

Because H2NT can sparsify networks with motif structures, it can also improve the computational efficiency of existing network embedding methods when integrated.

Computational Efficiency Network Embedding +1

Paper
Add Code

Rankmax: An Adaptive Projection Alternative to the Softmax Function

no code implementations • NeurIPS 2020 • Weiwei Kong, Walid Krichene, Nicolas Mayoraz, Steffen Rendle, Li Zhang

Several machine learning models involve mapping a score vector to a probability vector.

Paper
Add Code

A Systematic Literature Review on Federated Learning: From A Model Quality Perspective

no code implementations • 1 Dec 2020 • Yi Liu, Li Zhang, Ning Ge, Guanghao Li

In this process, the server uses an incentive mechanism to encourage clients to contribute high-quality and large-volume data to improve the global model.

Federated Learning

Paper
Add Code

Boundary-sensitive Pre-training for Temporal Localization in Videos

1 code implementation • ICCV 2021 • Mengmeng Xu, Juan-Manuel Perez-Rua, Victor Escorcia, Brais Martinez, Xiatian Zhu, Li Zhang, Bernard Ghanem, Tao Xiang

However, most existing models developed for these tasks are pre-trained on general video action classification tasks.

Ranked #23 on Temporal Action Localization on ActivityNet-1.3

Action Classification Classification +3

Paper
Code

Direct Classification of Emotional Intensity

no code implementations • 15 Nov 2020 • Jacob Ouyang, Isaac R Galatzer-Levy, Vidya Koesmahargyo, Li Zhang

In this paper, we present a model that can directly predict emotion intensity score from video inputs, instead of deriving from action units.

Classification General Classification

Paper
Add Code

Skin disease diagnosis with deep learning: a review

no code implementations • 11 Nov 2020 • Hongfeng Li, Yini Pan, Jie Zhao, Li Zhang

As an important part of this article, we then review the literature involving deep learning methods for skin disease diagnosis from several aspects according to the specific tasks.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.