Search Results for author: Chunyu Wang

Found 37 papers, 17 papers with code

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

no code implementations • 22 Apr 2024 • Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Jianmin Bao, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Qin Cai, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Dong Chen, Dongdong Chen, Yen-Chun Chen, Yi-Ling Chen, Parul Chopra, Xiyang Dai, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Victor Fragoso, Dan Iter, Mei Gao, Min Gao, Jianfeng Gao, Amit Garg, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Jamie Huynh, Mojan Javaheripi, Xin Jin, Piero Kauffmann, Nikos Karampatziakis, Dongwoo Kim, Mahoud Khademi, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Yunsheng Li, Chen Liang, Lars Liden, Ce Liu, Mengchen Liu, Weishung Liu, Eric Lin, Zeqi Lin, Chong Luo, Piyush Madan, Matt Mazzola, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Swadheen Shukla, Xia Song, Masahiro Tanaka, Andrea Tupini, Xin Wang, Lijuan Wang, Chunyu Wang, Yu Wang, Rachel Ward, Guanhua Wang, Philipp Witte, Haiping Wu, Michael Wyatt, Bin Xiao, Can Xu, Jiahang Xu, Weijian Xu, Sonali Yadav, Fan Yang, Jianwei Yang, ZiYi Yang, Yifan Yang, Donghan Yu, Lu Yuan, Chengruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, Xiren Zhou

We introduce phi-3-mini, a 3. 8 billion parameter language model trained on 3. 3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3. 5 (e. g., phi-3-mini achieves 69% on MMLU and 8. 38 on MT-bench), despite being small enough to be deployed on a phone.

Language Modelling

Paper
Add Code

GaussianCube: A Structured and Explicit Radiance Representation for 3D Generative Modeling

no code implementations • 28 Mar 2024 • BoWen Zhang, Yiji Cheng, Jiaolong Yang, Chunyu Wang, Feng Zhao, Yansong Tang, Dong Chen, Baining Guo

We introduce a radiance representation that is both structured and fully explicit and thus greatly facilitates 3D generative modeling.

Decoder Text to 3D

Paper
Add Code

Correlation-Embedded Transformer Tracking: A Single-Branch Framework

1 code implementation • 23 Jan 2024 • Fei Xie, Wankou Yang, Chunyu Wang, Lei Chu, Yue Cao, Chao Ma, Wenjun Zeng

Thus, we reformulate the two-branch Siamese tracking as a conceptually simple, fully transformer-based Single-Branch Tracking pipeline, dubbed SBT.

Feature Correlation Visual Object Tracking

Paper
Code

Plan, Posture and Go: Towards Open-World Text-to-Motion Generation

no code implementations • 22 Dec 2023 • Jinpeng Liu, Wenxun Dai, Chunyu Wang, Yiji Cheng, Yansong Tang, Xin Tong

Some works use the CLIP model to align the motion space and the text space, aiming to enable motion generation from natural language motion descriptions.

Paper
Add Code

VolumeDiffusion: Flexible Text-to-3D Generation with Efficient Volumetric Encoder

1 code implementation • 18 Dec 2023 • Zhicong Tang, Shuyang Gu, Chunyu Wang, Ting Zhang, Jianmin Bao, Dong Chen, Baining Guo

The 3D volumes are then trained on a diffusion model for text-to-3D generation using a 3D U-Net.

3D Generation Object +1

Paper
Code

MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation

no code implementations • 30 Nov 2023 • Yanhui Wang, Jianmin Bao, Wenming Weng, Ruoyu Feng, Dacheng Yin, Tao Yang, Jingxu Zhang, Qi Dai Zhiyuan Zhao, Chunyu Wang, Kai Qiu, Yuhui Yuan, Chuanxin Tang, Xiaoyan Sun, Chong Luo, Baining Guo

We present MicroCinema, a straightforward yet effective framework for high-quality and coherent text-to-video generation.

Text-to-Image Generation Text-to-Video Generation +1

Paper
Add Code

ART$\boldsymbol{\cdot}$V: Auto-Regressive Text-to-Video Generation with Diffusion Models

no code implementations • 30 Nov 2023 • Wenming Weng, Ruoyu Feng, Yanhui Wang, Qi Dai, Chunyu Wang, Dacheng Yin, Zhiyuan Zhao, Kai Qiu, Jianmin Bao, Yuhui Yuan, Chong Luo, Yueyi Zhang, Zhiwei Xiong

Second, it preserves the high-fidelity generation ability of the pre-trained image diffusion models by making only minimal network modifications.

Text-to-Video Generation Video Generation

Paper
Add Code

GAIA: Zero-shot Talking Avatar Generation

no code implementations • 26 Nov 2023 • Tianyu He, Junliang Guo, Runyi Yu, Yuchi Wang, Jialiang Zhu, Kaikai An, Leyi Li, Xu Tan, Chunyu Wang, Han Hu, HsiangTao Wu, Sheng Zhao, Jiang Bian

Zero-shot talking avatar generation aims at synthesizing natural talking videos from speech and a single portrait image.

Paper
Add Code

Multiple View Geometry Transformers for 3D Human Pose Estimation

no code implementations • 18 Nov 2023 • Ziwei Liao, Jialiang Zhu, Chunyu Wang, Han Hu, Steven L. Waslander

In this work, we aim to improve the 3D reasoning ability of Transformers in multi-view 3D human pose estimation.

3D Human Pose Estimation

Paper
Add Code

V-DETR: DETR with Vertex Relative Position Encoding for 3D Object Detection

1 code implementation • 8 Aug 2023 • Yichao Shen, Zigang Geng, Yuhui Yuan, Yutong Lin, Ze Liu, Chunyu Wang, Han Hu, Nanning Zheng, Baining Guo

We introduce a highly performant 3D object detector for point clouds using the DETR framework.

Ranked #1 on 3D Object Detection on ScanNetV2

3D Object Detection Decoder +2

Paper
Code

Human Pose as Compositional Tokens

1 code implementation • CVPR 2023 • Zigang Geng, Chunyu Wang, Yixuan Wei, Ze Liu, Houqiang Li, Han Hu

Human pose is typically represented by a coordinate vector of body joints or their heatmap embeddings.

Ranked #1 on Pose Estimation on MPII Human Pose

Decoder Pose Estimation

269

Paper
Code

VMarker-Pro: Probabilistic 3D Human Mesh Estimation from Virtual Markers

2 code implementations • CVPR 2023 • Xiaoxuan Ma, Jiajun Su, Yuan Xu, Wentao Zhu, Chunyu Wang, Yizhou Wang

Monocular 3D human mesh estimation faces challenges due to depth ambiguity and the complexity of mapping images to complex parameter spaces.

Ranked #1 on 3D Human Pose Estimation on Surreal

3D Human Pose Estimation 3D Pose Estimation

238

Paper
Code

Robust Multi-Object Tracking by Marginal Inference

no code implementations • 7 Aug 2022 • Yifu Zhang, Chunyu Wang, Xinggang Wang, Wenjun Zeng, Wenyu Liu

To address the problem, we present an efficient approach to compute a marginal probability for each pair of objects in real time.

Multi-Object Tracking Object

Paper
Add Code

One-Shot Medical Landmark Localization by Edge-Guided Transform and Noisy Landmark Refinement

no code implementations • 31 Jul 2022 • Zihao Yin, Ping Gong, Chunyu Wang, Yizhou Yu, Yizhou Wang

As an important upstream task for many medical applications, supervised landmark localization still requires non-negligible annotation costs to achieve desirable performance.

Paper
Add Code

Faster VoxelPose: Real-time 3D Human Pose Estimation by Orthographic Projection

1 code implementation • 22 Jul 2022 • Hang Ye, Wentao Zhu, Chunyu Wang, Rujie Wu, Yizhou Wang

While the voxel-based methods have achieved promising results for multi-person 3D pose estimation from multi-cameras, they suffer from heavy computation burdens, especially for large scenes.

Ranked #5 on 3D Multi-Person Pose Estimation on Campus

3D Multi-Person Pose Estimation 3D Pose Estimation

143

Paper
Code

VirtualPose: Learning Generalizable 3D Human Pose Models from Virtual Data

1 code implementation • 20 Jul 2022 • Jiajun Su, Chunyu Wang, Xiaoxuan Ma, Wenjun Zeng, Yizhou Wang

While monocular 3D pose estimation seems to have achieved very accurate results on the public datasets, their generalization ability is largely overlooked.

Ranked #5 on 3D Multi-Person Pose Estimation (absolute) on MuPoTS-3D

3D Multi-Person Pose Estimation (absolute) 3D Pose Estimation

Paper
Code

Correlation-Aware Deep Tracking

1 code implementation • CVPR 2022 • Fei Xie, Chunyu Wang, Guangting Wang, Yue Cao, Wankou Yang, Wenjun Zeng

In contrast to the Siamese-like feature extraction, our network deeply embeds cross-image feature correlation in multiple layers of the feature network.

Feature Correlation Visual Object Tracking

Paper
Code

Learning Tracking Representations via Dual-Branch Fully Transformer Networks

1 code implementation • 5 Dec 2021 • Fei Xie, Chunyu Wang, Guangting Wang, Wankou Yang, Wenjun Zeng

We present a Siamese-like Dual-branch network based on solely Transformers for tracking.

Object Tracking

Paper
Code

MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark

no code implementations • 30 Nov 2021 • Xiaotian Han, Quanzeng You, Chunyu Wang, Zhizheng Zhang, Peng Chu, Houdong Hu, Jiang Wang, Zicheng Liu

This dataset provides a more reliable benchmark of multi-camera, multi-object tracking systems in cluttered and crowded environments.

Ranked #2 on Object Tracking on MMPTRACK

Multi-Object Tracking Multiple People Tracking +1

Paper
Add Code

Relational Self-Attention: What's Missing in Attention for Video Understanding

1 code implementation • NeurIPS 2021 • Manjin Kim, Heeseung Kwon, Chunyu Wang, Suha Kwak, Minsu Cho

Convolution has been arguably the most important feature transform for modern neural networks, leading to the advance of deep learning.

Ranked #11 on Action Recognition on Diving-48

Action Recognition Temporal Action Localization +1

Paper
Code

VoxelTrack: Multi-Person 3D Human Pose Estimation and Tracking in the Wild

no code implementations • 5 Aug 2021 • Yifu Zhang, Chunyu Wang, Xinggang Wang, Wenyu Liu, Wenjun Zeng

We estimate 3D poses from the voxel representation by predicting whether each voxel contains a particular body joint.

Ranked #7 on 3D Multi-Person Pose Estimation on Campus

3D Multi-Person Pose Estimation 3D Pose Estimation

Paper
Add Code

Context Modeling in 3D Human Pose Estimation: A Unified Perspective

1 code implementation • CVPR 2021 • Xiaoxuan Ma, Jiajun Su, Chunyu Wang, Hai Ci, Yizhou Wang

By comparing the two methods, we found that the end-to-end training scheme in GNN and the limb length constraints in PSM are two complementary factors to improve results.

Ranked #60 on 3D Human Pose Estimation on MPI-INF-3DHP (AUC metric)

3D Human Pose Estimation

Paper
Code

A Multi-task Joint Framework for Real-time Person Search

no code implementations • 11 Dec 2020 • Ye Li, Kangning Yin, Jie Liang, Chunyu Wang, Guangqiang Yin

To solve these problems, we propose a Multi-task Joint Framework for real-time person search (MJF), which optimizes the person detection, feature extraction and identity comparison respectively.

Human Detection Person Search

Paper
Add Code

An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation

1 code implementation • ICCV 2021 • Rongchang Xie, Chunyu Wang, Wenjun Zeng, Yizhou Wang

The state-of-the-art methods are consistency-based which learn about unlabeled images by encouraging the model to give consistent predictions for images under different augmentations.

Pose Estimation Semi-Supervised Human Pose Estimation

Paper
Code

AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild

2 code implementations • 26 Oct 2020 • Zhe Zhang, Chunyu Wang, Weichao Qiu, Wenhu Qin, Wenjun Zeng

To make the task truly unconstrained, we present AdaFuse, an adaptive multiview fusion method, which can enhance the features in occluded views by leveraging those in visible views.

Ranked #1 on 3D Human Pose Estimation on Total Capture

3D Human Pose Estimation

Paper
Code

VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment

2 code implementations • ECCV 2020 • Hanyue Tu, Chunyu Wang, Wen-Jun Zeng

In contrast to the previous efforts which require to establish cross-view correspondence based on noisy and incomplete 2D pose estimations, we present an end-to-end solution which directly operates in the $3$D space, therefore avoids making incorrect decisions in the 2D space.

Ranked #6 on 3D Multi-Person Pose Estimation on Panoptic (using extra training data)

3D Multi-Person Pose Estimation

5,140

Paper
Code

FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking

32 code implementations • 4 Apr 2020 • Yifu Zhang, Chunyu Wang, Xinggang Wang, Wen-Jun Zeng, Wenyu Liu

Formulating MOT as multi-task learning of object detection and re-ID in a single network is appealing since it allows joint optimization of the two tasks and enjoys high computation efficiency.

Ranked #1 on Multi-Object Tracking on 2DMOT15 (using extra training data)

Fairness Multi-Object Tracking +4

12,219

Paper
Code

MetaFuse: A Pre-trained Fusion Model for Human Pose Estimation

no code implementations • CVPR 2020 • Rongchang Xie, Chunyu Wang, Yizhou Wang

Cross view feature fusion is the key to address the occlusion problem in human pose estimation.

Meta-Learning Pose Estimation

Paper
Add Code

Fusing Wearable IMUs with Multi-View Images for Human Pose Estimation: A Geometric Approach

1 code implementation • CVPR 2020 • Zhe Zhang, Chunyu Wang, Wenhu Qin, Wen-Jun Zeng

Then we lift the multi-view 2D poses to the 3D space by an Orientation Regularized Pictorial Structure Model (ORPSM) which jointly minimizes the projection error between the 3D and 2D poses, along with the discrepancy between the 3D pose and IMU orientations.

Ranked #1 on 3D Absolute Human Pose Estimation on Total Capture

2D Pose Estimation 3D Absolute Human Pose Estimation

Paper
Code

Cross View Fusion for 3D Human Pose Estimation

1 code implementation • ICCV 2019 • Haibo Qiu, Chunyu Wang, Jingdong Wang, Naiyan Wang, Wen-Jun Zeng

It consists of two separate steps: (1) estimating the 2D poses in multi-view images and (2) recovering the 3D poses from the multi-view 2D poses.

Ranked #6 on 3D Human Pose Estimation on Total Capture

2D Pose Estimation 3D Human Pose Estimation

536

Paper
Code

Video Object Segmentation by Learning Location-Sensitive Embeddings

no code implementations • ECCV 2018 • Hai Ci, Chunyu Wang, Yizhou Wang

We address the problem of video object segmentation which outputs the masks of a target object throughout a video given only a bounding box in the first frame.

Object Semantic Segmentation +2

Paper
Add Code

Online Dictionary Learning for Approximate Archetypal Analysis

no code implementations • ECCV 2018 • Jieru Mei, Chunyu Wang, Wen-Jun Zeng

The archetypes generally correspond to the extremal points in the dataset and are learned by requiring them to be convex combinations of the training data.

Dictionary Learning

Paper
Add Code

Object Detection in Videos by High Quality Object Linking

no code implementations • 30 Jan 2018 • Peng Tang, Chunyu Wang, Xinggang Wang, Wenyu Liu, Wen-Jun Zeng, Jingdong Wang

In particular, our method improves results by 8. 8% over the static image detector for fast moving objects.

General Classification Object +3

Paper
Add Code

Mining 3D Key-Pose-Motifs for Action Recognition

no code implementations • CVPR 2016 • Chunyu Wang, Yizhou Wang, Alan L. Yuille

Recognizing an action from a sequence of 3D skeletal poses is a challenging task.

Action Recognition Quantization +1

Paper
Add Code

Representing Data by a Mixture of Activated Simplices

no code implementations • 12 Dec 2014 • Chunyu Wang, John Flynn, Yizhou Wang, Alan L. Yuille

We show that under this restriction, building a model with simplices amounts to constructing a convex hull inside the sphere whose boundary facets is close to the data.

Paper
Add Code

Robust Estimation of 3D Human Poses from a Single Image

no code implementations • CVPR 2014 • Chunyu Wang, Yizhou Wang, Zhouchen Lin, Alan L. Yuille, Wen Gao

We address the challenges in three ways: (i) We represent a 3D pose as a linear combination of a sparse set of bases learned from 3D human skeletons.

Ranked #27 on 3D Human Pose Estimation on HumanEva-I

3D Human Pose Estimation 3D Pose Estimation +2

Paper
Add Code

An Approach to Pose-Based Action Recognition

no code implementations • CVPR 2013 • Chunyu Wang, Yizhou Wang, Alan L. Yuille

We start by improving a state of the art method for estimating human joint locations from videos.

Action Recognition In Videos Temporal Action Localization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.