no code implementations • ICCV 2023 • Junyu Bi, Daixuan Cheng, Ping Yao, Bochen Pang, Yuefeng Zhan, Chuanguang Yang, Yujing Wang, Hao Sun, Weiwei Deng, Qi Zhang
Vision-Language Pretraining (VLP) has significantly improved the performance of various vision-language tasks with the matching of images and texts.
no code implementations • 13 Dec 2020 • Kun Zhang, Rui Wu, Ping Yao, Kai Deng, Ding Li, Renbiao Liu, Chuanguang Yang, Ge Chen, Min Du, Tianyao Zheng
We note that 2D pose estimation task is highly dependent on the contextual relationship between image patches, thus we introduce a self-supervised method for pretraining 2D pose estimation networks.
no code implementations • 11 Sep 2019 • Kun Zhang, Peng He, Ping Yao, Ge Chen, Rui Wu, Min Du, Huimin Li, Li Fu, Tianyao Zheng
Specifically, RAM learns a group of weights to represent the different importance of feature maps across resolutions, and the GPR gradually merges every two feature maps from low to high resolutions to regress final human keypoint heatmaps.