Search Results for author: Hongjie Zhang

Found 12 papers, 4 papers with code

EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World

1 code implementation • 24 Mar 2024 • Yifei HUANG, Guo Chen, Jilan Xu, Mingfang Zhang, Lijin Yang, Baoqi Pei, Hongjie Zhang, Lu Dong, Yali Wang, LiMin Wang, Yu Qiao

Along with the videos we record high-quality gaze data and provide detailed multimodal annotations, formulating a playground for modeling the human ability to bridge asynchronous procedural actions from different viewpoints.

Paper
Code

InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding

2 code implementations • 22 Mar 2024 • Yi Wang, Kunchang Li, Xinhao Li, Jiashuo Yu, Yinan He, Guo Chen, Baoqi Pei, Rongkun Zheng, Jilan Xu, Zun Wang, Yansong Shi, Tianxiang Jiang, Songze Li, Hongjie Zhang, Yifei HUANG, Yu Qiao, Yali Wang, LiMin Wang

We introduce InternVideo2, a new video foundation model (ViFM) that achieves the state-of-the-art performance in action recognition, video-text tasks, and video-centric dialogue.

Ranked #1 on Audio Classification on ESC-50 (using extra training data)

Action Classification Action Recognition +12

973

Paper
Code

MoVQA: A Benchmark of Versatile Question-Answering for Long-Form Movie Understanding

no code implementations • 8 Dec 2023 • Hongjie Zhang, Yi Liu, Lu Dong, Yifei HUANG, Zhen-Hua Ling, Yali Wang, LiMin Wang, Yu Qiao

While several long-form VideoQA datasets have been introduced, the length of both videos used to curate questions and sub-clips of clues leveraged to answer those questions have not yet reached the criteria for genuine long-form video understanding.

Question Answering Video Question Answering +1

Paper
Add Code

Multi-view Feature Extraction based on Triple Contrastive Heads

no code implementations • 22 Mar 2023 • Hongjie Zhang

Multi-view feature extraction is an efficient approach for alleviating the issue of dimensionality in highdimensional multi-view data.

Contrastive Learning Self-Supervised Learning

Paper
Add Code

Multi-view Feature Extraction based on Dual Contrastive Head

no code implementations • 8 Feb 2023 • Hongjie Zhang

Multi-view feature extraction is an efficient approach for alleviating the issue of dimensionality in highdimensional multi-view data.

Contrastive Learning Self-Supervised Learning

Paper
Add Code

InternVideo: General Video Foundation Models via Generative and Discriminative Learning

2 code implementations • 6 Dec 2022 • Yi Wang, Kunchang Li, Yizhuo Li, Yinan He, Bingkun Huang, Zhiyu Zhao, Hongjie Zhang, Jilan Xu, Yi Liu, Zun Wang, Sen Xing, Guo Chen, Junting Pan, Jiashuo Yu, Yali Wang, LiMin Wang, Yu Qiao

Specifically, InternVideo efficiently explores masked video modeling and video-language contrastive learning as the pretraining objectives, and selectively coordinates video representations of these two complementary frameworks in a learnable manner to boost various video applications.

Ranked #1 on Action Recognition on Something-Something V1 (using extra training data)

Action Classification Contrastive Learning +8

973

Paper
Code

AcceRL: Policy Acceleration Framework for Deep Reinforcement Learning

no code implementations • 28 Nov 2022 • Hongjie Zhang

Inspired by the redundancy of neural networks, we propose a lightweight parallel training framework based on neural network compression, AcceRL, to accelerate the policy learning while ensuring policy quality.

Decision Making General Reinforcement Learning +3

Paper
Add Code

InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges

2 code implementations • 17 Nov 2022 • Guo Chen, Sen Xing, Zhe Chen, Yi Wang, Kunchang Li, Yizhuo Li, Yi Liu, Jiahao Wang, Yin-Dong Zheng, Bingkun Huang, Zhiyu Zhao, Junting Pan, Yifei HUANG, Zun Wang, Jiashuo Yu, Yinan He, Hongjie Zhang, Tong Lu, Yali Wang, LiMin Wang, Yu Qiao

In this report, we present our champion solutions to five tracks at Ego4D challenge.

Ranked #1 on State Change Object Detection on Ego4D

Future Hand Prediction Moment Queries +7

Paper
Code

Feature Extraction Framework based on Contrastive Learning with Adaptive Positive and Negative Samples

no code implementations • 11 Jan 2022 • Hongjie Zhang

In this study, we propose a feature extraction framework based on contrastive learning with adaptive positive and negative samples (CL-FEFA) that is suitable for unsupervised, supervised, and semi-supervised single-view feature extraction.

Contrastive Learning

Paper
Add Code

Unified Framework for Feature Extraction based on Contrastive Learning

no code implementations • 25 Jan 2021 • Hongjie Zhang

In this study, we proposed a unified framework based on a new perspective of contrastive learning (CL) that is suitable for both unsupervised and supervised feature extraction.

Contrastive Learning Graph Embedding +1

Paper
Add Code

Hybrid Models for Open Set Recognition

no code implementations • ECCV 2020 • Hongjie Zhang, Ang Li, Jie Guo, Yanwen Guo

We propose the OpenHybrid framework, which is composed of an encoder to encode the input data into a joint embedding space, a classifier to classify samples to inlier classes, and a flow-based density estimator to detect whether a sample belongs to the unknown category.

Ranked #6 on Out-of-Distribution Detection on CIFAR-10 vs CIFAR-100

Open Set Learning Out-of-Distribution Detection

Paper
Add Code

Viewpoint Selection for Photographing Architectures

no code implementations • 6 Mar 2017 • Jingwu He, Linbo Wang, Wenzhe Zhou, Hongjie Zhang, Xiufen Cui, Yanwen Guo

Unlike previous efforts devoted to photo quality assessment which mainly rely on 2D image features, we show in this paper combining 2D image features extracted from images with 3D geometric features computed on the 3D models can result in more reliable evaluation of viewpoint quality.

2k Clustering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.