Search Results for author: Dongyoon Han

Found 42 papers, 28 papers with code

HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts

no code implementations • 26 Apr 2024 • Wonjae Kim, Sanghyuk Chun, Taekyung Kim, Dongyoon Han, Sangdoo Yun

In an era where the volume of data drives the effectiveness of self-supervised learning, the specificity and clarity of data semantics play a crucial role in model training.

Self-Supervised Learning Specificity

Paper
Add Code

Leveraging Temporal Contextualization for Video Action Recognition

no code implementations • 15 Apr 2024 • Minji Kim, Dongyoon Han, Taekyung Kim, Bohyung Han

We propose Temporal Contextualization (TC), a novel layer-wise temporal information infusion mechanism for video that extracts core information from each frame, interconnects relevant information across the video to summarize into context tokens, and ultimately leverages the context tokens during the feature encoding process.

Action Recognition Temporal Action Localization +1

Paper
Add Code

Model Stock: All we need is just a few fine-tuned models

2 code implementations • 28 Mar 2024 • Dong-Hwan Jang, Sangdoo Yun, Dongyoon Han

This paper introduces an efficient fine-tuning method for large pre-trained models, offering strong in-distribution (ID) and out-of-distribution (OOD) performance.

3,435

Paper
Code

DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs

1 code implementation • 28 Mar 2024 • Donghyun Kim, Byeongho Heo, Dongyoon Han

This paper revives Densely Connected Convolutional Networks (DenseNets) and reveals the underrated effectiveness over predominant ResNet-style architectures.

Instance Segmentation object-detection +2

Paper
Code

Rotary Position Embedding for Vision Transformer

1 code implementation • 20 Mar 2024 • Byeongho Heo, Song Park, Dongyoon Han, Sangdoo Yun

This study provides a comprehensive analysis of RoPE when applied to ViTs, utilizing practical implementations of RoPE for 2D vision data.

Position

Paper
Code

Masked Image Modeling via Dynamic Token Morphing

no code implementations • 30 Dec 2023 • Taekyung Kim, Dongyoon Han, Byeongho Heo

Masked Image Modeling (MIM) arises as a promising option for Vision Transformers among various self-supervised learning (SSL) methods.

Fine-Grained Image Classification Self-Supervised Learning

Paper
Add Code

SeiT++: Masked Token Modeling Improves Storage-efficient Training

1 code implementation • 15 Dec 2023 • Minhyun Lee, Song Park, Byeongho Heo, Dongyoon Han, Hyunjung Shim

A recent breakthrough by SeiT proposed the use of Vector-Quantized (VQ) feature vectors (i. e., tokens) as network inputs for vision classification.

Classification Data Augmentation +2

Paper
Code

Match me if you can: Semantic Correspondence Learning with Unpaired Images

no code implementations • 30 Nov 2023 • Jiwon Kim, Byeongho Heo, Sangdoo Yun, Seungryong Kim, Dongyoon Han

Recent approaches for semantic correspondence have focused on obtaining high-quality correspondences using a complicated network, refining the ambiguous or noisy matching points.

Semantic correspondence

Paper
Add Code

Gramian Attention Heads are Strong yet Efficient Vision Learners

1 code implementation • ICCV 2023 • Jongbin Ryu, Dongyoon Han, Jongwoo Lim

We introduce a novel architecture design that enhances expressiveness by incorporating multiple head classifiers (\ie, classification heads) instead of relying on channel expansion or additional building blocks.

Fine-Grained Image Classification Instance Segmentation +2

Paper
Code

Learning with Unmasked Tokens Drives Stronger Vision Learners

no code implementations • 20 Oct 2023 • Taekyung Kim, Sanghyuk Chun, Byeongho Heo, Dongyoon Han

MIMs such as Masked Autoencoder (MAE) learn strong representations by randomly masking input tokens for the encoder to process, with the decoder reconstructing the masked tokens to the input.

Attribute Fine-Grained Image Classification +2

Paper
Add Code

Masking Augmentation for Supervised Learning

1 code implementation • 20 Jun 2023 • Byeongho Heo, Taekyung Kim, Sangdoo Yun, Dongyoon Han

In this paper, we propose a novel way to involve masking augmentations dubbed Masked Sub-model (MaskSub).

Ranked #9 on Self-Supervised Image Classification on ImageNet (finetuned)

Self-Supervised Image Classification Transfer Learning

Paper
Code

GeNAS: Neural Architecture Search with Better Generalization

1 code implementation • 15 May 2023 • JoonHyun Jeong, Joonsang Yu, Geondo Park, Dongyoon Han, Youngjoon Yoo

Recent neural architecture search (NAS) approaches rely on validation loss or accuracy to find the superior network for the target data.

Neural Architecture Search object-detection +2

Paper
Code

Neglected Free Lunch -- Learning Image Classifiers Using Annotation Byproducts

3 code implementations • 30 Mar 2023 • Dongyoon Han, Junsuk Choe, Seonghyeok Chun, John Joon Young Chung, Minsuk Chang, Sangdoo Yun, Jean Y. Song, Seong Joon Oh

We refer to the new paradigm of training models with annotation byproducts as learning using annotation byproducts (LUAB).

Time Series

Paper
Code

The Devil is in the Points: Weakly Semi-Supervised Instance Segmentation via Point-Guided Mask Representation

1 code implementation • CVPR 2023 • Beomyoung Kim, JoonHyun Jeong, Dongyoon Han, Sung Ju Hwang

In this paper, we introduce a novel learning scheme named weakly semi-supervised instance segmentation (WSSIS) with point labels for budget-efficient and high-performance instance segmentation.

Instance Segmentation Semantic Segmentation +1

Paper
Code

Neglected Free Lunch - Learning Image Classifiers Using Annotation Byproducts

1 code implementation • ICCV 2023 • Dongyoon Han, Junsuk Choe, Seonghyeok Chun, John Joon Young Chung, Minsuk Chang, Sangdoo Yun, Jean Y. Song, Seong Joon Oh

We refer to the new paradigm of training models with annotation byproducts as learning using annotation byproducts (LUAB).

Time Series

Paper
Code

Generating Instance-level Prompts for Rehearsal-free Continual Learning

no code implementations • ICCV 2023 • Dahuin Jung, Dongyoon Han, Jihwan Bang, Hwanjun Song

However, we observe that the use of a prompt pool creates a domain scalability problem between pre-training and continual learning.

Continual Learning

Paper
Add Code

Can We Find Strong Lottery Tickets in Generative Models?

no code implementations • 16 Dec 2022 • Sangyeop Yeo, Yoojin Jang, Jy-yong Sohn, Dongyoon Han, Jaejun Yoo

To the best of our knowledge, we are the first to show the existence of strong lottery tickets in generative models and provide an algorithm to find it stably.

Model Compression Network Pruning

Paper
Add Code

Similarity of Neural Architectures using Adversarial Attack Transferability

no code implementations • 20 Oct 2022 • Jaehui Hwang, Dongyoon Han, Byeongho Heo, Song Park, Sanghyuk Chun, Jong-Seok Lee

In recent years, many deep neural architectures have been developed for image classification.

Adversarial Attack Feature Importance +2

Paper
Add Code

Time Is MattEr: Temporal Self-supervision for Video Transformers

1 code implementation • 19 Jul 2022 • Sukmin Yun, Jaehyung Kim, Dongyoon Han, Hwanjun Song, Jung-Woo Ha, Jinwoo Shin

Understanding temporal dynamics of video is an essential aspect of learning better video representations.

Action Recognition Temporal Action Localization

Paper
Code

Loss-based Sequential Learning for Active Domain Adaptation

no code implementations • 25 Apr 2022 • Kyeongtak Han, Youngeun Kim, Dongyoon Han, Sungeun Hong

To solve these, we fully utilize pseudo labels of the unlabeled target domain by leveraging loss prediction.

Domain Adaptation

Paper
Add Code

An Extendable, Efficient and Effective Transformer-based Object Detector

1 code implementation • 17 Apr 2022 • Hwanjun Song, Deqing Sun, Sanghyuk Chun, Varun Jampani, Dongyoon Han, Byeongho Heo, Wonjae Kim, Ming-Hsuan Yang

Transformers have been widely used in numerous vision problems especially for visual recognition and detection.

Image Classification Instance Segmentation +4

299

Paper
Code

Frequency Selective Augmentation for Video Representation Learning

no code implementations • 8 Apr 2022 • Jinhyung Kim, Taeoh Kim, Minho Shim, Dongyoon Han, Dongyoon Wee, Junmo Kim

FreqAug stochastically removes specific frequency components from the video so that learned representation captures essential features more from the remaining information for various downstream tasks.

Action Recognition Data Augmentation +3

Paper
Add Code

Demystifying the Neural Tangent Kernel from a Practical Perspective: Can it be trusted for Neural Architecture Search without training?

1 code implementation • CVPR 2022 • Jisoo Mok, Byunggook Na, Ji-Hoon Kim, Dongyoon Han, Sungroh Yoon

To take such non-linear characteristics into account, we introduce Label-Gradient Alignment (LGA), a novel NTK-based metric whose inherent formulation allows it to capture the large amount of non-linear advantage present in modern neural architectures.

Neural Architecture Search

Paper
Code

Learning Features with Parameter-Free Layers

1 code implementation • ICLR 2022 • Dongyoon Han, Youngjoon Yoo, Beomyoung Kim, Byeongho Heo

We aim to break the stereotype of organizing the spatial operations of building blocks into trainable layers.

Paper
Code

OCR-free Document Understanding Transformer

4 code implementations • 30 Nov 2021 • Geewook Kim, Teakgyu Hong, Moonbin Yim, Jeongyeon Nam, Jinyoung Park, Jinyeong Yim, Wonseok Hwang, Sangdoo Yun, Dongyoon Han, Seunghyun Park

Current Visual Document Understanding (VDU) methods outsource the task of reading text to off-the-shelf Optical Character Recognition (OCR) engines and focus on the understanding task with the OCR outputs.

Ranked #10 on Document Image Classification on RVL-CDIP

Document Image Classification document understanding +3

5,315

Paper
Code

Contrastive Vicinal Space for Unsupervised Domain Adaptation

1 code implementation • 26 Nov 2021 • Jaemin Na, Dongyoon Han, Hyung Jin Chang, Wonjun Hwang

In the contrastive space, inter-domain discrepancy is mitigated by constraining instances to have contrastive views and labels, and the consensus space reduces the confusion between intra-domain categories.

Ranked #1 on Unsupervised Domain Adaptation on PACS

Unsupervised Domain Adaptation

Paper
Code

ViDT: An Efficient and Effective Fully Transformer-based Object Detector

1 code implementation • ICLR 2022 • Hwanjun Song, Deqing Sun, Sanghyuk Chun, Varun Jampani, Dongyoon Han, Byeongho Heo, Wonjae Kim, Ming-Hsuan Yang

Transformers are transforming the landscape of computer vision, especially for recognition tasks.

Ranked #12 on Object Detection on COCO 2017 val

Image Classification Object +2

299

Paper
Code

Rethinking Spatial Dimensions of Vision Transformers

10 code implementations • ICCV 2021 • Byeongho Heo, Sangdoo Yun, Dongyoon Han, Sanghyuk Chun, Junsuk Choe, Seong Joon Oh

We empirically show that such a spatial dimension reduction is beneficial to a transformer architecture as well, and propose a novel Pooling-based Vision Transformer (PiT) upon the original ViT model.

Ranked #336 on Image Classification on ImageNet

Dimensionality Reduction Image Classification +2

29,826

Paper
Code

Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels

2 code implementations • CVPR 2021 • Sangdoo Yun, Seong Joon Oh, Byeongho Heo, Dongyoon Han, Junsuk Choe, Sanghyuk Chun

However, they have not fixed the training set, presumably because of a formidable annotation cost.

Ranked #20 on Image Classification on OmniBenchmark

Image Classification Instance Segmentation +4

395

Paper
Code

VideoMix: Rethinking Data Augmentation for Video Classification

2 code implementations • 7 Dec 2020 • Sangdoo Yun, Seong Joon Oh, Byeongho Heo, Dongyoon Han, Jinhyung Kim

Recent data augmentation strategies have been reported to address the overfitting problems in static image classifiers.

Action Localization Action Recognition +5

Paper
Code

Rethinking Channel Dimensions for Efficient Model Design

10 code implementations • CVPR 2021 • Dongyoon Han, Sangdoo Yun, Byeongho Heo, Youngjoon Yoo

We then investigate the channel configuration of a model by searching network architectures concerning the channel configuration under the computational cost restriction.

Ranked #293 on Image Classification on ImageNet (using extra training data)

Image Classification Instance Segmentation +4

5,263

Paper
Code

AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights

4 code implementations • ICLR 2021 • Byeongho Heo, Sanghyuk Chun, Seong Joon Oh, Dongyoon Han, Sangdoo Yun, Gyuwan Kim, Youngjung Uh, Jung-Woo Ha

Because of the scale invariance, this modification only alters the effective step sizes without changing the effective update directions, thus enjoying the original convergence properties of GD optimizers.

Audio Classification Image Classification +3

29,826

Paper
Code

An Empirical Evaluation on Robustness and Uncertainty of Regularization Methods

no code implementations • 9 Mar 2020 • Sanghyuk Chun, Seong Joon Oh, Sangdoo Yun, Dongyoon Han, Junsuk Choe, Youngjoon Yoo

Despite apparent human-level performances of deep neural networks (DNN), they behave fundamentally differently from humans.

Bayesian Inference

Paper
Add Code

EXTD: Extremely Tiny Face Detector via Iterative Filter Reuse

2 code implementations • 15 Jun 2019 • YoungJoon Yoo, Dongyoon Han, Sangdoo Yun

In this paper, we propose a new multi-scale face detector having an extremely tiny number of parameters (EXTD), less than 0. 1 million, as well as achieving comparable performance to deep heavy detectors.

Ranked #23 on Face Detection on WIDER Face (Hard)

Face Detection

187

Paper
Code

CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features

30 code implementations • ICCV 2019 • Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, Youngjoon Yoo

Regional dropout strategies have been proposed to enhance the performance of convolutional neural network classifiers.

Ranked #1 on Out-of-Distribution Generalization on ImageNet-W

Domain Generalization Image Captioning +5

29,826

Paper
Code

ACE: Artificial Checkerboard Enhancer to Induce and Evade Adversarial Attacks

no code implementations • ICLR 2019 • Jisung Hwang, Younghoon Kim, Sanghyuk Chun, Jaejun Yoo, Ji-Hoon Kim, Dongyoon Han, Jung-Woo Ha

The checkerboard phenomenon is one of the well-known visual artifacts in the computer vision field.

Paper
Add Code

What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis

13 code implementations • ICCV 2019 • Jeonghun Baek, Geewook Kim, Junyeop Lee, Sungrae Park, Dongyoon Han, Sangdoo Yun, Seong Joon Oh, Hwalsuk Lee

Many new proposals for scene text recognition (STR) models have been introduced in recent years.

Ranked #7 on Scene Text Recognition on ICDAR 2003

Image Matching Scene Text Recognition

3,629

Paper
Code

Character Region Awareness for Text Detection

18 code implementations • CVPR 2019 • Youngmin Baek, Bado Lee, Dongyoon Han, Sangdoo Yun, Hwalsuk Lee

Scene text detection methods based on neural networks have emerged recently and have shown promising results.

Ranked #1 on Scene Text Detection on ICDAR 2013 (Precision metric)

Scene Text Detection Text Detection

22,004

Paper
Code

C3: Concentrated-Comprehensive Convolution and its application to semantic segmentation

2 code implementations • 12 Dec 2018 • Hyojin Park, Youngjoon Yoo, Geonseok Seo, Dongyoon Han, Sangdoo Yun, Nojun Kwak

To resolve this problem, we propose a new block called Concentrated-Comprehensive Convolution (C3) which applies the asymmetric convolutions before the depth-wise separable dilated convolution to compensate for the information loss due to dilated convolution.

Semantic Segmentation

Paper
Code

Deep Pyramidal Residual Networks

9 code implementations • CVPR 2017 • Dongyoon Han, Jiwhan Kim, Junmo Kim

This design, which is discussed in depth together with our new insights, has proven to be an effective means of improving generalization ability.

Ranked #98 on Image Classification on CIFAR-10

General Classification Image Classification

2,918

Paper
Code

Unsupervised Simultaneous Orthogonal Basis Clustering Feature Selection

no code implementations • CVPR 2015 • Dongyoon Han, Junmo Kim

Unlike the recent unsupervised feature selection methods, SOCFS does not explicitly use the pre-computed local structure information for data points represented as additional terms of their objective functions, but directly computes latent cluster information by the target matrix conducting orthogonal basis clustering in a single unified term of the proposed objective function.

Clustering feature selection

Paper
Add Code

Salient Region Detection via High-Dimensional Color Transform

no code implementations • CVPR 2014 • Jiwhan Kim, Dongyoon Han, Yu-Wing Tai, Junmo Kim

By mapping a low dimensional RGB color to a feature vector in a high-dimensional color space, we show that we can linearly separate the salient regions from the background by finding an optimal linear combination of color coefficients in the high-dimensional color space.

Vocal Bursts Intensity Prediction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.