Search Results for author: Vlad I. Morariu

Found 28 papers, 6 papers with code

TutoAI: A Cross-domain Framework for AI-assisted Mixed-media Tutorial Creation on Physical Tasks

no code implementations • 12 Mar 2024 • Yuexi Chen, Vlad I. Morariu, Anh Truong, Zhicheng Liu

Mixed-media tutorials, which integrate videos, images, text, and diagrams to teach procedural skills, offer more browsable alternatives than timeline-based videos.

Paper
Add Code

LayerDoc: Layer-wise Extraction of Spatial Hierarchical Structure in Visually-Rich Documents

no code implementations • IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023 • Puneet Mathur, Rajiv Jain, Ashutosh Mehra, Jiuxiang Gu, Franck Dernoncourt, Anandhavelu N, Quan Tran, Verena Kaynig-Fittkau, Ani Nenkova, Dinesh Manocha, Vlad I. Morariu

Experiments show that our approach outperforms competitive baselines by 10-15% on three diverse datasets of forms and mobile app screen layouts for the tasks of spatial region classification, higher-order group identification, layout hierarchy extraction, reading order detection, and word grouping.

Reading Order Detection

Paper
Add Code

MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding

no code implementations • 27 Nov 2022 • Zilong Wang, Jiuxiang Gu, Chris Tensmeyer, Nikolaos Barmpalios, Ani Nenkova, Tong Sun, Jingbo Shang, Vlad I. Morariu

In contrast, region-level models attempt to encode regions corresponding to paragraphs or text blocks into a single embedding, but they perform worse with additional word-level features.

Paper
Add Code

Unified Pretraining Framework for Document Understanding

no code implementations • 22 Apr 2022 • Jiuxiang Gu, Jason Kuen, Vlad I. Morariu, Handong Zhao, Nikolaos Barmpalios, Rajiv Jain, Ani Nenkova, Tong Sun

Document intelligence automates the extraction of information from documents and supports many business applications.

Ranked #7 on Document Layout Analysis on PubLayNet val

Document Layout Analysis document understanding +1

Paper
Add Code

SelfDoc: Self-Supervised Document Representation Learning

no code implementations • CVPR 2021 • Peizhao Li, Jiuxiang Gu, Jason Kuen, Vlad I. Morariu, Handong Zhao, Rajiv Jain, Varun Manjunatha, Hongfu Liu

For downstream usage, we propose a novel modality-adaptive attention mechanism for multimodal feature fusion by adaptively emphasizing language and vision signals.

Representation Learning

Paper
Add Code

RPCL: A Framework for Improving Cross-Domain Detection with Auxiliary Tasks

no code implementations • 18 Apr 2021 • Kai Li, Curtis Wigington, Chris Tensmeyer, Vlad I. Morariu, Handong Zhao, Varun Manjunatha, Nikolaos Barmpalios, Yun Fu

Contrasted with prior work, this paper provides a complementary solution to align domains by learning the same auxiliary tasks in both domains simultaneously.

Paper
Add Code

Black-box Explanation of Object Detectors via Saliency Maps

2 code implementations • CVPR 2021 • Vitali Petsiuk, Rajiv Jain, Varun Manjunatha, Vlad I. Morariu, Ashutosh Mehra, Vicente Ordonez, Kate Saenko

We propose D-RISE, a method for generating visual explanations for the predictions of object detectors.

Object object-detection +1

Paper
Code

Cross-Domain Document Object Detection: Benchmark Suite and Method

1 code implementation • CVPR 2020 • Kai Li, Curtis Wigington, Chris Tensmeyer, Handong Zhao, Nikolaos Barmpalios, Vlad I. Morariu, Varun Manjunatha, Tong Sun, Yun Fu

We establish a benchmark suite consisting of different types of PDF document datasets that can be utilized for cross-domain DOD model training and evaluation.

object-detection Object Detection

Paper
Code

Referring to Objects in Videos using Spatio-Temporal Identifying Descriptions

no code implementations • WS 2019 • Peratham Wiriyathammabhum, Abhinav Shrivastava, Vlad I. Morariu, Larry S. Davis

This paper presents a new task, the grounding of spatio-temporal identifying descriptions in videos.

Paper
Add Code

Learning Rich Features for Image Manipulation Detection

2 code implementations • CVPR 2018 • Peng Zhou, Xintong Han, Vlad I. Morariu, Larry S. Davis

Image manipulation detection is different from traditional semantic object detection because it pays more attention to tampering artifacts than to image content, which suggests that richer features need to be learned.

Image Manipulation Image Manipulation Detection +3

341

Paper
Code

Fused Deep Neural Networks for Efficient Pedestrian Detection

no code implementations • 2 May 2018 • Xianzhi Du, Mostafa El-Khamy, Vlad I. Morariu, Jungwon Lee, Larry Davis

The classification system further classifies the generated candidates based on opinions of multiple deep verification networks and a fusion network which utilizes a novel soft-rejection fusion method to adjust the confidence in the detection results.

Ensemble Learning General Classification +2

Paper
Add Code

Layout-induced Video Representation for Recognizing Agent-in-Place Actions

no code implementations • ICCV 2019 • Ruichi Yu, Hongcheng Wang, Ang Li, Jingxiao Zheng, Vlad I. Morariu, Larry S. Davis

We address the recognition of agent-in-place actions, which are associated with agents who perform them and places where they occur, in the context of outdoor home surveillance.

Paper
Add Code

Two-Stream Neural Networks for Tampered Face Detection

no code implementations • 29 Mar 2018 • Peng Zhou, Xintong Han, Vlad I. Morariu, Larry S. Davis

We propose a two-stream network for face tampering detection.

Face Detection Face Swapping +2

Paper
Add Code

NISP: Pruning Networks using Neuron Importance Score Propagation

no code implementations • CVPR 2018 • Ruichi Yu, Ang Li, Chun-Fu Chen, Jui-Hsin Lai, Vlad I. Morariu, Xintong Han, Mingfei Gao, Ching-Yung Lin, Larry S. Davis

In contrast, we argue that it is essential to prune neurons in the entire neuron network jointly based on a unified goal: minimizing the reconstruction error of important responses in the "final response layer" (FRL), which is the second-to-last layer before classification, for a pruned network to retrain its predictive power.

Network Pruning

Paper
Add Code

Dynamic Zoom-in Network for Fast Object Detection in Large Images

no code implementations • CVPR 2018 • Mingfei Gao, Ruichi Yu, Ang Li, Vlad I. Morariu, Larry S. Davis

We introduce a generic framework that reduces the computational cost of object detection while retaining accuracy for scenarios where objects with varied sizes appear in high resolution images.

object-detection Real-Time Object Detection

Paper
Add Code

C-WSL: Count-guided Weakly Supervised Localization

no code implementations • ECCV 2018 • Mingfei Gao, Ang Li, Ruichi Yu, Vlad I. Morariu, Larry S. Davis

We introduce count-guided weakly supervised localization (C-WSL), an approach that uses per-class object count as a new form of supervision to improve weakly supervised localization (WSL).

Object

Paper
Add Code

Visual Relationship Detection with Internal and External Linguistic Knowledge Distillation

no code implementations • ICCV 2017 • Ruichi Yu, Ang Li, Vlad I. Morariu, Larry S. Davis

Understanding visual relationships involves identifying the subject, the object, and a predicate relating them.

Ranked #1 on Visual Relationship Detection on VRD Predicate Detection

Knowledge Distillation Relationship Detection +1

Paper
Add Code

Generalized Deep Image to Image Regression

1 code implementation • CVPR 2017 • Venkataraman Santhanam, Vlad I. Morariu, Larry S. Davis

We present a Deep Convolutional Neural Network architecture which serves as a generic image-to-image regressor that can be trained end-to-end without any further machinery.

Colorization Denoising +1

Paper
Code

Learning a Discriminative Filter Bank within a CNN for Fine-grained Recognition

1 code implementation • CVPR 2018 • Yaming Wang, Vlad I. Morariu, Larry S. Davis

Compared to earlier multistage frameworks using CNN features, recent end-to-end deep approaches for fine-grained recognition essentially enhance the mid-level learning capability of CNNs.

Ranked #20 on Fine-Grained Image Classification on CUB-200-2011

Representation Learning

Paper
Code

Generating Holistic 3D Scene Abstractions for Text-based Image Retrieval

no code implementations • CVPR 2017 • Ang Li, Jin Sun, Joe Yue-Hei Ng, Ruichi Yu, Vlad I. Morariu, Larry S. Davis

Since interactions between objects can be reduced to a limited set of atomic spatial relations in 3D, we study the possibility of inferring 3D structure from a text description rather than an image, applying physical relation models to synthesize holistic 3D abstract object layouts satisfying the spatial constraints present in a textual description.

Image Retrieval Object +3

Paper
Add Code

The Role of Context Selection in Object Detection

no code implementations • 9 Sep 2016 • Ruichi Yu, Xi Chen, Vlad I. Morariu, Larry S. Davis

We investigate the reasons why context in object detection has limited utility by isolating and evaluating the predictive power of different context cues under ideal conditions in which context provided by an oracle.

Object object-detection +1

Paper
Add Code

Modeling Context Between Objects for Referring Expression Understanding

1 code implementation • 1 Aug 2016 • Varun K. Nagaraja, Vlad I. Morariu, Larry S. Davis

Our approach uses an LSTM to learn the probability of a referring expression, with input features from a region and a context region.

Multiple Instance Learning Object +1

Paper
Code

Mining Discriminative Triplets of Patches for Fine-Grained Classification

no code implementations • CVPR 2016 • Yaming Wang, Jonghyun Choi, Vlad I. Morariu, Larry S. Davis

Fine-grained classification involves distinguishing between similar sub-categories based on subtle differences in highly localized regions; therefore, accurate localization of discriminative regions remains a major challenge.

Classification General Classification

Paper
Add Code

VRFP: On-the-fly Video Retrieval using Web Images and Fast Fisher Vector Products

no code implementations • 10 Dec 2015 • Xintong Han, Bharat Singh, Vlad I. Morariu, Larry S. Davis

VRFP is a real-time video retrieval framework based on short text input queries, which obtains weakly labeled training images from the web after the query is known.

Re-Ranking Retrieval +2

Paper
Add Code

Searching for Objects using Structure in Indoor Scenes

no code implementations • 24 Nov 2015 • Varun K. Nagaraja, Vlad I. Morariu, Larry S. Davis

However, we can use structure in the scene to search for objects without processing the entire image.

Imitation Learning Object

Paper
Add Code

Selecting Relevant Web Trained Concepts for Automated Event Retrieval

no code implementations • ICCV 2015 • Bharat Singh, Xintong Han, Zhe Wu, Vlad I. Morariu, Larry S. Davis

Given a text description of an event, event retrieval is performed by selecting concepts linguistically related to the event description and fusing the concept responses on unseen videos.

Domain Adaptation Retrieval

Paper
Add Code

Planar Structure Matching Under Projective Uncertainty for Geolocation

no code implementations • ECCV 2014 • Ang Li, Vlad I. Morariu, Larry S. Davis

Image based geolocation aims to answer the question: where was this ground photograph taken?

Geometric Matching

Paper
Add Code

Automatic online tuning for fast Gaussian summation

no code implementations • NeurIPS 2008 • Vlad I. Morariu, Balaji V. Srinivasan, Vikas C. Raykar, Ramani Duraiswami, Larry S. Davis

To solve the second problem, we present an online tuning approach that results in a black box method that automatically chooses the evaluation method and its parameters to yield the best performance for the input data, desired accuracy, and bandwidth.

BIG-bench Machine Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.