Search Results for author: Quan Tran

Found 17 papers, 5 papers with code

Transfer Learning and Prediction Consistency for Detecting Offensive Spans of Text

no code implementations • Findings (ACL) 2022 • Amir Pouran Ben Veyseh, Ning Xu, Quan Tran, Varun Manjunatha, Franck Dernoncourt, Thien Nguyen

Toxic span detection is the task of recognizing offensive spans in a text snippet.

Transfer Learning

Paper
Add Code

DocTime: A Document-level Temporal Dependency Graph Parser

no code implementations • NAACL 2022 • Puneet Mathur, Vlad Morariu, Verena Kaynig-Fittkau, Jiuxiang Gu, Franck Dernoncourt, Quan Tran, Ani Nenkova, Dinesh Manocha, Rajiv Jain

We introduce DocTime - a novel temporal dependency graph (TDG) parser that takes as input a text document and produces a temporal dependency graph.

Paper
Add Code

Multimodal Intent Discovery from Livestream Videos

no code implementations • Findings (NAACL) 2022 • Adyasha Maharana, Quan Tran, Franck Dernoncourt, Seunghyun Yoon, Trung Bui, Walter Chang, Mohit Bansal

We construct and present a new multimodal dataset consisting of software instructional livestreams and containing manual annotations for both detailed and abstract procedural intent that enable training and evaluation of joint video and text understanding models.

Intent Discovery Video Summarization +1

Paper
Add Code

Fine-tuning CLIP Text Encoders with Two-step Paraphrasing

no code implementations • 23 Feb 2024 • Hyunjae Kim, Seunghyun Yoon, Trung Bui, Handong Zhao, Quan Tran, Franck Dernoncourt, Jaewoo Kang

Contrastive language-image pre-training (CLIP) models have demonstrated considerable success across various vision-language tasks, such as text-to-image retrieval, where the model is required to effectively process natural language input to produce an accurate visual output.

Image Captioning Image Retrieval +3

Paper
Add Code

Multi-Modal Video Topic Segmentation with Dual-Contrastive Domain Adaptation

no code implementations • 30 Nov 2023 • Linzi Xing, Quan Tran, Fabian Caba, Franck Dernoncourt, Seunghyun Yoon, Zhaowen Wang, Trung Bui, Giuseppe Carenini

Video topic segmentation unveils the coarse-grained semantic structure underlying videos and is essential for other video understanding tasks.

Contrastive Learning Segmentation +2

Paper
Add Code

Boosting Punctuation Restoration with Data Generation and Reinforcement Learning

no code implementations • 24 Jul 2023 • Viet Dac Lai, Abel Salinas, Hao Tan, Trung Bui, Quan Tran, Seunghyun Yoon, Hanieh Deilamsalehy, Franck Dernoncourt, Thien Huu Nguyen

Punctuation restoration is an important task in automatic speech recognition (ASR) which aim to restore the syntactic structure of generated ASR texts to improve readability.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Generating Adversarial Examples with Task Oriented Multi-Objective Optimization

1 code implementation • 26 Apr 2023 • Anh Bui, Trung Le, He Zhao, Quan Tran, Paul Montague, Dinh Phung

The key factor for the success of adversarial training is the capability to generate qualified and divergent adversarial examples which satisfy some objectives/goals (e. g., finding adversarial examples that maximize the model losses for simultaneously attacking multiple models).

Paper
Code

LayerDoc: Layer-wise Extraction of Spatial Hierarchical Structure in Visually-Rich Documents

no code implementations • IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023 • Puneet Mathur, Rajiv Jain, Ashutosh Mehra, Jiuxiang Gu, Franck Dernoncourt, Anandhavelu N, Quan Tran, Verena Kaynig-Fittkau, Ani Nenkova, Dinesh Manocha, Vlad I. Morariu

Experiments show that our approach outperforms competitive baselines by 10-15% on three diverse datasets of forms and mobile app screen layouts for the tasks of spatial region classification, higher-order group identification, layout hierarchy extraction, reading order detection, and word grouping.

Reading Order Detection

Paper
Add Code

A Unified Wasserstein Distributional Robustness Framework for Adversarial Training

1 code implementation • ICLR 2022 • Tuan Anh Bui, Trung Le, Quan Tran, He Zhao, Dinh Phung

We introduce a new Wasserstein cost function and a new series of risk functions, with which we show that standard AT methods are special cases of their counterparts in our framework.

Paper
Code

Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real Images

1 code implementation • ICCV 2021 • Zhuowan Li, Elias Stengel-Eskin, Yixiao Zhang, Cihang Xie, Quan Tran, Benjamin Van Durme, Alan Yuille

Our experiments show CCO substantially boosts the performance of neural symbolic methods on real images.

Question Answering Visual Question Answering

Paper
Code

Learning to Predict Visual Attributes in the Wild

no code implementations • CVPR 2021 • Khoi Pham, Kushal Kafle, Zhe Lin, Zhihong Ding, Scott Cohen, Quan Tran, Abhinav Shrivastava

In this paper, we introduce a large-scale in-the-wild visual attribute prediction dataset consisting of over 927K attribute annotations for over 260K object instances.

Attribute Contrastive Learning +2

Paper
Add Code

Explain by Evidence: An Explainable Memory-based Neural Network for Question Answering

no code implementations • COLING 2020 • Quan Tran, Nhan Dam, Tuan Lai, Franck Dernoncourt, Trung Le, Nham Le, Dinh Phung

Interpretability and explainability of deep neural networks are challenging due to their scale, complexity, and the agreeable notions on which the explaining process rests.

Question Answering

Paper
Add Code

Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions

1 code implementation • ECCV 2020 • Xihui Liu, Zhe Lin, Jianming Zhang, Handong Zhao, Quan Tran, Xiaogang Wang, Hongsheng Li

We propose a novel algorithm, named Open-Edit, which is the first attempt on open-domain image manipulation with open-vocabulary instructions.

Decoder Image Manipulation

Paper
Code

Context-Aware Group Captioning via Self-Attention and Contrastive Features

no code implementations • CVPR 2020 • Zhuowan Li, Quan Tran, Long Mai, Zhe Lin, Alan Yuille

In this paper, we introduce a new task, context-aware group captioning, which aims to describe a group of target images in the context of another group of related reference images.

Image Captioning