Search Results for author: Davide Moltisanti

Found 13 papers, 8 papers with code

Coarse or Fine? Recognising Action End States without Labels

1 code implementation • 13 May 2024 • Davide Moltisanti, Hakan Bilen, Laura Sevilla-Lara, Frank Keller

We use our synthetic data to train a model based on UNet and test it on real images showing coarsely/finely cut objects.

Action Recognition Object

Paper
Code

Efficient Pre-training for Localized Instruction Generation of Videos

no code implementations • 27 Nov 2023 • Anil Batra, Davide Moltisanti, Laura Sevilla-Lara, Marcus Rohrbach, Frank Keller

The resulting dataset is three orders of magnitude smaller than current web-scale datasets but enables efficient training of large-scale models.

Paper
Add Code

Learning Action Changes by Measuring Verb-Adverb Textual Relationships

1 code implementation • CVPR 2023 • Davide Moltisanti, Frank Keller, Hakan Bilen, Laura Sevilla-Lara

The goal of this work is to understand the way actions are performed in videos.

Ranked #2 on Video-Adverb Retrieval on HowTo100M Adverbs

Video-Adverb Retrieval

Paper
Code

An Action Is Worth Multiple Words: Handling Ambiguity in Action Recognition

2 code implementations • 10 Oct 2022 • Kiyoon Kim, Davide Moltisanti, Oisin Mac Aodha, Laura Sevilla-Lara

In practice, a given video can contain multiple valid positive annotations for the same action.

Action Recognition valid

Paper
Code

BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis

1 code implementation • 20 Jul 2022 • Davide Moltisanti, Jinyi Wu, Bo Dai, Chen Change Loy

Estimating human keypoints from these videos is difficult due to the complexity of the dance, as well as the multiple moving cameras recording setup.

Motion Synthesis Pose Estimation

Paper
Code

Rescaling Egocentric Vision

7 code implementations • 23 Jun 2020 • Dima Damen, Hazel Doughty, Giovanni Maria Farinella, Antonino Furnari, Evangelos Kazakos, Jian Ma, Davide Moltisanti, Jonathan Munro, Toby Perrett, Will Price, Michael Wray

This paper introduces the pipeline to extend the largest dataset in egocentric vision, EPIC-KITCHENS.

Ranked #7 on Action Anticipation on EPIC-KITCHENS-100

Action Anticipation Action Detection +4

118

Paper
Code

The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines

2 code implementations • 29 Apr 2020 • Dima Damen, Hazel Doughty, Giovanni Maria Farinella, Sanja Fidler, Antonino Furnari, Evangelos Kazakos, Davide Moltisanti, Jonathan Munro, Toby Perrett, Will Price, Michael Wray

Our dataset features 55 hours of video consisting of 11. 5M frames, which we densely labelled for a total of 39. 6K action segments and 454. 2K object bounding boxes.

Object

Paper
Code

Action Recognition from Single Timestamp Supervision in Untrimmed Videos

1 code implementation • CVPR 2019 • Davide Moltisanti, Sanja Fidler, Dima Damen

We propose a method that is supervised by single timestamps located around each action instance, in untrimmed videos.

Action Recognition Temporal Action Localization

Paper
Code

Towards an Unequivocal Representation of Actions

no code implementations • 10 May 2018 • Michael Wray, Davide Moltisanti, Dima Damen

This work introduces verb-only representations for actions and interactions; the problem of describing similar motions (e. g. 'open door', 'open cupboard'), and distinguish differing ones (e. g. 'open door' vs 'open bottle') using verb-only labels.

Action Recognition Retrieval +1

Paper
Add Code

Scaling Egocentric Vision: The EPIC-KITCHENS Dataset

2 code implementations • ECCV 2018 • Dima Damen, Hazel Doughty, Giovanni Maria Farinella, Sanja Fidler, Antonino Furnari, Evangelos Kazakos, Davide Moltisanti, Jonathan Munro, Toby Perrett, Will Price, Michael Wray

First-person vision is gaining interest as it offers a unique viewpoint on people's interaction with objects, their attention, and even intention.

Ranked #6 on Action Anticipation on EPIC-KITCHENS-55 (Unseen test set (S2)

Action Anticipation

148

Paper
Code

Trespassing the Boundaries: Labeling Temporal Bounds for Object Interactions in Egocentric Video

no code implementations • ICCV 2017 • Davide Moltisanti, Michael Wray, Walterio Mayol-Cuevas, Dima Damen

Manual annotations of temporal bounds for object interactions (i. e. start and end times) are typical training input to recognition, localization and detection algorithms.

Object

Paper
Add Code

Improving Classification by Improving Labelling: Introducing Probabilistic Multi-Label Object Interaction Recognition

no code implementations • 24 Mar 2017 • Michael Wray, Davide Moltisanti, Walterio Mayol-Cuevas, Dima Damen

This work deviates from easy-to-define class boundaries for object interactions.

General Classification Object

Paper
Add Code

SEMBED: Semantic Embedding of Egocentric Action Videos

no code implementations • 28 Jul 2016 • Michael Wray, Davide Moltisanti, Walterio Mayol-Cuevas, Dima Damen

We present SEMBED, an approach for embedding an egocentric object interaction video in a semantic-visual graph to estimate the probability distribution over its potential semantic labels.

General Classification Object

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.