no code implementations • 30 Nov 2023 • Kyungho Bae, Geo Ahn, Youngrae Kim, Jinwoo Choi
Disentangled action and scene representations could be beneficial for both in-context and out-of-context video understanding.
Action Recognition Temporal Action Localization +1