no code implementations • 15 Mar 2024 • Xiangtian Xue, Jiasong Wu, Youyong Kong, Lotfi Senhadji, Huazhong Shu
We transcend the limitation of traditional attention mechanisms that only focus on existing visual features by introducing deformable feature alignment to hierarchically refine spatial positioning fused with multi-scale visual and linguistic information.
no code implementations • 14 Mar 2024 • Xiangtian Xue, Jiasong Wu, Youyong Kong, Lotfi Senhadji, Huazhong Shu
Referring object removal refers to removing the specific object in an image referred by natural language expressions and filling the missing region with reasonable semantics.