no code implementations • 3 Mar 2024 • Yunzhuo Sun, Yifang Xu, Zien Xie, Yukun Shu, Sidan Du
First, MiniGPT-4 is employed to generate the detailed description of the video frame and rewrite the query statement, fed into the encoder as new features.
Decoder Highlight Detection +4