no code implementations • 28 May 2021 • Chang-Hwan Son, Pung-Hwi Ye
To achieve this, a target encoder is initially trained in an encoder-decoder framework to associate visual features with semantic words.
Decoder Image Captioning +1