Exploring Target Driven Image Classification

1 Jan 2021  ·  Aditya Singh, Alessandro Bay, Andrea Mirabile ·

For a given image, traditional supervised image classification using deep neural networks is akin to answering the question 'what object category does this image belong to?'. The model takes in an image as input and produces the most likely label for it. However, there is an alternate approach to arrive at the final answer which we investigate in this paper. We argue that, for any arbitrary category $\mathit{\tilde{y}}$, the composed question 'Is this image of an object category $\mathit{\tilde{y}}$' serves as a viable approach for image classification via. deep neural networks. The difference lies in the supplied additional information in form of the target along with the image. Motivated by the curiosity to unravel the advantages and limitations of the addressed approach, we propose Indicator Neural Networks(INN). It utilizes a pair of image and label as input and produces a boolean response. INN consists of $2$ encoding components namely: label encoder and image encoder which learns latent representations for labels and images respectively. Predictor, the third component, combines the learnt individual label and image representations to make the final yes/no prediction. The network is trained end-to-end. We perform evaluations on image classification and fine-grained image classification datasets against strong baselines. We also investigate various components of INNs to understand their contribution in the final prediction of the model. Our probing of the modules reveals that, as opposed to traditionally trained deep counterpart, INN tends to much larger regions of the input image for generating the image features. The generated image feature is further refined by the generated label encoding prior to the final prediction.

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here