Caption Generation

89 papers with code • 1 benchmarks • 1 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Caption Generation

Trend	Dataset	Best Model	Paper	Code	Compare
	Concadia	VLIS (BLIP-2)			See all

Libraries

Use these libraries to find Caption Generation models and implementations

rakshithShetty/captionGAN

2 papers

Datasets

Concadia

Most implemented papers

Most implemented Social Latest No code

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning • • 10 Feb 2015

Inspired by recent work in machine translation and object detection, we introduce an attention based model that automatically learns to describe the content of images.

Paper
Code

Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks

adityac94/Grad_CAM_plus_plus • • 30 Oct 2017

Over the last decade, Convolutional Neural Network (CNN) models have been highly successful in solving complex vision problems.

Paper
Code

Recurrent Neural Network Regularization

wojzaremba/lstm • 8 Sep 2014

We present a simple regularization technique for Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units.

Paper
Code

Microsoft COCO Captions: Data Collection and Evaluation Server

tylin/coco-caption • 1 Apr 2015

In this paper we describe the Microsoft COCO Caption dataset and evaluation server.

Paper
Code

Where to put the Image in an Image Caption Generator

mtanti/where-image2 • • 27 Mar 2017

When a recurrent neural network language model is used for caption generation, the image information can be fed to the neural network either by directly incorporating it in the RNN -- conditioning the language model by `injecting' image features -- or in a layer following the RNN -- conditioning the language model by `merging' image features.

Paper
Code

Scalable Bayesian Optimization Using Deep Neural Networks

automl/pybnn • • 19 Feb 2015

Bayesian optimization is an effective methodology for the global optimization of functions with expensive evaluations.

Paper
Code

Sequence to Sequence -- Video to Text

nasib-ullah/video-captioning-models-in-Pytorch • • 3 May 2015

Our LSTM model is trained on video-sentence pairs and learns to associate a sequence of video frames to a sequence of words in order to generate a description of the event in the video clip.

Paper
Code

An Actor-Critic Algorithm for Sequence Prediction

rizar/actor-critic-public • 24 Jul 2016

We present an approach to training neural networks to generate sequences using actor-critic methods from reinforcement learning (RL).

Paper
Code

Deep Reinforcement Learning For Sequence to Sequence Models

yaserkl/RLSeq2Seq • • 24 May 2018

In this survey, we consider seq2seq problems from the RL point of view and provide a formulation combining the power of RL methods in decision-making with sequence-to-sequence models that enable remembering long-term memories.

Paper
Code

Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts

google-research-datasets/conceptual-12m • CVPR 2021

The availability of large-scale image captioning and visual question answering datasets has contributed significantly to recent successes in vision-and-language pre-training.

Paper
Code

Caption Generation

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result