Paper tables with annotated results for Evaluation of Text Generation: A Survey

Paper

Evaluation of Text Generation: A Survey

The paper surveys evaluation methods of natural language generation (NLG) systems that have been developed in the last few years. We group NLG evaluation methods into three categories: (1) human-centric evaluation metrics, (2) automatic metrics that require no training, and (3) machine-learned metrics. For each category, we discuss the progress that has been made and the challenges still being faced, with a focus on the evaluation of recently proposed NLG tasks and neural NLG models. We then present two examples for task-specific NLG evaluations for automatic text summarization and long text generation, and conclude the paper by proposing future research directions.

PDF Paper record

Results in Papers With Code

(↓ scroll down to see all results)

Evaluation of Text Generation: A Survey

Reader Guidelines

Editor Guidelines