CCPM (Chinese Classical Poetry Matching)

Introduced by Li et al. in CCPM: A Chinese Classical Poetry Matching Dataset

Introduction

CCPM is a large Chinese classical poetry matching dataset that can be used for poetry matching, understanding and translation.

The main task of this dataset is: given a description in modern Chinese, the model is supposed to select one line of Chinese classical poetry from four candidates that semantically match the given description most.

Size

It contains 27,218 instances in total, which are split into training (21,778), validation (2,720) and test (2,720) sets.

Format

Each instance is composed of translation (the description in modern Chinese, a string), choice (four candidate lines of Chinese classical poetry, a list) and answer (the index of the correct line, an integer between 0 and 3).

Source: https://github.com/THUNLP-AIPoet/CCPM

Homepage

Benchmarks

Add a new result Link an existing benchmark

No benchmarks yet. Start a new benchmark or link an existing one.

Papers

Paper	Code	Results	Date	Stars

Dataset Loaders

Add Remove

No data loaders found. You can submit your data loader here.

Tasks

Similar Datasets

AlignBench

IFEval

YACLC

CUGE

Usage

License

Unknown

Modalities

Texts

Languages

Chinese