TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Math Word Problem Solving	MATH	Llemma-34B-KPMath-Plus	Accuracy	48.6	# 32
Math Word Problem Solving	MATH	Llemma-34B-KPMath-Plus	Parameters (Billions)	34	# 27
Math Word Problem Solving	MATH	Llama2-13B-KPMath-Plus	Accuracy	41	# 53
Math Word Problem Solving	MATH	Llama2-13B-KPMath-Plus	Parameters (Billions)	13	# 39
Math Word Problem Solving	MATH	DeepSeekMath-7B-KPMath-Plus	Accuracy	48.8	# 30
Math Word Problem Solving	MATH	DeepSeekMath-7B-KPMath-Plus	Parameters (Billions)	7	# 58
Math Word Problem Solving	MATH	Mistral-7B-KPMath-Plus	Accuracy	46.8	# 37
Math Word Problem Solving	MATH	Mistral-7B-KPMath-Plus	Parameters (Billions)	7	# 58

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/key-point-driven-data-synthesis-with-its/math-word-problem-solving-on-math)](https://paperswithcode.com/sota/math-word-problem-solving-on-math?p=key-point-driven-data-synthesis-with-its)`

Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning

4 Mar 2024 · Yiming Huang, Xiao Liu, Yeyun Gong, Zhibin Gou, Yelong Shen, Nan Duan, Weizhu Chen ·

Large language models (LLMs) have shown great potential in complex reasoning tasks, yet their performance is often hampered by the scarcity of high-quality and reasoning-focused training datasets. Addressing this challenge, we propose Key-Point-Driven Data Synthesis (KPDDS), a novel data synthesis framework that synthesizes question-answer pairs by leveraging key points and exemplar practices from authentic data sources. KPDDS ensures the generation of novel questions with rigorous quality control and substantial scalability. As a result, we present KPMath, an extensive synthetic dataset tailored for mathematical reasoning, comprising over 800K question-answer pairs. Utilizing KPMath and augmenting it with additional reasoning-intensive corpora, we create the comprehensive KPMath-Plus dataset. The Qwen1.5-72B model, fine-tuned on KPMath-Plus, achieves 87.0% PASS@1 accuracy on GSM8K and 58.3% on MATH, surpassing competitors in the 7B to 70B range and best commercial models like GPT-4 across multiple math reasoning datasets.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

GSM8K

Math

Mathematical Reasoning

Math Word Problem Solving

Datasets

GSM8K

MATH

SVAMP ASDiv MAWPS

Results from the Paper

Edit

Ranked #30 on Math Word Problem Solving on MATH

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Math Word Problem Solving	MATH	Llemma-34B-KPMath-Plus	Accuracy	48.6	# 32	Compare
Math Word Problem Solving	MATH	Llemma-34B-KPMath-Plus	Parameters (Billions)	34	# 27	Compare
Math Word Problem Solving	MATH	Llama2-13B-KPMath-Plus	Accuracy	41	# 53	Compare
Math Word Problem Solving	MATH	Llama2-13B-KPMath-Plus	Parameters (Billions)	13	# 39	Compare
Math Word Problem Solving	MATH	DeepSeekMath-7B-KPMath-Plus	Accuracy	48.8	# 30	Compare
Math Word Problem Solving	MATH	DeepSeekMath-7B-KPMath-Plus	Parameters (Billions)	7	# 58	Compare
Math Word Problem Solving	MATH	Mistral-7B-KPMath-Plus	Accuracy	46.8	# 37	Compare
Math Word Problem Solving	MATH	Mistral-7B-KPMath-Plus	Parameters (Billions)	7	# 58	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • GPT-4 • Layer Normalization • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove