ARC Direct Answer Questions (ARC-DA) dataset consists of 2,985 grade-school level, direct-answer ("open response", "free form") science questions derived from the ARC multiple-choice question set released as part of the AI2 Reasoning Challenge in 2018.
How the dataset was built
These questions were derived from the ARC multiple-choice question set released as part of the AI2 Reasoning Challenge in 2018. The ARC Easy and ARC Challenge set questions in the original dataset were combined and then filtered/modified by the following process:
-
Turking: Each of the multiple-choice questions was presented as a direct answer question to five crowdsourced workers to gather additional answers.
-
Heuristic filtering: The questions were filtered based on the following heuristic filters:
- Questions having a threshold number of turker answers, as a proxy for concreteness of the question.
- Questions having at least two turker-provided answers with word overlap, as a measure of confidence in the correctness of the answers, and also straightforwardness of the question.
- Other heuristics to identify questions that only make sense as multiple-choice questions, such as, questions starting with the phrase “Which of the following”.
-
Further manual vetting: We had volunteers in house do another pass of vetting where they:
- Marked highly open-ended questions with too many answer choices, such as “Name an insect”, or otherwise invalid questions, for removal. These are filtered out.
- Removed some of the bad answers gathered from turking.
- Reworded questions to make them more suited to direct answer question format, for e.g., a question such as “What element is contained in table salt?” which would make sense as a multiple-choice question, needs be reworded to something like “Name an element present in table salt”.
- Added any additional answers to the questions they could think of that were not present in the turker provided answers.