January 21, 2022
Measurement Incorporated, the University of Massachusetts-Amherst, Cambium Assessment, and the University of Duisburg-Essen win scoring challenge for NAEP Reading items using advanced natural language processing
WASHINGTON (January 21, 2022)—The National Center for Education Statistics (NCES), which administers the National Assessment of Educational Progress (NAEP), also known as The Nation’s Report Card, awarded four grand prizes and recognized four runner-up teams in its first automated scoring challenge. The winners used advanced natural language processing methods that promise to reduce scoring costs while maintaining accuracy similar to human scoring. Those honored described their technical approaches prior to scoring and met requirements for transparency, interpretability, and fairness.
“The winning approaches represent current best practices in natural language processing and demonstrate evidence of similar reliability to human scoring with certain types of items,” said Peggy G. Carr, NCES Commissioner. “All of the winning teams conducted fairness analyses showing that their models were not biased due to demographics or family background. These results suggest a promising path for NAEP to use automated scoring in the near future.”
The challenge had two parts: item-specific, where competitors created a different model for each item, and generic, where teams applied a model created from different item responses. Challenge winners include respondents from different sectors, including assessment companies, university researchers, and a student team.
Natural language processing uses computer algorithms to identify patterns in language; automated scoring applies these patterns to analyze student responses and assign scores. Those scores are then compared to the scores for each response given by human graders. The most accurate submissions used advanced machine learning approaches based in what are called “transformer network architectures” such as BERT (or “Bidirectional Encoder Representations from Transformers”). These models used NAEP data to fine tune pre-trained language models that were created by analyzing language consistencies and patterns among billions of student writing examples.
“This challenge illustrates the importance of using current machine learning approaches for NAEP’s technological infrastructure,” said Mark Schneider, Director of the Institute of Education Sciences, which includes NCES. Director Schneider also highlighted the importance of transparency. “The awardees and runners-up used cutting-edge approaches that they were willing to describe in detail,” he stated. “We did not accept ‘black box’ solutions.”
This challenge is a key component in modernization efforts to incorporate data science and machine learning into operational activities at NCES. It is the first in a series of challenges that use NAEP data. Participants were eligible to win up to $15,000 in prizes.
Arianto Wibowo, Measurement Incorporated (Item-Specific Model)
Andrew Lan, UMass-Amherst (Item-Specific Model)
Susan Lottridge, Cambium Assessment (Item-Specific Model)
Torsten Zesch, University of Duisburg-Essen (Generic Model)
Fabian Zehner, DIPF | Leibniz Institute for Research and Information in Education, Centre for Technology-Based Assessment (Item-Specific Model)
Scott Crossley, Georgia State University (Item-Specific Model)
Prathic Sundararajan, Georgia Institute of Technology and Suraj Rajendran, Weill Cornell Medical College (Item-Specific Model)
Susan Lottridge, Cambium Assessment (Generic Model)
Additional details on the challenge are available at https://www.challenge.gov/?challenge=naep-automated-scoring-challenge
The Institute of Education Sciences (IES) is the independent and nonpartisan statistics, research, and evaluation arm of the U.S. Department of Education. Their mission is to provide scientific evidence on which to ground education practice and policy and to share this information in formats that are useful and accessible to educators, parents, policymakers, researchers, and the public. Learn more at ies.ed.gov/.
The National Center for Education Statistics, a principal agency of the U.S. Federal Statistical System, is the statistical center of the U.S. Department of Education and the primary federal entity for collecting and analyzing data related to education in the U.S. and other nations. NCES fulfills a congressional mandate to collect, collate, and report complete statistics on the condition of American education; conduct and publish reports; and review and report on education activities internationally.
The National Assessment of Educational Progress (NAEP) is a congressionally authorized project sponsored by the U.S. Department of Education. The National Center for Education Statistics, within the Institute of Education Sciences, administers NAEP. The commissioner of the National Center for Education Statistics is responsible by law for carrying out the NAEP project. Policy for the NAEP program is set by the National Assessment Governing Board (NAGB), an independent, bipartisan board whose members include governors, state legislators, local and state school officials, educators, business representatives, and members of the general public.