Skip to main content
Skip Navigation

​NAEP Technical DocumentationEstimation of NAEP Score Scales


NAEP Scales

Item Scaling Models

Estimation of IRT Item Parameters

Treatment of Missing Responses in NAEP

Assumptions of the Item Scaling Models

Checks for Violations of Assumptions

Treatment of Items to Avoid Violations of Assumptions

NAEP Assessment IRT Parameters


NAEP score scales summarize statistics describing scale scores for groups of students for the collection of items representing the content specified in the NAEP frameworks. For each subject area, the framework determines the number of Item Response Theory (IRT) scales.

IRT models are used to describe the relationships between the item responses provided by students and the underlying score scales. IRT provides a common scale on which the performance of students receiving different blocks of items can be placed. For each item, item parameters that are used in the models are estimated from student response data. Different IRT models with different types of item parameters are used to describe multiple-choice items, constructed-response items that are scored right or wrong, and constructed-response items that have more than two categories. However, all types of items contribute to NAEP score scales. Evaluation of model fit is done during the item parameter estimation procedure.

Last updated 05 October 2023 (SK)