Skip to main content
Skip Navigation

NAEP Technical DocumentationNAEP Scales



Content of the Subject Area Scales

Number of Items in Each Subject Area Scale

Scale Composition by Item Type for New and Trend Blocks

Definition of Composite Scales

Ranges for the Final Subject Area Scales

Correlations Among NAEP Subject Area Subscales

NAEP Item Response Theory (IRT) scales are determined a priori by grouping items into content domains for which overall performance is deemed to be of interest. The content domains are defined by NAEP frameworks, which have been the responsibility of the National Assessment Governing Board since 1998. Frameworks for some subject areas (e.g., mathematics and reading) specify multiple content-related subscales, while others (e.g., writing and all long-term trend assessment subjects) specify a single scale.

For all of the IRT scales, there is a linear indeterminacy between the values of item parameters and proficiency parameters. That is, mathematically equivalent but different values of item parameters can be estimated on an arbitrarily linearly transformed proficiency scale. This linear indeterminacy can be resolved by setting the origin and unit size of the proficiency scale to arbitrary constants, such as a mean of 0 with a standard deviation of 1. The indeterminacy is most apparent when the scale is set for the first time.

Final results for each subject area are linearly transformed from the original scale to a 0–500 or a 0–300 scale.

When content area scales are specified, a composite scale is usually created from them. The frameworks specify the weights assigned to each of the content area scales when a composite scale is created.

Last updated 02 November 2022 (SK)