Skip to main content
Skip Navigation

​NAEP Technical DocumentationTreatment of Items in NAEP Assessments

Items with responses that do not appear to fit the item response models used to describe the data may be treated in special ways. Rarely, items with responses that do not fit the models may be deleted from NAEP scales. For instance, dichotomous items with responses that are non-monotonic would be deleted from NAEP scales. These are items where the likelihood of a correct response is lower for students who do well on the other items than for students who do poorly on the other items. Other items that might be deleted from NAEP scales are those that cannot be modified to fit the item response models and that have empirical and theoretical item response functions that vary a great deal.

Information about the way specific items are treated in NAEP scales is carefully tracked. Related items might be combined into cluster items and treated as a single item during scaling. Related items can be identified by test developers when items are written or scored, or by analysis staff when item responses are prepared for scaling or when items are scaled. Combining two or more items into a single item ameliorates dependencies among the items, making the assumption of local independence more realistic.

Polytomous items with categories of response data that do not fit the generalized partial credit model when empirical and theoretical item response functions are compared may fit the model more closely if categories of response data are combined. Therefore, some categories of polytomous items might be recoded to reflect the combination of response categories.

Finally, when items that were administered in more than one assessment year have responses that differ for those years, items might be scaled separately across assessment years. These items are most often identified by empirical item response functions for the different assessment years that differ markedly from one another and from the overall theoretical item response function.

 

Treatment of items in NAEP assessments: Various years, 2000–2019
YearSubject area
2019
Mathematics
Reading
Science
2018 Civics
Geography
Technology and engineering literacy (TEL)
U.S. history
2017 Mathematics
Reading
2016 Arts
2015 Mathematics
Reading
Science
Vocabulary
2014 Civics
Geography
Technology and engineering literacy (TEL)
U.S. history
2013 Mathematics
Reading
2012 Economics
2011 Mathematics
Reading
Reading vocabulary
Science
2010 Civics
Geography
U.S. history
2009 Mathematics
Reading
Reading vocabulary
Science
2008 Arts
2007 Mathematics
Reading
Writing
2006 Civics
Economics
U.S. history
2005 Mathematics
Reading
Science
2003 Mathematics
Reading
2002 Reading
Writing
2001 Geography R3 / R2
U.S. history R3 / R2
2000 Mathematics R3 / R2
Reading R3 / R2
Science R3 / R2

NOTE: Because preliminary analyses of students' writing performance in the 2017 NAEP writing assessments at grades 4 and 8 revealed potentially confounding factors in measuring performance, results will not be publicly reported. In NAEP, vocabulary, reading vocabulary, and meaning vocabulary refer to the same reporting scale. R2 is the non-accommodated reporting sample; R3 is the accommodated reporting sample. If sampled students are classified as students with disabilities (SD) or English learners (EL), and school officials, using NAEP guidelines, determine that they can meaningfully participate in the NAEP assessment with accommodation, those students are included in the NAEP assessment with accommodation along with other sampled students including SD/EL students who do not need accommodations. The R3 sample is more inclusive than the R2 sample and excludes a smaller proportion of sampled students. The R3 sample is the only reporting sample used in NAEP after 2001. The block naming conventions used in the 2018 civics, geography, and U.S. history assessments are described in the document 2018 Block Naming Conventions in Data Products and TDW. The block naming conventions used in the 2019 mathematics, reading, and science assessments are described in the document 2019 Block Naming Conventions in Data Products and TDW. 
SOURCE: U.S. Department of Education, Institute of Education Sciences, National Center for Education Statistics, National Assessment of Educational Progress (NAEP), various years, 2000–2019 Assessments.

 

Treatment of items in NAEP long-term trend assessments: 2004, 2008, and 2012
YearSubject area
2012 Mathematics long-term trend
Reading long-term trend
2008 Mathematics long-term trend
Reading long-term trend
2004 Mathematics long-term trend
Reading long-term trend
SOURCE: U.S. Department of Education, Institute of Education Sciences, National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 2004, 2008, and 2012 Long-Term Trend Assessments.




Last updated 04 January 2024 (ML)