A defining feature of NAEP is its ability to provide data about student achievement that can be used to track changes over time. For this to occur, assessment results must be placed on the same scale every time the assessment is administered. Because constructed-response items are scored by human scorers (rather than by machines as multiple-choice items are), special considerations are given to these items to ensure that constructed-response scoring remains consistent across assessment years so that any given written response will be scored similarly in any year that the item is scored. In NAEP, items with both current-year (i.e., within-year) responses and previous-assessment-year responses are called "trend items."
In 2006, NAEP began using an Interspersed Trend model. Trend responses from earlier NAEP administrations are interspersed among current-year responses within the electronic scoring system. Scorers are not aware whether the student response currently being scored is a current-year response or a trend response. Trend papers are delivered to scorers at a consistent rate throughout the scoring of an item. Under the previous trend model used by NAEP through 2005, trend scoring occurred in discrete time periods, and scorers were well aware whether they were scoring trend responses or current-year responses.
The cross-year t-statistics for individual item scores should be between -1.5 and +1.5. The cross-year interrater agreement rate should be within 8 percentage points of the prior year agreement for 2- and 3-point items, and within 10 percentage points for 4- and 5-point items.
On a daily basis, scoring staff create and disseminate a comprehensive report of results for all items at all grades for each subject. The report is a spreadsheet workbook that includes these data: date/time that report was run, t-statistics, and agreement data (current-year scoring agreement, trend-year scoring agreement, and comparison between trend-year scoring agreement and current-year agreement on trend set responses). Scoring staff ensure that the psychometricians receive a sufficient amount of data for each item to enable them to make informed judgments about the acceptability of scoring statistics.
For more information on the cross-year analyses performed on constructed-response items, see Analysis and Scaling.
It should be noted that the term "trend scoring" is not related to the long-term trend assessment. Trend scoring looks at changes over time using main NAEP item responses (e.g., 2000 reading assessment scores for an item compared to the 1998 reading assessment scores for that item). View a table that lists the differences between main NAEP assessment and the long-term trend NAEP assessment.