In February of each assessment cycle, trend sets for items that are to be replicated from prior assessment years for short-term trend rescoring are copied by the local scoring center or a print vendor. When student booklets are returned to the NAEP materials and scoring center for processing, scoring center staff photocopy responses to the new items. These papers are sorted by item and are numbered. When 25 to 150 responses are gathered for each item, a photocopy of the set is made. The photocopies are sent to the NAEP test development subject area specialist. The original is kept at the NAEP scoring center, where the sets are compiled according to instructions from the test development staff. Rangefinding takes place at the test development offices.
After review by each subject area's coordinator, the test development staff send the keys and/or the training sets for the new items to the materials-processing staff, who label them according to standard format and reproduce the sets of papers using the original copies. Training sets for trend extended constructed-response items are augmented with qualification sets created during this process. Correct scores are written on all anchor papers, while only the scoring supervisors and trainers have keys for the practice, calibration, and qualification sets. Trainers also keep annotations explaining the rationale for each score assigned. If any of these scores changes during training or scoring, the scoring supervisor's and trainer's notes provide the explanation.
Initial training sets are developed for each item during the pilot stage. The number of students assessed for pilot items is much smaller than for operational items. Also, items may be changed after pilot scoring. For these reasons new training sets are created when items become operational. Once items become operational, the training sets remain the same so that scorers are trained on that item the same way every year.