NAEP ScoringNAEP assessments include multiple-choice items, which are machine-scored by optical mark reflex scanning, and constructed-response items, which are scored by trained scoring staff. These trained scorers ("raters") use an image-based scoring system that routes student responses directly to each rater. Focused, explicit scoring guides are developed to match the criteria emphasized in the assessment frameworks. Consistency of scoring between raters is monitored during the process through ongoing reliability checks and frequent backreading.
Throughout the scoring process, three types of personnel make up individual scoring teams:
Throughout the scoring process, three types of personnel make up individual scoring teams:
Team members are required to have, at a minimum, a baccalaureate degree from a four-year college or university. An advanced degree, scoring experience, and/or teaching experience is preferred. Scoring teams use the training process to determine whether each individual rater is sufficiently prepared to score. Following training , each rater is given a pre-scored "qualification set" and expected to attain 80 percent correct in order to proceed. All scoring is carried out via image processing. To assign a score, raters click the mouse over a button displayed in a scoring window. Since buttons are included only for valid scores, there is no editing for out-of-range scores. Two significant advantages of the image-scoring system are the ease of regulating the flow of work to raters and the ease of monitoring scoring. The image system provides scoring supervisors with tools to determine rater qualification, to backread raters, to determine rater calibration, to reset trend rescore items, to monitor trend rescore items through t-statistics reports, to monitor interrater reliability, and to gauge the rate at which scoring was being completed. The scoring supervisors monitor work flow for each item using a status tool that displays the number of responses scored, the number of responses first-scored that still need to be second-scored, the number of responses remaining to be first-scored, and the total number of responses remaining to be scored. This allows the scoring directors and project leads to accurately monitor the rate of scoring and to estimate the time needed for completion of the various phases of scoring. Last updated 30 November 2007 (OT) |