NAEP assessments include cognitive items and non-cognitive items. Cognitive items are designed to assess what students know and can do, and are based on the framework and specifications documents for each assessment subject. These types of items include multiple-choice items,
constructed-response items scored
dichotomously, and constructed-response items scored
polytomously. Non-cognitive items are contextual questions that are administered to students, teachers, and school administrators via survey questionnaires that collect additional information about students' demographics and experiences in and out of school.
The item-development steps for each subject area are as follows:
The National Assessment Governing Board (the Governing Board) provides content frameworks and item specifications in each subject area.
The instrument development committee in each subject area provides guidance to NAEP staff about how the objectives described in the framework can be measured given the constraints of resources and the feasibility of measurement technology. The committee makes recommendations about priorities for the assessment (within the context of the assessment framework) and the types of items to be developed.
Specialists with subject-matter expertise and experience in creating items according to specifications develop and review the assessment questions.
NAEP test development staff and external test specialists review and revise the items and accompanying scoring guides.
Editorial and fairness reviews are conducted as required by NCES.
Pilot test materials are prepared, and those that require secure clearance are sent to the federal
Office of Management and Budget (OMB). Non-cognitive items are submitted to the OMB for clearance, while cognitive items do not need to be approved. In addition, materials such as recruitment and communication documents that would be sent to the field (e.g., Facts for Teachers and Facts for Districts) may also be included in clearance packages.
A pilot test is conducted in many of the states and jurisdictions slated to participate in the next operational assessment.
Based on the pilot test analyses, items are selected for inclusion in the operational assessment.
Each subject-area instrument development committee approves the selection of items to include in the next operational assessment.
Each subject-area instrument is submitted to the Governing Board for approval.
Operational materials, namely the non-cognitive items and documentation showing which non-cognitive items were removed, added, or revised are sent to the OMB to secure clearance.
After a final review, the booklets are printed or packaged as digital test forms for computer delivery.
Each administration of the NAEP assessment requires a new configuration of the student booklets given to students and how they are distributed to schools. To allow for wide content coverage within the limited testing time for each student, the instrument configuration entails a three-step design process for the subject areas to be assessed:
In the first step, NAEP uses a focused balanced incomplete block (BIB) or partially balanced incomplete block (pBIB) design to assign blocks or groups of cognitive items to student booklets. The "focused" aspect of NAEP's booklet design requires that each student answer questions from only one subject area. In a BIB design, the cognitive blocks are balanced; each cognitive block appears an equal number of times in every possible position. Each cognitive block is also paired with every other cognitive block in a test booklet exactly the same number of times. In a pBIB design, cognitive blocks may not appear an equal number of times in each position, or may not be paired with every other cognitive block an equal number of times.
Second, the spiraling scheme is designed. Spiraling refers to interleaving booklets systematically so that when they are handed out in the specified order, any group of students will receive approximately the target proportions of different types of booklets. This same process is applied in the development of digital test forms and contextual questionnaires for NAEP's computer-delivered assessments.
The third aspect is the bundling design. In 2003, NAEP test developers introduced an enhanced bundling design, referred to as vertical bundling. Vertical bundling has flexibility with respect to bundle length and reduces the required number of different bundles, decreasing booklet wastage, and improving balance of within-session booklet pairings.
|Note: Until the 1984 assessment, NAEP was administered using matrix sampling and tape recorders; that is, by administering booklets of exercises using paced audio tapes that walked groups of students through the individual assessment exercises in a common booklet. In the 1984 assessment, a balanced incomplete block booklet design, which does not include audio tape pacing, was introduced in place of taped matrix sampling.|