The national main assessment in 2000 selected fourth-, eighth-, and twelfth-grade students from public and private schools in the 50 states and District of Columbia for assessment in various subjects. Samples were selected using complex multi-stage sample designs that involved sampling geographic primary sampling units (PSUs) at the first stage, schools within PSUs at the second stage, assigning sample types and session types to schools at the third stage, and sampling students at the fourth stage. The goal was to secure a sample from which estimates of population and student group characteristics could be obtained with reasonably high precision as measured by low sampling variability.
The sample designs included provisions to sample particular groups of students, such as
at higher rates than students who were not in these groups. Nonresponse adjustment and poststratification were two of the estimation techniques employed to improve precision. To account for the different sampling rates and the various weighting adjustments, each student was assigned a sampling weight. Sampling weights were necessary to make valid inferences from the student sample to the respective target populations.
In NAEP, the national main assessment provides two types of weights for analysis purposes—modular weights and reporting weights. These weights are calculated for each student sample (i.e., grade/subject combination) in the national main assessment. Each set of weights is calculated to represent the target population. The target population consists of students in the target grade (4, 8, or 12) in public and private schools in the 50 states and the District of Columbia. The weights are motivated by the sample types and reporting populations used in the assessment.
The modular weight equals the reporting weight for each SD/LEP student. However, for each non-SD/LEP student, the reporting weight is approximately equal to half of the modular weight. This is because non-SD/LEP students from both sample types are included in each reporting population.
The national main assessment's sample design affects the estimation of sampling variability. Because of the effects of cluster sampling, observations of different students from the same school or geographical area are not assumed to be independent of one another. The term "cluster sampling" refers to the process of selecting many students within the same schools, and many schools within the same geographically defined primary sampling units. As a result of cluster sampling, ordinary formulas for the estimation of the variance of sample statistics, based on assumptions of independence, tend to underestimate true sampling variability and should not be used. Instead, jackknife replication methods are used to calculate variance estimates for assessment data.