Table of Contents | Search Technical Documentation | References
Prior to school sampling, NAEP school frames are stratified to increase the efficiency and ensure the representativeness of school samples in terms of important school-level characteristics, such as geography (e.g., census divisions or states), urbanicity, and race/ethnicity composition. The NAEP school frames are typically stratified using two types of stratification, explicit and implicit.
Explicit stratification partitions a sampling frame into mutually exclusive groupings called sampling strata. Samples are selected from these strata independently, meaning that each sampling stratum has its own target sample size, and, if systematic sampling is used, its own unique implicit stratification scheme and random start.
Implicit stratification involves sorting the sampling frame, as opposed to grouping the frame. For NAEP, schools are sorted in serpentine fashion by key school characteristics within sampling strata and sampled systematically using this ordering. This type of stratification ensures the representativeness of the school samples with respect to the key school characteristics.
Explicit stratification was not used in the sampling of the twelfth-grade public schools for NAEP 2015; only implicit stratification was used. The grade 12 school frame was implicitly stratified by:
Census division strata are generally based on census divisions. Most of the time, a census division stratum consists of a single census division, but sometimes they consist of combined census divisions or parts of a single census division. In 2015, each census division, except the Pacific Census Division, comprised a separate census division stratum. The Pacific Census Division was split into two parts: California in one part and Alaska, Hawaii, Oregon, and Washington in the other part. This was done so that California could use a different implicit stratification scheme than the other states, specifically the last stratification variable. The stratum of California schools used grade 12 achievement data as the final sort variable since it is available. Since grade 12 achievement data is generally not available for states, median income is used as a proxy.
The urbanicity classification strata were derived from the NCES urban-centric locale variable from the Common Core of Data (CCD), which classifies schools based on location (city, suburb, town, rural) and proximity to urbanized areas. Urban-centric locale has 12 possible values.
The urbanicity classification cells were created by starting with the original 12 NCES urban-centric locale categories within each census division stratum. Any cell with an expected school sample size less than four was combined with a neighboring cell within the same census division stratum. Collapsing was first done among the subcategories within a location class. (For example, the subcategories for location class city are 1:large, 2:mid-size, and 3:small. If one of these subcategories was deficient then either 1:large was collapsed with 2:mid-size; 3:small collapsed with 2:mid-size; or 2:mid-size collapsed with the smaller of 1:large or 3:small.) If the collapsed cell was still too small, all three subcategories within a location class were combined.
If a collapsed location class still had an expected school sample size less than four, then it was collapsed with a neighboring collapsed location class. That is, 1:city would be collapsed with 2:suburb or 3:town would be collapsed with 4:rural. If additional collapsing was necessary all location classes were combined. No collapsing across census division strata was allowed or necessary.
The final result of this was a set of census division-urbanicity strata with all strata having expected school sample sizes of at least four schools.
Schools within the urbanicity classification strata were further stratified into race/ethnicity classification strata. The first division was a dichotomization of each urbanicity stratum into a low and a high Black/Hispanic stratum (the cutoff was 15 percent Black and Hispanic students). If the expected school sample size of resultant strata was less than or equal to 8.0, then this was the final urbanicity-race/ethnicity stratum. If the expected school sample size exceeded 8.0, a further division was made.
For the low Black/Hispanic stratum, there were only five urbanicity strata that had a large enough expected school sample size, and these were dichotomized by state. The table below describes the dichotomization.
| Census division stratum | Urbanicity stratum | Group 1 States | Group 2 states | Group 3 states |
|---|---|---|---|---|
| Middle Atlantic | Suburb large city | New Jersey | New York | Pennsylvania |
| Middle Atlantic | Town fringe | New Jersey, Pennsylvania | New York | -- |
| East North Central | Suburb large city | Ohio | Illinois, Indiana | Michigan, Wisconson |
| East North Central | Town fringe | Michigan, Ohio | Illinois, Indiana, Wisconsin | -- |
| East North Central | Rural distant | Illinois, Michigan | Indiana, Ohio, Wisconsin | -- |
| SOURCE: U.S. Department of Education, Institute of Education Sciences, National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 2015 National Assessment. | ||||
Within the high Black/Hispanic stratum, the number of substrata was based on the expected school sample size. If the expected sample size was between 8.0 and 12.0, there were two substrata; if the expected sample size was between 12.0 and 16.0, there were three substrata; and if the expected sample size was over 16.0, there were four substrata.
The substrata were defined by percent Black and Hispanic students, with the cutoffs for substrata defined by weighted percentiles (with the weight equal to expected hits for each school). For two substrata, the cutoff was the weighted median; for three substrata, the weighted 33rd and 67th percentiles; for four substrata, the weighted median and quartiles.
The implicit stratification within these census division-urbanicity-race/ethnicity status strata was based on school type (public, BIE, DoDEA) and median income of the ZIP code area containing the school.