NCES 2012-001May 2012

## Appendix A.1. Sampling Errors

The samples used in surveys are selected from large numbers of possible samples of the same size that could have been selected using the same sample design. Estimates derived from the different samples would differ from each other. The difference between a sample estimate and the average of all possible samples is called the sampling deviation. The standard, or sampling, error of a survey estimate is a measure of the variation among the estimates from all possible samples and thus is a measure of the precision with which an estimate from a particular sample approximates the average result of all possible samples.

Based on the sample estimate and an estimate of its standard error, we can construct a confidence interval (with a lower and upper bound) such that if we drew repeated samples from the same population many times, we would expect a certain percentage of the estimates from these samples to fall within the interval. A commonly used confidence interval is a 95 percent confidence interval. If we used identical sampling procedures to draw one hundred samples and computed a 95 percent confidence interval around a sample estimate from one of those samples, we would expect that 95 of the estimates from the 100 samples would fall within the upper and lower bounds of the computed confidence interval. The 95 percent confidence interval is the interval extending from approximately two (1.96) standard errors above the estimate to two standard errors below the estimate.

To illustrate this concept, consider the data and standard errors appearing in table 116. For the 2009 estimate that 8.1 percent of 16- to 24-year-olds were high school dropouts, the table shows that the standard error is 0.20 percent. The sampling error above and below the stated figure, corresponding to a 95 percent confidence interval, is approximately double (1.96) the standard error, or about 0.40 percentage points. Therefore, we can create a 95 percent confidence interval, which is approximately 7.7 to 8.5 (8.1 percent ± 1.96 x 0.20 percent).

Analysis of standard errors can help assess how valid a comparison between two estimates might be. The standard error of a difference between two independent sample estimates is equal to the square root of the sum of the squared standard errors of the estimates. The standard error (se) of the difference between independent sample estimates a and b is

sea-b = (sea2 + seb2)1/2

It should be noted that most of the standard error estimates presented in the Digest and in the original documents are approximations. That is, to derive estimates of standard errors that would be applicable to a wide variety of items and could be prepared at a moderate cost, a number of approximations were required. As a result, the standard error estimates provide a general order of magnitude rather than the exact standard error for any specific item. The preceding discussion on sampling variability was directed toward a situation concerning one or two estimates. Determining the accuracy of statistical projections is more difficult. In general, the further away the projection date is from the date of the actual data being used for the projection, the greater the probable error in the projections. If, for instance, annual data from 1970 to 2008 are being used to project enrollment in institutions of higher education, the further beyond 2008 one projects, the more variability there is in the projection. One will be less sure of the 2018 enrollment projection than of the 2008 projection. A detailed discussion of the projections methodology is contained in Projections of Education Statistics to 2020 ( Guide to Sources

Top