Skip Navigation
Initial Results From the 2005 NHES Early Childhood Program Participation Survey

NCES 2006-075
May 2006

Appendix A: Data Reliability and Validity

Estimates produced using data from the survey are subject to two types of error, sampling and nonsampling errors. Nonsampling errors are errors made in the collection and processing of data. Sampling errors occur because the data are collected from a sample rather than the whole population.

Nonsampling Errors

Nonsampling error is the term used to describe variations in the estimates that may be caused by population coverage limitations and data collection, processing, and reporting procedures. The sources of nonsampling errors are typically problems like unit and item nonresponse, the differences in respondents' interpretations of the meaning of the questions, response differences related to the particular time the survey was conducted, and mistakes in data preparation. In NHES:2005, efforts were made to minimize nonsampling error through cognitive testing in the survey design stage, a field test of the surveys, online data edits and postinterview edits, and a comparison of the survey estimates with similar estimates from previous surveys. Weighting adjustments (in particular, nonresponse adjustments and poststratification/raking adjustments) were also used to minimize the potential effects of nonsampling error.

An important source of nonsampling error for a telephone survey is the failure to include persons who do not live in households with telephones. This is particularly problematic in RDD surveys because so little is known about the sampled telephone numbers with which contact has not been made. The March 2005 Current Population Survey (CPS) shows that 92.9 percent of all children ages 6 and younger live in households with at least one landline telephone (based on independent tabulations of the U.S. Census Bureau's March 2005 Current Population Survey). Estimation procedures were used to help reduce the bias in the estimates associated with excluding the 6.5 percent of children who do not live in households with landline telephones.

A study conducted by Montaquila, Brick, and Brock (1997) examined telephone coverage bias for subsamples of the population in NHES:1996. This study found that with very few exceptions, the adjusted weights yielded estimates with absolute telephone coverage bias of 2 percent or less. Undercoverage bias for some subgroups may have been large due to larger proportions of persons in these subgroups residing in nontelephone households.

Another potential source of nonsampling error is respondent bias. Respondent bias occurs when respondents systematically misreport (intentionally or unintentionally) information in a study. There are many different forms of respondent bias. One of the best known is social desirability bias, which occurs when respondents give what they believe is the socially desirable response (Demaio 1984). For example, surveys that ask about whether respondents voted in the most recent election typically obtain a higher estimate of the number of people who voted than do voting records. Although respondent bias may affect the accuracy of the results, it does not necessarily invalidate other results from a survey. If there are no systematic differences among specific groups under study in their tendency to give socially desirable responses, then comparisons of the different groups will accurately reflect differences among the groups.


Response Rates

In the 2005 survey, Screener interviews were completed with 58,140 households, with a weighted Screener unit response rate of 66.9 percent. A screener was used to collect information on household composition and interview eligibility. ECPP interviews were completed for 7,209 children, for a weighted unit response rate of 84.4 percent and an overall estimated unit response rate (the product of the Screener unit response rate and the ECPP unit response rate) of 56.4 percent.

An extensive unit nonresponse bias analysis was undertaken for NHES:2001 (See Brick et al. forthcoming.) It is informative to consider the results of the NHES:2001 nonresponse bias analysis, since the NHES-ECPP:2001 survey was similar in content, data collection procedures, and in target population to NHES-ECPP:2005. This study involved an analysis of the effect of weighting on estimates, as well as an examination of the effect of various data collection procedures (refusal conversion, second refusal conversion, and varying numbers of call attempts) on the estimates. For each hypothetical data collection scenario considered in this study, the sample was reweighted, and the estimates were compared across scenarios. The analysis of unit nonresponse bias showed no evidence of bias in the estimates as the data collection "effort" was varied. While such an analysis is unable to directly examine bias due to the exclusion of cases that did not respond under any of the scenarios studied, other approaches have been used in NHES to evaluate that bias. In NHES:2001, these other approaches involved an examination of unit response rates as a whole and for various subgroups, an analysis to determine characteristics that are associated with Screener unit nonresponse, and a comparison of estimates based on adjusted and unadjusted weights, and these investigations revealed no evidence of unit nonresponse bias. However, all such studies are limited in the variables that can be included; unit nonresponse bias may still be present in other variables that were not studied.

Item nonresponse (i.e., the failure to complete some items in an otherwise completed interview) was very low for most items in the ECPP-NHES:2005. The item nonresponse rate for most variables included in this report was 3 percent or lower. The one item with nonresponse rates larger than 10 percent was the item related to household income (with an item response rate of 89.7 percent). Items with missing data were imputed using a hot-deck procedure (Rao and Shao 1992) in which cells are formed that contain cases with similar characteristics and a donor value is used to impute the missing value. The estimates included in this report are based on the imputed data. Users can employ the imputation flags to delete the imputed values, use alternative imputation procedures, or account for the imputation in computation of the reliability of the estimates produced from the dataset. For example, some users might wish to analyze the data with the missing values rather than the imputed values.


Sampling Errors

The sample of telephone households selected for the 2005 survey is just one of many possible samples that could have been selected. Therefore, estimates produced from this sample may differ from estimates that would have been produced from other samples. This type of variability is called sampling error because it arises from using a sample of households with telephones, rather than surveying all households with telephones.

The standard error is a measure of the variability due to sampling when estimating a statistic; standard errors for estimates presented in this report were computed using a jackknife replication method. Standard errors can be used as a measure of the precision expected from a particular sample. The probability that a sample estimate would differ from the population parameter obtained from a complete census count by less than 1 standard error is about 68 percent. The chance that the difference would be less than 1.65 standard errors is about 90 percent, and that the difference would be less than 1.96 standard errors, about 95 percent.

Standard errors for all of the estimates are presented in the tables. These standard errors can be used to produce confidence intervals. For example, an estimated 60 percent of children from birth through age 5 and not yet in kindergarten have at least one regular nonparental care arrangement or early childhood program. This figure has an estimated standard error of 0.8 percent. Therefore, the estimated 95 percent confidence interval for this statistic is approximately 58 to 62 percent (60 1.96 (0.8)). That is, if the processes of selecting a sample, collecting the data, and constructing the confidence interval were repeated, it would be expected that in 95 out of 100 samples from the same population, the confidence interval would contain the true participation rate.



All of the estimates in this report are based on weighting the observations using the probabilities of selection of the respondents and other adjustments to partially account for nonresponse and coverage bias. Weights were developed to produce unbiased and consistent estimates of national totals. The weight used in this E.D. TAB is FEWT, the weight variable used to estimate the characteristics of infants/toddlers and preschoolers. In addition to weighting the responses properly, special procedures for estimating the statistical significance of the estimates were employed because the NHES:2005 data were collected using a complex sample design. Complex sample designs result in data that violate some of the assumptions that are normally made when assessing the statistical significance of results from a simple random sample. For example, the standard errors of the estimates from these surveys may vary from those that would be expected if the sample were a simple random sample and the observations were independent and identically distributed random variables. Eighty replicate weights, FEWT1 to FEWT80, were used to produce estimates of the sampling errors of estimates. The estimates and standard errors presented in this report were produced using WesVar Complex Samples software and the jackknife 1 option as a replication procedure (Westat 2000).