The target population for NHES is noninstitutionalized, civilian members of households in the 50 states and the District of Columbia. Because the topical surveys change from one NHES to the next, the specific age or grade criteria for the target populations also change. In general, there are three educational populations of interest: (1) children from birth through age 6, not yet enrolled in kindergarten; (2) school-aged children enrolled in kindergarten through grade 12, or homeschooled for the equivalent grades; and (3) adults not enrolled in 12th grade or below. The respondent is usually the parent or guardian of the child who is most knowledgeable about the education or care of the sampled child, the sampled youth, or the sampled adult.
NHES:2016 used a nationally representative address-based sample covering the 50 states and the District of Columbia. The survey was conducted by the U.S. Census Bureau from January through August 2016. The 2016 administration of NHES included a screener survey and three topical surveys: the Parent and Family Involvement in Education Survey, the Adult Training and Education Survey, and the Early Childhood Program Participation Survey. The screener survey asked for an enumeration of household members and was used to select an eligible household member for a topical survey .
All sampled households received initial contact by mail. While the majority of respondents completed paper questionnaires, a small sample of cases (n=35,000) was part of a Web experiment with mailed invitations to complete the survey online.
NHES:2012, also used an address-based sample; however, there was no option to complete the survey online. Prior to 2012, NHES used random digit dial (RDD) samples of landline telephones. Due to changes in the survey mode and item wording over the last few administrations, readers should use caution when comparing estimates with prior NHES administrations.
The 2016 household screener instrument was revised from the 2012 NHES to include a complete listing of all household members (up to 10) rather than just of children in the household.
Sampling Households. Several general sampling approaches have been taken with NHES, with the most recent being a two-stage address-based sampling approach. From 1995 to 2007 NHES used list-assisted RDD sampling , and the earliest administrations in 1991 and 1993 used a modified version of the Mitofsky-Waksberg RDD procedure.
NHES:2016. The first sampling stage selected 206,000 residential addresses; to increase the number of Black and Hispanic individuals in the sample, Black and Hispanic households were sampled at a higher rate than other households . Also, since ECPP-eligible children comprise a smaller portion of the population than PFI-eligible children, differential sampling in households with children in both domains was applied to ensure a sufficient sample size for the ECPP survey. The differential probabilities of selection (for households overall and also within households) are accounted for in the NHES weighting methodology.
Previous Administrations.For more details regarding the sampling of households in NHES:2012 and earlier, please refer to the corresponding Data File User’s Manual provided by the National Center for Education Statistics.
Approaches to household enumeration. The approach to screening households has changed over the course of the NHES program. Changes have been made in the methods of enumerating members of households that are contacted and the amount of information collected in the screener about the household and its members. In 2016, NHES screener questionnaires were sent by mail. The first question on the screener asked “How many people live in this household?” Respondents were then asked to provide the name, birth month and year, sex, current education enrollment status, and current grade equivalent for each person in the household. 115,342 completed screener questionnaires were returned for a weighted response rate of 66.4 percent. In 2012, NHES screener questionnaires were sent by mail. The first question on the screener asked, “Are there any youth or children age 20 or younger living in this household?” If the answer was no, respondents were instructed to return the questionnaire. If the answer was yes, respondents were instructed to provide name, age, sex, enrollment status, and grade level for each child or youth in the household. In prior administrations, household members were fully enumerated by phone interviewers.
Sampling within households. The within-household sample designs for the NHES collections are determined by the specific goals of the surveys administered and by the combination of surveys administered in a specific year. The number of people sampled per household varies across NHES administrations and modes, but no more than one person per survey was sampled within a household. Differential probabilities of selection in the within-household sampling are accounted for in the survey weights. Brief summaries of the within-household sampling for the various NHES administrations are given below, by year.
2016 NHES surveys . The NHES: 2016 sample was selected using a two-stage address-based sampling frame. The first sampling stage selected residential addresses, and the second sampling stage selected an eligible sample person from information provided on the household mail screener. For all screeners and topical surveys, multiple follow-up attempts were made to obtain completed questionnaires from nonrespondents and questionnaires were sent in both English and Spanish.
The respondent to the ATES questionnaire was the target respondent chosen after the screener survey. ATES questionnaires were completed for 47,744 adults, for a weighted response rate of 73.1 percent and an overall estimated weighted unit response rate (the product of the screener weighted unit response rate and the ATES unit weighted response rate) of 48.5 percent.
The respondent to the ECPP questionnaire was a parent or guardian in the household who knew about the sampled child. ECPP questionnaires were completed for 5,844 children, for a weighted unit response rate of 73.4 percent and an overall estimated weighted unit response rate (the product of the screener weighted unit response rate and the ECPP unit weighted response rate) of 48.7 percent.
The respondent to the PFI questionnaire was also a parent or guardian in the household who knew about the sampled child. The total number of completed PFI questionnaires was 14,075, for a weighted unit response rate of 74.3 percent and an overall estimated weighted unit response rate (the product of the screener weighted unit response rate and the PFI unit weighted response rate) of 49.3 percent, representing 53.2 million students when weighted to reflect national totals.
2012 NHES surveys. NHES:2012 used a similar sampling selection to that used in 2016. Screener questionnaires were completed by 99,590 households, for a weighted screener unit response rate of 73.8 percent. ECPP questionnaires were completed for 7,893 children, for a weighted unit response rate of 78.7 percent and an overall estimated weighted unit response rate (the product of the screener weighted unit response rate and the ECPP unit weighted response rate) of 58.1 percent. The total number of completed PFI questionnaires was 17,563, representing a population of 53.4 million students when weighted to reflect national totals.
Prior to the NHES:2012 data collection, NHES program surveys were collected by Westat and used computer-assisted telephone interviewing (CATI). For the 2012 NHES survey, data collection was conducted by the U.S. Census Bureau utilizing printed mail surveys. The 2016 NHES survey data collection was also conducted by the U.S Census Bureau, utilizing both printed mail surveys and online-web surveys. A user’s manual for the 2016 administration is forthcoming.
Reference dates. . Most questions in NHES that ask respondents to reference a period of time refer to the time of data collection or to the interval of time between the data collection and September of the school year for school-related activities, or the past 12–months for employment related activities. Other items are asked retrospectively for recent time frames. For example, respondents may be asked about activities in the past week or past month.
Data collection.Data collection for the NHES surveys typically takes place over a 4– to 8–month period beginning in January of each survey year. The 2016 NHES data collection was conducted from January to September 2016 using mailed surveys and online web-surveys, while data collections prior to 2012 used CATI , which required a shorter time-frame. For NHES: 2016, an address-based sample covering the 50 states and the District of Columbia was used. All sampled households received initial contact by mail. NHES screeners were then completed by adults at sampled addresses. An eligible household member, if any, was chosen from each returned screener. Then a topical survey was mailed either to the sampled adult or to the parent or guardian of the sampled child. Although the majority of respondents completed paper-and-pencil questionnaires, a small sample of households participating in NHES:2016 was part of a web experiment with mailed invitations to complete the survey online.
Editing. Intensive data editing is a feature of both the data collection and file preparation phases of the NHES collections. Data from the 2012 and 2016 mail surveys underwent a series of data processing procedures after receipt of the keyed questionnaire data. These procedures were data capture and imaging; the reformatting of keyed data; a preliminary interview status classification; a series of computer edits (to check that the data were in range, were consistent throughout a questionnaire record, and follow the correct skip pattern); school coding (where applicable); a final interview status classification; and a set of imputation procedures used to generate values for all appropriate questionnaire items with missing information. After imputation was completed, the editing procedures were repeated to ensure that no errors were introduced during imputation. Prior to NHES:2012, range checks for allowable values and logic checks for consistency between items were included in the online CATI interview so that many unlikely values or inconsistent responses could be resolved while the interviewer was speaking with the respondent.
The NHES surveys use weighting to adjust for the fact that the sampling method used is not simple random sampling. It is also used to adjust for potential undercoverage bias and potential unit nonresponse bias. Imputation is performed to compensate for item nonresponse. This section contains details on the 2012 and the 2016 data collections, which are designed similarly and use similar estimation methods.
Weighting.. The objective of the NHES surveys is to make inferences about the entire noninstitutionalized, U.S. civilian population and about subgroups of interest. To accomplish this, weighting occurs in multiple stages: household-level weighting and person-level weighting, as described below.
Information from the screener was used to create the household-level base weights, including the probability of sampling each address from the sampling frame based on the race/ethnicity stratum and the probability of selection based on PO Boxes which were designated by the United States Postal Service (USPS) as the only way to get mail (OWGM) versus those PO Boxes which were not OWGM. The household weight was then adjusted for screener nonresponse. The PO Box adjustment was not used in 2016 weighting adjustments, however the race/ethnicity adjustments were.
Starting in 2012, a within-household sampling scheme was developed to control the number of persons sampled for topical questionnaires in each household, to limit respondent burden. Eligible children were selected to receive either the ECPP survey or the PFI-Enrolled or PFI-Homeschooled survey, with no household receiving more than one survey. Responses were then weighted using the probabilities of selection of the respondents and other adjustments to account for nonresponse and coverage bias; the weight used for PFI estimates represented the characteristics of the school-age children, and the weight used for ECPP estimates represented the characteristics of the children not yet enrolled in kindergarten.
The person-level weight was computed to account for five factors: the probability of sampling the person’s domain (ECPP, PFI, or ATES) in a given household, the probability of sampling the person of all eligible persons in the household for the given domain (ECPP, PFI, or ATES), the probability of sampling a child in a joint custody arrangement at both parents’ addresses, nonresponse, and raking the nonresponse-adjusted person-level weights to national totals obtained using the number of children from the annual American Community Survey (ACS). ACS 2011 estimates were used for NHES:2012 and ACS 2015 estimates were used for NHES:2016. The Current Population Survey (CPS) was used for raking in prior NHES administrations, but ACS was used for NHES:2012 and NHES:2016 because its sample size was larger than CPS, allowing for more accurate control totals and greater precision in the NHES estimates. Please see NHES:2012 Data File User’s Manual (McPhee et al ., 2015) for additional information.
For NHES surveys prior to 2012, only households with landline telephones were sampled. Estimates were then adjusted to totals of persons living in both telephone and nontelephone households derived from the CPS to achieve this goal. CPS is an annual household survey conducted by the U.S. Bureau of the Census for the U.S. Bureau of Labor Statistics. As a result, any undercoverage in CPS for special populations, such as the homeless, is also reflected in NHES estimates. The potential for bias due to sampling only telephone households had been examined for virtually all the population groups sampled in NHES. Generally, the bias in the estimates due to excluding nontelephone households was small in 2007 and earlier.
Imputation. Item response rates for most data items collected in NHES surveys are very high. Nevertheless, virtually all items with missing data (including “don’t know” and “refused” responses) are imputed in NHES surveys. For more extensive information on item response rates, etc., please refer to the NHES:2012 Data File User’s Manual (McPhee et al ., 2015).
Imputations are done in the NHES program for three reasons. First, complete responses are needed for the variables used in developing the sampling weights. Second, data users compute estimates employing a variety of methods, and complete responses should aid their analysis. Third, imputation may reduce bias due to item nonresponse, by obtaining imputed values from donors that are similar to the recipients. The procedures for imputing missing data are discussed below.
A standard (random within-class) hot-deck procedure has been used to impute missing responses in every NHES data collection. In the hot-deck approach, the entire file is sorted into cells defined by characteristics of the respondents. The variables used in the sorting are general descriptors of the interview and include any variables involved in the skip pattern for the items. All of the observations are sorted into cells defined by the responses to the sort variables, and then divided into two classes within the cell depending on whether or not the item being imputed is missing. For an observation with a missing value, a value from a randomly selected donor (with the item completed) is used to replace the missing value. After the imputation is completed, edit programs are run to ensure that the imputed responses do not violate edit rules.
For some items, the missing values are imputed manually rather than using the hot-deck procedure , for example, (1) to impute certain person-level demographic characteristics; (2) to correct for a small number of inconsistent imputed values; and ( 3) to impute for a few cases when no donors with matching sort variable values could be found.
Some person-level characteristics (age confirmation, household relationships, and child and parent language) were imputed manually because they typically involve complex relationships and/or constraints that require special attention to ensure consistency and reasonableness.
After values have been imputed for all observations with missing values, the distribution of the item prior to imputation (i.e., the respondent’s distribution) is compared to the post-imputation distribution of the imputed values alone and of the imputed values together with the observed values. This comparison is an important step in assessing the potential impact of item nonresponse bias and ensuring that the imputation procedure reduces this bias, particularly for items with relatively low response rates (less than 90 percent).
For each data item for which any values are imputed, an imputation flag variable is created so that users can identify imputed values. Users can employ the imputation flag to delete the imputed values, use alternative imputation procedures, or account for the imputation in computation of the reliability of the estimates produced from the dataset.
As a result of declining response rates for all telephone surveys, and the increase in households that only or mostly use cellphones instead of landlines, the data collection method for 2012 was changed to a mail survey. The new design utilizes an address-based sample (ABS) and primarily collects data using a self -administered paper questionnaire that is mailed to sampled households. For more information about the mail data collection and ABS design, see NHES:2012 Data File User’s Manual (McPhee et al ., 2015).
The next NHES data collection is planned for 2019.