Skip Navigation
Homeschooling in the United States: 2003

NCES 2006-042
February 2006

Methodology and Technical Notes

The National Household Education Surveys Program (NHES) is a telephone survey data collection program conducted by the U.S. Department of Education’s National Center for Education Statistics (NCES). Data collections have taken place from January through early May in 1991 and January through April in 1993, 1995, 1996, 1999, 2001, and 2003. When appropriately weighted, each sample is nationally representative of all persons in the target population in the 50 states and District of Columbia. The samples were selected using random-digit-dialing (RDD) methods, and the data were collected using computer-assisted telephone interviewing (CATI) technology.

Data from the 1999 and 2003 administrations of the NHES were used in this report. A screening interview administered to a member of the household age 18 or older was used to determine whether any children of the appropriate age lived in the household, to collect information on each child, and to identify the appropriate parent or guardian to respond for the sampled child. If one or two eligible children resided in the household, a parent interview was conducted about each child. If more than two eligible children resided in the household, generally two were sampled for extended interviews. Each interview was conducted with the parent or guardian most knowledgeable about the care and education of each sampled child, that parent or guardian being the child’s mother or female guardian in about 80 percent of the cases in both years. This report is based on a subset of the total sample, specifically children ages 5 to 17 with a grade equivalent of kindergarten to grade 12. The unweighted number of homeschooled students used in this analysis is 275 for 1999 and 239 for 2003. The unweighted number of nonhomeschooled students was 16,833 in 1999 and 11,755 in 2003.

Response Rates

Screening interviews were administered to all households and were completed with 57,278 households in 1999 and 32,049 households in 2003, yielding screener response rates of 74 percent and 65 percent, respectively. During the screener, children were identified and sampled for the parent interview. Parent interviews were completed for 88 percent of the sampled children in 1999 and 83 percent of sampled children in 2003. The response rate for the entire sample is calculated by taking the product of the proportion of completed screeners and the proportion of completed parent interviews for sampled children—65 percent in 1999 (.74*.88=.65) and 54 percent in 2003 (.65*.83=.54).

The estimated response rate for homeschooled students in 1999 was 63 percent compared to 65 percent for nonhomeschooled students. In 2003, the estimated response rate for homeschooled students was 51 percent, compared to 54 percent for nonhomeschooled students. A response bias analysis comparing weights adjusted for nonresponse to those not adjusted for nonresponse showed no evidence of bias among key NHES estimates and subgroups (U.S. Department of Education, forthcoming).

Item nonresponse (the failure to complete some items in an otherwise completed interview) was very low. The item nonresponse rates for most variables in this report were less than 2 percent, except for household income, which was about 10 percent in both survey years. All items with missing responses (i.e., don’t know, refused, or not ascertained) were imputed using an imputation method called a hot-deck procedure (Kalton and Kasprzyk 1986). As a result, no missing values remain in the data.

Data Reliability

NHES estimates are subject to two types of errors, sampling and nonsampling errors. Nonsampling errors are errors made in the collection and processing of data. Sampling errors occur because the data are collected from a sample rather than a census of the population.

Nonsampling Errors

Nonsampling error is the term used to describe variations in the estimates that may be caused by population coverage limitations and data collection, processing, and reporting procedures. The sources of nonsampling errors are typically problems like unit and item nonresponse, the differences in respondents’ interpretations of the meaning of the questions, response differences related to the particular time the survey was conducted, the tendency for respondents to give socially desirable responses, and mistakes in data preparation.

In general, it is difficult to identify and estimate either the amount of nonsampling error or the bias caused by this error. For each NHES survey, efforts were made to prevent such errors from occurring and to compensate for them where possible. For instance, during the survey design phase, cognitive interviews were conducted to assess respondents’ knowledge of the topics, comprehension of questions and terms, and the sensitivity of items. The design phase also entailed extensive staff testing of the CATI instrument and a pretest in which several hundred interviews were conducted.

An important nonsampling error for a telephone survey is failure to include persons who do not live in households with home telephones. The NHES only samples households with home telephones (i.e. fixed, or land-line, telephones for home use). As of 2000, approximately 5 percent of households in the United States did not have a home telephone.3 Weighting adjustments using characteristics related to telephone coverage were used to reduce the bias in the estimates associated with children who do not live in households with telephones. Weighting adjustments were also used to adjust for nonresponse and for the oversampling of households with Blacks and Hispanics. Finally, the person-level weights are developed using a cross between race/ethnicity of the child and household income categories; a cross between Census region and urbanicity; and, a cross between home tenure (own or rent) and age or grade of child.

Sampling Errors

The sample of households with home telephones selected for each NHES survey is just one of many possible samples that could have been selected from all households with telephones. Therefore, estimates produced from each NHES survey may differ from estimates that would have been produced from other samples. This type of variability is called sampling error because it arises from using a sample of households with telephones rather than all households with telephones.

The standard error is a measure of the variability due to sampling when estimating a statistic; standard errors for estimates presented in this report were computed using a jackknife replication method. Standard errors can be used as a measure of the precision expected from a particular sample. The probability that the sample estimate would differ from a complete census count by less than 1 standard error is about 68 percent. The chance that the difference would be less than 1.65 standard errors is about 90 percent; and that the difference would be less than 1.96 standard errors, about 95 percent.

Standard errors for all of the estimates presented in this report are available in tabular form in appendix A (see List of Tables for links to the Appendix A Standard Error tables). These standard errors can be used to produce confidence intervals. For example, an estimated 1.7 percent of students were homeschooled in 1999. This percentage has an estimated standard error of 0.14 percent. Therefore, the estimated 95 percent confidence interval for this statistic is 1.42 to 1.97 percent (1.7 +/- 1.96*0.14). That is, in 95 out of 100 samples using the same sample design, the estimated participation rate rounded to the nearest tenth, should fall between 1.4 and 2.0 percent.

Statistical Tests

The tests of significance used in this analysis are based on two-tailed Student’s t statistics. All differences cited in this report are significant at the 0.05 level of significance. In addition, tests for effect size were used for most distributions. Except for the comparison of homeschooling rates, which are used to describe a unique subset of the population and are central to the analysis, differences of less than 5 percent are not reported. For the logistic regression, effect size was calculated by dividing the log odds beta coefficient by 1.81 (Chinn 2000). Logistic regression findings that are significant at the 0.05 level of significance with an effect size of 0.2 and greater are reported.


Many of the variables used in the analysis for this report were derived from other variables in the public-use data files. In most cases, variables that had more than four response categories were collapsed into four or fewer categories to accommodate the small number of sampled homeschoolers. This procedure of collapsing response categories ensured that the number of sampled homeschoolers was appropriate for statistical analysis.


In this report, students are defined as being homeschooled if: 1) their parents reported them being schooled at home instead of a public or private school, 2) their enrollment in a public or private school did not exceed 25 hours a week if they were being homeschooled part-time, and 3) they were not being homeschooled solely because of a temporary illness.4 The construction of this measure combines answers from the questions listed below with answers to questions about reasons for homeschooling. The definition of homeschooling used in this report was intended to include rather than exclude students based on the data available to identify homeschooled students. Researchers wishing to apply different criteria to define homeschooled students may produce different results.

PB2. Some parents decide to educate their children at home rather than send them to school. Is (CHILD) being schooled at home?
  Yes (GO TO PB3)
PB3. So (CHILD) is being schooled at home instead of at school for at least some classes or subjects?
  Yes (Go to PB4)
PB4. Is (CHILD) getting all of (his/her) instruction at home, or is (he/she) getting some at school and some at home?
  All at home
Some at school & some at home (GO TO PB5
PB5. How many hours each week does (CHILD) usually go to a school for instruction? Please do not include time spent in extracurricular activities.
  Hours: ____
Grade equivalent

If students were enrolled in school for 9 or more hours per week, parents were asked to identify what grade or year their child was attending. If students were homeschooled and not attending school for 9 or more hours per week, or if parents responded that students were ungraded or in special education, then parents were asked to identify their child’s grade equivalent. In this report, a student’s grade equivalent is either the actual grade the student was enrolled in or the student’s grade equivalent. One measure of grade equivalent used in the report had three categories: kindergarten to grade 5, grades 6 to 8, and grades 9 to 12. Another measure had five categories to show more detail in the kindergarten to grade 5 category: kindergarten, grades 1 to 3, grades 4 to 5, grades 6 to 8, and grades 9 to 12. In 2003, 0.02 percent of students had a grade equivalent of “ungraded.” This statistic was 0.03 percent in 1999.

Race and ethnicity

Parents were asked to identify the race and ethnicity of sampled children. In 1999 and 2003, race was a mutually exclusive variable. Hispanic ethnicity was determined separately. The categories of race and ethnicity used in this report are Black, non-Hispanic, meaning the child was identified as Black but not Hispanic; White, non-Hispanic, meaning the child was identified as White but not Hispanic; Hispanic, meaning the child was identified as Hispanic and of any race; and Other, meaning the child was not identified as Hispanic and not identified as Black or White.

Number of children in the household

The number of children in the household was derived by adding the sampled child (one) to the total number of other children in the household. This report collapsed the number of children into three categories: One child, meaning the sampled child was the only child in the household; two children, meaning the household contained the sampled child and another child; and three or more children, meaning the household contained the sampled child plus two or more other children.

Number of parents in the household

Parents include birth, adoptive, step or foster parents in the household. If two such parents were in the household, the number of parents living in the household was two. If one such parent was in the household, the number of parents living in the household was one. If no such parents were in the household, the number of parents was none and any adult responsible for the sampled child was referred to as a nonparent guardian.

Parents’ participation in the labor force

Parents include birth, adoptive, step or foster parents in the household or nonparent guardians in the household. Parents were considered to be in the labor force if they were working full-time (35 hours or more per week) or part-time (less than 35 hours per week) or if they were actively looking for work during the time of the interview. If parents did not meet these criteria, they were classified as not in the labor force.

Household income

Household income is reported as a range. These ranges were collapsed into the following four categories for this report.
$25,000 or less
$25,001 to 50,000
$50,001 to 75,000
$75,001 or more

Parents’ highest educational attainment

Parents’ highest educational attainment is a composite variable that indicates the highest level of education for the students’ parents (birth, adoptive, or step) or nonparent guardians who reside in the household. The variable used in this report has four attainment categories: High school diploma or less, which includes high school equivalency degrees; Voc/tech degree or some college which includes associates degrees; Bachelor’s degree; and Graduate/professional school which includes some graduate coursework in addition to degree completion.


This variable categorizes the household ZIP Code as urban or rural. The definitions for these categories are taken from the 1990 and 2000 Census of Population. An urban place comprises densely settled territory that has a minimum population of 50,000 people. The specific density and distance requirements are defined in the Federal Register, Vol. 67, No. 84. Areas not classified as urban are classified as rural. Since a ZIP Code can cut across geographic areas, the urbanicity variable is classified into the category that has the largest number of persons. For example, if a ZIP Code has 5,000 persons in the category “urban”, and 1,200 persons in the category “rural”, it is classified as “urban.”


This variable identifies the Census region in which the subject child lives. The variable was created by linking states and telephone area codes of sampled numbers and then grouping the states into regions. The following states and the District of Columbia are in each Census region:

Northeast: Connecticut, Massachusetts, Maine, New Hampshire, New Jersey, New York, Pennsylvania, Rhode Island, Vermont
South: Alabama, Arkansas, District of Columbia, Delaware, Florida, Georgia, Kentucky, Louisiana, Maryland, Mississippi, North Carolina, Oklahoma, South Carolina, Tennessee, Texas, Virginia, West Virginia
Midwest: Iowa, Illinois, Indiana, Kansas, Michigan, Minnesota, Missouri, North Dakota, Nebraska, Ohio, South Dakota, Wisconsin
West: Alaska, Arizona, California, Colorado, Hawaii, Idaho, Montana, New Mexico, Nevada, Oregon, Utah, Washington, Wyoming

3 Special tabulations from the Current Population Survey, 2000.
4 The 25-hour cut-off translates to about 80 percent of the average school week, according to the 1999–2000 Schools and Staffing Survey. In 1999, about 1 percent of students who were homeschooled part time were excluded from the homeschooled student category because of the 25-hour cut-off. In 2003, no sampled students who were homeschooled part time were excluded from the homeschooled student category because of the 25-hour cut-off.