Skip Navigation

Dropout Rates in the United States, 1996


Definition of Who Is a Dropout

      There are variations in the dropout definitions in the existing data sources, including the Current Population Survey (CPS), the High School and Beyond Study (HS&B), and the National Education Longitudinal Study of 1988 (NELS:88). In addition, the age or grade span examined and the type of dropout rate-status, event, or cohort-varies across the data sources. Furthermore, there were potentially significant changes in CPS procedures in 1986, 1992, and 1994.

      The dropout collection through the (NCES) Common Core of Data (CCD) is designed to be consistent with the current CPS procedures. However, the CCD collection includes all dropouts in grades 7 through 12 versus only grades 10 through 12 in CPS; it is on administrative records rather than a household survey as in CPS; and counts anyone receiving a GED outside of a regular (approved) secondary education program as a dropout as opposed to the CPS approach of counting GED certificate holders as high school completers.

      One of the concerns addressed in the NCES CCD data collection on dropouts is the development and implementation of a nationally consistent definition of a dropout to be used in school districts and state departments of education. Currently, there is considerable variation across local, state, and federal data collections on such issues as:

There will, no doubt, be some discontinuities in dropout reporting as the new and more consistent data become available.

Defining and Calculating Event Dropout Rates Using the CCD

      The Common Core of Data (CCD) administered by NCES is an annual survey of the state-level education agencies in the 50 states, the District of Columbia, and the outlying areas. Statistical information is collected on public schools, staff, students, and finance.

      A dropout data collection component was field tested during the 1989-90 school year. The participants were in approximately 300 school districts that included representatives from 27 states and two territories. The data were gathered through administrative records maintained at school districts and schools. The field test data were used to inform the design of a dropout statistics component for CCD.

      In the CCD dropout data collection, the event of dropping out is the focus of the collection. A school dropout is defined as an individual who was enrolled in school at some time during the previous year, was not enrolled at the beginning of the current school year, had not graduated from high school or completed an approved educational program, and did not meet any of the following exclusionary conditions:

For the purpose of this definition:

      This new collection was initiated with a set of instructions to state CCD coordinators in the summer of 1991. Those instructions specified the details of dropout data to be collected during the 1991-92 school year. Dropouts, like graduates, are reported for the preceding school year. The 1991-92 data were submitted to NCES as a component of the 1992-93 CCD data collection. Most recently, the 1994-95 data were submitted as a component of the 1995-96 CCD.

Defining and Calculating Dropout Rates Using the CPS

Event Rates

      The October Supplement to the CPS is the only current national data source that can be used to estimate annual national dropout rates. As a measure of recent dropout experiences, the event rate measures the proportion of students who dropped out over a one-year interval of time.

      The numerator of the event rate for 1996 is the number of persons 15 through 24 years old surveyed in 1996 (grades 10-12) who were enrolled in high school in October 1995, were not enrolled in high school in October 1996, and who also did not complete high school (that is, had not received a high school diploma or an equivalency certificate) between October 1995 and October 1996.

      The denominator of the event rate is the sum of the dropouts (that is, the numerator) and the number of all persons 15 through 24 years old who attended grades 10, 11, and 12 last year who are still enrolled or who graduated or completed high school last year.

      The dropout interval is defined to include the previous summer and the current school year; so that once a grade is completed, the student is then at risk of dropping out of the next grade. Given that the data collection is tied to each young adult's enrollment status in October of two consecutive years, any students who drop out and return within the 12-month period are not counted as dropouts.

Status Rates

      The status dropout rate is a cumulative rate that estimates the proportion of young adults who are dropouts, regardless of when they dropped out.

      The numerator of the status rate for 1996 is the number of young adults ages 16 through 24 years of age who, as of October 1996, have not completed high school and are not currently enrolled. The denominator is the total number of 16- through 24-year-olds in October 1996.

CPS Design

      CPS is a nationally representative sample survey of all households. The survey is conducted in approximately 60,000 dwelling units in 729 primary sampling units. Dwelling units are in-sample for four successive monthly interviews, out-of-sample for the next 8 months, and then returned to the sample for the following four months. The sample frame is a complete list of dwelling-unit addresses at the Census updated by demolitions and new construction and field listings. The population surveyed excludes members of the Armed Forces, inmates of correctional institutions, and patients in long-term medical or custodial facilities; it is referred to as the civilian, non-institutionalized population. Typically, about 4 percent of dwelling units are not interviewed, because occupants are not at home after repeated callbacks, or for some other reason.

      An adult member of each household serves as the informant for that household, supplying data for each member of the household. In addition, supplementary questions regarding school enrollment are asked about eligible household members 3 years old and over. Some interviews are conducted by phone using computer-assisted telephone interviewing.

CPS Dropout Data Collection

      CPS data on educational attainment and enrollment status in the current year and prior year are used to identify dropouts; and additional CPS data are used to describe some basic characteristics of dropouts. The CPS provides the only source of national time series data on dropout rates. However, because CPS collects no information on school characteristics and experiences, its uses in addressing dropout issues are primarily for providing some insights into who drops out. In addition, the sample design of the CPS yields estimates for Hispanics that tend to have large standard errors, which make it difficult to understand patterns in Hispanic dropout rates.

Changes Introduced in 1986

      In an effort to improve data quality, in 1986 the Bureau of Census instituted new editing procedures for cases with missing data on school enrollment items. The effect of the editing changes were evaluated for data from 1986 by applying both the old and new editing procedures. The result was an increase in the number of students enrolled in school and a decrease in the number of students enrolled last year but not enrolled in the current year. The new editing procedures lowered, but not significantly, the 1986 event rate for grades 10-12, ages 14 through 24, by about 0.4 percentage points, from 4.69 to 4.28. The changes in the editing procedures made even less of a difference in the status dropout rates for 16- through 24-year-olds (12.2 percent based on the old procedures and 12.1 percent based on the new).

Changes Introduced in 1992

      Prior to 1992, educational attainment was based on the control card questions on highest grade attended and completed. Identification as a high school graduate was derived based on attendance and completion of grade 12.

      The control card items used to identify educational attainment were:

      The 1992 redesign of the CPS introduced a change in the data used to identify high school completers. Dropout data from the CPS year are now based on a combination of control card data on educational attainment and October Supplement data on school enrollment and educational attainment. In 1992 the Census Bureau changed the items on the control card which measured each individual's educational attainment.

      The October CPS Supplement items used to identify dropouts include the following:

      The new control card educational attainment item is as follows:

      Educational attainment status is now based on the response to the control card item. The following response categories are used for high school:

      Students whose highest grade completed is the 9th, 10th, or 11th grade are assumed to have dropped out in the next grade.

      The following response categories are used to identify high school completers:

      Although the response categories are not automatically read to each respondent, they can be used as a prompt to help clarify the meaning of a question or a response. Identification as a high school completer is based on the direct response to the new control card educational item.

      Differences in the pre- and post-1992 methods of identifying high school completers come from the observation that not all 12th-grade completers receive a high school diploma or equivalent, and not all holders of a high school diploma or certificate complete the 12th grade. These differences have an impact on the numbers and proportions of event and status dropouts.

      Differences in event rates. In the case of the event rate, in prior years students who completed 12th grade and left school without graduation or certification were counted as completers when they were in fact dropouts. On the other hand, some students who left school because they completed high school before the 12th grade were identified as dropouts when they were really early completers (e.g., those who passed the California Challenge Exam, received a GED certificate, or were admitted early to college)41. The current use of actual graduation or completion status includes the first group as dropouts and the second group as completers.

      Compared to before, the event dropout rate includes 12th graders who did not receive a credential of some sort in the numerator count of dropouts and the early completers are subtracted from the numerator. The denominator is not changed.

      The net effect of these changes is small, resulting in an increase in the aggregate event dropout rate that is not significant. In 1992, the October CPS included both versions of the educational attainment items-the old items based on the number of years of school completed and the new one based on the more accurate response categories42. Using the old items, the estimated event rate for 1992 was 4.0, compared with a rate of 4.4 percent in 1992 using the new educational attainment item.

      Differences in the status rate. The status rate involves a third group of students who were miscoded prior to 1992. These students leave high school before completing the 12th grade, never complete the 12th grade, but later graduate or complete high school by some alternative means, such as an equivalency exam. Prior to 1992 these young adults were coded as dropouts. Since 1992, members of this group have been coded as graduates or completers. Furthermore, the explicit inclusion of high school graduation or completion, including the GED (e.g., "GED") as a response category may have increased the likelihood of identifying late completers.

      Under the procedures introduced in 1992, the 12th graders who do not complete high school or the equivalent are added to the numerator of the status dropout rate and early and late completers are subtracted from the numerator. The denominator is not changed. These changes, especially the identification and removal of late completers from the dropout count, contributed to a decrease in the status dropout rate. Indeed, using years of school completed rather than the new educational attainment item, the status rate in 1992 rises to 11.4 percent rather than the 11.0 percent based on the old educational attainment item. However, the estimate of 11.4 percent is still much lower than the status rate for 1991 (12.5 percent). While this could represent real change in the status dropout rate, the fact that this would be the largest decrease in the status dropout rate seen in the time series data from 1972 to 1995, coupled with the fact that the rate for 1993 also was 11.0 percent, leads one to speculate that the introduction of the new educational attainment item resulted in more accurate data on educational attainment throughout the survey, including the variables that had been used to calculate the number of years of school completed.

      Special education students. One exception to the procedures to identify dropouts in CPS is the categorization of special education students. In principle, efforts are made by the Census Bureau to identify special education students in special schools and treat them as not enrolled. However, if special education students are not identified, they may be reported as completing 12th grade with no diploma. If this happens, they will, by definition, be counted as dropouts.

Changes Introduced in 1994

      During the 1994 data collection and processing two additional changes were implemented in the CPS. Computer-assisted telephone interviewing was introduced, resulting in higher completion rates for each individual data item and thus less reliance on allocation of missing responses. If the allocation procedures yielded a distribution different from the 1994 reported patterns, there is the potential for a change in the distribution of the high school completion status.

      In 1994 there were also changes introduced in the processing and computing phase of data preparation. The benchmarking year for these survey estimates was changed from the 1980 Census to the 1990 Census, and adjustments for undercount in the 1990 Census were included. Thus, any age, sex, or racial-ethnic groups that were found to be under-represented in the 1990 Census are given increased weights. An analysis of the effect of the changes in the benchmarking year using the 1993 data indicate that the change especially affected the weights assigned to Hispanic young adults(table B1).

These changes have the potential for affecting both the numerator and denominator of the dropout rates. Analyses of the 1993 data show that the change in the benchmark year for the sample weights increased the male and Hispanic status and event dropout rates, while having little effect on the white or black rates (table B2).

      Table B2 also shows that overall the change in control years had a larger impact on status rates than on event rates. Using the 1990 controls increases the event rate by only 1.3 percent, but raises the status rate by 3.2 percent-from 11.0 percent to 11.4 percent.

Defining and Calculating High School Completion Rates Using the CPS

      The educational attainment and high school completion status data from the October CPS are also used to measure the high school graduation and completion rates.

      In years prior to 1974, completion rates were reported in a series of separate two-year age groups, but no overall rates comparable to the event and status dropout rates were computed. The completion rate computed and published first in 1994 is for the young adult population in the years beyond high school-that is, the 18- to 24-year-old population. These rates are reported nationally by race-ethnicity and at the state level, three-year moving averages are computed to yield more stable estimates.

As was noted in the text, the state completion rates reflect the experiences of the 18- to 24-year-olds living in the state at the time of the interview; thus, movements in and out of states to accommodate employment and postsecondary education may be evident in some states. For example, a state with a relatively large unskilled labor workplace sector might have a lower high school completion rate than anticipated, due to an influx of young workers. Conversely, a state with a disproportionate number of colleges and universities might have a higher high school completion rate than anticipated, due to an influx of postsecondary students.

Increases in GED rates

      The section on completion indicated that there was a substantial increase in the last few years in the estimate of the percentage of 18- to 24-year-olds getting GEDs. In 1993 it was only 4.9 percent, but went from 7.0 in 1994, 7.7 in 1995, and 9.8 in 1996. Although the standard errors on these estimates are fairly large, the absolute change is also quite large. The large increases in 1994 and 1995 came at the time that CPS instituted CATI in 1994.

      The American Council on Education, which administers the GED, produces annual reports on the number of persons taking the GED and the number of persons who were issued a GED credential. From these reports it is possible to calculate the number of 18- to 24-year-olds who received a GED in the past year for 1990 through 1995. It is also possible to estimate the same quantity from the CPS data for 1990 to 1995 by looking at only those who were reported to have completed a GED last year and using this, along with the GED item, to calculate how many 18- to 24-year-olds obtained GEDs each year. The CPS numbers for 1994 and 1995 are much closer to the estimates from the American Council on Education than previous years (figure B1).

Definition of Family Income in CPS

      Family income is derived from a single question asked of the household respondent. Income includes money income from all sources including jobs, business, interest, rent, social security payments, and so forth. The income of nonrelatives living in the household is excluded, but the income of all family members 14 years old and over, including those temporarily living away, is included. Family income refers to receipts over a 12-month period.

      Income for families from which no income information was obtained (about 5 percent of families) was imputed. A sequential hot deck procedure was used. A total of 200 imputation classes were created-5 levels of the age of head of household by 5 levels of the education of the head of household by 2 levels for the employment status of the head of household, and 4 levels of the number of workers in the household. To minimize the multiple use of a single donor, up to 5 donors were placed in each imputation class. A donor was selected at random from these when a family with missing income information was encountered. In a few instances (about 10 of 50,000 families in each year) an imputation class had no donors but a family from the class with missing income information was encountered. In these cases a donor was selected by collapsing similar classes until a non-empty imputation class was created.

      To facilitate comparisons over time, the categorical family income information was transformed into a continuous family income variable. The transformation was accomplished by randomly assigning for each family an income value from the income interval to which their income belonged. For intervals below the median a rectangular probability density function was used; for those above the median a Pareto probability density function was used. The methodology has a feature that if the continuous family income variable were transformed back to a categorical family income variable, the value for each family would be identical to the original data. Based on the continuous family income variable, a family income percentile variable is calculated for each person in the survey which represents that person's position in the family income distribution. For example, if 25 percent of all persons have a lower value of family income (and 75 percent have a higher value), then the person's family income percentile variable has a value of 25. The methodology gives all persons in the same household the same value of both the categorical and continuous versions of family income. There are several issues that affect the interpretation of dropout rates by family income using the CPS. First, it is possible that the family income of the students at the time they dropped out was somewhat different than their current family income. (The problem is potentially greatest with status dropouts who could have dropped out several years ago.)

      Furthermore, family income is from a single question asked of the household respondent in the October CPS. In some cases, there are persons 15 through 24 years old living in the household who are unrelated to the household respondent, yet whose family income is defined as the income of the family of the household respondent. Therefore, the current household income of the respondent may not accurately reflect that person's family background. In particular, in 1991 some of the dropouts in the 15- through 24-year age range were not still living in a family unit with a parent present. However, an analysis of 1991 status dropout rates by family income, race-ethnicity, and family status (presence of parent in the household) indicates that the bias introduced by persons not living in their parents' household is small (table B3). The status dropout rates for black and white persons were similar with or without the parent present. For example, 20.6 percent of low income blacks without a parent present were dropouts compared with 21.3 percent of those living in their parents' household. In addition, the relationship between dropout rates and income held within each racial category regardless of whether the person was living in a household with his or her parent. That is, blacks and whites within income levels dropped out at similar levels-with or without the parent present. However, this was not true of Hispanics. Hispanics in upper income levels not residing with either parent were more likely than upper income Hispanics with parents present to be status dropouts.

Definition of Geographic Regions in CPS

      There are four Census regions used in this report: Northeast, Midwest, South, and West. The Northeast consists of Maine, New Hampshire, Vermont, Massachusetts, Connecticut, Rhode Island, New York, New Jersey, and Pennsylvania. The Midwest consists of Ohio, Indiana, Illinois, Michigan, Wisconsin, Iowa, Minnesota, Missouri, North Dakota, South Dakota, Nebraska, and Kansas. The South consists of Delaware, Maryland, Washington D.C., Virginia, West Virginia, North Carolina, South Carolina, Georgia, Florida, Kentucky, Tennessee, Alabama, Mississippi, Arkansas, Louisiana, Oklahoma, and Texas. The West consists of Montana, Idaho, Wyoming, Colorado, New Mexico, Arizona, Utah, Nevada, Washington, Oregon, California, Alaska, and Hawaii.

Definition of Immigration Status in CPS

      Immigration status was derived from a variable on the control card inquiring about the citizenship status of the reference person:

Citizen Status:

1 = Native, born in the U.S.

2 = Native, born in Puerto Rico or U.S. outlying area

3 = Native, born abroad of American parent or parents

4 = Foreign born, U.S. citizen by naturalization

5 = Foreign born, not a citizen of the U.S.

Those coded "1" above (Native, born in U.S.) were considered born in United States. All others were considered foreign born. (Less than 1 percent of Hispanics were born abroad of American parents.)

Imputation for Item Non-Response

      For many key items in the October CPS, the Bureau of the Census imputes data for cases with missing data due to item non-response. However, for some of the items that were used in this report, item non-response was not imputed by the Bureau of the Census. Special imputations were conducted for these items using a sequential hot deck procedure implemented through the PROC IMPUTE computer program developed by the American Institutes for Research43. Three categories of age, two categories of race, two categories of sex, and two categories of citizenship were used as imputation cells.

Defining and Calculating Cohort Dropout Rates Using NELS:88

      The NELS:88 baseline comprised a national probability sample of all regular public and private 8th-grade schools in the 50 states and District of Columbia in the 1987-88 school year. Excluded from the NELS:88 sample were Bureau of Indian Affairs schools, special education schools for the handicapped, area vocational schools that do not enroll students directly, and schools for dependents of U.S. personnel overseas; such school-level exclusions have a very small impact on national estimates.

      NELS:88 started with the base-year data collection in which students, parents, teachers, and school administrators were selected to participate in the survey. NELS:88 began with a target sample of 1,032 sample schools, of which 30 were deemed ineligible. Some 698 of the 1,002 eligible schools agreed to participate in the study. Given the longitudinal nature of the study, the initial school response rate of 69.7 percent was deemed too low to yield acceptable levels of schools, administrators, teachers, parents, and most importantly, students. To address this concern, a sample of sister schools was selected and 359 replacement schools were identified and added to the study. Responses were obtained from 1,057 schools, thus increasing the school response rate to 77.7 percent (1,057/(1,002+359)). Usable student data were received for 1,052 of the schools.

      The total eighth-grade enrollment for the 1,052 NELS:88 sample schools was 202,996. During the listing procedures (before 24-26 students were selected per school), 5.35 percent of the students were excluded because they were identified by school staff as being incapable of completing the NELS:88 instruments owing to limitations in their language proficiency or to mental or physical disabilities. Ultimately, 93 percent or 24,599 of the sample students participated in the base-year survey in the spring of 1988.

      The NELS:88 first follow-up survey was conducted in the spring of 1990. Students, dropouts, teachers, and school administrators participated in the followup, with a successful data collection effort for approximately 93 percent of the base-year student respondents. In addition, because the characteristics and education outcomes of the students excluded from the base year may differ from those of students who participated in the base-year data collection, a special study was initiated to identify the enrollment status of a representative sample of the base-year ineligible students. Data from this sample were then combined with first and second follow-up data for the computation of 8th- to 10th-grade, 10th- to 12th-grade, and 8th- to 12th-grade cohort dropout rates.

      The second follow-up survey was conducted in the spring of 1992. Students, dropouts, parents, teachers, and school administrators participated in this followup. Approximately 91 percent of the sample of students participated in the second follow-up survey, with 88 percent of the dropouts responding.

      The second follow-up High School Transcript Study was conducted in the fall of 1992. Transcript data spanning the three or four years of high school (9th or 10th through 12th grades) were collected for 1) students attending, in the spring of 1992, schools sampled for the second follow-up school administrator and teacher surveys,44   2) all dropouts and dropouts in alternative programs who had attended high school for a minimum of one term; 3) all early graduates, regardless of school contextual sample type; and 4) triple ineligibles enrolled in the twelfth grade in the spring of 1992, regardless of school affiliation. Triple ineligibles are sample members who were ineligible-due to mental or physical handicap or language barrier-for the base year, first follow-up, and second follow-up surveys. The transcript data collected from schools included student-level data (e.g., number of days absent per school year, standardized test scores) and complete course-taking histories. Complete high school course-taking records were, of course, obtained only for those transcript survey sample members who graduated by the end of the spring term of 1992; incomplete records were collected for sample members who had dropped out of school, had fallen behind the modal progression sequence, or were enrolled in a special education program requiring or allowing more than twelve years of schooling.

      A total of 1,287 contextual schools and 256 non-contextual schools responded to the request for transcripts. Reasons cited by school staff for not complying with the request included: inadequate permission for transcript release (some schools required parental permission for the release of minors' transcripts); no record of the sample member or no course-taking record because of brevity of enrollment; insufficient staff for transcript preparation (despite offers of remuneration for preparation costs); and archiving or transfer of sample member records. Student coverage rates were 89.5 percent for the total transcript sample and 74.2 percent for the dropout/alternative completers.

      Missing from the cohort rates from NELS:88 is anyone who had dropped out prior to the spring of their eighth-grade year. Thus, the overall cohort rates reported here may be lower than they would have been if a younger cohort were used. This may be particularly important for Hispanics, given that CPS data show that Hispanic dropouts tend to have completed less schooling than other dropouts. The cohort rates also reflect the school enrollment status of both eligible and ineligible non-participants and participants, to the extent that this information could be obtained.

The following definition of a dropout was employed in NELS:88:

1. an individual who, according to the school (if the sample member could not be located) or according to the school and home, is not attending school (i.e., has not been in school for 4 consecutive weeks or more and is not absent due to accident or illness); or

2. a student who has been in school less than 2 weeks after a period in which he or she was classified as a dropout.

      Thus, a student who was a temporary dropout (stopout) who was found by the study to be out of school for 4 consecutive school weeks or more and had returned to school (that is, had been back in school for a period of at least 2 weeks at the time of survey administration in the spring of 1990) would not be classified as a dropout for purposes of the cohort dropout rates reported here.

      The basic NELS:88 procedure for identification of a dropout was to confirm school-reported dropout status with the student's household. For the first followup, dropout status was obtained first from the school and then confirmed with the household for 96.4 percent of the dropouts. Thus only 3.6 percent of the dropouts were identified by only school-reported information. For the second followup, 4.9 percent of the dropouts were identified by only school-reported information.

      The 1988-1990 dropout rate requires data from both 1988 and 1990. As a result, the size of the sample used in computing the 1988 to 1990 rate is tied to the size of the sample in 1990. Many students changed schools between 1988 and 1990. Because of the costs associated with following small numbers of students to many schools, a subsampling operation was conducted at the time of the first followup. Of the 24,599 students who participated in 1988, 20,263 students were sampled, and 130 were found to be out of scope (due to death or migration out of the country). The dropout rates from 1988-1990 reflect the experiences of 20,133 sample cases. Some 1,088 sample cases dropped out and 19,045 sample cases continued in school.

      The 1990-1992 rate starts from the 19,045 student sample cases. Some 91 of the student sample cases from 1990 were identified as out of scope in 1992. The dropout rates from 1990 to 1992 reflect the experiences of 18,954 student sample cases.

      The 1988-1992 rates reflect the experiences of the 20,070 student sample cases. These cases result from the 20,263 subsampled student cases in 1990, less the 92 cases that were out of scope in both 1990 and 1992, less the 91 students sample cases identified as out of scope in 1992, less the 10 dropout sample cases identified as out of scope in 1992. Note that 24 student sample cases who were out of the country in 1990 returned to school in the U.S. by spring 1992, and an additional 14 student sample cases who were out of the country in spring 1990 returned to the U.S. by spring 1992 but did not reenroll (dropouts). And, another 354 student sample cases who dropped out between 1988 and 1990 returned to school by spring 1992.

HS&B Calculation of Cohort Dropout Rates

      In HS&B, students are reported as having either a regular diploma or some alternative credential-described as the equivalent of a high school diploma. The estimate that 7 percent of the high school completers from the class of 1982 held alternative credentials by 1986 refers to a comparison of alternative completers with all regular diploma recipients. The estimates of a 16.6 percent dropout rate and an 8.2 percent alternative completion rate by 1986 are based on a comparison of on-time regular diploma recipients versus all other completers. The difference in the last two estimates is due to the fact that they are computed from two differently derived variables on the public use data files.

Accuracy of Estimates

      The estimates in this report are derived from samples and are subject to two broad classes of error-sampling and nonsampling error. Sampling errors occur because the data are collected from a sample of a population rather than from the entire population. Estimates based on a sample will differ somewhat from the values that would have been obtained from a universe survey using the same instruments, instructions, and procedures. Nonsampling errors come from a variety of sources and affect all types of surveys, universe as well as sample surveys. Examples of sources of nonsampling error include design, reporting, and processing errors, and errors due to nonresponse. The effects of nonsampling errors are more difficult to evaluate than those that result from sampling variability. As much as possible, procedures are built into surveys in order to minimize nonsampling errors.

      In reporting sample survey data, estimates based on unweighted sample sizes less than 30 are not displayed. The standard error is a measure of the variability due to sampling when estimating a parameter. It indicates how much variance there is in the population of possible estimates of a parameter for a given sample size. Standard errors can be used as a measure of the precision expected from a particular sample. The probability that a complete census would differ from the sample by less than the standard error is about 68 out of 100. The chances that the difference would be less than 1.65 times the standard error are about 90 out of 100; that the difference would be less than 1.96 times the standard error, about 95 out of 100.

Standard errors for rates and number of persons based on CPS data were calculated using the following formulas:

Standard errors for the estimates in the tables appear in appendix A.

      In October of 1991, the Bureau of the Census released new b parameters for 1988 and 1990. With the release of the new parameters, the Bureau of the Census also made adjustments to the parameters for earlier years. Therefore, for some years, the standard errors presented in the appendix tables here are different than the standard errors presented in earlier reports.

Methodology and Statistical Procedures

      The comparisons in the text have all been tested for statistical significance to ensure that the differences are larger than those that might be expected due to sampling variation. Two types of comparisons have been made in the text.

      Differences in two estimated percentages. The Student's t statistic can be used to test the likelihood that the differences between two percentages are larger than would be expected by sampling error.

t =

where P1 and P2 are the estimates to be compared and se1 and se2 are their corresponding standard errors.

      As the number of comparisons on the same set of data increases, the likelihood that the t value for at least one of the comparisons will exceed 1.96 simply due to sampling error increases. For a single comparison, there is a 5 percent chance that the t value will exceed 1.96 due to sampling error. For five tests, the risk of getting at least one t value that high increases to 23 percent and for 20 comparisons, 64 percent.

      One way to compensate for this danger when making multiple comparisons is to adjust the alpha level to take into account the number of comparisons being made. For example, rather than establishing an alpha level of 0.05 for a single comparison, the alpha level is set to ensure that the likelihood is less than 0.05 that the t value for any of the comparisons exceeds the critical value by chance alone when there are truly no differences for any of the comparisons. This Bonferroni adjustment is calculated by taking the desired alpha level and dividing by the number of possible comparisons, based on the variable(s) being compared. The t value corresponding to the revised, lower alpha level must be exceeded in order for any of the comparisons to be considered significant. For example, to test for differences in dropout rates between whites, blacks, and Hispanics, the following steps would be involved:

      All comparisons in this report were tested using the Bonferroni adjustment for the t tests. Where categories of two variables were involved, the number of comparisons used to make the Bonferroni adjustment was based on the relationship(s) being tested.

      Trends. Regression analysis was used to test for trends across age groups and over time. Regression analysis assesses the degree to which one variable (the dependent variable) is related to a set of other variables (the independent variables). The estimation procedure most commonly used in regression analysis is ordinary least squares (OLS). While some of the trends span the entire period from 1972 to 1995, many of the rates reached a high point during the late 1970s. Thus, most of the descriptions that refer to "since the late 1970s" use 1978 as a starting point.

      The analyses in this report were conducted on the event rates, status rates, and completion rates. The event rate and status rate estimates were used as dependent measures in the analysis with a variable representing time and a dummy variable controlling for changes in the editing procedure (0 = years 1968 to 1986, 1 = 1987 to 1995) used as independent variables. However, in these data some of the observations were less reliable than others (i.e., some years' standard errors were larger than other years'). In such cases OLS estimation procedures do not apply and it is necessary to modify the regression procedures to obtain unbiased regression parameters. The modification that is usually recommended transforms the observations to variables which satisfy the usual assumptions of ordinary least squares regression and then applies the usual OLS analysis to these variables.

      This was done in this analysis using the data manipulation and regression capability of Microsoft EXCEL. Each of the variables in the analysis was transformed by dividing each by the standard error of the relevant year's rate (event or status). The new dependent variable was then regressed on the new time variable and new editing-change dummy variable. All statements about trends in this report are statistically significant at the 0.05 level.


[41]  Although prior to 1992 the questionnaire did not have the words "high school diploma or equivalency certificate," the interviewer instructions included an instruction to record 12th grade for people who completed high school with a GED or other certificate although they had dropped out earlier. The specific inclusion of these words on the questionnaire appear to have made a difference in the quality of responses from the household informant.Back to the Text

[42]  Unlike prior years, however, data for individuals missing on the variables representing years of school completed ("What is the highest grade or year...has attended?" and "Did...complete that grade?") were not imputed by the Census Bureau. For this analysis we imputed missing data on these variables based on the grade they attended last year (if enrolled last year). For those individuals that were missing data and were not enrolled last year we imputed their highest grade completed by examining the responses to the new educational attainment variable.Back to the Text

[43]  D. H. McLaughlin, Imputation for Non-Response Adjustment, American Institutes for Research, October 1991, updated: February 1994.Back to the Text

[44]  Schools selected for the contextual components of the second followup (the school administrator and teacher surveys) are referred to as contextual schools. Sample members enrolled in those schools are referred to as contextual students.Back to the Text


Appendix A Previous Contents Contents Appendix C