Skip Navigation
Dropout Rates in the United States: 1995

Appendix B
Technical Notes

Definition of Who Is a Dropout

There are variations in the dropout definitions in the existing data sources, including the Current Population Survey (CPS), the High School and Beyond Study (HS&B), and the National Education Longitudinal Study of 1988 (NELS:88). In addition, the age or grade span examined and the type of dropout rate-status, event, or cohort-varies across the data sources. Furthermore, there were potentially significant changes in CPS procedures in 1986, 1992, and 1994.

The new collection through the (NCES) Common Core of Data (CCD) is designed to be consistent with the current CPS procedures. However, the CCD collection includes all dropouts in grades 7 through 12 versus only grades 10 through 12 in CPS, it is on administrative records rather than a household survey as in CPS, and counts anyone receiving a GED outside of a regular (approved) secondary education program as a dropout as opposed to the CPS approach of counting GED certificate holders as high school completers.

One of the concerns addressed in the new (NCES) Common Core of Data (CCD) data collection on dropouts is the development and implementation of a nationally consistent definition of a dropout to be used in school districts and state departments of education. Currently, there is considerable variation across local, state, and federal data collections on such issues as:

There will, no doubt, be some discontinuities in dropout reporting as the new and more consistent data become available.

Defining and Calculating Event Dropout Rates Using the CCD

The Common Core of Data (CCD) administered by NCES is an annual survey of the state-level education agencies in the 50 states, the District of Columbia, and the outlying areas. Statistical information is collected on public schools, staff, students, and finance.

A dropout data collection component was field tested during the 1989-90 school year. The participants were in approximately 300 school districts that included representatives from 27 states and two territories. The data were gathered through administrative records maintained at school districts and schools. The field test data were used to inform the design of a dropout statistics component for CCD.

In the CCD dropout data collection the event of dropping out is the focus of the collection. A school dropout is defined as an individual who was enrolled in school at some time during the previous year, was not enrolled at the beginning of the current school year, had not graduated from high school or completed an approved educational program, and did not meet any of the following exclusionary conditions:

For the purpose of this definition: This new collection was initiated with a set of instructions to state CCD coordinators in the summer of 1991. Those instructions specified the details of dropout data to be collected during the 1991-92 school year. Dropouts, like graduates, are reported for the preceding school year. The 1991-92 data were submitted to NCES as a component of the 1992-93 CCD data collection. Most recently, the 1993-94 date were submitted as a component of the 1994-95 CCD.

There were fifteen states that reported 1991-92 data that are consistent with the specified definition. State data submissions for the 1994-95 CCD show that 43 states and the District of Columbia submitted dropout data for 1993-94, and in 24 of the states and the District of Columbia the data are consistent with the specified definition.

Defining and Calculating Dropout Rates Using the CPS

Event Rates

The October Supplement to the CPS is the only current national data source that can be used to estimate annual national dropout rates. As a measure of recent dropout experiences, the event rate measures the proportion of students who dropped out over a one year interval of time.

The numerator of the event rate for 1995 is the number of persons 15- through 24-years-old surveyed in 1995 (grades 10-12) who were enrolled in high school in October 1994, were not enrolled in high school in October 1995, and who also did not complete high school (that is, had not received a high school diploma or an equivalency certificate) between October 1994 and October 1995.

The denominator of the event rate is the sum of the dropouts (that is, the numerator) and the number of all persons 15- through 24-years-old who attended grades 10, 11, and 12 last year who are still enrolled or who graduated or completed high school last year.

The dropout interval is defined to include the previous summer and the current school year; so that once a grade is completed, the student is then at risk of dropping out of the next grade. Given that the data collection is tied to each young adult's enrollment status in October of two consecutive years, any students who drop out and return within the 12-month period are not counted as dropouts.

Status Rates

The status dropout rate is a cumulative rate that estimates the proportion of young adults who are dropouts, regardless of when they dropped out.

The numerator of the status rate for 1995 is the number of young adults ages 16- through 24-years of age who, as of October 1995, have not completed high school and are not currently enrolled. The denominator is the total number of 16- through 24-year-olds in October 1995.

CPS Design

CPS is a nationally representative sample survey of all households. The survey is conducted in approximately 60,000 dwelling units in 729 primary sampling units. Dwelling units are in-sample for four successive monthly interviews, out-of-sample for the next 8 months, and then returned to the sample for the following four months. The sample frame is a complete list of dwelling-unit addresses at the Census updated by demolitions and new construction and field listings. The population surveyed excludes members of the Armed Forces, inmates of correctional institutions, and patients in long-term medical or custodial facilities; it is referred to as the civilian, non-institutionalized population. Typically, about 4 percent of dwelling units are not interviewed, because occupants are not at home after repeated callbacks, or for some other reason.

An adult member of each household serves as the informant for that household, supplying data for each member of the household. In addition, supplementary questions regarding school enrollment are asked about eligible household members 3 years old and over. Some interviews are conducted by phone using computer assisted telephone interviewing.

CPS Dropout Data Collection

CPS data on educational attainment and enrollment status in the current year and prior year are used to identify dropouts; and additional CPS data are used to describe some basic characteristics of dropouts. The CPS provides the only source of national time series data on dropout rates. However, because CPS collects no information on school characteristics and experiences, its uses in addressing dropout issues are primarily for providing some insights into who drops out. In addition, the sample design of the CPS yields estimates for Hispanics that tend to have large standard errors which make it difficult to understand patterns in Hispanic dropout rates.

Changes Introduced in 1986

In an effort to improve data quality, in 1986 the Bureau of Census instituted new editing procedures for cases with missing data on school enrollment items. The effect of the editing changes were evaluated for data from 1986 by applying both the old and new editing procedures. The result was an increase in the number of students enrolled in school and a decrease in the number of students enrolled last year but not enrolled in the current year. The new editing procedures lowered, but not significantly, the 1986 event rate for grades 10-12, ages 14- through 24, by about 0.4 percentage points, from 4.69 to 4.28. The changes in the editing procedures made even less of a difference in the status dropout rates for 16- through 24-year-olds (12.2 percent based on the old procedures and 12.1 percent based on the new).

Changes Introduced in 1992

Prior to 1992, educational attainment was based on the control card questions on highest grade attended and completed. Identification as a high school graduate was derived based on attendance and completion of grade 12.

The control card items used to identify educational attainment were:

The 1992 redesign of the CPS introduced a change in the data used to identify high school completers. Dropout data from the CPS year are now based on a combination of control card data on educational attainment and October Supplement data on school enrollment and educational attainment. In 1992 the Census Bureau changed the items on the control card which measured each individual's educational attainment.

The October CPS Supplement items used to identify dropouts include the following:

The new control card educational attainment item is as follows:

Educational attainment status is now based on the response to the control card item. The following response categories are used for high school:

Students whose highest grade completed the 9th, 10th, or 11th grade are assumed to have dropped out in the next grade.

The following response categories are used to identify high school completers:

Although the response categories are not automatically read to each respondent, they can be used as a prompt to help clarify the meaning of a question or a response. Identification as a high school completer is based on the direct response to the new control card educational item.

Differences in the pre- and post-1992 methods of identifying high school completers come from the observation that not all 12th grade completers receive a high school diploma or equivalent, and not all holders of a high school diploma or certificate complete the 12th grade. These differences have an impact on the numbers and proportions of event and status dropouts

Differences in event rates. In the case of the event rate, in prior years students who completed 12th grade and left school without graduation or certification were counted as completers when they were in fact dropouts. On the other hand, some students who left school because they completed high school before the 12th grade were identified as dropouts when they were really early completers (e.g. those who passed the California Challenge Exam, received a GED certificate, or were admitted early to college).\50\ The current use of actual graduation or completion status includes the first group as dropouts and the second group as completers.

Compared to before, the event dropout rate includes 12th graders who did not receive a credential of some sort in the numerator count of dropouts and the early completers are subtracted from the numerator. The denominator is not changed.

The net effect of these changes is small, resulting in an increase in the aggregate event dropout rate that is not significant. In 1992, the October CPS included both versions of the educational attainment items-the old items based on the number of years of school completed and the new one based on the more accurate response categories.\51\  Using the old items, the estimated event rate for 1992 was 4.0, compared with a rate of 4.4 percent in 1992 using the new educational attainment item.

Differences in the status rate. The status rate involves a third group of students who were miscoded prior to 1992. These students leave high school before completing the 12th grade, never complete the 12th grade, but later graduate or complete high school by some alternative means, such as an equivalency exam. Prior to 1992 these young adults were coded as dropouts. Since 1992 members of this group have been coded as graduates or completers. Furthermore, the explicit inclusion of high school graduation or completion, including the GED (e.g. "GED" as a response category may have increased the likelihood of identifying late completers.

Under the procedures introduced in 1992, the 12th graders who do not complete high school or the equivalent are added to the numerator of the status dropout rate and early and late completers are subtracted from the numerator. The denominator is not changed. These changes, especially the identification and removal of late completers from the dropout count, contributed to a decrease in the status dropout rate. Indeed, using years of school completed rather than the new educational attainment item, the status rate in 1992 rises to 11.4 percent rather than the 11.0 percent based on the educational attainment item. However, the estimate of 11.4 percent is still much lower than the status rate for 1991 (12.5 percent). While this could represent real change in the status dropout rate, the fact that this would be the largest decrease in the status dropout rate seen in the time series data from 1972 to 1995, coupled with the fact that the rate for 1993 also was 11.0 percent, leads one to speculate that the introduction of the new educational attainment item resulted in more accurate data on educational attainment throughout the survey, including the variables that had been used to calculate the number of years of school completed.

Special education students. One exception to the procedures to identify dropouts in CPS is the categorization of special education students. In principle, efforts are made by the Census Bureau to identify special education students in special schools and treat them as not enrolled. However, if special education students are not identified, they may be reported as completing 12th grade with no diploma. If this happens, they will, by definition, be counted as dropouts.

Changes Introduced in 1994

During the 1994 data collection and processing two additional changes were implemented in the CPS. Computer assisted telephone interviewing was introduced, resulting in higher completion rates for each individual data item and thus less reliance on allocation of missing responses. If the allocation procedures yielded a distribution different from the 1994 reported patterns, there is the potential for a change in the distribution of the high school completion status.

In 1994 there were also changes introduced in the processing and computing phase of data preparation. The benchmarking year for these survey estimates was changed from the 1980 Census to the 1990 Census, and adjustments for undercount in the 1990 Census were included. Thus, any age, sex, or race/ethnicity groups that were found to be under-represented in the 1990 Census are given increased weights. An analysis of the effect of the changes in the benchmarking year using the 1993 data indicate that the change especially effected the weights assigned to Hispanic young adults (table B1).

Table B1-Average weight and number in population using 1980 and 1990 Census based weights, by race ethnicity: October 1993

                            1980 Based                       1980 Based
                      -------------------------         ------------------------
                                      Number                           Number
                      Average      in thousands         Average     in thousands    Percentage
                       weight      (population)         weight      (population)      change
 White non-Hispanic     1.79          23,911             1.84          24,611          2.7
 Black non-Hispanic     2.25          5,087              2.33          5,285           3.4
 Hispanic               2.09          3,998              2.48          4,747           15.7
 Other                  1.32          1,351              1.51          1,541           12.6

These changes have the potential for affecting both the numerator and denominator of the dropout rates. Analyses of the 1993 data show that the change in the benchmark year for the sample weights increased the male and Hispanic status and event dropout rates, while having little effect on the white or black rates (table B2).

Table B2. Estimated event and status rates based on 1980 census controls and 1990 Census controls: October 1993

                              1980 based             1990 based
                               weights                 weights         Differences in rates
                          ------------------     ------------------    --------------------
                          Events      Status     Events      Status     Events      Status
Total                      4.46       11.01       4.52       11.36       1.3%        3.2%
  Male                     4.58       11.17       4.65       11.61        1.5        4.0
  Female                   4.34       10.85       4.38       11.10        1.0        2.3
  White non-Hispanic       3.93        7.94       3.95        7.96        0.5        0.3
  Black non-Hispanic       5.83       13.56       5.81       13.52       -0.3       -0.3
  Hispanic                 6.72       27.52       6.90       27.88        2.8        1.3
  Other                    2.79        7.01       2.87        7.04        2.9        0.4
Family income
  Low income level        12.32       23.88      12.44       24.38        1.0        2.1
  Middle income level      4.33        9.90       4.36       10.22        0.7        3.2
  High income level        1.34        2.72       1.36        2.75        1.1        1.3

Table B2 also shows that overall the change in control years had a larger impact on status rates than on event rates. Using the 1990 controls increases the event rate by only 1.3 percent, but raises the status rate by 3.2 percent-from 11.0 percent to 11.4 percent.

Summary of Changes Since 1992

Figures B1 and B2 display the actual event and status rates from 1990 to 1995 and also event and status rates which attempt to adjust for the various changes in CPS since 1992. The details of these adjustments are described in the next section.

Figure B1-Event rates, actual and adjusted: 1990 to 1995

SOURCE: U.S. Department of Commerce, Bureau of the Census, Current Population Survey, various years, unpublished data

Figure B2-Status rates, actual and adjusted: 1990 to 1995

SOURCE: U.S. Department of Commerce, Bureau of the Census, Current Population Survey, various years, unpublished data

These figures of adjusted rates reflect the fact that we can adjust for the increase in the status rates of the weighting changes in 1994 and the increase in the event rates associated with the change to the new educational attainment item in 1992. What is more difficult to account for is the drop in status rates between 1991 and 1992 and for the increase in event rates from 1993 to 1995. It is plausible that the decrease in status rate in 1992 was due to the fact that the introduction of the educational attainment item to the control card resulted in the collection of better data throughout the survey, including the data on years of school completed. While we cannot rule out that this drop in the status rate represented real change in the proportion of 16- to 24-year-olds who were dropouts in 1992, if true, this single year drop would represent the largest one year drop in the history of the time series.

There also appears to be an increase in the event rate from 1993 through 1995; although the year-to-year increases are not statistically significant. The adjustment for the definitional changes between 1991 and 1992 "corrects" for or "explains" the increase observed in the actual data from 1991 to 1992. And a ratio adjustment of the 1990 to the 1980 based rates is constant over time. However, changes over time in the relative size of each subpopulation and their respective contributions to the pool of dropouts is likely to result in a change in the ratio of 1990 to 1980 based rates over time; thus, intercensal estimates based on 1990 data are likely to yield a different pattern of year to year change than was observed in intercensal estimates based on 1980 data. This factor was not taken into consideration in the adjusted rates in figures B1 and B2. Thus, the apparent increases may be due to changing population dynamics that were beyond the scope of this analysis, or they represent the beginning of a real increase in event dropout rates (which would eventually effect the status rates as well). Alternatively, this apparent increase may be due to mere statistical fluctuations owing to sampling error and does not represent real change. Several more years of data are needed in order to answer these questions.

Details of Adjustment Procedures

Changes in 1992. In prior years students who completed 12th grade and left school without graduation or certification were counted as completers when they were in fact dropouts. It was fairly simple to subtract from the numerators of both the status and the event rates, thus treating them as before, as completers. The SPSS code is listed below:

compute event2=event.
compute status2=status.
if (enlastyr eq 1 & edat eq 38 and enroll eq 2) event2=0.
if (edat eq 38 and enroll eq 2) status2=0.
variable labels 
   status2 "Status without 12th grade non completers"
   event2 "Event without 12th grade non completers".

As stated earlier, the status rate involves two other groups of students who were miscoded prior to 1992. On the one hand, students who left school because they completed high school before the 12th grade were identified as dropouts when they were really early completers. Another group of students leave high school before completing the 12th grade, never complete the 12th grade, but later graduate or complete high school by some alternative means, such as an equivalency exam. Prior to 1992 these young adults were coded as dropouts. Since 1992 members of this group have been coded as graduates or completers. In order to adjust the post 1992 rates to be equivalent to prior years, these cases had to be identified and then added back into the numerator and counted as dropouts. In principle this should have been straightforward. In 1992, the survey asked both the items that are used to create the "years completed" variable and also the new educational attainment item. Those former students with an educational attainment of 39 (high school diploma or GED) or more and with less than 12 years completed should be those late completers that prior to 1992 were counted as dropouts. Unfortunately, there were missing data on the "years completed" variable. To compensate, we took the proportion of status completers who had a code greater than 38 on the educational attainment item (graduate or more) and had less than 12 years completed and applied that ratio to the number of status completers. For example, of the 10,606,000 (weighted) cases who 1) had a code of 39 or above on the educational attainment item, 2) were not currently enrolled in school, and 3) had a non-missing value on the "years completed" variable, 2.3 percent had been reported to have completed less than 12 years of school. We therefore took this proportion and applied it to the proportion of all young persons who had a code of 39 or above and were not enrolled in school (11,647,000). This resulted in 279,000 (2.3 percent of 11,647,000) persons that we subtracted out of the numerator of the status dropout rate.

Changes in 1994. The only change that we could estimate was the impact of the weighting change on the estimates for 1994 and 1995. The Census Bureau provided us with 1990 control weights for the 1993 data. The ratio of the rates calculated by the 1980 weights in 1993 to the rates calculated with the 1990 based weights were then applied to the 1994 and 1995 event and status rates. As shown in table B2, changes in the benchmarking year had little effect on overall rates, but did have an effect on Hispanic rates.

We had no way of gauging the impact of CPS implementing computer assisted interviews in 1994. Clearly, this innovation has resulted in better, cleaner, and more accurate data. This can be concluded from the absence of outliers seen in previous waves of the survey (e.g. 24-year-olds enrolled in the 3rd grade). Both the estimates for event and status rates went up (though non-significantly) in 1994, leading one to believe that CATI may have had some influence on the estimate. The estimates in 1995 also showed apparent (though not statistically significantly) increases; again, these changes are likely to be linked to the changes associated with CATI, since in this second year, preloaded data were based on the "improved" data from the first year of the CATI operation. Alternatively, the apparent changes could represent the beginning of a real increase in dropout rates, or they could be noting more than statistical fluctuations owing to sampling error. Again, several more years of data are needed in order to answer these questions.

Defining and Calculating High School Completion Rates Using the CPS

The educational attainment and high school completion status data from the October CPS are also used to measure the high school graduation and completion rates.

In years prior to 1974, completion rates were reported in a series of separate two year age groups, but no overall rates comparable to the event and status dropout rates were computed. The completion rate computed and published first in 1994, and now again for 1995 is for the young adult population in the years beyond high school-that is, the 18- to 24-year-old population. These rates are reported nationally by race-ethnicity and at the state level, three year moving averages are computed to yield more stable estimates.

As was noted in the text, the state completion rates reflect the experiences of the 18- to 24-year-olds living in the state at the time of the interview; thus, movements in and out of states to accommodate employment and post-secondary education may be evident in some states. For example, a state with a relatively large unskilled labor workplace sector might have a lower high school completion rate than anticipated, due to an influx of young workers. Conversely, a state with a disproportionate number of colleges and universities might have a higher high school completion rate than anticipated, due to an influx of post-secondary students.

Increases in GED rates

The section on completion indicated that there was a substantial increase in the last couple of years in the estimate of the percentage of 18- to 24-year-olds getting GED's. In 1990 it was only 4.0 percent, but went from 4.9 in 1993 to 6.4 in 1994 and 7.4 in 1995. Although the standard errors on these estimates are fairly large, the absolute change is also quite large. The large increases in 1994 and 1995 came at the time that CPS instituted CATI in 1994. The figure below shows this increase:

Figure B3-Percentage of 18- to 24-year-olds completing high school by earning a GED

SOURCE: U.S. Department of Commerce, Bureau of the Census, Current Population Survey, various years, unpublished data

The American Council on Education, who administers the GED, produces annual reports on the number of persons taking the GED and the number of persons who were issued a GED credential. From these reports it is possible to calculate the number of 18- to 24-year-olds who received a GED in the past year for 1990 through 1995. It is also possible to estimate the same quantity from the CPS data for 1990 to 1995 by looking at only those who were reported to have completed a GED last year and using this, along with the GED item, to calculate how many 18- to 24-year-olds obtained GEDs each year. This results in the following figure:

Figure B4-Number of 18- to 24-year-olds who received a GED in given year

SOURCE: U.S. Department of Commerce, Bureau of the Census, Current Population Survey, various years, unpublished data; and American Council on Education, GED Testing Service, GED Statistical Report 1990 to 1995.

The CPS numbers for 1994 and 1995 are much closer to the estimates from the American Council on Education than previous years. It seems reasonable that in the CPS data since the institution of CATI, better data on the number of GEDs are being collected and that the increases seen in 1994 and 1995 are a reflection of better more accurate data collection and not a change in the actual number of young people getting GED's.

Definition of Family Income in CPS

Family income is derived from a single question asked of the household respondent. Income includes money income from all sources including jobs, business, interest, rent, social security payments, and so forth. The income of nonrelatives living in the household is excluded, but the income of all family members 14 years old and over, including those temporarily living away, is included. Family income refers to receipts over a 12-month period.

Income for families from which no income information was obtained (about 5 percent of families) was imputed. A sequential hotdeck procedure was used. A total of 200 imputation classes were created-5 levels of the age of head of household by 5 levels of the education of the head of household by 2 levels for the employment status of the head of household, and 4 levels of the number of workers in the household. To minimize the multiple use of a single donor, up to 5 donors were placed in each imputation class. A donor was selected at random from these when a family with missing income information was encountered. In a few instances (about 10 of 50,000 families in each year) an imputation class had no donors but a family from the class with missing income information was encountered. In these cases a donor was selected by collapsing similar classes until a non-empty imputation class was created.

To facilitate comparisons over time, the categorical family income information was transformed into a continuous family income variable. The transformation was accomplished by randomly assigning for each family an income value from the income interval to which their income belonged. For intervals below the median a rectangular probability density function was used; for those above the median a Pareto probability density function was used. The methodology has a feature that if the continuous family income variable were transformed back to a categorical family income variable, the value for each family would be identical to the original data. Based on the continuous family income variable, a family income percentile variable is calculated for each person in the survey which represents that person's position in the family income distribution. For example, if 25 percent of all persons have a lower value of family income (and 75 percent have a higher value), then the person's family income percentile variable has a value of 25. The methodology gives all persons in the same household the same value of both the categorical and continuous versions of family income. There are several issues that affect the interpretation of dropout rates by family income using the CPS. First, it is possible that the family income of the students at the time they dropped out was somewhat different than their current family income. (The problem is potentially greatest with status dropouts who could have dropped out several years ago.)

Furthermore, family income is from a single question asked of the household respondent in the October CPS. In some cases, there are persons 15- through 24 years old living in the household that are unrelated to the household respondent, yet whose family income is defined as the income of the family of the household respondent. Therefore, the current household income of the respondent may not accurately reflect that person's family background. In particular, in 1991 some of the dropouts in the 15- through 24-year age range were not still living in a family unit with a parent present. However, an analysis of 1991 status dropout rates by family income, race-ethnicity, and family status (presence of parent in the household) indicates that the bias introduced by persons not living in their parent's household is small (table B2). For example, while only 62 percent of 16- through 24-year-olds lived with at least one parent, the status dropout rates for black and white persons were similar with or without the parent present. For example, 20.6 percent of low income blacks without a parent present were dropouts compared with 21.3 percent of those living in their parent's household. In addition, the relationship between dropout rates and income held within each racial category regardless of whether the person was living in a household with his or her parent. That is, blacks and whites within income levels dropped out at similar levels-with or without the parent present. However, this was not true of Hispanics. Hispanics in upper income levels not residing with either parent were more likely than upper income Hispanics with parents present to be status dropouts.

Table B3-Percentage of status dropouts by household type by race-ethnicity and income: October 1992

                                          Parent           Parent
                           Total        not present        present
     Total                 100.0           38.0             62.0
White, non-Hispanic        100.0           37.1             62.9
  Low income                19.9           20.5             18.1
  Middle income             7.9            10.0              6.6
  High income               2.1             7.7              1.6
Black, non-Hispanic        100.0           33.9             66.1
  Low income                21.0           20.6             21.3
  Middle income             7.6             9.1              7.1
  High income               3.0             4.1              2.7
Hispanic                   100.0           48.7             51.3
  Low income                45.8           59.6             26.2
  Middle income             28.4           46.0             15.4
  High income               12.8           28.4              8.3
SOURCE: U.S. Department of Commerce, Bureau of the Census, Current
Population Survey, October 1991, unpublished data.

Definition of Geographic Regions in CPS

There are four Census regions used in this report: Northeast, Midwest, South, and West. The Northeast consists of Maine, New Hampshire, Vermont, Massachusetts, Connecticut, Rhode Island, New York, New Jersey, and Pennsylvania. The Midwest consists of Ohio, Indiana, Illinois, Michigan, Wisconsin, Iowa, Minnesota, Missouri, North Dakota, South Dakota, Nebraska, and Kansas. The South consists of Delaware, Maryland, Washington D.C., Virginia, West Virginia, North Carolina, South Carolina, Georgia, Florida, Kentucky, Tennessee, Alabama, Mississippi, Arkansas, Louisiana, Oklahoma, and Texas. The West consists of Montana, Idaho, Wyoming, Colorado, New Mexico, Arizona, Utah, Nevada, Washington, Oregon, California, Alaska, and Hawaii.

Definition of Immigration Status in CPS

Immigration status was derived from a variable on the control card inquiring about the citizenship status of the reference person:

Citizen Status:

Those coded '1' above (Native, born in US) were considered born in US. All others were considered foreign born. (Less than one percent of Hispanics were born abroad of American parents).

Definition of English Language Ability in CPS

English language ability for Hispanics 16- to 24-year olds was derived from several items in the CPS October Supplement. The first items was:

1= Yes
2= No - Speaks only English
For those who answered "yes" to this item they were asked the following question:

1= Spanish
2= Asian (e.g. Chinese, Japanese, Vietnamese)
3= Other European (e.g. French, German, Polish)
4= Other
For Hispanics, 98.9 percent who spoke another language at home spoke Spanish. Also, for those persons who spoke another language at home other than English only, the respondent was asked:

1= Very well
2= Well
3= Not well
4= Not at all

For some of the tables in this report, the first two categories of the above item were collapsed into "Speak English well" and the last two categories were collapsed into "Speak English not well."

Imputation for Item Non-Response

For many key items in the October CPS, the Bureau of the Census imputes data for cases with missing data due to item non-response. However, for some of the items that were used in this report item non-response was not imputed by the Bureau of the Census. Special imputations were conducted for these items using a sequential hot deck procedure implemented through the PROC IMPUTE computer program developed by the American Institutes for Research\52\ . Three categories of age, two categories of race, two categories of sex, and two categories of citizenship were used as imputation cells. The following table shows the variables for which missing data were imputed and the number of unweighted cases imputed:

Table B4-Imputed variables and unweighted counts of imputed cases

                                                                         Age group
                                                                15-24               16-24
                                                           -----------------   -----------------
                                                           Number    Percent   Number    Percent
Using age, sex, race:
Completed high school by equivalency test (hsbyged)          780       8.4       780       8.4
Repeated a grade (repeat)                                  1,400       8.2     1,262       8.4
Years attended U.S. school (Foreign-born, not enrolled)
     (schyrus1)                                               28       7.8        28       7.8
Years attended U.S. school (Foreign-born, enrolled)
     (schyrus2)                                              169      19.3       144      19.4
Using age, sex, race, citizen:
Ever taken course to read/write English
     as a second language (esl)                              309      11.3       118       8.8
How well speak English (spkeng)                              267       9.8       172       8.5
Does disability affect ability to learn (learndis)            11       0.9         8       0.8
Speak other than English at home (spkothr)                 1,152       6.8       907       6.4
Ever attended a U.S. school (schatusa)                       155       0.9       150       0.9
Imputations were conducted for all 15- through 24-year-olds, but only for the 16- through 24-year-olds for status calculations. Also, esl and spkeng were imputed based on spkothr=1 (not English speaker).

Defining and Calculating Cohort Dropout Rates Using NELS:88

The NELS:88 baseline comprised a national probability sample of all regular public and private 8th-grade schools in the 50 states and District of Columbia in the 1987-88 school year. Excluded from the NELS:88 sample were Bureau of Indian Affairs schools, special education schools for the handicapped, area vocational schools that do not enroll students directly, and schools for dependents of U.S. personnel overseas; such school-level exclusions have a quite small impact on national estimates.

NELS:88 started with the base-year data collection in which students, parents, teachers, and school administrators were selected to participate in the survey. NELS:88 began with a target sample of 1,032 sample schools, of which 30 were deemed ineligible. Some 698 of the 1,002 eligible schools agreed to participate in the study. Given the longitudinal nature of the study, the initial school response rate of 69.7 percent was deemed too low to yield acceptable levels of schools, administrators, teachers, parents, and most importantly, students. To address this concern, a sample of sister schools was selected and 359 replacement schools were identified and added to the study. Responses were obtained from 1,057 schools, thus increasing the school response rate to 77.7 percent (1,057/(1,002+359)). Usable student data were received for 1,052 of the schools.

The total eighth-grade enrollment for the 1,052 NELS:88 sample schools was 202,996. During the listing procedures (before 24-26 students were selected per school), 5.35 percent of the students were excluded because they were identified by school staff as being incapable of completing the NELS:88 instruments owing to limitations in their language proficiency or to mental or physical disabilities. Ultimately, 93 percent or 24,599 of the sample students participated in the base-year survey in the spring of 1988.

The NELS:88 first follow-up survey was conducted in the spring of 1990. Students, dropouts, teachers, and school administrators participated in the followup, with a successful data collection effort for approximately 93 percent of the base-year student respondents. In addition, because the characteristics and education outcomes of the students excluded from the base year may differ from those of students who participated in the base-year data collection, a special study was initiated to identify the enrollment status of a representative sample of the base-year ineligible students. Data from this sample were then combined with first and second follow-up data for the computation of 8th- to 10th-grade, 10th- to 12th-grade, and 8th- to 12th-grade cohort dropout rates.

The second follow-up survey was conducted in the spring of 1992. Students, dropouts, parents, teachers, and school administrators participated in this followup. Approximately 91 percent of the sample of students participated in the second follow-up survey, with 88 percent of the dropouts responding.

The second follow-up High School Transcript Study was conducted in the Fall of 1992. Transcript data spanning the three or four years of high school (ninth or tenth through twelfth grades) were collected for 1) students attending, in the spring of 1992, schools sampled for the second follow-up school administrator and teacher surveys,\53\ 2) all dropouts and dropouts in alternative programs who had attended high school for a minimum of one term; 3) all early graduates, regardless of school contextual sample type; and 4) triple ineligibles enrolled in the twelfth grade in the spring of 1992, regardless of school affiliation. Triple ineligibles are sample members who were ineligible-due to mental or physical handicap or language barrier-for the base year, first follow-up, and second follow-up surveys. The transcript data collected from schools included student-level data (e.g., number of days absent per school year, standardized test scores) and complete course-taking histories. Complete high school course-taking records were, of course, obtained only for those transcript survey sample members who graduated by the end of the spring term of 1992; incomplete records were collected for sample members who had dropped out of school, had fallen behind the modal progression sequence, or were enrolled in a special education program requiring or allowing more than twelve years of schooling.

A total of 1,287 contextual schools and 256 non-contextual schools responded to the request for transcripts. Reasons cited by school staff for not complying with the request included: inadequate permission for transcript release (some schools required parental permission for the release of minors' transcripts); no record of the sample, member, or no course-taking record because of brevity of enrollment; insufficient staff for transcript preparation (despite offers of remuneration for preparation costs); and archiving or transfer of sample member records. Student coverage rates were 89.5 percent for the total transcript sample and 74.2 percent for the dropout/alternative completers.

Missing from the cohort rates from NELS:88 is anyone who had dropped out prior to the spring of their eighth-grade year. Thus, the overall cohort rates reported here may be lower than they would have been if a younger cohort were used. This may be particularly important for Hispanics, given that CPS data show that Hispanic dropouts tend to have completed less schooling than other dropouts. The cohort rates also reflect the school enrollment status of both eligible and ineligible non-participants and participants, to the extent that this information could be obtained.

The following definition of a dropout was employed in NELS:88:

1. an individual who, according to the school (if the sample member could not be located), or according to the school and home, is not attending school (i.e., has not been in school for 4 consecutive weeks or more and is not absent due to accident or illness); or

2. a student who has been in school less than 2 weeks after a period in which he or she was classified as a dropout.

Thus, a student who was a temporary dropout (stopout) who was found by the study to be out of school for 4 consecutive school weeks or more and had returned to school (that is, had been back in school for a period of at least 2 weeks at the time of survey administration in the spring of 1990) would not be classified as a dropout for purposes of the cohort dropout rates reported here.

The basic NELS:88 procedure for identification of a dropout was to confirm school reported dropout status with the student's household. For the first follow-up, dropout status was obtained first from the school and then confirmed with the household for 96.4 percent of the dropouts. Thus only 3.6 percent of the dropouts were identified by only school-reported information. For the second followup, 4.9 percent of the dropouts were identified by only school-reported information.

The 1988-1990 dropout rate requires data from both 1988 and 1990. As a result, the size of the sample used in computing the 1988 to 1990 rate is tied to the size of the sample in 1990. Many students changed schools between 1988 and 1990. Because of the costs associated with following small numbers of students to many schools, a subsampling operation was conducted at the time of the first follow-up (figure B1). Of the 24,599 students who participated in 1988, 20,263 students were sampled, and 130 were found to be out of scope (due to death or migration out of the country). The dropout rates from 1988-1990 reflect the experiences of 20,133 sample cases. Some 1,088 sample cases dropped out and 19,045 sample cases continued in school.

The 1990-1992 rate starts from the 19,045 student sample cases. Some 91 of the student sample cases from 1990 were identified as out of scope in 1992. The dropout rates from 1990 to 1992 reflect the experiences of 18,954 student sample cases.

The 1988-1992 rates reflect the experiences of the 20,070 student sample cases. These cases result from the 20,263 subsampled student cases in 1990, less the 92 cases that were out of scope in both 1990 and 1992, less the 91 students sample cases identified as out of scope in 1992, less the 10 dropout sample cases identified as out of scope in 1992. Note that 24 student sample cases who were out of the country in 1990 returned to school in the U.S. by spring 1992, and an additional 14 student sample cases who were out of the country in spring 1990 returned to the U.S. by spring 1992 but did not reenroll (dropouts). And, another 354 student sample cases who dropped out between 1988 and 1990 returned to school by spring 1992.

HS&B Calculation of Cohort Dropout Rates

In HS&B, students are reported as having either a regular diploma or some alternative credential-described as the equivalent of a class of 1982 held alternative credentials by 1986 refers to a comparison of alternative completers with all regular diploma recipients. The estimates of a 16.6 percent dropout rate and an 8.2 percent alternative completion rate by 1986 are based on a comparison of online regular diploma recipients versus all other completers. The difference in the last two estimates is due to the fact that they are computed from two differently derived variables on the public use data files.

Variables Used in Comparison of HS&B and NELS:88

Listed below are the definitions for the poverty and family composition variables used in the section comparing 10th- to 12th-grade dropout rates in HS&B and NELS:88.



1. Below poverty line:

If family size (famsize) is 1 to 3 and family income (bb101) is $7,000 or less or;
If family size is 4 to 6 and income is $11,999 or less or;
If family size is 7 or more and income is under $15,999

2. Not below poverty line:

All other cases.

Below poverty line:

If family size (byfamsize) is I or 2 and family income (byfaminc) is $7,499 or less or;
If family size is 3 and family income is $9,999 or less or;
If family size is 4 or 5 and family income is $14,999 or less or;
If family size is 6 or 7 and family income is $19,999 or less or;
If family size is 8 and family income is $24,999 or less or;
If family size is 9 or more and family income is $34,999 or less;
Not below poverty line:
All other cases.

Family composition


1. Intact:

If father in household (bb036b=l) and mother in HH (bb036d=l)
2. Parent plus step parent
If father not in HH (bb036b=0) and mother in HH (bb036d=l) and male guardian in HH (bb036c=l) or;
If mother not in HH (bb036d=0) and father in HH (bb036b=1) and female guardian in HH (bb036e=l)
3. Single parent
If father is in HH (bb036b=l) and no other adult partner is in HH (bb036d to bb036e=0) or;
If mother is in HH (bb036d=l) and not other adult partner is in HH (bb036b to bb036c=0)
4. Other
All other cases.

1. Intact:

If father in household (f1s92a=1) and mother in HH (f1s92d=1)
2. Parent plus step parent
If father not in HH (f1s92a=0) and mother in HH (f1s92d=1) and male guardian or stepfather in HH (f1s92c=1 or f1s92b=1) or;
If mother not in HH (f1s92d=0) and father in HH (f1s92a=1) and female guardian or stepmother in HH (f1s92e=1 or f1s92f)
3. Single parent
If father is in HH (f1s92a=1) and no other adult partner is in HH (f1s92d to f1s92f=0) or;
If mother is in HH (f1s92c=1) and no other adult partner is in HH (f1s92a to f1s92c=0).
4. Other
All other cases.

Variables used in NELS:88

High School Completion Status

1. High school graduate:

If individual has received a high school diploma (f3diplom=1);
2. Received alternative credential
If individual has received a GED (f3diplom=2) or received a certificate of attendance (f3diplom=3);
3. Still enrolled in high school
If individual is currently in high school (f3diplom=4) or is working toward an equivalent (f3diplom=5);
4. Dropout
If individual is not a graduate or GED/certificate holder (f3diplom = 6)

Accuracy of Estimates

The estimates in this report are derived from samples and are subject to two broad classes of error-sampling and nonsampling error. Sampling errors occur because the data are collected from a sample of a population rather than from the entire population. Estimates based on a sample will differ somewhat from the values that would have been obtained from a universe survey using the same instruments, instructions, and procedures. Nonsampling errors come from a variety of sources and affect all types of surveys, universe as well as sample surveys. Examples of sources of nonsampling error include design, reporting, and processing errors, and errors due to nonresponse. The effects of nonsampling errors are more difficult to evaluate than those that result from sampling variability. As much as possible, procedures are built into surveys in order to minimize nonsampling errors.

In reporting sample survey data, estimates based on unweighted samlpe sizes less than 30 are not displayed. The standard error is a measure of the variability due to sampling when estimating a parameter. It indicates how much variance there is in the population of possible estimates of a parameter for a given sample size. Standard errors can be used as a measure of the precision expected from a particular sample. The probability that a complete census would differ from the sample by less than the standard error is about 68 out of 100. The chances that the difference would be less than 1.65 times the standard error are about 90 out of 100; that the difference would be less than 1.96 the standard error, about 95 out of 100.

Standard errors for rates and number of persons based on CPS data were calculated using the following formulas:

Dropout rate:

where p = the percentage (0 < p < 100), Number of persons:

where x = the number of persons (i.e., dropouts),
Standard errors for the estimates in the tables appear in appendix A.

In October of 1991, the Bureau of the Census released new b parameters for 1988 and 1990. With the release of the new parameters, the Bureau of the Census also made adjustments to the parameters for earlier years. Therefore, for some years, the standard errors presented in the appendix tables here are different than the standard errors presented in earlier reports.

Methodology and Statistical Procedures

The comparisons in the text have all been tested for statistical significance to ensure that the differences are larger than those that might be expected due to sampling variation. Two types of comparisons have been made in the text.

Differences in two estimated percentages. The Student's t statistic can be used to test the likelihood that the differences between two percentages are larger than would be expected by sampling error.

where P1 and P2 are the estimates to be compared and se1 and se2 are their corresponding standard errors.

As the number of comparisons on the same set of data increases, the likelihood that the t value for at least one of the comparisons will exceed 1.96 simply due to sampling error increases. For a single comparison, there is a 5 percent chance that the t value will exceed 1.96 due to sampling error. For five tests, the risk of getting at least one t value that high increases to 23 percent and for 20 comparisons, 64 percent.

One way to compensate for this danger when making multiple comparisons is to adjust the alpha level to take into account the number of comparisons being made. For example, rather than establishing an alpha level of 0.05 for a single comparison, the alpha level is set to ensure that the likelihood is less than 0.05 that the t value for any of the comparisons exceeds the critical value by chance alone when there are truly no differences for any of the comparisons. This Bonferroni adjustment is calculated by taking the desired alpha level and dividing by the number of possible comparisons, based on the variable(s) being compared. The t value corresponding to the revised, lower alpha level must be exceeded in order for any of the comparisons to be considered significant. For example, to test for differences in dropout rates between whites, blacks, and Hispanics, the following steps would be involved:

All comparisons in this report were tested using the Bonferroni adjustment for the t tests. Where categories of two variables were involved, the number of comparisons used to make the Bonferroni adjustment was based on the relationship(s) being tested.

Trends. Regression analysis was used to test for trends across age groups and over time. Regression analysis assesses the degree to which one variable (the dependent variable) is related to a set of other variables (the independent variables). The estimation procedure most commonly used in regression analysis is ordinary least squares (OLS). While some of the trends span the entire period from 1972 to 1995, many of the rates reached a high point during the late 1970s. Thus, most of the descriptions that refer to "since the late 1970s" use 1978 as a starting point.

The analyses in this report were conducted on the event rates, status rates, and completion rates. The event rate and status rate estimates were used as dependent measures in the analysis with a variable representing time and a dummy variable controlling for changes in the editing procedure (0 = years 1968 to 1986, 1 = 1987 to 1995) used as independent variables. However, in these data some of the observations were less reliable than others (i.e., some years' standard errors were larger than other years'). In such cases OLS estimation procedures do not apply and it is necessary to modify the regression procedures to obtain unbiased regression parameters. The modification that is usually recommended transforms the observations to variables which satisfy the usual assumptions of ordinary least squares regression and then applies the usual OLS analysis to these variables.

This was done in this analysis using the data manipulation and regression capability of Microsoft EXCEL. Each of the variables in the analysis was transformed by dividing each by the standard error of the relevant year's rate (event or status). The new dependent variable was then regressed on the new time variable and new editing-change dummy variable. All statements about trends in this report are statistically significant at the 0.05 level.


50/  Although prior to 1992 the questionnaire did not have the words "high school diploma or equivalency certificate", the interviewer instruction included an instruction to record 12th grade for people who completed high school with a GED or other certificate although they had dropped out earlier. The specific inclusion of these words on the questionnaire appear to have made a difference in the quality of responses from the household informant.

51/  Unlike prior years however, data for individuals missing on the variables representing years of school completed ("What is the highest grade or year has attended?"; and "Didcomplete that grade?") were not imputed by the Census Bureau. For this analysis we imputed missing on these variables based on the grade they attended last year (if enrolled last year). For those individuals that were missing data and were not enrolled last year we imputed their highest grade completed by examining the responses to the new educational attainment variable.

52/  D. H. McLaughlin, Imputation for Non-Response Adjustment, American Institutes for Research, October 1991, updated: February 1994.

53/  School selected for the contextual components of the second follow-up the school administrator and teacher surveys. Care referred to as contextual schools. Sample members enrolled in those schools are referred to as contextual students.

[Appendix A: Standard Error and Time Series Tables] Previous Table of Contents Next[Appendix C: Supplemental Tables]