This special analysis draws upon various NCES datasets, which are summarized in exhibit A. Many of the findings in this special analysis come from previously published NCES reports; however, the findings in part A and part B of section 2 come from special analyses of ELS:2002, NELS:88, and BPS:04/06. These three datasets were obtained from statistical samples of the entire population of target students. These technical notes describe various issues that are important to keep in mind when interpreting sampled data as well as the sample populations and the variables created for these special analyses. For detailed information about any of the NCES datasets, see http://nces.ed.gov/surveys/.

ESTIMATES FROM SAMPLED DATA

Estimating the size of the total population or subpopulations from a data source based on a sample of the entire population requires consideration of several factors before the estimates become meaningful. However conscientious an organization may be in collecting data from a sample of a population, there will always be some margin of error in estimating the size of the actual total population or subpopulation because the data are available from only a portion of the total population. Consequently, data from samples can provide only an estimate of the true or actual value. The margin of error or the range of the estimate depends on several factors, such as the amount of variation in the responses, the size and representativeness of the sample, and the size of the subgroup for which the estimate is computed. The magnitude of this margin of error is measured by what statisticians call the “standard error” of an estimate.

Standard Errors

The standard error for each estimate in this special analysis was calculated in order to determine the “margin of error” for these estimates. The standard errors for all the estimated means and percentages reported in the figures and tables of the special analysis can be found on this website.

An estimate with a smaller standard error provides a more reliable estimate of the true value than an estimate with a higher standard error. Standard errors tend to diminish in size as the size of the sample (or subsample) increases. Consequently, for the same data, such as the percentage of students who enrolled immediately in a community college, standard errors will almost always be larger for American Indian/Alaska Native students than for White students because the latter represent a larger proportion of the population.

Analysis and Interpretation

Due to standard errors, caution is warranted when drawing conclusions about the size of one population estimate in comparison to another or whether a time series of population estimates is increasing, decreasing, or staying about the same. Although one estimate of the population size may be larger than another, a statistical test may reveal that there is no measurable difference between the two estimates due to their uncertainty. Whether differences in means or percentages are statistically significant can be determined by using the standard errors of the estimates. When differences are statistically significant, the probability that the difference occurred by chance is usually small; for example, it might be about 5 times out of 100. For this special analysis, differences between means or percentages (including increases or decreases) are stated only when they are statistically significant. To determine whether differences reported are statistically significant, two-tailed t tests, at the .05 level, were used. The t test formula for determining statistical significance was adjusted when the samples being compared were dependent.

Rounding and Other Considerations

Although values reported in the supplemental tables are rounded to one decimal place (e.g., 76.5 percent), values reported in this special analysis are rounded to whole numbers (with any value of 0.5 or above rounded to the next highest whole number). Due to rounding, total percentages sometimes differ from the sum of the reported parts, which may, for example, equal 99 or 101 percent, rather than the percentage distribution's total of 100 percent.

Some values reported in supplemental tables are flagged with an exclamation mark (!) to alert readers that those values have relatively large standard errors in relation to the estimated value, or, in other words, have larger confidence intervals around them than unflagged estimates. Specifically, this special analysis has flagged values with standard errors greater than 30 percent of the estimated value. For example, an estimate of 15.3 percent with a standard error of 4.9 is flagged because 4.9 / 15.3 equals 0.32. (With this standard error, at a confidence level of .95, the estimate may differ from the actual value by ±9.6 or, in other words, lies in the confidence interval 6.6 and 24.9.) In contrast, an estimate of 15.3 percent with a standard error of 1.8 is not flagged because 1.8 / 15.3 equals 0.12. (With this standard error, at a confidence level of .95, the estimate may differ from the actual value by ±3.5 or, in other words, lies in the confidence interval 11.8 and 18.8.)

EDUCATION LONGITUDINAL STUDY OF 2002 (ELS:2002)

The ELS:2002 base year conducted a baseline survey of high school sophomores in spring term 2002, surveying almost 15,400 students in 752 schools (out of 17,600 students and 1,268 schools selected for the sample).1 The ELS:02/04 first follow-up surveyed 15,000 of the participants (out of 16,500 eligible for the sample) in the spring of 2004, when most sample members were seniors (though some were dropouts, some early graduates, and some retained in an earlier grade).2 The first follow-up survey collected high school course offerings and student transcripts (coursetaking records at the student level for grades 9–12) for all sample members. Transcript information was obtained for about 14,900 participants, or for about 91 percent (weighted) of the ELS:2002 student sample. The ELS:02/06 second follow-up surveyed 15,000 participants (out of 16,400 eligible for the sample) between January and September 2006, about 2 years after most sample members had completed high school.

For the special analysis in section 2, part A, only students in the spring-term 2004 senior cohort (or “2004 seniors”) were used.3 The 2004 seniors who enrolled in a postsecondary institution some time between July and December 2004 were classified as “immediate enrollees.” The 2004 seniors who first enrolled in a postsecondary institution after December 2004 and before the second follow-up in 2006 were classified as “delayed enrollees.” The 2004 seniors who had never enrolled in a postsecondary institution before the second follow-up in 2006 were considered to have had “no postsecondary education through 2006.”

To determine the type of postsecondary institution into which 2004 immediate enrollees enrolled, this special analysis used the ELS variables F2PS1 and F2IORDER to identify the first postsecondary institution a student attended.4 The postsecondary institution used to classify delayed enrollees by institution type was the last institution in which the delayed enrollee was enrolled before the second follow-up in 2006.5

The special analysis used the panel weight F2F1WT, which generalizes to the spring 2004 senior cohort who also participated in the second follow-up survey.

For more information on ELS:2002, including sampling design, data collection methodology, data processing and procedures, response rates, imputation, weighting, and the construction of specific variables, see Ingels et al. (2007).

NATIONAL EDUCATION LONGITUDINAL STUDY OF 1988 (NELS:88)

The NELS:88 base year conducted a baseline survey of high school 8th-graders in spring term 1988, surveying almost 24,600 students in 1,052 schools (out of about 26,400 students and 1,032 schools selected for the sample).6 The NELS:88/92 second follow-up surveyed about 16,800 of the participants (out of 18,200 eligible for the sample7) in the spring of 1992, when most sample members were in their final semester of high school (though some were dropouts, some early graduates, and some retained in an earlier grade).8 The second follow-up survey collected high school course offerings and student transcripts (coursetaking records at the student level for grades 9–12) for all sample members. Transcript information was obtained for about 17,300 participants, or for about 88 percent (weighted) of the NELS:88 student sample. The NELS:88/94 third follow-up surveyed the students during the spring of 1994, about 2 years after most sample members had completed high school. This follow-up included 14,900 participants (out of almost 16,000 eligible for the sample).9

For the special analysis in section 2, part A, only students in the spring-term 1992 senior cohort (or “1992 seniors”) were used.10 The 1992 seniors who enrolled in a postsecondary institution some time between May and December 1992 were classified as “immediate enrollees.”11 The 1992 seniors who enrolled in a postsecondary institution after December 1992 and before the third follow-up in 1994 were classified as “delayed enrollees.” The 1992 seniors who had never enrolled in a postsecondary institution before the third follow-up in 1994 were considered to have had “no postsecondary education through 1994.”12 The 1992 seniors who enrolled in a postsecondary institution outside the United States (24 cases, incode = -10) and who reported attending a “postsecondary” institution for which no identifying IPEDS data existed (500 cases, incode = -12) were dropped from the analysis.

Because NELS does not include information identifying all types of postsecondary institutions, data on the level (i.e., 2-year or 4-year) and control (e.g., public or private) of all postsecondary institutions in IPEDS 1993/94 were merged with the NELS data. To determine the type of postsecondary institution into which 1992 immediate enrollees enrolled, this special analysis used the IPEDS variables CONTROL and LEVEL associated with the first “real” postsecondary institution a student attended. The first “real” postsecondary institution that a student attended was either (1) the first institution in which the student enrolled after August 1992 or (2) the institution in which the student enrolled before September 1992 and in which he or she remained a student through at least September 1992. The postsecondary institution used to classify delayed enrollees by institution type was the last institution in which the delayed enrollee was enrolled before the second follow-up in 1994.13

The special analysis used the panel weight F3F2PNWT, which generalizes to the spring 1992 senior cohort who also participated in the third follow-up survey.

For more information on NELS:88, including sampling design, data collection methodology, data processing and procedures, response rates, imputation, weighting, and the construction of specific variables, see Curtin et al. (2002).

ELS was designed to permit comparisons with NELS data; however, some variables used in this special analysis are not directly comparable between the two datasets. The following bullets describe these variables and how they are treated in the special analysis:

• Data on grade point averages (GPAs) were collected for ELS from schools on a standardized scale of 0.0 to 4.0; however for NELS, schools reported student GPAs on a scale of 0 to 104. Given the different scales, no comparison is made and the NELS GPA data are not reported. For the analysis of the GPAs in ELS, a cutpoint of 2.5 was used to distinguish the top half of the grade distribution from the bottom half: grades above 2.5 (typically equated to a C+ or better) fall into 3 categories (2.51-3.0, 3.1-3.5, and 3.51-4.0) and grades of 2.5 or below fall into 3 categories for which credit is earned (2.1-2.5, 1.51-2.0, and 1.0-1.5).

• Data on family income in ELS were from a base year 2 years prior to the data collection for 2004 seniors; however, the NELS data on family income come from a base year 4 years prior to the data collection for 1992 seniors. In addition, the cutpoints used to collect the information varied slightly: ELS used categories that began with “001” and ended in “000” (e.g., “\$20,001–35,000”) while NELS used categories that began with “000” and ended in “999” (e.g., \$20,000–34,999”). These data are reported as they are without any adjustments to make them more comparable, given that no adjustments for inflation (which is a greater source of incompatibility) are possible with such categorical data.

• Data on race/ethnicity in ELS were collected using the Office of Management and Budget (OMB) standard racial and ethnic classifications for the 2000 Census. These superseded the prior OMB classifications, used in NELS, which did not include the category “more than one race.” As a result of the new category, the race categories in ELS and NELS are not directly comparable. These data are reported as they are.

• Data on the 2004 seniors' post-high school plans in ELS identify the type of postsecondary institution (e.g., at a 2-year or 4-year institution) that a senior plans to attend, if he or she reported having plans to get a postsecondary education. In contrast, data on 1992 seniors in NELS only report if seniors had plans to get a postsecondary education (not the type of postsecondary institution they planned to attend). These data are reported as they are, with a total shown for 2004 seniors who had plans to get any type of postsecondary education.

Transcript data collected as part of ELS and NELS were classified into the coursetaking categories using the same coding scheme. For more information on how these coursetaking categories were created and on the courses assigned to each category, see the Technical Notes and Methodology in Planty, Provasnik, and Daniel (2007).

BEGINNING POSTSECONDARY STUDENTS LONGITUDINAL STUDY OF 2006 (BPS:04/06)

BPS:04/06 surveyed a subsample of the 89,500 undergraduates who participated in the National Postsecondary Student Aid Study (NPSAS) of all postsecondary students in academic year 2003–04.14 NPSAS:04 selected a sample of postsecondary students from some 1,600 postsecondary institutions that were stratified to be nationally representative of the entire postsecondary universe of institutions. The students were initially interviewed for NPSAS in 2004; the BPS:04/06 study is the first follow-up of these students 3 years later in 2006. BPS:04/06 surveyed about 18,600 NPSAS:04 participants who began their postsecondary education in the academic year 2003–04.

The BPS:04/06 data for this special analysis were analyzed using NCES' Data Analysis System (DAS). BPS data in the DAS pertain to the experiences of students over 3 academic years and provide information about rates of program completion, transfer, and attrition for students who first enrolled at various types of postsecondary institutions. The DAS may be accessed at http://nces.ed.gov/das/.

For more information about BPS:04/06 including sampling design, data collection methodology, imputation, and weighting, see appendix B of Berkner et al. (2007).

1 Schools were the first-stage unit of selection, with sophomores randomly selected within schools. (back to text)

2 At this time, the survey was "freshened" to ensure a nationally representative spring-term 2004 senior cohort. This freshening procedure is a method for producing a representative sample of students who were enrolled in 12th grade in 2004 but were not enrolled in 10th grade in 2002 (e.g., students held back in the 11th or 12th grades or who were not in school in the United States in 2002). (back to text)

3 The filter used for this population was G12COHRT > 0. (back to text)

4 The ELS variable F2PS1 takes into account the fact that some 2004 seniors took college classes over the summer before entering their "real" college of intended matriculation in the fall. In most cases, the first "real" institution (what F2PS1 identifies) is the postsecondary institution with the earliest start date (and will therefore appear first on the ELS institution file, i.e. F2IORDER=1). This was not the case, however, if (1) the first chronological institution (as opposed to the first "real" institution) is a summer school (defined as an institution with a start date of May, June, or July, and a same year end date of May, June, July, or August); (2) the summer school was attended in the same year as high school completion/exit; and (3) a second postsecondary institution (with longer total enrollment) was also started in August, September, or October of that same year. If all the above conditions are met, the post-summer school institution is identified in F2PS1. If the earliest start date is shared by more than one institution, the one with the longest enrollment period is identified in F2PS1. (back to text)

5 For delayed enrollees who were enrolled concurrently in more than one institution, the institution in which the student was enrolled the longest amount of time was used to determine the institution type. (back to text)

6 More schools ended up participating than were in the original selected sample. (back to text)

7 Excludes dropouts. (back to text)

8 The sample was also "freshened" to ensure a nationally representative spring-term high school senior class of 1992. (back to text)

9 To control costs in the third follow-up, subsampling was instituted to reduce the second follow-up sample of 21,600 participating students and dropouts to some 16,000 sample members. See Curtin et al. (2002), p. 38. (back to text)

10 The filter used for this population was G12COHRT > 0. (back to text)

11 For some 1992 seniors, the month that they enrolled in their first postsecondary institution was missing or unknown. If these seniors reported that they enrolled in their first postsecondary institution in 1992, they were classified as "immediate enrollees." Students initially classified as "immediate enrollees" who reported an end date of their postsecondary education before September 1992 were reclassified. Those who enrolled in another postsecondary institution before January 1993 were reclassified as "immediate enrollees" at the control and level of the institution enrolled at in the fall. Those who did not enroll in another postsecondary institution before January 1993, but enrolled sometime before the third follow-up in 1994, were reclassified as "delayed enrollees" at the control and level of that later institution. (back to text)

12 The category "no postsecondary education through 1994" also includes a few students who had enrolled in a postsecondary institution before graduating high school (e.g., dual-credit course students) but who did not enroll in a postsecondary institution again before the third follow-up in 1994. (back to text)

13 For delayed enrollees who were enrolled concurrently in more than one institution, the institution in which the student was enrolled the longest amount of time was used to determine the institution type. (back to text)

14 BPS also surveyed first-time graduate students, but these data were not used for this special analysis. (back to text)

