Skip Navigation


The National Center for Education Statistics (NCES) was directed by Congress to produce the Higher Education: Gaps in Access and Persistence Study (Higher Ed:GAPS), a statistical report that documents the scope and nature of gaps in educational participation and attainment between male Blacks, Hispanics, Native Hawaiians/Pacific Islanders, and American Indians/Alaska Natives and their female counterparts, as well as gaps between males in these racial/ethnic groups and White males. The primary focus of the Higher Ed:GAPS report is to examine differences among males and females both overall and within racial/ethnic groups. The secondary focus of the report is to examine overall racial/ethnic differences. In addition to descriptive indicators, this report also includes descriptive multivariate analyses of variables that are associated with male and female postsecondary attendance and attainment.

The congressional language specified that the Higher Ed: GAPS report track males of underrepresented racial/ ethnic groups (i.e., Blacks, Hispanics, Native Hawaiians/ Pacific Islanders, and American Indians/Alaska Natives) through the college attainment pipeline, including college preparation, college access, college attainment, and graduation in fields where they are underrepresented in the labor force. Indicators appearing in past NCES reports, such as The Condition of Education and Status and Trends in the Education of Racial/Ethnic Groups, have addressed these topics in a general way. However, the intent of Congress indicated that a comprehensive examination of these specific topics was necessary to inform the development of prospective policies that will address these gaps. In order to select the most appropriate indicators to support this goal, NCES convened an expert group of researchers, policy analysts, and relevant U.S. Department of Education staff to review and discuss potential indicators, analysis methodologies, and data sources.

Organization of the Report

The Higher Ed:GAPS report presents indicators that include the most recently available, nationally representative data from NCES, other federal agencies, and selected items from ACT and the College Board. These measures are examined in seven chapters: Demographic Context, Characteristics of Schools, Student Behaviors and Afterschool Activities, Academic Preparation and Achievement, College Knowledge, Postsecondary Education, and Postsecondary Outcomes and Employment. Each indicator consists of text describing key findings, technical notes, one or more figures, and one or more tables. The indicator text generally describes overall differences by sex, differences by sex within racial/ethnic groups, differences between minority males and White and Asian males, and overall racial/ethnic differences. Chapter 8 presents findings from multivariate analyses of relationships between student, postsecondary enrollment and degree attainment. Appendix A is the technical appendix for the logistic regression analysis and imputation procedures, and appendix B provides a guide to sources at the end of the report. Standard error tables are available on the NCES website (

Definitions of Race and Ethnicity

The Office of Management and Budget (OMB) is responsible for the standards that govern the categories used to collect and present federal data on race and ethnicity. The OMB revised the guidelines on racial/ ethnic categories used by the federal government in October 1997, with a January 2003 deadline for implementation (Office of Management and Budget 1997). The revised standards require a minimum of these five categories for data on race: American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or Other Pacific Islander, and White. The standards also require the collection of data on the ethnicity categories Hispanic or Latino and Not Hispanic or Latino. It is important to note that Hispanic origin is an ethnicity rather than a race, and therefore persons of Hispanic origin may be of any race. Origin can be viewed as the heritage, nationality group, lineage, or country of birth of the person or the person's parents or ancestors before their arrival in the United States. The race categories White, Black, Asian, Native Hawaiian or Other Pacific Islander, and American Indian or Alaska Native, as presented in this report, exclude persons of Hispanic origin unless noted otherwise.

The categories are defined as follows:

American Indian or Alaska Native : A person having origins in any of the original peoples of North and South America (including Central America) and maintaining tribal affiliation or community attachment.

Asian : A person having origins in any of the original peoples of the Far East, Southeast Asia, or the Indian subcontinent, including, for example, Cambodia, China, India, Japan, Korea, Malaysia, Pakistan, the Philippine Islands, Thailand, and Vietnam.

Black or African American : A person having origins in any of the black racial groups of Africa.

Native Hawaiian or Other Pacific Islander : A person having origins in any of the original peoples of Hawaii, Guam, Samoa, or other Pacific Islands.

White : A person having origins in any of the original peoples of Europe, the Middle East, or North Africa.

Hispanic or Latino : A person of Mexican, Puerto Rican, Cuban, South or Central American, or other Spanish culture or origin, regardless of race.

Within this report, some of the category labels have been shortened in the indicator text, tables, and figures. American Indian or Alaska Native is denoted as American Indian/Alaska Native (except when separate estimates are available for American Indians alone or Alaska Natives alone); Black or African American is shortened to Black; and Hispanic or Latino is shortened to Hispanic. When discussed separately, Native Hawaiian or Other Pacific Islander is shortened to Native Hawaiian/Pacific Islander.

The indicators in this report draw from a number of different sources. Many are federal surveys that collect data using the OMB standards for racial/ethnic classification described above; however, some sources have not fully adopted the standards, and some indicators include data collected prior to the adoption of the OMB standards. This report focuses on the six categories that are the most common among the various data sources used: White, Black, Hispanic, Asian, Native Hawaiian/ Pacific Islander, and American Indian/Alaska Native. Asians and Native Hawaiians/Pacific Islanders are combined into one category in indicators for which the data were not collected separately for the two groups.

Some of the surveys from which data are presented in this report give respondents the option of selecting either an "other" race category, a "two or more races" or "multiracial" category, or both. Where possible, indicators present data on the "two or more races" category; however, in some cases this category may not be separately shown because the information was not collected or due to other data issues. The "other" category is not separately shown. Any comparisons made between persons of one racial/ethnic group to "all other racial/ ethnic groups" include only the racial/ethnic groups shown in the indicator. In some surveys, respondents are not given the option to select more than one race. In these surveys, respondents of two or more races must select a single race category. Any comparisons between data from surveys that give the option to select more than one race and surveys that do not offer such an option should take into account the fact that there is a potential for bias if members of one racial group are more likely than members of the others to identify themselves as "two or more races."3 For postsecondary data, foreign students are counted separately and are therefore not included in any racial/ethnic category. Please see Appendix B: Guide to Sources at the end of this report for specific information on each of the report's data sources.

Limitations of the Data

The relatively small sizes of the American Indian/Alaska Native and Native Hawaiian/Pacific Islander populations pose many measurement difficulties when conducting statistical analysis. Even in larger surveys, the numbers of American Indians/Alaska Natives and Native Hawaiians/ Pacific Islanders included in a sample are often small. Researchers studying data on these two populations often face small sample sizes that reduce the reliability of results. Survey data for American Indians/Alaska Natives often have somewhat higher standard errors than data for other racial/ethnic groups (Cahalan et al. 1998). Due to large standard errors, differences that seem substantial are often not statistically significant and, therefore, not cited in the text.

Data on American Indians/Alaska Natives are often subject to inaccuracies that can result from respondents self-identifying their race/ethnicity. Research on the collection of race/ethnicity data suggests that the categorization of American Indian and Alaska Native is the least stable self-identification (U.S. Department of Labor, Bureau of Labor Statistics [BLS] 1995). The racial/ ethnic categories presented to a respondent, and the way in which the question is asked, can influence the response, especially for individuals who consider themselves of mixed race or ethnicity. These data limitations should be kept in mind when reading this report.

As mentioned above, Asians and Native Hawaiians/Pacific Islanders are combined into one category in indicators for which the data were not collected separately for the two groups. The combined category can sometimes mask significant differences between subgroups. For example, prior to 2011, the National Assessment of Educational Progress (NAEP) collected data that did not allow for separate reporting of estimates for Asians and Native Hawaiians/Pacific Islanders. Information from the Digest of Education Statistics, 2011 (table 21), based on the Census Bureau Current Population Reports, indicates that 96 percent of all Asian/Pacific Islander 5- to 24-year-olds are Asian. This combined category for Asians/Pacific Islanders is more representative of Asians than Native Hawaiians/Pacific Islanders. For example, figure A shows the percentages of students scoring at or above the Proficient level on the grade 8 NAEP reading and mathematics assessments for Asian/Pacific Islander students as a combined category, Asian students as a separate category, and Native Hawaiian/Pacific Islander students as a separate category. In 2011, approximately 47 percent of 8th-grade students in the combined Asian/ Pacific Islander category scored at or above the Proficient level on the reading assessment. When examining 8th-grade reading proficiency levels separately for Asians (49 percent) and Native Hawaiians/Pacific Islanders (24 percent), the difference between these two groups emerges. A similar pattern was found for the percentages of Asian/Pacific Islander (55 percent), Asian (58 percent), and Native Hawaiian/Pacific Islander (22 percent) 8th-grade students scoring at or above the Proficient level on the mathematics assessment.

The indicators presented in this report are intended to provide a descriptive overview of the education data available from many federal surveys. Readers are cautioned not to draw causal inferences based on the univariate, bivariate, and multivariate results presented in this report. One of the limitations of bivariate statistics is that they describe subpopulation differences without taking into account the influence of other individual, family, school, or environmental factors. Many of the outcome variables examined in this report may be related to other factors outside of students' sex and race/ethnicity. Although multivariate analyses were conducted to explore some of those relationships, there may be other, more complex interactions and relationships that have not been explored. The indicators were selected to provide a range of data that are relevant to a variety of policy issues surrounding gaps in postsecondary access and persistence, rather than to emphasize any particular issue.

Statistical Comparisons

Data for indicators in this report are obtained primarily from two types of surveys: universe surveys and sample surveys. In the case of universe data, information is collected from every member of the population. When data from an entire population are available, estimates of the total population or a subpopulation are made by simply summing the units in the population or subpopulation. As a result, there is no sampling error, and observed differences are reported as true. In the case of sample surveys, a nationally representative sample of respondents is selected and asked to participate in the data collection. When a sample survey is used, statistical uncertainty is introduced, because the data come from only a portion of the entire population.

Sample survey data include weights to make estimates from the data representative of the population of interest. Indicators based on longitudinal survey data (i.e., Beginning Postsecondary Students Longitudinal Study, Early Childhood Longitudinal Study: Kindergarten Class of 199899, Education Longitudinal Study of 2002, and High School Longitudinal Study of 2009) include the specific weight variable name in the table and figure notes because the longitudinal datasets provide multiple weighting variables that could be used for analysis purposes.

Statistical uncertainty about whether the sample population represents the population at large must be considered when reporting estimates and making comparisons. Using estimates calculated from data based on a sample of the population requires consideration of several factors before the estimates become meaningful. When using data from a sample, some margin of error will always be present in estimations of characteristics of the total population or subpopulation, because the data are available from only a portion of the total population. Consequently, data from samples can provide only an approximation of the true or actual value. The margin of error of an estimate, or the range of potential true or actual values, depends on several factors such as the amount of variation in the responses, the size and representativeness of the sample, and the size of the subgroup for which the estimate is computed. The magnitude of this margin of error is measured by what statisticians call the "standard error" of an estimate. When data from sample surveys are reported, the standard error is calculated for each estimate. The standard errors for all estimated totals, means, medians, and percentages reported in the Higher Ed:GAPS report tables can be viewed on the NCES website (

All statements about differences in this report are supported by the data, either directly in the case of universe surveys or with statistical significance testing in the case of sample survey data. When estimates are from a sample, caution is warranted when drawing conclusions about one estimate in comparison to another. Although one estimate may appear to be larger than another, a statistical test may find that the apparent difference between them is not reliably measurable due to the uncertainty around the estimates. In this case, the estimates will be described as having no measurable difference, meaning that the difference between them is not statistically significant.

Whether differences in means or percentages are statistically significant can be determined using the standard errors of the estimates. In this publication and others produced by NCES, when differences are statistically significant, the probability that the difference occurred by chance is less than 5 percent.

For all Higher Ed:GAPS report indicators that include estimates based on samples, differences between estimates are stated only when they are statistically significant. To determine whether differences reported are statistically significant, two-tailed t tests at the .05 level are typically used. The t test formula for determining statistical significance is adjusted when the samples being compared are dependent. The t test formula is not adjusted for multiple comparisons. Due to the large sample sizes used for this report, many differences between estimates are statistically significant. Not all statistically significant results are reported in the text. This report focuses on reporting statistically significant differences between Black, Hispanic, American Indian/Alaska Native, and Native Hawaiian/Pacific Islander males and their peers.

The appearance of a "!" symbol (meaning "Interpret data with caution") in a table or figure indicates a data cell with a high ratio of standard error to estimate (i.e., the coefficient of variation is greater than or equal to 0.30 but less than 0.50); the reader should use caution when interpreting such data. These estimates are still discussed, however, when statistically significant differences are found despite large standard errors. The appearance of a "" symbol (meaning "Reporting standards not met") indicates a data cell that is suppressed either due to a coefficient of variation that is greater than or equal to 0.50 or too few respondents to meet reporting standards.

All calculations in the Higher Ed: GAPS report are based on unrounded estimates. Therefore, the reader may find that a calculation cited in text or figures, such as a difference or a percentage change, may not be identical to the calculation obtained using the rounded values shown in the accompanying tables. Although percentages reported in the tables are generally rounded to one decimal place (e.g., 76.5 percent), percentages reported in the text and figures are generally rounded from the original number to whole numbers (with any value of 0.50 or above rounded to the next highest whole number). While the data labels on the figures have been rounded to whole numbers, the graphical presentation of these data are based on the unrounded estimates shown in the corresponding table. Due to rounding, cumulative percentages may sometimes equal 99 or 101 percent, rather than 100 percent. In addition, sometimes a whole number in the text may seem rounded incorrectly based on its value when rounded to one decimal place. For example, the percentage 14.479 rounds to 14.5 at one decimal place, but rounds to 14 when reported as a whole number.

Figure A. Percentage of students scoring at or above the Proficient level on the grade 8 National Assessment of Educational Progress (NAEP) reading and mathematics assessments, by different categorizations of race/ ethnicity: 2011


3 Such bias was found by a National Center for Health Statistics study that examined race/ethnicity responses to the 2000 Census. This study found, for example, that as the percentage of multiple-race respondents in a county increased, the likelihood of respondents stating Black as their primary race increased among Black/White respondents but decreased among American Indian or Alaska Native/Black respondents. See Parker, J. et al. (2004). Bridging Between Two Standards for Collecting Information on Race and Ethnicity: An Application to Census 2000 and Vital Rates. Public Health Reports, 119 (2): 192205. Available through