The Condition of Education contains indicators on the state of education in the United States and abroad. This report is available in two forms: on the National Center for Education Statistics (NCES) website as a full PDF, as individual indicator PDFs, and in HTML; and on the NCES mobile website. All reference tables are hyperlinked within the PDF and HTML versions, as are the sources for each of the graphics. The reference tables can be found in other NCES publications—primarily the Digest of Education Statistics.
Data Sources and Estimates
The data in these indicators were obtained from many different sources—including students and teachers, state education agencies, local elementary and secondary schools, and colleges and universities—using surveys and compilations of administrative records. Users should be cautious when comparing data from different sources. Differences in aspects such as procedures, timing, question phrasing, and interviewer training can affect the comparability of results across data sources.
Most indicators in The Condition of Education summarize data from surveys conducted by NCES or by the Census Bureau with support from NCES. Brief descriptions of the major NCES surveys used in these indicators can be found in the Guide to Sources. More detailed descriptions can be obtained on the NCES website under “Surveys and Programs.”
The Guide to Sources also includes information on non-NCES sources used to develop indicators, such as the Census Bureau’s American Community Survey (ACS) and Current Population Survey (CPS). For further details on the ACS, see http://www.census.gov/acs/www/. For further details on the CPS, see http://www.census.gov/cps/.
Data for Condition of Education indicators are obtained from two types of surveys: universe surveys and sample surveys. In universe surveys, information is collected from every member of the population. For example, in a survey regarding certain expenditures of public elementary and secondary schools, data would be obtained from each school district in the United States. When data from an entire population are available, estimates of the total population or a subpopulation are made by simply summing the units in the population or subpopulation. As a result, there is no sampling error, and observed differences are reported as true.
Since universe surveys are often expensive and time consuming, many surveys collect data from a sample of the population of interest (sample survey). For example, the National Assessment of Educational Progress (NAEP) assesses a representative sample of students rather than the entire population of students. When a sample survey is used, statistical uncertainty is introduced, because the data come from only a portion of the entire population. This statistical uncertainty must be considered when reporting estimates and making comparisons. For more information, please see the section on standard errors below.
Various types of statistics derived from universe and sample surveys are reported in The Condition of Education. Many indicators report the size of a population or a subpopulation, and often the size of a subpopulation is expressed as a percentage of the total population. In addition, the average (or mean) value of some characteristic of the population or subpopulation may be reported. The average is obtained by summing the values for all members of the population and dividing the sum by the size of the population. An example is the annual average salaries of full-time instructional faculty at degree-granting postsecondary institutions. Another measure that is sometimes used is the median. The median is the midpoint value of a characteristic at or above which 50 percent of the population is estimated to fall, and at or below which 50 percent of the population is estimated to fall. An example is the median annual earnings of young adults who are full-time, full-year wage and salary workers.
Using estimates calculated from data based on a sample of the population requires consideration of several factors before the estimates become meaningful. When using data from a sample, some margin of error will always be present in estimations of characteristics of the total population or subpopulation because the data are available from only a portion of the total population. Consequently, data from samples can provide only an approximation of the true or actual value. The margin of error of an estimate, or the range of potential true or actual values, depends on several factors such as the amount of variation in the responses, the size and representativeness of the sample, and the size of the subgroup for which the estimate is computed. The magnitude of this margin of error is measured by what statisticians call the “standard error” of an estimate. Larger standard errors typically mean that the estimate is less accurate, while smaller standard errors typically indicate that the estimate is more accurate.
When data from sample surveys are reported, the standard error is calculated for each estimate. The standard errors for all estimated totals, means, medians, or percentages are reported in the reference tables.
In order to caution the reader when interpreting findings in the indicators, estimates from sample surveys are flagged with a “!” when the standard error is between 30 and 50 percent of the estimate, and suppressed with a “‡” when the standard error is 50 percent of the estimate or greater.
Data Analysis and Interpretation
When estimates are from a sample, caution is warranted when drawing conclusions about whether one estimate is different in comparison to another; about whether a time series of estimates is increasing, decreasing, or staying the same; or about whether two variables are associated. Although one estimate may appear to be larger than another, a statistical test may find that the apparent difference between them is not measurable due to the uncertainty around the estimates. In this case, the estimates will be described as having no measurable difference, meaning that the difference between them is not statistically significant.
Whether differences in means or percentages are statistically significant can be determined using the standard errors of the estimates. In the indicators in The Condition of Education and other reports produced by NCES, when differences are statistically significant, the probability that the difference occurred by chance is less than 5 percent, according to NCES standards.
For all indicators that report estimates based on samples, differences between estimates (including increases and decreases) are stated only when they are statistically significant. To determine whether differences reported are statistically significant, two-tailed t tests at the .05 level are typically used. The t test formula for determining statistical significance is adjusted when the samples being compared are dependent. The t test formula is not adjusted for multiple comparisons, with the exception of statistical tests conducted using the NAEP Data Explorer. When the variables to be tested are postulated to form a trend over time, the relationship may be tested using linear regression or ANOVA trend analyses instead of a series of t tests. Indicators that use other methods of statistical comparison include a separate technical notes section. For more information on data analysis, please see the NCES Statistical Standards, Standard 5-1, available at http://nces.ed.gov/statprog/2012/pdf/Chapter5.pdf.
Multivariate analyses, such as ordinary least squares (OLS) regression models, provide information on whether the relationship between an independent variable and an outcome measure (such as group differences in the outcome measure) persists, after taking into account other variables, such as student, family, and school characteristics. For The Condition of Education indicators that include a regression analysis, multiple categorical or continuous independent variables are entered simultaneously. A significant regression coefficient indicates an association between the dependent (outcome) variable and the independent variable, after controlling for other independent variables included in the regression model.
Data presented in the indicators typically do not investigate more complex hypotheses or support causal inferences. We encourage readers who are interested in more complex questions and in-depth analysis to explore other NCES resources, including publications, online data tools, and public- and restricted-use datasets at http://nces.ed.gov.
A number of considerations influence the ultimate selection of the data years to feature in the indicators. To make analyses as timely as possible, the latest year of available data is shown. The choice of comparison years is often also based on the need to show the earliest available survey year, as in the case of the NAEP and the international assessment surveys. In the case of surveys with long time frames, such as surveys measuring enrollment, a decade’s beginning year (e.g., 1980 or 1990) often starts the trend line. In the figures and tables of the indicators, intervening years are selected in increments in order to show the general trend. The narrative for the indicators typically compares the most current year’s data with those from the initial year and then with those from a more recent period. Where applicable, the narrative may also note years in which the data begin to diverge from previous trends.
Rounding and Other Considerations
All calculations within the indicators in this report are based on unrounded estimates. Therefore, the reader may find that a calculation, such as a difference or a percentage change, cited in the text or figure may not be identical to the calculation obtained by using the rounded values shown in the accompanying tables. Although values reported in the reference tables are generally rounded to one decimal place (e.g., 76.5 percent), values reported in each indicator are generally rounded to whole numbers (with any value of 0.50 or above rounded to the next highest whole number). Due to rounding, cumulative percentages may sometimes equal 99 or 101 percent rather than 100 percent.
Race and Ethnicity
The Office of Management and Budget (OMB) is responsible for the standards that govern the categories used to collect and present federal data on race and ethnicity. The OMB revised the guidelines on racial/ethnic categories used by the federal government in October 1997, with a January 2003 deadline for implementation. The revised standards require a minimum of these five categories for data on race: American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or Other Pacific Islander, and White. The standards also require the collection of data on ethnicity categories, at a minimum, Hispanic or Latino and Not Hispanic or Latino. It is important to note that Hispanic origin is an ethnicity rather than a race, and therefore persons of Hispanic origin may be of any race. Origin can be viewed as the heritage, nationality group, lineage, or country of birth of the person or the person’s parents or ancestors before their arrival in the United States. The race categories White, Black, Asian, Native Hawaiian or Other Pacific Islander, and American Indian or Alaska Native, as presented in these indicators, exclude persons of Hispanic origin unless noted otherwise.
The categories are defined as follows:
Within these indicators, some of the category labels have been shortened in the text, tables, and figures for ease of reference. American Indian or Alaska Native is denoted as American Indian/Alaska Native (except when separate estimates are available for American Indians alone or Alaska Natives alone); Black or African American is shortened to Black; and Hispanic or Latino is shortened to Hispanic. Native Hawaiian or Other Pacific Islander is shortened to Pacific Islander.
The indicators in this report draw from a number of different data sources. Many are federal surveys that collect data using the OMB standards for racial/ethnic classification described above; however, some sources have not fully adopted the standards, and some indicators include data collected prior to the adoption of the OMB standards. This report focuses on the six categories that are the most common among the various data sources used: White, Black, Hispanic, Asian, Pacific Islander, and American Indian/Alaska Native. Asians and Pacific Islanders are combined into one category in indicators for which the data were not collected separately for the two groups.
Some of the surveys from which data are presented in these indicators give respondents the option of selecting either an "other" race category, a "Two or more races" or "multiracial" category, or both. Where possible, indicators present data on the "Two or more races" category; however, in some cases this category may not be separately shown because the information was not collected or due to other data issues. In general, the "other" category is not separately shown. Any comparisons made between persons of one racial/ethnic group to "all other racial/ethnic groups" include only the racial/ethnic groups shown in the indicator. In some surveys, respondents are not given the option to select more than one race. In these surveys, respondents of Two or more races must select a single race category. Any comparisons between data from surveys that give the option to select more than one race and surveys that do not offer such an option should take into account the fact that there is a potential for bias if members of one racial group are more likely than members of the others to identify themselves as "Two or more races."1 For postsecondary data, foreign students are counted separately and are therefore not included in any racial/ethnic category.
The American Community Survey (ACS), conducted by the U.S. Census Bureau, collects information regarding specific racial/ethnic ancestry. Selected indicators include Hispanic ancestry subgroups (such as Mexican, Puerto Rican, Cuban, Dominican, Salvadoran, Other Central American, and South American) and Asian ancestry subgroups (such as Asian Indian, Chinese, Filipino, Japanese, Korean, and Vietnamese). In addition, selected indicators include "Two or more races" subgroups (such as White and Black, White and Asian, and White and American Indian/Alaska Native).
Limitations of the Data
The relatively small sizes of the American Indian/Alaska Native and Pacific Islander populations pose many measurement difficulties when conducting statistical analyses. Even in larger surveys, the numbers of American Indians/Alaska Natives and Pacific Islanders included in a sample are often small. Researchers studying data on these two populations often face small sample sizes that reduce the reliability of results. Survey data for American Indians/Alaska Natives often have somewhat higher standard errors than data for other racial/ethnic groups. Due to large standard errors, differences that seem substantial are often not statistically significant and, therefore, not cited in the text.
Data on American Indians/Alaska Natives are often subject to inaccuracies that can result from respondents self-identifying their race/ethnicity. According to research on the collection of race/ethnicity data conducted by the Bureau of Labor Statistics in 1995, the categorization of American Indian and Alaska Native is the least stable self-identification. The racial/ethnic categories presented to a respondent, and the way in which the question is asked, can influence the response, especially for individuals who consider themselves as being of mixed race or ethnicity. These data limitations should be kept in mind when reading this report.
As mentioned above, Asians and Pacific Islanders are combined into one category in indicators for which the data were not collected separately for the two groups. The combined category can sometimes mask significant differences between subgroups. For example, prior to 2011, the National Assessment of Educational Progress (NAEP) collected data that did not allow for separate reporting of estimates for Asians and Pacific Islanders. Information from Digest of Education Statistics, 2015 (table 101.20), based on the Census Bureau Current Population Reports, indicates that 96 percent of all Asian/Pacific Islander 5- to 24-year-olds are Asian. This combined category for Asians/Pacific Islanders is more representative of Asians than Pacific Islanders.
In accordance with the NCES Statistical Standards, many tables in this volume use a series of symbols to alert the reader to special statistical notes. These symbols, and their meanings, are as follows:
— Not available.
† Not applicable.
# Rounds to zero.
! Interpret data with caution. The coefficient of variation (CV) for this estimate is between 30 and 50 percent.
‡ Reporting standards not met. Either there are too few cases for a reliable estimate or the coefficient of variation (CV) for this estimate is 50 percent or greater.
* p < .05 Significance level.
1 Such bias was found by a National Center for Health Statistics study that examined race/ethnicity responses to the 2000 Census. This study found, for example, that as the percentage of multiple-race respondents in a county increased, the likelihood of respondents stating Black as their primary race increased among Black/White respondents but decreased among American Indian or Alaska Native/Black respondents. See Parker, J. et al. (2004). Bridging Between Two Standards for Collecting Information on Race and Ethnicity: An Application to Census 2000 and Vital Rates. Public Health Reports, 119(2): 192–205. Available through http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1497618.