The data in this edition of the Digest of Education Statistics were obtained from many different sources—including students and teachers, state education agencies, local elementary and secondary schools, and colleges and universities—using surveys and compilations of administrative records. Users should be cautious when comparing data from different sources. Differences in aspects such as procedures, timing, question phrasing, and interviewer training can affect the comparability of results across data sources.
Most of the tables present data from surveys conducted by the National Center for Education Statistics (NCES) or conducted by other agencies and organizations with support from NCES. Some tables also include other data published by federal and state agencies, private research organizations, or professional organizations. Totals reported in the Digest are for the 50 states and the District of Columbia unless otherwise noted. Brief descriptions of the surveys and other data sources used in this volume can be found in Appendix A: Guide to Sources. For each NCES and non-NCES data source, the Guide to Sources also provides information on where to obtain further details about that source.
Data are obtained primarily from two types of surveys: universe surveys and sample surveys. In universe surveys, information is collected from every member of the population. For example, in a survey regarding certain expenditures of public elementary and secondary schools, data would be obtained from each school district in the United States. When data from an entire population are available, estimates of the total population or a subpopulation are made by simply summing the units in the population or subpopulation. As a result, there is no sampling error, and observed differences are reported as true.
Since universe surveys are often expensive and time consuming, many surveys collect data from a sample of the population of interest (sample surveys). For example, the National Assessment of Educational Progress (NAEP) assesses a representative sample of students rather than the entire population of students. When a sample survey is used, statistical uncertainty is introduced, because the data come from only a portion of the entire population. This statistical uncertainty must be considered when reporting estimates and making comparisons. For information about how NCES accounts for statistical uncertainty when reporting sample survey results, see “Data Analysis and Interpretation,” later in this Reader’s Guide.
Various types of statistics derived from universe and sample surveys are reported. Many tables report the size of a population or a subpopulation, and often the size of a subpopulation is expressed as a percentage of the total population.
In addition, the average (or mean) value of some characteristic of the population or subpopulation may be reported. The average is obtained by summing the values for all members of the population and dividing the sum by the size of the population. An example is the average annual salary of full-time instructional faculty at degree-granting postsecondary institutions. Another measure that is sometimes used is the median. The median is the midpoint value of a characteristic at or above which 50 percent of the population is estimated to fall, and at or below which 50 percent of the population is estimated to fall. An example is the median annual earnings of young adults who are full-time year-round workers. Some tables also present an average per capita, or per person, which represents an average computed for every person in a specified group or population. It is derived by dividing the total for an item (such as income or expenditures) by the number of persons in the specified population. An example is the per capita expenditure on education in each state.
Many tables report financial data in dollar amounts. Unless otherwise noted, all financial data are in current dollars, meaning not adjusted for changes in the purchasing power of the dollar over time due to inflation. For example, 1996–97 teacher salaries in current dollars are the amounts that the teachers earned in 1996–97, without any adjustments to account for inflation. Constant dollar adjustments attempt to remove the effects of price changes (inflation) from statistical series reported in dollars. For example, if teacher salaries over a 20-year period are adjusted to constant 2018–19 dollars, the salaries for all years are adjusted to the dollar values that presumably would exist if prices in each year were the same as in 2018–19 (in other words, as if the dollar had constant purchasing power over the entire period). Any changes in the constant dollar amounts would reflect only changes in real values. Constant dollar amounts are computed using price indexes. Price indexes for inflation adjustments can be found in web-only table 106.70. Each table that presents constant dollars includes a note indicating which index was used for the inflation adjustments; in most cases, the Consumer Price Index was used.
When presenting data for a time series, some tables include both actual and projected data. Actual data are data that have already been collected. Projected data can be used when data for a recent or future year are not yet available. Projections are estimates that are based on recent trends in relevant statistics and patterns associated with correlated variables. Unless otherwise noted, all data in this volume are actual.
Using estimates calculated from data based on a sample of the population requires consideration of several factors before the estimates can be interpreted. When using data from a sample, some margin of error will always be present in estimations of characteristics of the total population or subpopulation because the data are available from only a portion of the total population. Consequently, data from samples can provide only an approximation of the true or actual value. The margin of error of an estimate, or the range of potential true or actual values, depends on several factors such as the amount of variation in the responses, the size and representativeness of the sample, and the size of the subgroup for which the estimate is computed. The magnitude of this margin of error is measured by what statisticians call the standard error of an estimate.
When data from sample surveys are reported, the standard error is calculated for each estimate. In the tables, the standard error for each estimate generally appears in parentheses next to the estimate to which it applies. In order to caution the reader when interpreting findings, estimates from sample surveys are flagged with a “!” when the standard error is between 30 and 50 percent of the estimate and suppressed with a “‡” when the standard error is 50 percent of the estimate or greater. The term coefficient of variation (CV) refers to the ratio of the standard error to the estimate; for example, if an estimate has a CV of 30 percent, this means that the standard error is equal to 30 percent of the value of the estimate.
In addition to standard errors, which apply only to sample surveys, all surveys are subject to nonsampling errors. Nonsampling errors may arise when individual respondents or interviewers interpret questions differently; when respondents must estimate values, or when coders, keyers, and other processors handle answers differently; when people who should be included in the universe are not; or when people fail to respond, either totally or partially. Total nonresponse means that people do not respond to the survey at all, while partial nonresponse (or item nonresponse) means that people fail to respond to specific survey items. To compensate for nonresponse, adjustments are often made. For universe surveys, an adjustment made for either type of nonresponse, total or partial, is often referred to as an imputation, which is often a substitution of the “average” questionnaire response for the nonresponse. For universe surveys, imputations are usually made separately within various groups of sample members that have similar survey characteristics. For sample surveys, total nonresponse is handled through nonresponse adjustments to the sample weights. For sample surveys, imputation for item nonresponse is usually made by substituting for a missing item the response to that item of a respondent having characteristics that are similar to those of the nonrespondent. For additional general information about imputations, see the NCES Statistical Standards (NCES 2014-097). Standard 4-1 provides information about imputation for item nonresponse. Appendix A: Guide to Sources includes some information about specific surveys’ response rates, nonresponse adjustments, and other efforts to reduce nonsampling error. Although the magnitude of nonsampling error is frequently unknown, idiosyncrasies that have been identified are noted in the appropriate tables.
When estimates are from a sample, caution is warranted when drawing conclusions about one estimate in comparison to another or about whether a time series of estimates is increasing, decreasing, or staying the same. Although one estimate may appear to be larger than another, a statistical test may find that the apparent difference between them is not reliably measurable due to the uncertainty around the estimates. In this case, the estimates will be described as having “no measurable difference,” meaning that the difference between them is not statistically significant.
Whether differences in means or percentages are statistically significant can be determined using the standard errors of the estimates. In reports produced by NCES, when differences are statistically significant, the probability that the difference occurred by chance is less than 5 percent, according to NCES standards.
Data presented in the text do not investigate more complex hypotheses, account for interrelationships among variables, or support causal inferences. We encourage readers who are interested in more complex questions and in-depth analysis to explore other NCES resources, including publications, online data tools, and public- and restricted-use datasets at https://nces.ed.gov.
In text that reports estimates based on samples, differences between estimates (including increases and decreases) are stated only when they are statistically significant. To determine whether differences reported are statistically significant, two-tailed t tests at the .05 level are typically used. The t test formula for determining statistical significance is adjusted when the samples being compared are dependent. The t test formula is not adjusted for multiple comparisons, with the exception of statistical tests conducted using the NAEP Data Explorer (https://nces.ed.gov/nationsreportcard/data/). When the variables to be tested are postulated to form a trend, the relationship may be tested using linear regression, logistic regression, or ANOVA trend analysis instead of a series of t tests. These alternate methods of analysis test for specific relationships (e.g., linear, quadratic, or cubic) among variables. For more information on data analysis, please see the NCES Statistical Standards, Standard 5-1, available at https://nces.ed.gov/statprog/2012/pdf/Chapter5.pdf.
A number of considerations influence the ultimate selection of the data years to include in the tables and to feature in the text. To make analyses as timely as possible, the latest year of available data is shown. The choice of comparison years is often also based on the need to show the earliest available survey year, as in the case of NAEP and the international assessment surveys. The text typically compares the most current year’s data with those from the initial year and then with those from a more recent year. In the case of surveys with long time frames, such as surveys measuring enrollment, changes over the course of a decade may be noted in the text. Where applicable, the text may also note years in which the data begin to diverge from previous trends. In figures and tables, intervening years are selected in increments in order to show the general trend.
All calculations are based on unrounded estimates. Therefore, the reader may find that a calculation, such as a difference or a percentage change, cited in the text or a figure may not be identical to the calculation obtained by using the rounded values shown in the accompanying tables. Although values reported in the tables are generally rounded to one decimal place (e.g., 76.5 percent), values reported in the text are generally rounded to whole numbers (with any value of 0.50 or above rounded to the next highest whole number). Due to rounding, cumulative percentages may sometimes equal 99 or 101 percent rather than 100 percent.
The Office of Management and Budget (OMB) is responsible for the standards that govern the categories used to collect and present federal data on race and ethnicity. The OMB revised the guidelines on racial/ethnic categories used by the federal government in October 1997, with a January 2003 deadline for implementation. The revised standards require a minimum of these five categories for data on race: American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or Other Pacific Islander, and White. The standards also require the collection of data on the ethnicity categories Hispanic or Latino and Not Hispanic or Latino. It is important to note that Hispanic origin is an ethnicity rather than a race, and therefore persons of Hispanic origin may be of any race. Origin can be viewed as the heritage, nationality group, lineage, or country of birth of the person or the person’s parents or ancestors before their arrival in the United States. The race categories White, Black, Asian, Native Hawaiian or Other Pacific Islander, and American Indian or Alaska Native exclude persons of Hispanic origin unless otherwise noted.
For a description of each racial/ethnic category, please see the “Racial/ethnic group” entry in Appendix B: Definitions. Some of the category labels are shortened for more concise presentation in text, tables, and figures. American Indian or Alaska Native is denoted as American Indian/Alaska Native (except when separate estimates are available for American Indians alone or Alaska Natives alone); Black or African American is shortened to Black; and Hispanic or Latino is shortened to Hispanic. When discussed separately from Asian estimates, Native Hawaiian or Other Pacific Islander is shortened to Pacific Islander.
Many of the data sources used for this volume are federal surveys that collect data using the OMB standards for racial/ethnic classification described above; however, some sources have not fully adopted the standards, and some tables include historical data collected prior to the adoption of the OMB standards. Asians and Pacific Islanders are combined into a single category for years in which the data were not collected separately for the two groups. The combined category can sometimes mask significant differences between the two subgroups. For example, prior to 2011, NAEP collected data that did not allow for separate reporting of estimates for Asians and Pacific Islanders. The population counts presented in table 101.20, based on the U.S. Census Bureau’s Current Population Reports, indicate that 96 percent of all Asian/Pacific Islander 5- to 17-year-olds were Asian in 2010. Thus, the combined category for Asians/Pacific Islanders is more representative of Asians than of Pacific Islanders.
Some surveys give respondents the option of selecting more than one race category, an “other” race category, or a “Two or more races” or “more than one race” category. Where possible, tables present data on the “Two or more races” category; however, in some cases this category may not be separately shown because the information was not collected or due to other data issues. Some tables include the “other” category. Any comparisons made between persons of one racial/ethnic group and persons of “all other racial/ethnic groups” include only the racial/ethnic groups shown in the reference table. In some surveys, respondents are not given the option to select more than one race category and also are not given an option such as “other” or “more than one race.” In these surveys, respondents of Two or more races must select a single race category. Any comparisons between data from surveys that give the option to select more than one race and surveys that do not offer such an option should take into account the fact that there is a potential for bias if members of one racial group are more likely than members of the others to identify themselves as “Two or more races.”1 For postsecondary data, foreign students are counted separately and are therefore not included in any racial/ethnic category. In addition to the major racial/ethnic categories, several tables include Hispanic ancestry subgroups (such as Mexican, Puerto Rican, Cuban, Dominican, Salvadoran, Other Central American, and South American) and Asian ancestry subgroups (such as Asian Indian, Chinese, Filipino, Japanese, Korean, and Vietnamese). In addition, selected tables include “Two or more races” subgroups (such as White and Black, White and Asian, and White and American Indian/Alaska Native).
Due to large standard errors, some differences that seem substantial are not statistically significant and, therefore, are not cited in the text. This situation often applies to estimates involving American Indians/Alaska Natives and Pacific Islanders. The relatively small sizes of these populations pose many measurement difficulties when conducting statistical analysis. Even in larger surveys, the numbers of American Indians/Alaska Natives and Pacific Islanders included in a sample are often small. Researchers studying data on these two populations often face small sample sizes that increase the size of standard errors and reduce the reliability of results. Readers should keep these limitations in mind when comparing estimates presented in the tables.
As mentioned, caution should be exercised when comparing data from different sources. Differences in sampling, data collection procedures, coverage of target population, timing, phrasing of questions, scope of nonresponse, interviewer training, and data processing and coding mean that results from different sources may not be strictly comparable. For example, the racial/ethnic categories presented to a respondent, and the way in which the question is asked, can influence the response, especially for individuals who consider themselves of more than one race or ethnicity. In addition, data on American Indians/Alaska Natives are often subject to inaccuracies that can result from respondents self-identifying their race/ethnicity. Research on the collection of race/ethnicity data suggests that the categorization of American Indian and Alaska Native is the least stable self-identification (for example, the same individual may identify as American Indian when responding to one survey but may not do so on a subsequent survey).2
1 For discussion of such bias in responses to the 2000 Census, see Parker, J., et al. (2004). Bridging Between Two Standards for Collecting Information on Race and Ethnicity: An Application to Census 2000 and Vital Rates. Public Health Reports, 119(2): 192–205. Available at https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1497618/.
2 See U.S. Department of Labor, Bureau of Labor Statistics (1995). A Test of Methods for Collecting Racial and Ethnic Information (USDL 95-428). Washington DC: Author. Available at https://www.bls.gov/news.release/history/ethnic_102795.txt.