Overview of the Assessment
Reporting the Assessment—Scale Scores and Performance Levels
Results Are Estimates
NAEP Reporting Groups
Exclusion Rates and Assessment Results
Cautions in Interpretations
One of the primary objectives of NAEP is to track trends in student performance over time. The NAEP long-term trend (LTT) assessments in reading and mathematics were administered throughout the nation in the 2019–2020 school year to students aged 9, 13, and 17. As the age 17 is administered from March through May, the 2020 administration of the long-term trend assessment at age 17 was postponed because of the COVID-19 pandemic. NCES decided to re-administer the LTT reading and mathematics assessments to age-9 students in the winter of 2022 to examine the impact of the COVID-19 pandemic on student learning. The LTT program has charted educational progress since 1971 in reading and 1973 in mathematics. The LTT assessments differ from main NAEP; for example, the LTT instruments do not evolve based on changes in curricula or in educational practices. It is not possible to compare results from the main NAEP assessments in reading and mathematics with those of the LTT assessments as the instruments and methodologies of the two assessment programs are different.
Beginning with the 2004 administration of the long-term trend assessments in reading and mathematics, several changes were made to the assessment design. Any time changes are made in a long-term trend assessment, studies are required to ensure that the results can continue to be reported on the same trend line—that is, that they are validly comparable to earlier results. Analyses were needed to ensure that the 2004 results under the new design were comparable to the results from 1971 through 1999, under the earlier design. Therefore, two assessments were conducted in 2004. The revised assessment used the new design, and the original assessment replicated the former design. Comparisons of the results could then detect any shifts in results due to changes in test design. The original assessment linked the old assessments to the new one.
The content of the NAEP long-term trend assessments is determined by a set of objectives incorporating expert perspectives about the measurement of reading and mathematics. Read how the long-term trend assessment was developed and how the assessment was administered. Read more about what the long-term trend reading and long-term trend mathematics assessments measure.
The results are reported based on representative samples of students for the nation. The 2022 LTT assessment in reading and mathematics was administered to approximately 7,400 age-9 students from 410 students. Ninety-two percent of schools from 2022 were sampled in 2022. Eligibility for the age 9 sample was based on the calendar year. Students in the age 9 sample were 9 years old on January 1, 2022, with birth months January through December 2013.
The results of student performance on the long-term trend assessment are reported in two ways: as average scale scores and percentages of students performing at or above each performance level. Student performance in each subject area is summarized as an average score on a 0 to 500 scale. For each year in which the assessments were administered, achievement in a particular subject area is described for a group of students by their average scale score. Trends in student achievement are determined by examining the average scale scores attained by students in the current assessment year and comparing them to the average scale scores in other assessment years. While both the reading scale and the mathematics scale range from 0 to 500, the scale was derived independently for each subject. Therefore, average scale scores between subjects cannot be compared.
Student performance is also described in terms of the percentages of students attaining specific levels of performance. These performance levels correspond to five points on the reading and mathematics scales: 150, 200, 250, 300, and 350. For each subject area, the performance levels from lowest to highest are associated with increasingly advanced skills and knowledge. Examining the percentages of students in each year who attained each performance level provides additional insights into student achievement. Read more about the long-term trend performance levels.
The average scores and percentages presented in the LTT reports are estimates because they are based on representative samples of students rather than on the entire population of students. As such, NAEP results are subject to a measure of uncertainty, reflected in the standard error of the estimates. The standard errors for the estimated scale scores and percentages in the figures and tables presented in the report are available in the NAEP Data Explorer.
Results are provided for groups of students defined by shared characteristics—race/ethnicity, gender, grade attended, highest level of parental education (for age 13 and age 17 students only), type of school, type of school location, eligibility for free/reduced-price school lunch, region of the country, status as students with disabilities, status as students identified as English learners. Based on participation rate criteria, results are reported for groups of students only when adequate school representation and sufficient numbers of students are present. The minimum requirement is at least 62 students in a particular student group from at least five primary sampling units (PSUs). However, the data for all students, regardless of whether their group was reported separately, were included in computing overall results. Explanations of the reporting groups are presented below.
Results are presented for students in different racial/ethnic groups according to the following mutually exclusive categories: White, Black, Hispanic, and Other. Results for Asian/Pacific Islander and American Indian/Alaska Native students are not reported separately because there were too few students in the groups for statistical reliability in certain assessment years, but the full results for these two groups along with the other racial/ethnic groups are available in the
NAEP Data Explorer. The data for all students, regardless of whether their racial/ethnic group was reported separately, were included in computing the overall national results.
Results by students’ race/ethnicity are presented in the report based on information collected from two different sources:
Observed Race/Ethnicity. Students were assigned to a racial/ethnic category based on the assessment administrator's observation. A category for Hispanic students did not exist in 1971, but was included in subsequent years. The results for the 2004 original assessment format and all previous assessment years are based on observed race/ethnicity.
School-Reported Race/Ethnicity. Data about students’ race/ethnicity from school records were collected in 2004, but were not collected for any of the previous NAEP long-term trend assessments. The results presented in this report for the 2004 revised assessment format and for 2008, 2012, and 2020 are based on school-reported race/ethnicity.
Results are reported separately for males and females. Gender was reported by the school.
The long-term trend assessments are administered to samples of students defined by age rather than by grade. Nine-year-olds are typically in fourth grade, 13-year-olds are typically in eighth grade, and 17-year-olds are typically in eleventh grade. Some students in each age group, however, are in a grade that is below or above the grade that is typical for their age. For example, some 17-year-olds are in the tenth of twelfth grade rather than the eleventh grade. Different factors may contribute to why students are in a lower or higher grade than is typical for their age. Such factors could include having started a school year earlier or later than usual, having been held back a grade, or having skipped a grade.
Age 13 and age 17 students were asked to indicate the extent of schooling for each of their parents, choosing among the following options: did not finish high school, graduated from high school, had some education after high school, or graduated from college. (Results for parental education are not reported at age 9 because research has shown that students' reports of their parents' education level are less reliable at this age.) The response indicating the highest level of education for either parent was selected for reporting. Although students in previous long-term trend assessments were asked about their parents' level of education, the wording of the question in the revised format of the reading assessments administered in 2004 and later was different from previous years. Consequently, results from the 2004, 2008, 2012, and 2020 reading assessments are reported for the parents’ education level variable. However, this is not the case for the long-term trend mathematics assessment. Results for this variable in mathematics go back to 1978.
The national results are based on a representative sample of students in both public schools and nonpublic schools. Nonpublic schools include private schools, Bureau of Indian Affairs schools, and Department of Defense schools. Private schools include Catholic, Conservative Christian, Lutheran, and other private schools. To ensure unbiased samples, NAEP statistical standards require that participation rates for original school samples be 70 percent or higher to report national results separately for public and private school students. At both age 9 and age 13 in 2020, and age 9 in 2022, the school participation rates met the standards for reporting results separately for public schools in 2020 but not for private schools. Catholic school participation rates did not meet the standards in 2022 for reporting results separately.
NAEP results are reported for four mutually exclusive categories of school location: city, suburb, town, and rural. The categories are based on standard definitions established by the Federal Office of Management and Budget using population and geographic information from the U.S. Census Bureau. Schools are assigned to these categories in the NCES Common Core of Data based on their physical address. In 2007, the classification system was revised; therefore results are not available for type of location prior to 2008 in NAEP long-term trend assessments.
The new locale codes are based on an address's proximity to an urbanized area (a densely settled core with densely settled surrounding areas). This is a change from the original system based on metropolitan statistical areas. To distinguish the two systems, the new system is referred to as "urban-centric locale codes." The urban-centric locale code system classifies territory into four major types: city, suburban, town, and rural. Each type has three subcategories. For city and suburb, these are gradations of size—large, midsize, and small. Towns and rural areas are further distinguished by their distance from an urbanized area. They can be characterized as fringe, distant, or remote.
As part of the Department of Agriculture's National School Lunch Program (NSLP), schools can receive cash subsidies and donated commodities in turn for offering free or reduced-price lunches to eligible children. Based on available school records, students were classified as either currently eligible for the free/reduced-price school lunch or not eligible. Eligibility for free and reduced-price lunches is determined by students' family income in relation to the federally established poverty level. Students whose family income is at or below 130 percent of the poverty level qualify to receive free lunch, and students whose family income is between 130 percent and 185 percent of the poverty level qualify to receive reduced-price lunch. For the period July 1, 2021 through June 30, 2022, for a family of four, 130 percent of the poverty level was $34,450 and 185 percent was $49,025 in most states. The classification applies only to the school year when the assessment was administered (i.e., the 2021–2022 school year) and is not based on eligibility in previous years. If school records were not available, the student was classified as "Information not available." If the school did not participate in the program, all students in that school were classified as "Information not available. As a result of the passage of the Healthy, Hunger-Free Kids Act of 2010, schools can use a new universal meal service option, the "Community Eligibility Provision" (CEP). Through CEP, eligible schools can provide meal service to all students at no charge, regardless of economic status and without the need to collect eligibility data through household applications. CEP became available nationwide in the 2014-2015 school year; as a result, the percentage of students in many states categorized as eligible for NSLP may have increased in comparison to 2013 due to this provision. Therefore, readers should interpret NSLP trend results with caution. Results are not available for eligibility for the National School Lunch Program (NSLP) prior to 2004 in NAEP long-term trend assessments. See the proportion of students in each category at age 9 in reading and mathematics in the NAEP Data Explorer.
Results are not available for region of the country prior to 2004 in NAEP long-term trend assessments. Prior to 2003, NAEP results were reported for four NAEP-defined regions of the nation: Northeast, Southeast, Central, and West. As of 2003, to align NAEP with other federal data collections, NAEP analysis and reports have used the U.S. Census Bureau's definition of "region." The four regions defined by the U.S. Census Bureau are Northeast, South, Midwest, and West. The Central region used by NAEP before 2003 contained the same states as the Midwest region defined by the U.S. Census. The former Southeast region consisted of the states in the Census-defined South minus Delaware, the District of Columbia, Maryland, Oklahoma, Texas, and the section of Virginia in the District of Columbia metropolitan area. The former West region consisted of Oklahoma, Texas, and the states in the Census-defined West. The former Northeast region consisted of the states in the Census-defined Northeast plus Delaware, the District of Columbia, Maryland, and the section of Virginia in the District of Columbia metropolitan area. The table below shows how states are subdivided into these Census regions. All 50 states and the District of Columbia are listed. Other jurisdictions, including the Department of Defense Educational Activity schools, are not assigned to any region.
District of Columbia
SOURCE: U.S. Department of Commerce Economics and Statistics Administration.
Results are reported for students who were identified by school records as having a disability. A student with a disability may need specially designed instruction to meet his or her learning goals. A student with a disability will usually have an Individualized Education Program (IEP), which guides his or her special education instruction. Students with disabilities are often referred to as special education students and may be classified by their school as learning disabled (LD) or emotionally disturbed (ED). The goal of NAEP is that students who are capable of participating meaningfully in the assessment are assessed, but some students with disabilities selected by NAEP may not be able to participate, even with the accommodations provided. Beginning in 2009, NAEP disaggregated students with disabilities from students who were identified under section 504 of the Rehabilitation Act of 1973. The results for SD are based on students who were assessed and could not be generalized to the total population of students with disabilities.
Results are reported for students who were identified by school records as being English learners (EL). (Note that English learners were previously referred to as limited English proficient (LEP). The results for EL are based on students who were assessed and could not be generalized to the total population of such students.
Assessing representative samples of students, including students with disabilities (SD) and English learners (EL), helps to ensure that NAEP results accurately reflect the educational performance of all students in the target population and are a meaningful measure of U.S. students' academic achievement over time.
To ensure that all selected students from the population can be assessed, many of the same accommodations that SD and EL students use on other tests are provided for those students participating in NAEP. Prior to 2004, no testing accommodations were allowed for students identified as SD and/or EL selected to participate in the long-term trend assessments. One of the changes introduced as part of the 2004 assessments was the use of accommodations, such as extra testing time or individual rather than group administration for students who needed such accommodations to participate in the assessments. The results for the 2004, 2008, 2012, and 2020 long-term trend assessments are based on administration procedures that also allowed accommodations.
Use the "Data Quick View" to see tables that summarize the percentages of students identified, excluded, and assessed in long-term trend.
For each student selected to participate in NAEP who was identified as either SD or EL, a member of the school staff most knowledgeable about the student completed an SD/EL questionnaire. Students with disabilities were excluded from the assessment if an IEP (individualized education program) team or equivalent group determined that the student could not participate in assessments such as NAEP; if the student's cognitive functioning was so severely impaired that he or she could not participate; or if the student's IEP required that the student be tested with an accommodation or adaptation not permitted or available in NAEP, and the student could not demonstrate his or her knowledge of the assessment subject area without that accommodation or adaptation.
A student who was identified as EL and who was a native speaker of a language other than English was excluded if the student received instruction in the assessment subject area (e.g., reading or mathematics) primarily in English for less than three school years, including the current year, or if the student could not demonstrate his or her knowledge of reading or mathematics in English without an accommodation or adaptation.
The differences between scale scores and between percentages discussed in the results take into account the standard errors associated with the estimates. Comparisons are based on statistical tests that consider both the magnitude of the difference between the group average scores or percentages and the standard errors of those statistics. Throughout the results, differences between scores or between percentages are discussed only when they are significant from a statistical perspective.
All differences reported are significant at the 0.05 level with appropriate adjustments for multiple comparisons. The term "significant" is not intended to imply a judgment about the absolute magnitude or the educational relevance of the differences. It is intended to identify statistically dependable population differences to help inform dialogue among policymakers, educators, and the public.
Users of this website are cautioned against interpreting NAEP results as implying causal relationships. Inferences related to student group performance or to the effectiveness of particular classroom practices, for example, should take into consideration the many socioeconomic and educational factors that may also impact performance.
The NAEP long-term trend scales make it possible to examine relationships between students' performance and various background factors measured by NAEP. However, a relationship that exists between achievement and another variable does not reveal its underlying cause, which may be influenced by a number of other variables. Similarly, the assessments do not reflect the influence of unmeasured variables. The results are most useful when they are considered in combination with other knowledge about the student population and the educational system, such as trends in instruction, changes in the school-age population, and societal demands and expectations.
Comparisons of the 2022, 2020, 2012, and 2008 results to the 2004 original or previous trend results should be interpreted with caution, bearing in mind the differences in assessment accommodations and changes to assessment procedures.