National Center for Education Statistics
In late January through early March of 2003, the National Assessment of Educational Progress (NAEP) grade 4 and 8 reading and mathematics assessments were administered to representative samples of students in approximately 100 public schools in each state. The results of these assessments were announced in November 2003. Each state also carried out its own reading and mathematics assessments in the 2002-2003 school year, most including grades 4 and 8. This report addresses the question of whether the results published by NAEP are comparable to the results published by individual state testing programs.
Comparisons to address the following four questions are based purely on results of testing and do not compare the content of NAEP and state assessments.
Both NAEP and State Education Agencies have set achievement, or performance, standards for reading and have identified test score criteria for determining the percentages of students who meet the standards. Most states have multiple performance standards, and these can be categorized into a primary standard, which, since the passage of No Child Left Behind, is generally the standard used for reporting adequate yearly progress (AYP), and standards that are above or below the primary standard. Most states refer to their primary standard as proficient or meets the standard.
By matching percentages of students reported to be meeting state standards in schools participating in NAEP with the distribution of performance of students in those schools on NAEP, cutpoints on the NAEP scale can be identified that are equivalent to the scores required to meet a state's standards.
From the analyses presented in chapter 2, we find:
An essential criterion for the comparison of NAEP and state assessment results in a state is that the two assessments agree on which schools are high achieving and which are not. The critical statistic for testing this criterion is the correlation between schools' percentages achieving their primary standard, as measured by NAEP and the state assessment. Generally, a correlation of at least .7 is important for confidence in linkages between them.1 In 2003, correlations between NAEP and state assessment measures of reading achievement were greater than .7 in 29 out 51 jurisdictions in grade 4 and in 29 out of 48 in grade 8. Several factors other than similarity of the assessments depress this correlation.
One of these factors is a disparity between the standards: the correlation between the percent of students meeting a high standard on one test and a low standard on the other test are bound to be lower than the correlation between percents of students meeting standards of equal difficulty on the two tests. To be fair and unbiased, comparisons of percentages meeting standards on two tests must be based on equivalent standards for both tests. To remove the bias of different standards, NAEP was rescored in terms of percentages meeting the state's standard. Nevertheless, as discussed in chapter 3, other factors also depressed the correlations:
Comparisons are made between NAEP and state assessment reading achievement trends over three periods: from 1998 to 2003, from 1998 to 2002, and from 2002 to 2003. Achievement trends are measured by both NAEP and state assessments as gains in school-level percentages meeting the state's primary standard.2
From the analyses presented in chapter 4, we find:
Comparisons are made between NAEP and state assessment measurement of (1) reading achievement gaps in grades 4 and 8 in 2003 and (2) changes in these reading achievement gaps between 2002 and 2003. Comparisons are based on school-level percentages of Black, Hispanic, White, and economically disadvantaged and nondisadvantaged students achieving the state's primary reading achievement standard in the NAEP schools in each state.
From the analyses presented in chapter 5, we find:
This report makes use of test score data for 50 states and the District of Columbia from two sources: (1) NAEP plausible value files for the states participating in the 1998, 2002 and 2003 reading assessments, augmented by imputations of plausible values for the achievement of excluded students;3 and (2) state assessment files of school-level statistics compiled in the National Longitudinal School-Level State Assessment Score Database (NLSLSASD).4
All comparisons in the report are based on NAEP and state assessment results in schools that participated in NAEP, weighted to represent the states. Across states in 2003,the median percentage of NAEP schools for which state assessment records were matched was greater than 99 percent. However, results in this report represent about 95 percent of the regular public school population, because for confidentiality reasons state assessment scores are not available for the smallest schools in most states.
In most states, comparisons with NAEP grade 4 and 8 results are based on state assessment scores for the same grades, but in a few states for which tests were not given in grades 4 and 8, assessment scores from adjacent grades are used.
Because NAEP and state assessment scores were not available from all states prior to 2003, trends could not be compared in all states. Furthermore, in ten of the states with available scores, either assessments or performance standards were changed between 1998 and 2003, precluding trend analysis in those states for some years. As a result, comparisons of trends from 2002 to 2003 are possible in 31 states for grade 4 and 29 states for grade 8, but comparisons of reading achievement trends from 1998 to 2002 are possible in only 11 states for grade 4 and 10 states for grade 8 and for 1998 to 2003 in only eight states for grade 4 and six states for grade 8.
Because subpopulation achievement scores were not systematically acquired for the NLSLSASD prior to 2002 and are only available for a subset of states in 2002, achievement gap comparisons are limited to gaps in 2003 and changes in gaps between 2002 to 2003. In addition, subpopulation data are especially subject to suppression due to small sample sizes, so achievement gap comparisons are not possible for groups consisting of fewer than ten percent of the student population in a state.
Black-White gap comparisons for 2003 are possible in 26 states for grade 4 and 20 states for grade 8; Hispanic-White gap comparisons in 14 states for grade 4 and 13 states for grade 8; and poverty gap comparisons in 31 states for grade 4 and 28 states for grade 8. Gap reduction comparisons, which require scores for both 2002 and 2003, are possible for Black-White trends in 18 states for grade 4 and 15 states for grade 8, and poverty trends in 13 states for grade 4 and 12 states for grade 8. However, Hispanic-White trends can only be compared in 6 states for grade 4 and 5 states for grade 8.
Although this report brings together a large amount of information about NAEP and state assessments, there are significant limitations on the conclusions that can be reached from the results presented.
First, this report does not address questions about the content, format, or conduct of state assessments, as compared to NAEP. The only information presented in this report concerns the results of the testing—the achievement scores reported by NAEP and state reading assessments.
Second, this report does not represent all public school students in each state. It does not represent students in home schooling, private schools, or many special education settings. State assessment scores based on alternative tests are not included in the report, and no adjustments for non-standard test administrations (accommodations) are applied to scores. Student exclusion and nonparticipation are statistically controlled for NAEP data, but not for state assessment data.
Third, this report is based on school-level percentages of students, overall and in demographic subgroups, who meet standards. As such, it has nothing to say about measurement of individual student variation in achievement within these groups or differences in achievement that fall within the same discrete achievement level.
Finally, this report is not an evaluation of state assessments. State assessments and NAEP are designed for different, although overlapping purposes. In particular, state assessments are designed to provide important information about individual students to their parents and teachers, while NAEP is designed for summary assessment at the state and national level. Findings of different standards, different trends, and different gaps are presented without suggestion that they be considered as deficiencies either in state assessments or in NAEP.
There are many technical reasons for different assessment results from different assessments of the same skill domain. The analyses in this report have been designed to eliminate some of these reasons, by (1) comparing NAEP and state results in terms of the same performance standards, (2) basing the comparisons on scores in the same schools, and (3) removing the effects of NAEP exclusions on trends. However, other differences remain untested, due to limitations on available data.
The findings in this report must necessarily raise more questions than they answer. For each state in which the correlation between NAEP and state assessment results is not high, a variety of alternative explanations must be investigated before reaching conclusions about the cause of the relatively low correlation. The report evaluates some explanations but leaves others to be explained when more data become available.
Similarly, the explanations of differences in trends in some states may involve differences in populations tested, differences in testing accommodations, or other technical differences, even though the assessments may be testing the same domain of skills. Only further study will yield explanations of differences in measurement of achievement gaps. This report lays a foundation for beginning to study the effects of differences between NAEP and state assessments of reading achievement.
NCES 2008-474 Ordering information
McLaughlin, D.H., Bandeira de Mello, V., Blankenship, C., Chaney, K., Esra, P., Hikawa, H., Rojas, D., William, P., and Wolman, M. (2008). Comparison Between NAEP and State Reading Assessment Results: 2003 (NCES 2008-474). National Center for Education Statistics, Institute of Education Sciences, U.S. Department of Education. Washington, DC.
See more information on comparing NAEP and state proficiency standards on the NAEP website.
3Estimations of NAEP scale score distributions are based on an estimated distribution of possible scale scores (or plausible values), rather than point estimates of a single scale score.
4Most states have made school-level achievement statistics available on state websites since the late 1990s; these data have been compiled into a single database, the NLSLSASD, for use by educational researchers. These data can be downloaded from http://www.schooldata.org.