Comparison Between NAEP and State Reading Assessment Results: 2003

April 2008

National Center for Education Statistics

Download the complete report or portions of the reports as PDF files for viewing and printing.

Executive Summary

In late January through early March of 2003, the National Assessment of Educational Progress (NAEP) grade 4 and 8 reading and mathematics assessments were administered to representative samples of students in approximately 100 public schools in each state. The results of these assessments were announced in November 2003. Each state also carried out its own reading and mathematics assessments in the 2002-2003 school year, most including grades 4 and 8. This report addresses the question of whether the results published by NAEP are comparable to the results published by individual state testing programs.

Objectives

Comparisons to address the following four questions are based purely on results of testing and do not compare the content of NAEP and state assessments.

How do states' achievement standards compare with each other and with NAEP?
Are NAEP and state assessment results correlated across schools?
Do NAEP and state assessments agree on achievement trends over time?
Do NAEP and state assessments agree on achievement gaps between subgroups?

How do states' achievement standards compare with each other and with NAEP?

Both NAEP and State Education Agencies have set achievement, or performance, standards for reading and have identified test score criteria for determining the percentages of students who meet the standards. Most states have multiple performance standards, and these can be categorized into a primary standard, which, since the passage of No Child Left Behind, is generally the standard used for reporting adequate yearly progress (AYP), and standards that are above or below the primary standard. Most states refer to their primary standard as proficient or meets the standard.

By matching percentages of students reported to be meeting state standards in schools participating in NAEP with the distribution of performance of students in those schools on NAEP, cutpoints on the NAEP scale can be identified that are equivalent to the scores required to meet a state's standards.

From the analyses presented in chapter 2, we find:

The median of the states' primary reading standards, as reflected in their NAEP equivalents, is slightly below the NAEP basic level at grade 4 and slightly above the NAEP basic level at grade 8.
The primary standards vary greatly in difficulty across states, as reflected in their NAEP equivalents. In fact, among states, there is more variation in placement of primary reading standards than in average NAEP performance.
As a corollary, states with high primary standards tend to see few students meet their standards, while states with low primary standards tend to see most students meet their standards.
There is no evidence that setting a higher state standard is correlated with higher performance on NAEP. Students in states with high primary standards score just about the same on NAEP as students in states with low primary standards.

Are NAEP and state assessment results correlated across schools?

An essential criterion for the comparison of NAEP and state assessment results in a state is that the two assessments agree on which schools are high achieving and which are not. The critical statistic for testing this criterion is the correlation between schools' percentages achieving their primary standard, as measured by NAEP and the state assessment. Generally, a correlation of at least .7 is important for confidence in linkages between them.¹ In 2003, correlations between NAEP and state assessment measures of reading achievement were greater than .7 in 29 out 51 jurisdictions in grade 4 and in 29 out of 48 in grade 8. Several factors other than similarity of the assessments depress this correlation.

One of these factors is a disparity between the standards: the correlation between the percent of students meeting a high standard on one test and a low standard on the other test are bound to be lower than the correlation between percents of students meeting standards of equal difficulty on the two tests. To be fair and unbiased, comparisons of percentages meeting standards on two tests must be based on equivalent standards for both tests. To remove the bias of different standards, NAEP was rescored in terms of percentages meeting the state's standard. Nevertheless, as discussed in chapter 3, other factors also depressed the correlations:

Correlations are biased downward by schools with small enrollments, by use of scores for an adjacent grade rather than the same grade, and by standards set near the extremes of a state's achievement distribution.
Estimates of what the correlations would have been if they were all based on scores on non-extreme standards in the same grade in schools with 30 or more students per grade were greater than .7 in 44 out of 48 states for grade 4 and in 33 out of 45 states for grade 8.

Do NAEP and state assessments agree on achievement trends over time?

Comparisons are made between NAEP and state assessment reading achievement trends over three periods: from 1998 to 2003, from 1998 to 2002, and from 2002 to 2003. Achievement trends are measured by both NAEP and state assessments as gains in school-level percentages meeting the state's primary standard.²

From the analyses presented in chapter 4, we find:

For reading achievement trends from 1998 to 2003, there are significant differences between NAEP and state assessments in 5 of 8 states in grade 4 and 5 of 6 states in grade 8.
For trends from 1998 to 2002, there are significant differences in 5 of 11 states in grade 4 and 7 of 10 states in grade 8.
For trends from 2002 to 2003, there are significant differences in 9 of 31 states in grade 4 and 10 of 29 states in grade 8.
In aggregate, in both grades 4 and 8, reading achievement gains from 1998 to 2002 and 2003 measured by state assessments are significantly larger than those measured by NAEP.
Across states, there is no consistent pattern of agreement between NAEP and state assessment reports of gains in school-level percentages meeting the state's primary standard.

Do NAEP and state assessments agree on achievement gaps between subgroups?

Comparisons are made between NAEP and state assessment measurement of (1) reading achievement gaps in grades 4 and 8 in 2003 and (2) changes in these reading achievement gaps between 2002 and 2003. Comparisons are based on school-level percentages of Black, Hispanic, White, and economically disadvantaged and nondisadvantaged students achieving the state's primary reading achievement standard in the NAEP schools in each state.

From the analyses presented in chapter 5, we find:

In most states, gap profiles based on NAEP and state assessments are not significantly different from each other.
There was a small but consistent tendency for NAEP to find larger achievement gaps than state assessments did in comparisons of the lower half of the Black student population with the lower half of the White student population. This may be related to the method of measurement, rather than to actual achievement differences.
There was very little evidence of discrepancies between NAEP and state assessment measurement of gap reductions between 2002 and 2003.

Data Sources

This report makes use of test score data for 50 states and the District of Columbia from two sources: (1) NAEP plausible value files for the states participating in the 1998, 2002 and 2003 reading assessments, augmented by imputations of plausible values for the achievement of excluded students;³ and (2) state assessment files of school-level statistics compiled in the National Longitudinal School-Level State Assessment Score Database (NLSLSASD).⁴

All comparisons in the report are based on NAEP and state assessment results in schools that participated in NAEP, weighted to represent the states. Across states in 2003,the median percentage of NAEP schools for which state assessment records were matched was greater than 99 percent. However, results in this report represent about 95 percent of the regular public school population, because for confidentiality reasons state assessment scores are not available for the smallest schools in most states.

In most states, comparisons with NAEP grade 4 and 8 results are based on state assessment scores for the same grades, but in a few states for which tests were not given in grades 4 and 8, assessment scores from adjacent grades are used.

Because NAEP and state assessment scores were not available from all states prior to 2003, trends could not be compared in all states. Furthermore, in ten of the states with available scores, either assessments or performance standards were changed between 1998 and 2003, precluding trend analysis in those states for some years. As a result, comparisons of trends from 2002 to 2003 are possible in 31 states for grade 4 and 29 states for grade 8, but comparisons of reading achievement trends from 1998 to 2002 are possible in only 11 states for grade 4 and 10 states for grade 8 and for 1998 to 2003 in only eight states for grade 4 and six states for grade 8.

Because subpopulation achievement scores were not systematically acquired for the NLSLSASD prior to 2002 and are only available for a subset of states in 2002, achievement gap comparisons are limited to gaps in 2003 and changes in gaps between 2002 to 2003. In addition, subpopulation data are especially subject to suppression due to small sample sizes, so achievement gap comparisons are not possible for groups consisting of fewer than ten percent of the student population in a state.

Black-White gap comparisons for 2003 are possible in 26 states for grade 4 and 20 states for grade 8; Hispanic-White gap comparisons in 14 states for grade 4 and 13 states for grade 8; and poverty gap comparisons in 31 states for grade 4 and 28 states for grade 8. Gap reduction comparisons, which require scores for both 2002 and 2003, are possible for Black-White trends in 18 states for grade 4 and 15 states for grade 8, and poverty trends in 13 states for grade 4 and 12 states for grade 8. However, Hispanic-White trends can only be compared in 6 states for grade 4 and 5 states for grade 8.

Caveats

Although this report brings together a large amount of information about NAEP and state assessments, there are significant limitations on the conclusions that can be reached from the results presented.

First, this report does not address questions about the content, format, or conduct of state assessments, as compared to NAEP. The only information presented in this report concerns the results of the testing—the achievement scores reported by NAEP and state reading assessments.

Second, this report does not represent all public school students in each state. It does not represent students in home schooling, private schools, or many special education settings. State assessment scores based on alternative tests are not included in the report, and no adjustments for non-standard test administrations (accommodations) are applied to scores. Student exclusion and nonparticipation are statistically controlled for NAEP data, but not for state assessment data.

Third, this report is based on school-level percentages of students, overall and in demographic subgroups, who meet standards. As such, it has nothing to say about measurement of individual student variation in achievement within these groups or differences in achievement that fall within the same discrete achievement level.

Finally, this report is not an evaluation of state assessments. State assessments and NAEP are designed for different, although overlapping purposes. In particular, state assessments are designed to provide important information about individual students to their parents and teachers, while NAEP is designed for summary assessment at the state and national level. Findings of different standards, different trends, and different gaps are presented without suggestion that they be considered as deficiencies either in state assessments or in NAEP.

Conclusion

There are many technical reasons for different assessment results from different assessments of the same skill domain. The analyses in this report have been designed to eliminate some of these reasons, by (1) comparing NAEP and state results in terms of the same performance standards, (2) basing the comparisons on scores in the same schools, and (3) removing the effects of NAEP exclusions on trends. However, other differences remain untested, due to limitations on available data.

The findings in this report must necessarily raise more questions than they answer. For each state in which the correlation between NAEP and state assessment results is not high, a variety of alternative explanations must be investigated before reaching conclusions about the cause of the relatively low correlation. The report evaluates some explanations but leaves others to be explained when more data become available.

Similarly, the explanations of differences in trends in some states may involve differences in populations tested, differences in testing accommodations, or other technical differences, even though the assessments may be testing the same domain of skills. Only further study will yield explanations of differences in measurement of achievement gaps. This report lays a foundation for beginning to study the effects of differences between NAEP and state assessments of reading achievement.

Download the entire Volume I report in a PDF file for viewing and printing. (1939K PDF)

Download the entire Volume II report in a PDF file for viewing and printing. (5357K PDF)

Download Volume I Chapter 1 in a PDF file for viewing and printing. (781K PDF)

Download Volume I Chapter 2 in a PDF file for viewing and printing. (624K PDF)

Download Volume I Chapter 3 in a PDF file for viewing and printing. (521K PDF)

Download Volume I Chapter 4 in a PDF file for viewing and printing. (533K PDF)

Download Volume I Chapter 5 in a PDF file for viewing and printing. (690K PDF)

Download Volume I Chapter 6 in a PDF file for viewing and printing. (567K PDF)

Download Volume I Appendix A in a PDF file for viewing and printing. (572K PDF)

Download Volume I Appendix B in a PDF file for viewing and printing. (616K PDF)

Download Volume I Appendix C in a PDF file for viewing and printing. (546K PDF)

Download Volume II Appendix D.1 in a PDF file for viewing and printing. (1532K PDF)

Download Volume II Appendix D.2 in a PDF file for viewing and printing. (1412K PDF)

Download Volume II Appendix D.3 in a PDF file for viewing and printing. (1279K PDF)

Download Volume II Appendix D.4 in a PDF file for viewing and printing. (1446K PDF)

Download Volume II Appendix D.5 in a PDF file for viewing and printing. (1230K PDF)

Download Volume II Appendix D.6 in a PDF file for viewing and printing. (938K PDF)

NCES 2008-474 Ordering information

Suggested Citation
McLaughlin, D.H., Bandeira de Mello, V., Blankenship, C., Chaney, K., Esra, P., Hikawa, H., Rojas, D., William, P., and Wolman, M. (2008). Comparison Between NAEP and State Reading Assessment Results: 2003 (NCES 2008-474). National Center for Education Statistics, Institute of Education Sciences, U.S. Department of Education. Washington, DC.

See more information on comparing NAEP and state proficiency standards on the NAEP website.

¹A correlation of at least .7 implies that 50% or more of the variance of one variable can be predicted from the other variable.

²To provide an unbiased trend comparison, NAEP was rescored in terms of the percentages meeting the state's primary standard in the earliest trend year.

³Estimations of NAEP scale score distributions are based on an estimated distribution of possible scale scores (or plausible values), rather than point estimates of a single scale score.

⁴Most states have made school-level achievement statistics available on state websites since the late 1990s; these data have been compiled into a single database, the NLSLSASD, for use by educational researchers. These data can be downloaded from http://www.schooldata.org.

Last updated 06 August 2009 (EP)

Comparison Between NAEP and State Reading Assessment Results: 2003

Executive Summary

Objectives

Data Sources

Caveats

Conclusions

Objectives

How do states' achievement standards compare with each other and with NAEP?

Are NAEP and state assessment results correlated across schools?

Do NAEP and state assessments agree on achievement trends over time?

Do NAEP and state assessments agree on achievement gaps between subgroups?

Data Sources

Caveats

Conclusion