## Validation Studies of the Linkage between NAEP and TIMSS Eighth Grade Mathematics Assessments

Don McLaughlin
John Dossey
Fran Stancavage

Educational Statistical Services Institute
May 1997

### NAEP-TIMSS Mathematics Linkage Validation Summary

The NAEP and TIMSS 8th grade assessment instruments both covered number sense, measurement, geometry, statistics, and algebra and are generally sufficiently similar to warrant linkage for global comparisons at both grades but not necessarily for detailed comparisons of areas of student achievement or processes in classrooms. A few important differences were noted between the instruments; and these should be reported whenever the linkage is used as the basis for presenting comparisons.

Content Analysis Results

• The TIMSS mathematics assessment was embedded in a combined math and science assessment, and this may have had unknown effects on performance on the mathematics items.
• The NAEP mathematics assessment included blocks of items on which calculators were available and others on which rulers and cardboard shapes were to be used.
• There were somewhat more items on geometry in NAEP (19% vs. 13%).
• More TIMSS items involved computation (59% vs.40%), and more involved decimals or fractions (34% vs. 13%).
• More of the TIMSS items were multiple choice (79% vs. 57%).
• More NAEP items than TIMSS items were difficult, based on percentages of correct responses given by U.S. students.

Correlational Results

In most cases in which an item-type was more prevalent on one assessment than on the other, the correlation between performance on the more prevalent item-type and other items on the same assessment was sufficiently high not to raise concerns about the linkage. The only exception to this involved the comparison of easy and difficult items. Although the differential prevalence of difficult items would reduce the correlation underlying the linkage by only about 1 percent in grade 8, any statements based on the linkage should mention that NAEP contained a larger percentage of difficult items.

## Validation Studies of the Linkage between NAEP and TIMSS Eighth Grade Science Assessments

Don McLaughlin
Senta Raizen
Fran Stancavage

Educational Statistical Services Institute
April 1997

### NAEP-TIMSS Science Linkage Validation Summary

The NAEP and TIMSS 8th grade assessment instruments both covered physical, earth, and life science and are generally sufficiently similar to warrant linkage for global comparisons of middle school science achievement but not necessarily for detailed comparisons of areas of student achievement or processes in classrooms. A few important differences were noted between the instruments, and these should be reported whenever the linkage is used as the basis for presenting comparisons.

Content Analysis Results

• The TIMSS science assessment was part of a combined math and science assessment, which may have had unknown effects on performance on the science items.
• The NAEP science assessment included a block of hands-on laboratory-like items as one of the three blocks of items administered to each student.
• There were somewhat more items on physical science in TIMSS (45% vs. 31%).
• Twelve percent of the NAEP items involved graph-reading, compared to fewer than 1 percent of the TIMSS items.
• Seventy-three percent of TIMSS items were multiple choice, compared to 40 percent of NAEP items.
• More NAEP items than TIMSS items were difficult, based on percentages of correct responses given by U.S. 8th grade students. On multiple-choice items, 28 percent of the NAEP items, versus 9 percent of TIMSS items, were sufficiently difficult that fewer than 40 percent of students got them right; and on free-response items, the difference was 72 percent versus 32 percent. Moreover, 36 of 189 NAEP items had percentages less than 20 percent, compared to only 4 of 140 TIMSS items.

Correlational Results

In most cases in which an item-type was more prevalent on one assessment than on the other, the correlation between performance on the more prevalent item-type and other items on the same assessment was sufficiently high not to raise concerns about the linkage. Exceptions to this were the NAEP hands-on items and the graph-reading items, and TIMSS greater prevalence of multiple-choice items and easy items. Although none of these special types of items would reduce the correlation underlying the linkage by more than 6 percent, any statements based on the linkage should mention these differences.