One state, Minnesota, participated in a state-level NAEP in 1996 and a state-level TIMSS in 1995 at grade 8. Consequently, these data provide a validation of the linking functions since they are independent of the data used to construct the links. Specifically, the linking functions developed from the U.S. National TIMSS and NAEP results can be used to convert the State NAEP results to projected results for that state on TIMSS. These projected results can then be compared with the actual TIMSS results.
Table 15 shows the results of applying the public school linking functions of Table 10 to the grade 8 public school data from Minnesota. The first two columns of Table 15 give the actual mean proficiency for the state from the TIMSS assessment. Accompanying this mean is its standard error and a 95 percent confidence interval. The last two columns of the table give the predicted TIMSS mean, its standard error, and 95 percent confidence interval, using the linking functions in Table 10 and the variance components in Table 11.
Table 16 provides comparisons between the actual TIMSS results and the results predicted from NAEP in terms of the percentages above the TIMSS marker levels. Since the 95 percent confidence interval for the predicted percentages from Equation 19 are nonsymmetric, all results are expressed in terms of confidence intervals.
Table 16.Ninety-five percent confidence intervals for the percentages above the TIMSS marker levels based on actual TIMSS data and on predictions from NAEP (data are from public schools only) for grade 8
The agreement between the actual TIMSS results and the results predicted from NAEP adds credibility to the linkage. Not only do the confidence intervals for the predicted TIMSS mean proficiencies contain the actual TIMSS means, and vice versa, but the intervals themselves substantially overlap. That the actual and predicted TIMSS results are based on different students in largely different schools and in different years and still show this degree of overlap provides support to the usefulness of the predicted grade 8 TIMSS results.5
An interesting feature of Table 15 deserves some comment. Since contains many components, the reader might be surprised that the standard errors for the actual TIMSS means are slightly larger than those of the means predicted from NAEP. This is contrary to the fact that the standard error of the predicted mean includes many additional components beyond the naive value SE(x). In fact, using the components from Table 11, the error due to linking roughly doubles the naive standard error in this particular analysis.
The standard error of the predicted TIMSS mean is about the same size as that of the actual mean largely because of the difference in the sample sizes for the NAEP and TIMSS assessments in Minnesota. While the number of grade 8 public school students in Minnesota assessed with either NAEP mathematics or science is around 2,400, only around 900 public school students were assessed in that state with TIMSS mathematics and science. All other things being equal, the standard error of a mean based on 900 students will be roughly 1.7 times larger than the standard error of a mean based on 2,400 students. Thus the increased variance of the predicted TIMSS mean is offset by the larger sample size for the NAEP data. 6
6 The TIMSS sample design, which selected intact classrooms, also
leads to somewhat larger standard errors than an equivalently
sized NAEP sample, which randomly selects students within a school.
The validation data provide an indication of how much information one can obtain from a linking study like this one. Loosely speaking, predicted TIMSS results from this study based on NAEP samples of 2,400 students are about as reliable as actual TIMSS results based on 900 students. Put another way, this particular statistical moderation study, where there was no direct information about how the same student might perform on both assessments, provides, from the NAEP assessment, information about the performance of students on the TIMSS assessment that is nearly three times less reliable than the information that would be obtained from a direct administration of TIMSS.