Commissioner, National Center for Education Statistics
2011 NAEP-TIMSS Linking Study
October 24, 2013
Today I am pleased to release the findings from the NAEP-TIMSS Linking Study. The National Center for Education Statistics (NCES) initiated this special study in an effort to link scores of two assessments—the mathematics and science components of the National Assessment of Educational Progress (NAEP) and the Trends in International Mathematics and Science Study (TIMSS)—that were administered in 2011 to assess the proficiency of eighth-grade students.
We conducted this study because it is important to know how students educated in various U.S. states are performing against international standards. NCES coordinated this effort across the two assessment programs. The National Assessment Governing Board, which sets policy for NAEP, worked with NCES to modify the NAEP assessment schedule so that NAEP mathematics and science could be administered in the same year as TIMSS.
With this study, U.S. states and jurisdictions--50 U.S. states, the District of Columbia, and the Department of Defense schools—can compare performance of their students assessed by NAEP with that of students educated in many other countries. In 2011, TIMSS assessed eighth-grade students in 47 “education systems” (not counting individual U.S. states). These systems included 38 countries, such as Australia, Finland, and Japan, and nine “subnational entities,” such as Alberta in Canada, Dubai in the United Arab Emirates, and England in Great Britain.
To conduct this linking study, NCES requested that some U.S. states participate in TIMSS directly. Nine states in total participated in the 2011 TIMSS: Alabama, California, Colorado, Connecticut, Florida, Indiana, Massachusetts, Minnesota, and North Carolina. These states were selected to participate based on their state enrollment, willingness to participate, previous experience as benchmarking participants in TIMSS, and geographic diversity. NCES also considered whether they as a group represented a substantial range of performance relative to the national average on NAEP.
Although both assessments measure student achievement in mathematics and science, the NAEP and TIMSS programs differ in several important respects. For instance, NAEP assesses students in winter, while TIMSS assesses students at different times of the year in different parts of the world. The testing time for NAEP is 50 minutes per subject, while TIMSS testing time lasts 90 minutes because it assesses the same students in both subjects.
The testing populations of the two programs also vary. For instance, NAEP includes students who are tested with accommodations, while TIMSS does not. In addition, results reported for the 52 states and jurisdictions by NAEP are based on students in public schools only, whereas most education systems in TIMSS assess students in public and private schools. The content areas and their coverage are somewhat different between NAEP and TIMSS as well. For example, NAEP classifies mathematics content into five areas, whereas TIMSS uses four content areas. You can read about comparisons of the assessment frameworks and test questions on the NCES website (http://nces.ed.gov). Pages 4 through 8 of the report, U.S. States in a Global Context: Results From the 2011 NAEP-TIMSS Linking Study, contrast many features of the NAEP and TIMSS programs and assessments.
NAEP and TIMSS both report results by average scale scores. For NAEP, the mathematics scale is 0-500 and the science scale is 0-300. The TIMSS scales for these subjects are both 0-1,000. In addition to scale scores, NAEP uses three achievement levels—Basic, Proficient, and Advanced—and TIMSS uses four benchmarks—Low, Intermediate, High, and Advanced—to report assessment results. The NAEP achievement levels are set by the National Assessment Governing Board, and the TIMSS benchmarks are determined by a panel of international content experts. These achievement levels and benchmarks provide a way to interpret average scores and understand how students’ proficiency in mathematics and science varies. You can find descriptions of the kinds of skills and knowledge required for answering items successfully at each TIMSS benchmark level in the report (on pages 6, 16, and 22) and on the TIMSS section of the NCES website (http://nces.ed.gov/timss).
Three linking methods—calibration, statistical projection, and statistical moderation—were applied in linking the NAEP and TIMSS scales. In the calibration approach, NAEP items were calibrated directly onto the TIMSS scale using multiple student samples including those assessed with booklets that included both NAEP and TIMSS items. In the statistical projection approach, projection functions were developed based on student samples assessed with special booklets that included both NAEP and TIMSS items. In the statistical moderation approach, the NAEP score distributions were adjusted to match certain characteristics of the TIMSS score distributions.
NCES selected the statistical moderation linking approach, the simplest of the three, to generate predicted TIMSS scores for the 43 U.S. states and jurisdictions that participated only in NAEP. In predicting the TIMSS scores, additional adjustments were made to make the assessed populations of NAEP and TIMSS more comparable. The accuracy of the prediction was evaluated by comparing two sets of scores—predicted TIMSS scores with actual TIMSS scores—for the nine states that participated in both NAEP and TIMSS in 2011. These comparisons indicate that predicted TIMSS scores for these nine states were not statistically different from their actual scores, except in two cases in one subject.
Turning to the findings, in mathematics, compared to the TIMSS scale average that was set at 500, thirty-six states scored higher, six states scored lower, and scores for 10 states were not statistically different. The average TIMSS scores for these 52 states/jurisdictions ranged from 466 in Alabama to 561 in Massachusetts. Alabama’s score of 466 was higher than scores for 19 international education systems, while Massachusetts’ score of 561 was higher than scores for 42 international education systems. The U.S. average score in TIMSS mathematics was 507 (for the public schools).
Let’s look at the findings by percentages of students reaching the two highest TIMSS benchmarks of Advanced and High. The U.S. state with the highest percentage of students reaching the Advanced benchmark in mathematics was Massachusetts, with 19 percent. In Vermont, the second-highest-scoring state, 16 percent of eighth-graders reached the Advanced benchmark. The Republic of Korea, Singapore, and Chinese Taipei scored the highest of all participating education systems and also had the highest percentages of students reaching the Advanced benchmark. Forty-seven percent of eighth-graders in the Republic of Korea reached the Advanced benchmark, 48 percent of eighth-graders in Singapore reached the Advanced level, and 49 percent of eighth-graders in Chinese Taipei reached Advanced.
Of the states, Massachusetts also had the largest percentage of students reaching the High benchmark in mathematics (57 percent). Of all participating education systems, Singapore and the Republic of Korea also had the highest percentages of their students reaching the High benchmark,78 percent and 77 percent, respectively. Pages 14 and 15 of the report show average mathematics scores and the percentages of students reaching each TIMSS benchmark for all states and education systems.
In science, compared to the TIMSS scale average set at 500, forty-seven states scored higher, three states scored lower, and scores for two states were not statistically different. The average TIMSS scores in science for the U.S. states and jurisdictions ranged from 453 in the District of Columbia to 567 in Massachusetts. The District of Columbia’s average TIMSS score of 453 was higher than scores for 14 education systems, while the scores for Massachusetts and Vermont were higher than scores for 43 education systems. The U.S. average score in TIMSS science was 522 (for the public schools).
In Massachusetts, the highest-scoring state in science, 24 percent of eighth-graders reached the Advanced benchmark, similar to the percentage in Chinese Taipei. Singapore, the top-scoring education system in science, had the highest percentage of students reaching the Advanced benchmark, 40 percent. Of the U.S. states, Massachusetts and Vermont had the highest percentages of eighth-graders reaching the High benchmark, with 61 percent in Massachusetts and 60 percent in Vermont. These percentages are similar to the percentage in Chinese Taipei. Sixty-nine percent of eighth-graders in Singapore reached the High benchmark in science, the highest percentage among the education systems participating in TIMSS. Pages 20 and 21 of the report show average science scores and the percentages of students reaching each TIMSS benchmark for all states and education systems.
As the study suggests, we found that most eighth-graders in the United States are competitive in mathematics and science when their performances were compared to those of their peers from around the globe. Yet even our leading states are behind the highest-performing countries in terms of the percentages of students performing at the highest levels.
In addition to the report that provides TIMSS scores for the 52 U.S states and jurisdictions, additional information is available at the NCES website explaining the methodology used to conduct the study, the results of comprehensive comparisons of the content tested by TIMSS and NAEP, and answers to frequently asked questions.
As always, NCES thanks the students and schools who participated in NAEP and TIMSS, without whom this study would not have been possible.