CENTER FOR EDUCATION REFORM AND EMPOWER AMERICA
ACHIEVEMENT IN THE UNITED STATES: PROGRESS SINCE A NATION AT RISK?
April 3, 1998
By:
Pascal D. Forgione, Jr., Ph.D.
U.S. Commissioner of Education Statistics
National Center for Education Statistics
Office of Educational Research and Improvement
U.S. Department of Education
TABLE OF CONTENTS
Highlights
Introduction To ask if today's students are as smart as students used to be  if they know more or can do more  invokes the most traditional and simplest form of benchmarking; it compares performance today by the standard of performance in the past. That is the main question I will address today  to ask if students are performing better by presenting data from the National Assessment of Educational Progress (NAEP) which looks at national and state performance over time. What we shall see is that the news is mixed. But there are other ways to ask the general question "how are we doing?" Policymakers often ask if American students are doing as well as they should or as well as they can. International comparisons present an alternative kind of benchmark for gauging overall performance and are probably the most important indicator to business leaders. Comparisons of academic performance among our major economic partners are leading indicators for employers who must compete in a global economy. International comparisons are the second group of data I want to present here today, and for that I will draw primarily from the Third International Mathematics and Science Study (TIMSS), the International Reading Literacy Study (IRLS) and the International Adult Literacy Survey (IALS). These data also paint an uneven picture of our relative educational standing. Finally, I will present data on how students have responded to the call for better performance and higher standards. We shall see that students have changed their behavior since A Nation at Risk: they are more likely to graduate from high school, have higher educational aspirations, and take more academic courses. I. Performance Over Time Science. The overall pattern of performance in science for 9, 13, and 17yearolds is one of early declines followed by a period of improvement (Figure A). For 9yearolds, the overall trend shows improvement; in 1996, the average score for these students was higher than in 1970. The overall trend for 13yearolds was also positive, but there was no significant difference between the average science scores in 1970 and those in 1996. The average science score of 17yearolds in 1996 was lower than the average score in 1969. Science scores have been increasing upward for all ages tested since 1982 and the publication of A Nation at Risk. Average scores at all three ages were higher in 1996 than in 1982 (for 17yearolds, scores increased by 13 points; at age 13, scores increased 6 points, and at age 9, scores increased 9 points). Mathematics. The overall pattern of mathematics achievement for 9, 13, and 17yearolds shows overall improvement, with early declines or relative stability followed by increased performance (Figure B). Further, the scores of 9 and 13yearolds were significantly higher in 1996 than in 1973. As with science, mathematics scores have also shown an upward trend at all ages since 1982 and the publication of A Nation at Risk. On average, the scores of 17yearolds increased 8 points; 13yearolds increased 5 points; and 9yearolds increased 12 points. Reading. The overall trend pattern in reading achievement is one of minimal changes across the assessment years (Figure C). The performance of 9yearolds improved from 1971 to 1980, but has declined slightly since that time. However, in 1996, the average reading score for these students was higher than it was in 1971. Thirteenyearolds showed moderate gains in reading achievement; in 1996, their average reading score was higher than that in 1971. There was an overall pattern of increase in reading scores for 17yearolds, but the 1996 average score was not significantly different than in 1971. Reading scores have remained fairly stable between 1984 and 1996, the time period immediately following the release of A Nation at Risk. No significant changes at any age occurred during this time period. Subgroup Performance on NAEP Analyses of NAEP assessment data by race show how achievement gaps have been changing over time. In mathematics and reading, score gaps between white and black students aged 13 and 17 narrowed during the 1970s and the 1980s. Although there was some evidence of widening gaps during the late 1980s and 1990s, the score gaps in 1996 were smaller than those in the first assessment year for 13 and 17yearolds in mathematics and for 17yearolds in reading. Among 9yearolds, score gaps in mathematics and reading have generally decreased across the assessment years, resulting in smaller gaps in 1996 compared to those in the first assessment year. Since A Nation at Risk, performance in science has been increasing for white, black, and Hispanic students at ages 9, 13, and 17. At age 17, for example, average scores of white students increased 14 points from 1982 to 1996; for black students the increase was 25 points; and Hispanic students improved by 20 points. As a result of these increases, the gap between white and black students closed significantly (although it is still 47 points); the gap between white and Hispanic students also narrowed, though the change was not statistically significant (the gap in 1996 was 38 points). Average mathematics scores of white, black, and Hispanic students also increased since 1982. For 17yearolds, for example, white students improved 9 points; black students improved 14 points; and Hispanic students increased 15 points. The gaps between white and black students narrowed between 1982 and 1990, but has widened again through the 1990s, to 27 points in 1996. The gap between white and Hispanic students narrowed somewhat since 1982, though the change was not statistically significant, and the gap remained at 21 points in 1996. Changes in reading were minimal for white, black, and Hispanic students at all ages during the years 1982 to 1996. As a result, the gaps between white and black students remained about the same (in 1996 the gap at age 17 was 29 points). The gap between white and Hispanic students also changed little (in 1996 the gap at age 17 was 30 points). In looking at subgroup performance in NAEP, it is particularly interesting to examine how gains made by subgroups over time can be masked by simple averages. Whenever the demographic balance among subgroups shifts, it can result in what is sometimes termed "Simpson's paradox"  which is illustrated by the NAEP longterm reading gains of 9 yearold whites, blacks, and Hispanics compared to the overall average gains shown in (Figure D). Between 1971 and 1996, 9yearold students' average performance in reading rose by 4 points on a 500 point scale. Yet average score increases for each of the subgroups  blacks, Hispanics, and whites ¾ exceeded the overall average increase. Why? Blacks and Hispanics, the lowest scoring subgroups represent a greater share of the total population in 1996 compared with 1971, which had the paradoxical effect of lowering overall gains even as each group's performance improved. Frameworkbased Assessments in Mathematics, Reading, and Science In addition to, and separate from ,the longterm trend assessments, NAEP also provides cross sectional data based on grade level student samples. These reports, called "The Nation's Report Card", involve more recently developed testing instruments. Instead of repeatedly using the same sets of questions and tasks necessary to generate trend data, the Nation's Report Card is frameworkbased, that is they reflect the best current thinking about what all children should know and be able to do. Each of these frameworkbased assessments is based on different sets of questions or tasks; therefore, the results from each cannot be directly compared. Mathematics. The NAEP 1996 mathematics assessment continues the commitment to evaluate and report the educational progress of students at grades 4, 8, and 12. Like previous NAEP mathematics assessments in 1990 and 1992, the 1996 assessment uses a framework influenced by the Curriculum and Evaluation Standards for School Mathematics of the National Council of Teachers of Mathematics (NCTM). The 1996 framework was updated to more adequately reflect recent curricular emphases and objectives. The framework characterizes the mathematics domain in terms of five content strands  number sense, properties, and operations; measurement; geometry and spatial sense; data analysis, statistics, and probability; and algebra and functions. Across the five content strands, the assessment examines mathematical abilities (conceptual understanding, procedural knowledge, and problem solving) and mathematical power (reasoning, connections, and communication). The positive news is that national data from the 1996 mathematics assessment showed progress in students' mathematics performance on a broad front, as compared with both the 1990 and 1992 assessments.
Since 1990, the NAEP reading assessments have increasingly emphasized the importance of having students construct a response to what they have read. This has been accomplished through the use of fewer but longer text selections and an increasing number of items that require students to answer with original responses as short as one or two sentences or as long as a few paragraphs. National data from the NAEP 1994 Reading Report Card showed no significant changes in average performance among the national population of fourth or eighthgraders from 1992 to 1994. However, between these years there was a decline in the average reading performance of twelfthgraders in all three assessed purposes for reading.
The science framework for the 1996 NAEP science assessment was developed through a national consensus process involving educators, policymakers, science teachers, representatives of the business community, assessment and curriculum experts, and members of the general public. Two principles guide the science framework. First, the framework recognizes that scientific knowledge relies on the ability to organize disparate facts and to draw inferences from patterns and relationships. Second, the NAEP framework assumes that scientific performance depends on the ability to use scientific tools, procedures, and reasoning processes.

The core of the science framework is organized into three major fields: earth, physical, and life sciences. The assessment measures a student's ability to know and do science within these fields by testing the knowledge of important facts and concepts; the ability to explain, integrate, apply, analyze, evaluate, and communicate scientific information; and the ability to perform investigations, and evaluate and apply the results of investigations.
II. International Comparisons NCES uses a combination of international and U.S. databases to look at the performance of our students. The combination of both types of data is required to see ourselves in stereographic or parallel perspective. U.S.only data is blind in one eye, and international data is blind in the other. Both types of data are necessary for a clear and an accurate view of our students' performance. TIMSS is noteworthy not only because of its scope and magnitude, but also because of innovations in its design. In this international study, NCES along with the National Science Foundation (NSF) combined multiple methodologies to create an information base that goes beyond simple student test score comparisons to examine the fundamental elements of schooling. Innovative research techniques include analyses of textbooks and curricula, videotapes, and ethnographic case studies. The result is a more complete portrait of how U.S. mathematics and science education differs from that of other nations, especially in extended comparisons with Germany and Japan. The information in these reports can serve as a starting point for our efforts to define a "worldclass" education. If the United States is to improve the mathematics and science education of its students, we must carefully examine not just how other countries rank, but also how their policies and practices help students achieve. TIMSS shows us where U.S. education stands  not just in terms of test scores, but also what is included in textbooks, taught in the schools, and learned by students. Examining these data provides a valuable opportunity to shed new light on education in the United States through the prism of other countries. At the same time, we should avoid the temptation to zero in on any one finding or leap to a conclusion without carefully considering the broader context. Our students' international standing declines as students progress through school, according to TIMSS. Overall, U.S. fourthgraders scored above the international average in both science and mathematics. Our eighthgraders scored above the international average in science but below it in mathematics. In twelfthgrade, the scores of both our overall student population tested on general mathematics and science knowledge, and of our more advanced students tested in mathematics and physics, were well below the international average. FourthGrade Findings. In both mathematics and science, U.S. fourthgraders performed above the international average. In mathematics, of the 26 participating TIMSS countries, U.S. fourthgraders outperformed students in 12 countries and were outperformed by students in seven countries. In science, U.S. students outperformed students in 19 countries, and were outperformed by students in only one countryKorea. In the six mathematics content areas, U.S. fourthgraders exceeded the international average in five. In the science content areas, U.S. fourthgraders exceeded the international average in all four areas assessed. EighthGrade Findings. Data on eighthgrade performance from TIMSS suggests a general improvement in U.S. eighthgrade science scores as compared to a prior 1991 international assessment that placed U.S. students below average, though the tests and the set of participating nations have changed. The TIMSS data, however, show that U.S. eighthgrade students' mathematics performance remains slightly below the international average. U.S. eighthgrade students scored lower, on average, in mathematics than students in Canada, France, and Japan, and scored about the same as students in England and Germany. In science, eighth grade students from the United States scored higher, on average, than students in France, about the same as students in Canada, England, and Germany, and lower than students in Japan. (Figure G) summarizes U.S. performance by content area on the fourth and eighthgrade assessments. TwelfthGrade Findings. The twelfthgrade TIMSS included 21 countries that conducted assessments of their students' general knowledge in mathematics and science during their last year in secondary school. Japan and other Asian countries that traditionally perform well in mathematics and science did not participate in the twelfthgrade TIMSS. Even with those Asian countries excluded, the United States performed relatively poorly. In the mathematics general knowledge assessment, U.S. twelfthgrade students were outperformed by 14 countries, and outperformed two countries. U.S. students performed the same as students in four other countries. In science, U.S. twelfthgrade students were outperformed by students in 11 countries, and outperformed students in two countries . U.S. students performed the same as students in seven other countries (Figure H). Average test scores can mask important differences in the distribution of scores. For example, as a result of our country's diverse population, U.S. test score averages could be unduly lowered by a relatively large group of lowscoring students. In the twelfthgrade TIMSS assessments, however, the distribution of scores among U.S. students was no wider than that in most other participating countries; the U.S. scores also start and end lower than those in higher scoring countries. We also like to think that at least America's "best and brightest" students are among the smartest in the world; again, TIMSS findings suggest otherwise. Sixteen countries assessed advanced mathematics and physics among a select group of advanced students. In advanced mathematics, 11 countries outperformed the U.S., and no countries performed more poorly. In physics, 14 countries outperformed the U.S.; again, no countries performed more poorly (Figure I). Several other factors suggested by observers also do not account for the relatively poor performance of U.S. students in grade 12. For example, it is not the case that a greater proportion of U.S. students complete secondary school than in most of the other countries participating in this phase of TIMSS. Thus, the vast majority of U.S. young people are not being compared only to an elite in other countries. Furthermore, in TIMSS, the general pattern was that countries with higher proportions of young people enrolled in and completing secondary school outperformed countries with lower proportions. The decentralized nature of decisionmaking about curriculum did not explain the poor performance of U.S. students. Some countries with decentralized decisionmaking outperformed us and some did not. The same was true of countries with centralized decisionmaking. Finally, while U.S. students on average were about a half a year younger than the average for all 21 counties, the age differential is not a major factor contributing to our poor performance. Not only is the age differential relatively small (and it is even less in the advanced assessments), countries in which the average age of the students was similar to or younger than the U.S. also outperformed us. Among the other achievement findings drawn from the TIMSS:
While TIMSS has given us information on our international standing, it is most valuable in telling us what factors are related to high achievement in schools. The overarching message is that there is no easy solution or single nostrum that will magically increase our nation's performance. Indeed, TIMSS shows us that many of the curealls recommended in the past are not associated with high performance in all nations. For example, more seat time in math and science, more homework, and less television have often been recommended as methods for increasing student performance. These strategies may indeed be effective in the case of individual students or schools, yet TIMSS has shown us another perspective. Comparisons of eighthgrade students, teachers, and classrooms in the U.S., Japan, and Germany have been particularly revealing. For example, U.S. eighth graders already spend more seat time in math and science classes than students in Japan and Germany. Japan outperforms us at this grade level, while Germany does not, so this shows that more seat time is not necessarily a magic tonic. With respect to homework, U.S. eighthgrade teachers already assign more homework, spend more class time discussing it, and are more likely to count it toward grades than teachers in Japan. Japanese eighth graders also watch just as much TV as students in the U.S. The most recent TIMSS also found that the relatively poor performance of U.S. twelfthgrade students is not related to hours spent on homework, the use of calculators or computers, time spent watching television or working at a paid job, or to attitudes toward mathematics and science. These and other TIMSS findings show us that there is no single easy answer to achieving high performance in mathematics and science. But the TIMSS and other NCES data sources do suggest some problems in U.S. mathematics and science education that may help explain our relatively low achievement at the higher grade levels. These data suggest that three issues are worth our attention: curriculum, coursetaking, and teacher preparation. First, both the mathematics and science curricula in American high schools have been criticized for their lack of coherence, depth, and continuityfor covering too many topics at the expense of indepth understanding. As a result, our secondary school curricula leave American students with a more limited opportunity to learn than their counterparts have in other countries. For example, while most other countries introduce algebra and geometry in the middle grades, in the U.S. only 25 percent of students take algebra before high school. The TIMSS also demonstrated the relative "slowness" of our curricula. The study found that the topics on the twelfth grade general knowledge mathematics assessment were covered by the ninth grade in the U.S, but by the seventh grade in most other countries. The topics on the general science assessment were covered by the eleventh grade in the U.S., but by the ninth grade in most other countries. Students' exposure to challenging mathematics and science content is further limited by their coursetaking behavior. Despite some recent increases in academic coursetaking, fully 90 percent of all U.S. high school students stop taking mathematics before getting to calculus. Even among collegebound seniors, 52 percent have not taken physics, 48 percent have not taken trigonometry, and 77 percent have not taken calculus; almost onethird (31 percent) had not taken four years of mathematics. Among 1994 high school graduates, only 9 percent had taken calculus and 24 percent had taken physics. Finally, courses and curricula do not teach themselves. At the most basic level, the education system relies on knowledgeable, welltrained teachers to convey the information students need to learn. What teachers do not know, they cannot teach. And our data suggest that considerable percentages of our mathematics and science teachers have not been adequately exposed to the information they teach. (Figure J) shows that in 199394, 28 percent of public high school (grade 912) mathematics teachers and 18 percent of public high school science teachers were teaching outoffield (that is, without a major or minor in their subject). Within science subfields, 31 percent of life science (biological/life sciences) teachers and 55 percent of physical science (chemistry, physics, earth science, and physical science) teachers lacked a major or minor in their subfield. In addition, 24 percent of mathematics teachers and 17 percent of science teachers lacked state certification in their teaching field. In short, TIMSS does dispel myths, but more importantly, it shows us our own education system in clearer perspective. In our quest for factors related to better student performance, TIMSS encourages us to focus on rigorous content, focused curriculum, good teaching, and good training for teachers. TIMSS has shown us that the typical U.S. eighthgrade mathematics class usually discusses material taught at the seventhgrade around the world. Compared to those in Japan, our mathematics teachers tend to focus on teaching specific math skills, rather than higherlevel mathematical problem solving. For example, U.S. eighthgrade math teachers are more likely to merely state rather than explain mathematical concepts. Further, our curriculum includes more topics, and our teachers are more frequently interrupted by loudspeakers and other outside agents, while they are teaching than are teachers in Japan and Germany. Our teachers also lack a one or two year apprenticeship in teaching before they become teachers, as is the case in these two other countries. Clearly TIMSS shows us that while it may not be easy, important change is needed to help our nation continue to improve its performance. International Comparisons of Reading In 1991, the IEA Reading Literacy Study assessed the reading literacy of fourthgraders (in 27 countries) and ninthgraders (in 31 countries). The underlying framework for this assessment paralleled the NAEP framework in that it too defined reading in terms of three text types  narrative, expository and document. In contrast to the NAEP Reading Report Card, this study painted a more positive picture of the reading literacy of American students.
International Perspective on Labor Force Proficiency Literacy has been viewed as one of the fundamental tools necessary for successful economic performance in industrialized societies. As society becomes more complex and lowskill jobs continue to disappear, concern about adults' ability to use written information to function in society continues to increase. Within countries, literacy levels are affected both by the quality and quantity of the population's formal education, as well as by participation in informal learning activities. The most recent international adult literacy data (1996) demonstrate that the U.S. appears most similar to New Zealand and the United Kingdom in the overall distribution of literacy skills. (Figure K) These three countries had close to 20 percent of their adult population at both the high and low ends of the literacy scale (Level 1 and Levels 4 and 5). In contrast, the performance of our European counterparts was concentrated in the middle literacy levels, with at least twothirds of the adult population in the Netherlands, Switzerland (both French and German speaking) and Germany at Literacy Levels 2 or 3. While Sweden tended to have the greatest concentration at the higher end of the scale, Poland's adults were concentrated at the lower end. In the United States, as you might expect, workers with higher adult literacy scores are unemployed less and earn more than workers with lower literacy scores. Unemployment rates are especially high for workers in the two lowest levels of literacylevels 1 and 2on each of the three literacy scales. For these workers, the unemployment rate ranges from 12 percent for workers with level 2 quantitative literacy to nearly 20 percent for those with level 1. Unemployment rates for individuals in the two highest literacy levelslevels 4 and 5are less than 6 percent. Workers with high literacy scores earn more than other workers do, on average (Figure L). On the prose scale, for example, fulltime workers in level 3 earn a mean weekly wage 50 percent higher than that of their counterparts in level 1. Those in level 5 earn a weekly wage 71 percent higher than the wage of those in level 3. Thus, academic skills do make a difference in both earnings and employability. The dropout rates of 16to24yearolds Hispanics remained at levels substantially higher than the dropout rates experienced by their white and black peers (Figure M). And, in contrast to the decline among black and white 16to24year olds, the dropout rates for Hispanics has not changed significantly since 1982. In 1996, 29 percent of Hispanics were not enrolled in school and had not completed high school; however this percentage includes young immigrants who came to the United States without high school credentials and never enrolled in a U.S. school . The dropout rate for Hispanic immigrants aged 16 to 24yearsold was 44 percent, compared to the dropout rate for firstgeneration Hispanics born in the United States, which was 17 percent (Figure N). Educational Aspirations and College Attendance One of the most dramatic changes taking place since A Nation at Risk is that the hopes of high school seniors for the future increasingly include more education. In 1992, 69 percent of seniors said that they hoped to graduate from college, compared with 39 percent of 1982 seniors. Moreover, 33 percent said they hoped to earn a postgraduate degree as compared with 18 percent in 1982. The proportion of minority students aspiring to postgraduate degrees was about the same, or higher, than for whites. Not surprisingly, these higher student aspirations have been accompanied by substantial increases in actual college attendance. The proportion of high school graduates going directly on to college rose from 51 percent in 1982 to 65 percent in 1996. Coursetaking Patterns in High School One of the important elements in the recommendations in A Nation at Risk was to increase the academic course load of high school students. Since the release of that report, most states have raised course requirements for high school graduation and most states have mandated studenttesting standards. As a result, both collegebound and noncollegebound students now take more academic courses than their counterparts did a decade before. In 1982, the average high school graduate completed 2.6 Carnegie units in mathematics and 2.2 units in science. By 1994, the average number of Carnegie units completed had risen to 3.4 in mathematics, and 3.0 units in science. Foreign language units rose from 1.0 to 1.8, and coursework in English and social studies also increased. The increase in the average units completed means that more students are now taking advanced mathematics courses, such as calculus, which was completed by 9 percent of the 1994 graduates compared to 5 percent of the 1982 graduates. Similarly, the proportion of graduates completing a physics course rose from 14 percent in 1982 to 24 percent in 1994 (Figure O). A Nation at Risk recommended that high school students complete a "New Basics" curriculum that included a minimum number of courses in the core academic areas of English (4), Mathematics (3), Science (3), and Social Studies (3). Since the release of these "New Basics" recommendations, high school graduates have taken more courses overall, particularly academic courses. The proportion of students completing the "New Basics" core curriculum in English, mathematics, science, and social studies has increased; and greater percentages are taking Advanced Placement (AP) courses. In 1982, 14 percent of high school graduates earned the credits recommended in A Nation at Risk; by 1994, 50 percent had done so. The percentage of graduates who have completed the more extensive recommendations for collegebound students, which include the "New Basics," plus 2 years of foreign language instruction and a halfyear of computer science, rose from 2 percent in 1982 to 25 percent in 1994. Even though we cannot establish a cause and effect relationship, it is interesting to compare the average mathematics and science performance of 17yearolds, as measured by our National Assessment of Education Progress, and the increase in course taking. The mathematics performance of 17yearolds rose by 7 points between 1982 and 1994, which roughly equates to about 2/3 of the typical grade to grade progress. This increase compares closely to the rise of .8 average units of mathematics completed by high school graduates. The science performance of 17yearolds rose by 11 points between 1982 and 1994, compared to an average increase of .9 science units completed by high school graduates. 