Mark Schneider
Commissioner, National Center for Education Statistics

NCES Statement on PISA 2006
December 4, 2007

Today the National Center for Education Statistics is releasing results on the performance of students in the United States on an international study, the Program for International Student Assessment (PISA). PISA is a system of international assessments that measures 15-year-old students’ performance in reading literacy, mathematics literacy, and science literacy every 3 years. PISA, first implemented in 2000, is sponsored by the Organization for Economic Cooperation and Development (OECD), an intergovernmental organization of 30 member countries.

PISA uses the terminology of "literacy" in each subject area to denote its broad focus on the application of knowledge and skills. For example, PISA seeks to assess whether 15-year-olds are scientifically literate, or to what extent they can apply scientific knowledge and skills to a range of different situations they may encounter in their lives. The target age of 15 allows jurisdictions to compare outcomes of learning as students near the end of compulsory schooling. PISA’s goal is to answer the question "what knowledge and skills do students have at age 15?" taking into account schooling and other factors that may influence their performance. In this way, PISA’s achievement scores represent a “yield” of learning at age 15, rather than a direct measure of attained curriculum knowledge at a particular grade level, because 15-year-olds in the United States and elsewhere come from several grade levels.

In the 2006 assessment, 57 education systems (called jurisdictions in the report, these were nearly all countries, but included subnational entities such as Hong Kong and Macao-China) participated in PISA, including all 30 OECD countries and 27 non-OECD jurisdictions. The United States has participated in all three administrations of PISA (2000, 2003, and 2006).

Each PISA data collection effort assesses one of the three subject areas in depth, even as all three are assessed in each cycle so that participating jurisdictions have an ongoing source of achievement data in every subject area. In this third cycle, PISA 2006, science literacy was the subject area assessed in depth. In 2009, PISA will focus on reading literacy, which was also assessed as the major domain in 2000.

The results presented here focus on the performance of U.S. students in the major subject area of science literacy as assessed in PISA 2006. Achievement in the minor subject area of mathematics literacy in 2006 and differences in achievement by selected student characteristics are also presented.

How PISA Was Conducted
PISA 2006 was a 2-hour paper-and-pencil assessment of 15-year-old students and a 30-minute student background questionnaire. Like other large-scale assessments, PISA was not designed to provide individual student scores, but rather national and group estimates of performance. PISA 2006 was administered between September and November 2006. The U.S. sample included both public and private schools, randomly selected and weighted to be representative of the nation. In total, 166 schools and 5,611 students participated in PISA 2006 in the United States. More information about how the assessment was developed and conducted is included in the technical notes of the U.S. report on PISA. An additional source of information will be the PISA 2006 technical report to be published by the OECD and scheduled for release in 2007.

U.S. Performance in Science Literacy

The PISA 2006 assessment measures student performance on a combined science literacy scale and on three science literacy subscales: identifying scientific issues, explaining phenomena scientifically, and using scientific evidence. PISA scores are reported on a scale that ranges from 0 to 1000 with a mean of 500 and a standard deviation of 100.

Performance on the Combined Science Literacy Scale
On the combined science literacy scale, U.S. 15-year-old students’ average score (489) was lower than the OECD average (500). The U.S. average score was lower than the average score in 16 of the other 29 OECD jurisdictions and 6 of the 27 non-OECD jurisdictions. The U.S. average score was higher than the average scores of students in 22 jurisdictions (5 OECD and 17 non-OECD).

Finland was the top scoring country in 2006. Among the G-8 countries, four (Canada, Germany, Japan and the United Kingdom) scored higher than the United States.

When comparing the science performance of the highest-achieving students, there was no measurable difference between the score of U.S. students at the 90th percentile compared to the average score at the 90th percentile among OECD countries. In 12 jurisdictions (9 OECD and 3 non-OECD), students at the 90th percentile scored higher than their counterparts in the United States.

In 2003, the U.S. average score in science literacy was 491, lower than the OECD average as well as the average scores in 18 countries (including 15 OECD countries). Note that because of changes in the instruments for science between 2003 and 2006, the U.S. science scores between these two years are not comparable.

Proficiency Levels
Along with scale scores, PISA 2006 also uses six proficiency levels (levels 1 through 6, with level 6 being the highest level of proficiency) to describe student performance in science literacy. An additional level (below level 1) includes students whose skills are not developed sufficiently to be described by PISA.

The United States had greater percentages of students at or below level 1 than the OECD average percentages on the combined science literacy scale. The percentages of U.S. students performing at the highest levels 5 and 6 were not measurably different from the OECD averages.

Performance of U.S. Students on the Science Literacy Subscales: Identifying Scientific Issues, Explaining Phenomena Scientifically, and Using Scientific Evidence
Although not the terms commonly used in schools, each of the sub-areas used in PISA has a relationship to the typical courses taught in U.S. schools. The concepts are the familiar ones of physics, chemistry, biological sciences, and Earth and space sciences. The processes are centered on the ability to acquire, interpret, and act on evidence such as describing scientific phenomena and interpreting scientific evidence. The sample items in the report show examples of what each subscale measured. PISA uses these sub-areas to evaluate the application of knowledge and skills to problems in a real-life context.

On two of the three subscales (explaining phenomena scientifically and using scientific evidence) U.S. 15-year-old students had lower scores than the OECD average score in 2006. Of the other 56 jurisdictions, students in 19 OECD and 6 non-OECD jurisdictions scored higher, on average, than U.S. students on the explaining phenomena scientifically subscale. The U.S. scores were also lower, on average, than 14 OECD and 6 non-OECD jurisdictions on the using scientific evidence subscale.

There was no measurable difference between the U.S. average score and the OECD average on the identifying scientific issues subscale. However, U.S. 15-year-old students scored lower, on average, than 18 jurisdictions (13 OECD and 5 non-OECD).

Differences in Performance by Selected Student Characteristics

Performance by Sex
While the OECD average was higher for males than females, U.S. male and female students’ scores did not differ measurably on the combined science literacy scale. In the United States, female students scored higher than male students on the identifying scientific issues subscale, while male students scored higher than female students on the explaining phenomena scientifically subscale. There was no measurable difference between the performance of U.S. male and female students on the using scientific evidence subscale.

Performance by Race/Ethnicity
Because racial and ethnic groups vary across countries, it is not possible to compare performance by race/ethnicity on international assessments. In the United States, students were asked whether they were of Hispanic origin and their race. Students who identified themselves as being of Hispanic origin were classified as Hispanic, regardless of race. Black students and Hispanic students scored lower, on average, on the combined science literacy scale than White students, Asian students, and students of more than one race. Hispanic students scored higher than Black students. White students scored higher than Asian students and were the only U.S. racial group to score higher than the OECD average.

U.S. Performance in Mathematics Literacy

The U.S. average score in mathematics literacy was lower than the OECD average in 2006. Thirty-one jurisdictions (23 OECD jurisdictions and 8 non-OECD jurisdictions) had a higher average score than the United States in mathematics literacy in 2006. In contrast, 20 jurisdictions (4 OECD jurisdictions and 16 non-OECD jurisdictions) scored lower than the United States in mathematics literacy in 2006.

The top scoring jurisdictions in mathematics literacy were Chinese Taipei, Finland, Hong-Kong-China, and the Republic of Korea. Among the G-8 countries, five (Canada, France, Germany, Japan and the United Kingdom) scored higher than the United States in mathematics literacy.

In 2003, the U.S. score in mathematics literacy was 483, lower than the average score of 500. There was no measurable change in either the U.S. mathematics literacy score from 2003 to 2006 or the U.S. position compared to the OECD average.

Reading Literacy

The OECD determined that 2006 U.S. reading literacy results could not be reported because of an error in printing the U.S. test booklets. The printing error resulted in incorrect instructions to students in the reading sections, and OECD’s analysis found that the results were too severely compromised to be reported. OECD determined that the printing error did not have the same effect in the science and mathematics sections.

Comparisons Between PISA and NAEP

The National Assessment of Educational Progress (NAEP) released in September results of its 2007 assessment of eighth-graders achievement in mathematics. To better grasp what the two assessments—PISA and NAEP—can contribute to our understanding of U.S. students’ performance in mathematics in the transitional years between middle and high school, it is important to examine the similarities and differences of the two assessments.

The results of recent studies comparing the two assessments in terms of the populations they study, levels of measurement precision, and what and how they measure mathematics or mathematics literacy are available at the NCES website (http://nces.ed.gov/Surveys/PISA/pdf/
comppaper12082004.pdf
PDF File - 211 KB). Some key points include:

  • The students being studied by PISA and NAEP represent different groups. PISA is intended to assess mathematics literacy performance of 15-year-old students. Students were assessed by PISA in the Fall of 2006, and so most students were in tenth grade, though students ranged between seventh and twelfth grade. NAEP is intended to assess mathematics achievement of eighth-graders, and so includes students of various ages. The PISA assessment is designed to measure the performance of students generally two grades above those assessed in NAEP at the eighth grade.
  • The goals of the assessments have subtle but important distinctions with regard to the U.S. curricula. NAEP is tailored specifically to practices and standards operating in the United States. PISA’s content is determined internationally in collaboration with other countries and reflecting consensus views of key content. Also, PISA’s specific focus on the "yield" of the education system and the application of competencies in real-word contexts, distinguishes it from NAEP, which aims at measuring school-based curricular attainment more closely.
  • PISA and NAEP are designed to measure student performance at different levels of precision. They are both designed to provide valid and reliable measures of U.S. students’ performance in the aggregate and for major subpopulations, and each study draws a sample sufficient for this purpose. However, the sample size in each assessment is influenced by their somewhat distinct goals. PISA is designed to detect the relatively large differences in student performance across countries. Thus the sample size is relatively less sensitive to smaller variations in student performance within the United States, as well as smaller variations in performance over time. In contrast, NAEP is designed to detect smaller differences since NAEP seeks to be sensitive to small changes in student performance over time, for the nation as a whole, for individual states, and for student subgroups. Moreover, the NAEP sample size has increased dramatically in order to produce state representative data because of a growing interest in state comparisons.
  • PISA and NAEP differ in the ways in which the frameworks for assessment are organized and in terms of content coverage, item format, and other key features. PISA has a greater focus on data analysis, statistics, and probability and lesser focus on algebra than does NAEP. These differences reflect in part PISA’s assessment of performance among an older group, as well as the national interest in algebra at the eighth grade found in NAEP. PISA has more items requiring relatively higher demand for thinking and reasoning and developing and communicating an argument; NAEP, on the other hand, had more items that require reproducing practiced material and performing routine operations. Finally, PISA also has a greater emphasis than NAEP on open-ended or constructed response item formats than multiple-choice.

Conclusion

This PISA report is intended to be used by educators, policymakers, and interested members of the public. It is important to have the kind of performance data that PISA provides as an external perspective on the performance of our nation’s students.

The project director for this report was Holly Xie of NCES. She was assisted in her efforts by valuable staff from the American Institutes for Research (Stéphane Baldi, Ying Jin, and Melanie Skemer) and from RTI International (Patricia J. Green and Deborah Herget). Recognition should also be given to Val Plisko, Associate Commissioner, responsible for overall direction of the project, Daniel McGrath, the director of the international activities program, and to Marilyn Seastrom, NCES’ Chief Statistician. Supporting their work were members of the Education Statistics Services Institute (Anindita Sen, David Miller, and Aparna Sundaram), of Child Trends (Lydia Malley and Siri Warkentein) and of MacroSys (Steve Hocker).

NCES also wishes to thank the schools and the students who participated in this study. Their participation has allowed us to provide the nation with this important international perspective on student performance.

For More Information

This statement covers some of the major findings from PISA 2006 from the U.S. perspective, but of course, it is not the whole story. Other findings are available in the OECD’s report on PISA 2006, and additional results will be published by OECD in a series of future thematic reports. The PISA 2006 data will also be publicly available after December 4 for independent analyses.

For more information on PISA, please visit the PISA website at http://nces.ed.gov/surveys/pisa.

For more information on NAEP, please visit the NAEP website at http://nces.ed.gov/nationsreportcard.

The U.S. PISA 2006 results are available at http://nces.ed.gov/surveys/pisa.

The 2007 NAEP eighth grade mathematics results are available at http://nces.ed.gov/pubsearch/
pubsinfo.asp?pubid=2007494

A comparison of PISA and NAEP studies is available at http://nces.ed.gov/Surveys/PISA/pdf/
comppaper12082004.pdf
PDF File (211 KB).

Top