Program for International Student Assessment (PISA): 2015 Results

Overview
PISA Data Explorer
PISA 2018 Results
Technical Notes
Previous PISA Results
- PISA 2015 Results
  - Welcome to the PISA 2015 Results
  - Selected Findings from PISA 2015
  - Introduction
  - Science Literacy
  - Reading Literacy
  - Mathematics Literacy
  - Financial Literacy
  - Collaborative Problem Solving
  - Trends in Student Performance
  - State Results
  - Methodology and Technical Notes
  - Download all PISA 2015 tables and figures
  - For More Information
- PISA 2012 Results
- PISA 2009 Results
- PISA 2006 Results
- PISA 2003 Results
PISA Young Adult Follow-up Study
FAQs
Data
PISA Released Assessment Items
Questionnaires
Countries
Schedule and Plans
Partners
PISA International Site
Join NewsFlash

Scaling of Student Test Data

Return to Methodology and Technical Notes

Each test form had a different subset of items. Because each student completed only a subset of all possible items, classical test scores, such as the percentage correct, are not accurate measures of student performance. Instead, scaling techniques were used to establish a common scale for all students. For PISA 2015, item response theory (IRT) was used to estimate average scores for science, reading, and mathematics literacy for each education system, as well as for three science process and three science content subscales. For education systems participating in the financial literacy assessment and the collaborative problem solving assessment, these assessments were scaled separately and assigned separate scores.

IRT identifies patterns of response and uses statistical models to predict the probability of answering an item correctly as a function of the students' proficiency in answering other questions. With this method, the performance of a sample of students in a subject area or subarea can be summarized on a simple scale or series of scales, even when students are administered different items.

Scores for students were estimated as plausible values because each student completed only a subset of items. Ten plausible values were estimated for each student for each scale. These values represented the distribution of potential scores for all students in the population with similar characteristics and identical patterns of item response. Statistics describing performance on the PISA science, reading, and mathematics scales are based on plausible values. In PISA, the science, mathematics and reading literacy scales are from 0-1,000.