Technical Notes: A.7 Cutpoint scores and achievement levels
The IEA has developed international benchmarks for achievement on TIMSS and PIRLS based on cutpoint scores that describe what students—who have reached each benchmark's threshold or "cutpoint" score—know and can do in regard to the subject assessed. For example, 4th-grade students who have reached the TIMSS Intermediate benchmark in mathematics (scored 475 or better)
demonstrate an understanding of whole numbers. They can extend simple numeric and geometric patterns. They are familiar with a range of two-dimensional shapes. They can read and interpret different representations of the same data. (Gonzales et. al. 2008, p. 13)
The IEA describes student achievement in this manner at four points on its assessment scales: Advanced International Benchmark (cutpoint score of 625), High International Benchmark (550), Intermediate International Benchmark (475), and Low International Benchmark (400). With these four equally spaced benchmarks serving as touchstones for reference, it is possible to interpret what the scores on the PIRLS and TIMSS achievement scales mean more concretely (i.e., understand what knowledge and skills may be demonstrated with a scale score of 513 versus 426).
To describe student performance at the selected points or benchmarks along the TIMSS and PIRLS achievement scales, the IEA uses scale anchoring. Scale anchoring involves selecting a cutpoint score that will "anchor" a benchmark and then identifying items that students scoring within plus or minus 5 scale score points of these anchor points are likely to answer correctly. (The range of plus and minus 5 points around a benchmark's anchor point is intended to provide a sample that is adequate to analyze the items defining student performance at each benchmark, yet one that is small enough so that performance at each benchmark anchor point is clearly distinguishable from the next.) Subsequently, these items are grouped by content area within benchmarks and reviewed by subject matter experts. These experts focus on the content of each item and describe the kind of knowledge demonstrated by students answering the item correctly. The experts then provide a summary description of performance at each anchor point leading to a content-referenced interpretation of the achievement results. (Detailed information on the creation of the benchmarks is provided in Mullis, Martin, and Foy 2008a and 2008b and Martin et al. 2007.)
Levels of proficiency
The OECD has identified levels of proficiency for each of the subject areas of PISA to describe concretely what particular ranges of scores mean. Unlike benchmarks, which are anchored by scale scores, levels of proficiency are anchored by items, which reflect particular proficiencies. Specifically, the knowledge and skills that students are asked to demonstrate in the assessment are classified into one of five or six levels, and the items associated with those specific knowledge and skills become the basis both for classifying students into one of these levels of proficiency and for determining the cutpoint scores for each level.
In PISA, all students within a level are expected to answer at least half of the items from that level correctly. Students at the bottom of a level are able to provide the correct answers to about 52 percent of all items from that level, have a 62 percent chance of success on the easiest items from that level, and have a 42 percent chance of success on the hardest items from that level. Students in the middle of a level have a 62 percent chance of correctly answering items of average difficulty for that level (an overall response probability of 62 percent). Students at the top of a level are able to provide the correct answers to about 70 percent of all items from that level, have a 78 percent chance of success on the easiest items from that level, and have a 62 percent chance of success on the hardest items from that level. Students just below the top of a level would score less than 50 percent on an assessment at the next higher level.
Students at a particular level demonstrate not only the knowledge and skills associated with that level but also the proficiencies classified at lower levels. Thus, all students proficient at level 3 are also proficient at levels 1 and 2. Patterns of responses for students below level 1 suggest that these students are unable to answer at least half of the items from level 1 correctly.
Given that items are the basis for classifying students into the levels of proficiency, the cutpoint scores for particular levels vary from assessment to assessment. For more details about the PISA levels of proficiency, see the PISA 2006 Technical Report (OECD 2008).