Mathematics Coursetaking and Achievement at the End of High School:
NCES 2008-319
January 2008

2.1 Mathematics Achievement Assessments

Assessments in mathematics were administered to students in their schools during the BY and F1 survey administrations. There were multiple forms of the test. In the BY, assignment of form was based on a routing test, and in the F1, on the BY ability estimate. These tests, designed and scored using Item Response Theory (IRT), serve as "bookends" to learning that took place during the 2002–03 and the 2003–04 academic years-that is, approximately the end of sophomore year to approximately the end of senior year for on-time students.8 The BY assessment can be thought of as a pretest, or baseline, to academic experiences that take place during the second half of high school, while the F1 assessment can be thought of as a posttest. IRT uses patterns of correct, incorrect, and omitted answers to obtain achievement estimates that are comparable across different test forms within a domain.9 In estimating a student's achievement, IRT also accounts for each test question's difficulty, discriminating ability, and a guessing factor. For this analysis, two measures of mathematics achievement based on their performance on this test are used: IRT-estimated number-right scores and proficiency probability scores.

The IRT-estimated number-right score is an overall measure of mathematical knowledge and skill. The IRT-estimated number-right score used in this analysis is an IRT-based estimate of the number of items an examinee would have answered correctly if he or she had taken all of the items in the item pool on the multiform assessment administered to 10th-graders in ELS:2002's predecessor study, the National Education Longitudinal Study of 1988 (NELS:88). Using common item calibration techniques for linking scales, results between NELS:88 and ELS:2002 are comparable.10 There were 81 items in the vertically scaled 10th- to 12th-grade ELS:2002 item pool. For the analytic sample used in this study, students answered an average of 47 questions correctly on the 10th-grade assessment and 51 questions correctly on the 12th-grade assessment.

A proficiency probability score is a criterion-referenced score measuring how well an examinee performs relative to some set criterion representing mastery of knowledge and skills assessed. There are five distinct scores corresponding to five hierarchical levels (level 1 through level 5). Mastery of a higher level typically implies proficiency at lower levels. In contrast to the IRT-estimated number-right scores, which indicate overall achievement, the proficiency probability scores indicate what knowledge and skills the student does or does not possess. The five ordinal levels of mathematics proficiency include:

  1. simple arithmetical operations on whole numbers, such as simple arithmetic expressions involving multiplication or division of integers;
  2. simple operations with decimals, fractions, powers, and roots, such as comparing expressions, given information about exponents;
  3. simple problem solving, requiring the understanding of low-level mathematical concepts, such as simplifying an algebraic expression or comparing the length of line segments illustrated in a diagram;
  4. understanding of intermediate-level mathematical concepts and/or multistep solutions to word problems such as drawing an inference based on an algebraic expression or inequality; and
  5. complex multistep word problems and/or advanced mathematics material such as a two-step problem requiring evaluation of functions.

The proficiency probability score at each level ranges from 0 to 1 and indicates the likelihood that a student has mastered the skills and knowledge described above (0 = no mastery, 1 = compete mastery). The mean of a proficiency probability score aggregated over a subgroup of students is analogous to an estimate of the percentage of students in the subgroup who have displayed mastery of the particular skill.11 For example, in this study, the analytic sample has a mean score of .73 for level 2 in the 10th grade. This can be interpreted as "73 percent of 10th-graders have mastered the skills and concepts of level 2." The proficiency probabilities were computed using IRT-estimated item parameters originally calibrated in NELS:88. Appendix A provides more detailed information about the assessment framework, the distribution of the item pool across its elements, and the scaling techniques for the different scores. For the purposes of presentation and discussion, throughout this report, level 1 is considered basic skills, levels 2 and 3 are considered intermediate skills, and levels 4 and 5 are considered advanced skills.

Top


8 Less than 1 percent of students included in the analysis were not in the 12th grade at the time of the F1 survey administration (n = 60), likely due to grade retention. As this time span captures the academic experiences in the junior and senior years for almost the entire sample (99 percent), the phrases "junior and senior year of high school," "latter half of high school" and "2002–03 and 2003–04 academic years" will be used interchangeably in this report.
9 For an account of IRT, see Embretson and Reise (2000) or Hambleton, Swaminathan, and Rogers (1991).
10 Development of the 1992 NELS:88 mathematics scale is documented in Rock and Pollack (1995b). The linkage of the NELS:88 scale to ELS:2002 through IRT methods is documented in Ingels et al. (2005, p. 39).
11 Although probabilities of proficiency have been placed on a 0–1 scale, when aggregated they can be interpreted as a proportion. On the interpretation of a probability as a proportion, see Fleiss, Levin, and Paik (2003, p. 1).