The U.S. TIMSS fourth- and eighth-grade national sample design
In the United States and most other participating education systems, the target populations of students corresponded to the fourth and eighth grades. In sampling these populations, TIMSS used a two-stage stratified cluster sampling design (as explained in International Requirements for Sampling, Data Collection and Response Rates.1 The U.S. sampling frame was explicitly stratified by three categorical stratification variables:
The U.S. sampling frame was implicitly stratified (that is, sorted for sampling) by two categorical stratification variables:
For the first stage of drawing the samples, a systematic probability-proportional-to-size (PPS) technique was used to select schools for the original sample from a sampling frame based on the 2015 National Assessment of Educational Progress (NAEP) school sampling frame.4 Data for public schools in the sampling frame came from the Common Core of Data (CCD), and data for private schools came from the Private School Universe Survey (PSS). Note that the overlap with the NAEP school samples was not minimized for the TIMSS fourth and eighth grade samples because the TIMSS samples were selected before the NAEP samples on account of TIMSS scheduling constraints. Thus, the overlap between the samples was minimized when the 2015 NAEP samples were selected. Besides the original schools selected, two schools adjacent to each original school in the sampling frame were designated as substitute schools. The first school following the original sample school was the first substitute and the first school preceding it was the second substitute.
For the second stage of drawing the sample, intact mathematics classes within each participating school were selected. Schools provided lists of fourth- and eighth-grade classrooms. Within schools, classrooms with fewer than 15 students were collapsed into pseudo-classrooms, so that each classroom in the school's classroom sampling frame had at least 20 students.5An equal probability sample of two classrooms6 was identified from the classroom frame for the school. In schools where there was only one classroom, this classroom was selected with certainty. At both grade levels, pseudo-classrooms were created prior to classroom sampling when classroom sizes were small. All students in sampled classrooms and pseudo-classrooms were selected for the assessment.
In this way, the overall sample design for the United States results in an approximately self-weighting sample of students, with each 4th- or 8th-grade student having a roughly equal probability of selection. Note that in large schools, a smaller proportion of the classes (and therefore of the students) is selected, but this lower rate of selecting students in large schools is offset by a larger probability of selection of large schools, as schools are selected with probability proportional to size
The U.S. TIMSS Florida state sample design
In 2015, a TIMSS public school state sample was drawn for Florida for both fourth- and eighth-grade. The 2015 TIMSS state samples replicated the national sample design whenever possible. The school frames to draw the Florida state samples was identical to the national frames of public schools in Florida. The objective for the Florida state samples was that they would not include the schools that were previously selected as part of the TIMSS national sample.
The fourth- and eighth-grade schools samples selected for Florida followed the normal TIMSS procedure of selecting two classes per school. The target number of schools needed in each grade was 50 plus an additional five schools to account for ineligible schools (schools with no students in the target grade).
The U.S. TIMSS Advanced national sample design
In the United States and most other participating education systems, the target population corresponded to advanced students in the final year of secondary school who had taken or were taking advanced mathematics or physics.
Defining this target population in the United States is more challenging than in most countries because the United States does not have a common curriculum for the entire country and students do not follow a uniform course-taking sequence. Many different courses may meet the criteria for having studied advanced topics as described in the TIMSS Advanced framework. Courses with similar titles may vary in curriculum content and rigor across schools within the nation. Moreover, students may have taken these courses in high school, local colleges or universities, or online; and they may have taken the courses prior to their senior year of high school. Complicating things further, the school lists that form the U.S. sampling frames are based on NCES data from the Common Core of Data (CCD) (which lists all public schools) and the Private School Universe Survey (PSS) (which lists private schools), and neither of these datasets identify the courses offered in schools.
For the purpose of this study, student eligibility for the advanced mathematics assessment was defined as having taken or currently taking a calculus course, and for the physics assessment as having taken or currently taking an advanced physics course similar to AP physics. To confirm eligibility of a school to participate in the study, the advanced courses taken by students at sampled schools were listed by each school and confirmed as applicable courses for the study by consulting with school staff and by checking the school's course catalog.
To sample the advanced student populations for TIMSS Advanced 2015, TIMSS used a two-stage stratified cluster sampling design (as explained in International Requirements for Sampling, Data Collection and Response Rates). The U.S. TIMSS Advanced 2015 sampling frame was explicitly stratified by three categorical stratification variables:
The U.S. sampling frame was implicitly stratified (that is, sorted for sampling) by two categorical stratification variables:
As is done for TIMSS at fourth and eighth grade, for the first stage of drawing the TIMSS Advanced sample, a systematic probability-proportional-to-size (PPS) technique was used to select schools for the original sample from a sampling frame based on the 2015 NAEP school sampling frame. To supplement the twelfth-grade frame data, NCES worked with the College Board, which provided data identifying schools that offered AP courses in 2013. With these data, the frame could include information on schools with students who took AP exams. The overlap with the 2015 NAEP school sample was minimized when the NAEP grade 12 sample was selected, which happened prior to the selection of the TIMSS Advanced sample. Besides the original schools selected, two schools adjacent to each original school in the sampling frame were designated as substitute schools. The first school following the original sample school was the first substitute and the first school preceding it was the second substitute.
The second stage consisted of selecting students within each participating school. Schools provided lists of all eligible advanced mathematics and physics students. The student sampling was designed to meet the international guidelines described earlier. Since actual counts of advanced calculus and physics students were not available, estimates of eligible student counts were computed with the available information. In non-AP schools, the percentages of graduates who earned credit in calculus and/or physics from the 2009 High School Transcript Study (HSTS) were applied to the grade 12 enrollment from the frame. In AP schools, the frequencies of students taking AP exams in calculus, physics, and both calculus and physics were inflated based on the HSTS percentages. This was done within each school by inflating the AP counts in calculus, physics, and both calculus and physics by the comparable ratio of total percentage of advanced to AP students. Within schools, students were sampled using the following algorithm:
Comparing TIMSS Advanced samples from 2015 and 1995
The target population for TIMSS Advanced 2015 varied from the 1995 TIMSS Advanced where students had to be designated as a “specialist” in advanced mathematics, physics, or both by the school. In 1995, an advanced mathematics “specialist” was a student who took a pre-calculus, calculus, or AP calculus course. A physics “specialist” was a student who took a first-years physics course (which did not include physical science courses, but may have included general physics courses). Students who met both criteria were considered “combined specialists” and were the only students eligible to take a combined advanced mathematics-physics assessment. Students who met neither set of criteria were labeled as “generalists” (or “non-specialists”) and could only take the TIMSS twelfth-grade mathematics and science literacy assessment, which was part of TIMSS in 1995.
Comparisons between the 1995 and 2015 TIMSS Advanced samples need to be conducted with comparable students based on their coursework; however, the manner in which the coursework has been identified and substantiated differs in the two administrations. Thus, it is important to note that the subset of U.S. students in the 1995 TIMSS Advanced population that is comparable with the 2015 sample were identified based on the following criteria: (a) the school identifying the student as a “specialist” and (b) student self-reported course taking information. As part of the 1995 TIMSS Advanced student background questionnaire, students were asked to report the highest level course taken in mathematics and physics. The questionnaire helped differentiate students who took calculus and advanced physics courses.
1 The primary purpose of stratification is to improve the precision of the survey estimates. If explicit stratification of the population is used, the units of interest (schools, for example) are sorted into mutually exclusive subgroups—strata. Units in the same stratum are as homogeneous as possible, and units in different strata are as heterogeneous as possible, with respect to the characteristics of interest to the survey. Separate samples are then selected from each stratum. In the case of implicit stratification, the units of interest are simply sorted with respect to one or more variables known to have a high correlation with the variable of interest. In this way, implicit stratification guarantees that the sample of units selected will be spread across the categories of the stratification variables.
2 The Census definitions of region were used. The Northeast region consists of Connecticut, Maine, Massachusetts, New Hampshire, New Jersey, New York, Pennsylvania, Rhode Island, and Vermont. The Midwest region consists of Illinois, Indiana, Iowa, Kansas, Michigan, Minnesota, Missouri, Nebraska, North Dakota, Ohio, Wisconsin, and South Dakota. The South region consists of Alabama, Arkansas, Delaware, District of Columbia, Florida, Georgia, Kentucky, Louisiana, Maryland, Mississippi, North Carolina, Oklahoma, South Carolina, Tennessee, Texas, Virginia, and West Virginia. The West region consists of Alaska, Arizona, California, Colorado, Hawaii, Idaho, Montana, Nevada, New Mexico, Oregon, Utah, Washington, and Wyoming.
3 The NCES definitions of locale were used. The four urban-centric locale types are: (1) Central city, which consists of a large, midsize, or small territory inside an urbanized area and inside a principal city. (2) Suburb, which consists of a large, midsize, or small territory outside a principal city and inside an urbanized area. (3) Town, which consists of a fringe, distant, or remote territory inside an urban cluster. (4) Rural, which consists of a fringe census-defined rural territory.
4 In order to maximize response rates from both districts and schools it was necessary to begin recruitment prior to the end of the 2013-14 school year. The 2015 NAEP sampling frame used 2012-2013 school data.
5 Since classrooms are sampled with equal probability within schools, small classrooms would have the same probability of selection as large classrooms. Selecting classrooms under these conditions would likely mean that student sample size would be reduced, and some instability in the sampling weights created. To avoid these problems, pseudo-classrooms are created for the purposes of classroom sampling, in which small classrooms are joined to reach a larger student count. These pseudo-classrooms are treated as single classes in the class sampling process. Following class sampling, the pseudo-classroom combinations are dissolved and the small classes involved retain their own identity. In this way, data on students, teachers, and classroom practices are linked in small classes in the same way as with larger classes
6 The classrooms selected could be pseudo-classrooms.
7 AP status indicates whether or not the school has students who took any relevant College Board AP test in 2013.