Frequently Asked Questions


[Show All] International Studies
  • U.S. Participation
    • In what international education studies does the United States participate, and what do they measure?
      • The United States currently participates in the following international studies:

        PIRLS — Progress in International Reading Literacy Study
        PIRLS is an international comparative study of the reading literacy of young students. PIRLS collects data on the reading achievement, experiences, and attitudes of fourth-grade students in the United States and students in the equivalent of fourth grade in other participating countries, as well as information on students' classroom and school contexts. PIRLS is organized by the International Association for the Evaluation of Educational Achievement (IEA). PIRLS was first administered in 2001 and is administered every 5 years.

        PISA — Program for International Student Assessment
        PISA is an international comparative study of the reading literacy, mathematics literacy, and science literacy of 15-year-old students. It assesses students' applied knowledge and skills to problems within a real-life context. In addition to an assessment of student literacy, PISA collects information on students' experiences and attitudes, as well school contexts. PISA is organized by the Organization for Economic Cooperation and Development (OECD), an intergovernmental organization of 34 member countries. Non-OECD-member countries participate as well. PISA was first administered in 2000 and is administered every 3 years.

        TIMSS — Trends in International Mathematics and Science Study
        TIMSS is an international comparative study of student performance in mathematics and science. TIMSS collects data on student achievement, experiences, and attitudes of students in the United States and students in the other participating countries, as well as information on classroom and school contexts. TIMSS is organized by the International Association for the Evaluation of Educational Achievement (IEA). TIMSS data have been collected from students at grade 4 and 8 since 1995 every 4 years, generally. In addition, TIMSS data have been collected twice from students at grade 12 in 1995 and 2008, which can be referred to as TIMSS Advanced.

        PIAAC — Program for the International Assessment of Adult Competencies
        PIAAC is an international comparative study of adult literacy, including reading literacy, numeracy, problem-solving in a technology-rich environment, and component reading literacy skills, as well as the skills adults report using in their jobs. In addition to the assessment of adult literacy, PIAAC collects data on adults' educational and work experiences. PIAAC is organized by the Organization for Economic Cooperation and Development (OECD). Non-OECD-member countries participate as well. PIAAC was first administered in 2012 and is expected to be administered every 10 years.

        TALIS — Teaching and Learning International Survey
        TALIS is an international comparative study of teachers, teaching, and learning environments, with a particular focus on education workforce issues. TALIS is coordinated by the Organization for Economic Cooperation and Development (OECD). Non-OECD-member countries participate as well. TALIS was first administered in the United States in 2013 and is expected to be administered every 5 years.
    • Why does the United States participate in international education studies?
      • The United States participates in international studies primarily for two reasons:
        • To learn about the performance of U.S. students and adults in comparison to their peers in other countries.
        • To learn about the educational and work experiences of students and adults in other countries.
        Student assessments are a common feature of school systems that are concerned about accountability and assuring students' progress throughout their educational careers. National or state assessments enable us to know how well students are doing in a variety of subjects and at different ages and grade levels compared to other students nationally or within their own state. International assessments, on the other hand, offer a unique opportunity to benchmark our students' performance to the performance of students in other countries. Similarly, international assessments of adult literacy enable us to compare U.S. adults with their international peers on literacy skills that support productive adult lives in the workplace and society.

        International assessments of students also enable countries to learn from each other about the variety of approaches to schooling and to identify promising practices and policies to consider in their schools. International assessments of adults enable research on the correlates between adults' work and educational experiences and their skill levels within countries and cross-nationally.

  • Development and Administration
    • How are test and survey questions developed for the international studies?
      • There are three main components in the development of test and survey questions:
        1. Test and survey questions for each study are first developed through a collaborative, international process.
          For each study, an international subject area expert group is convened by the organization conducting the study. This expert group drafts a framework (the outline of the topics and skills that should be assessed or surveyed in a particular domain), which reflects a multinational consensus on the assessment and survey of a subject area. Based on the framework, national representatives and subject matter specialists develop the test and survey questions. National representatives from each country then review every item to ensure that each adheres to the internationally agreed-upon framework. While not every item may be equally familiar to all study participants, if any item is considered inappropriate for a participating country or an identified subgroup within a country, that item is eliminated.
        2. Test and survey items are field-tested before they are used or administered in the full-scale study.
          Before the administration of the study, a field test is conducted in the participating countries. An expert panel convenes after the field test to review the results and look at the items to see if any results were biased due to national, social or cultural differences. If such items exist, they are not included in the full study. Only after this thorough process, in which every participating country is involved, are the actual items administered to study participants.
        3. There is an extensive translation verification process.
          All participating countries are responsible for translating the assessment or survey into their own language or languages, unless the original items are in the language of the country. All countries identify translators to translate the source versions into their own language. External translation companies independently review each country's translations. Instruments are verified twice, once before the field test and again before the main data collection. Statistical analyses of the item data are then conducted to check for evidence of differences in performance across countries that could indicate a translation problem. If a translation problem with an item is discovered in the field test, it is removed for the full study. Since for TIMSS, PIRLS, PISA, PIAAC, and TALIS the items are provided to countries in English, the United States does not need to translate the assessments but does adapt the international English versions to U.S.-English when necessary and appropriate.
    • Who participates in the international studies?
      • A representative national sample of the target population in each participating country responds to each study. In the case of PIRLS, PISA, and TIMSS, the sample is drawn to be representative of students at the designated age or grade level. In the case of PIAAC, the sample is drawn to be representative of persons 16 to 65 years old living in households. In the case of TALIS, the sample is drawn to be representative of teachers.

        The international organization that conducts each study verifies that all participating countries select a nationally representative sample. To ensure comparability, target grades, ages, or populations are clearly defined. For example, TIMSS countries participating in the study at the eighth-grade level sample students in the grade that corresponds to the end of 8 years of formal schooling, providing that the mean age of the students at the time of testing is at least 13.5 years.

        Not all selected respondents choose to participate in the studies; and certain respondents, such as some with cognitive or physical disabilities, may not be able to participate. Thus the sponsoring international organizations check each country's participation rates and exclusion rates to ensure they meet established target rates in order for the country's results to be reported.
    • How can we be sure that countries administer the test or survey in the same way?
      • The short answer is that procedures for the administration of the international studies are standardized and independently verified.

        The international organizations that conduct international studies require compliance with standardized procedures. Manuals are provided to each country that specify the standardized procedures that all countries must follow on all aspects of sampling, preparation, administration, and scoring. To further ensure standardization, independent international quality control monitors visit a sample of schools (or households in the case of PIAAC) in each country. In addition, the countries themselves organize their own quality control monitors to visit an additional number of schools (or households in the case of PIAAC). Results for countries that fail to meet the international requirements are footnoted with explanations of the specific failures (e.g., "only met guidelines for sample participation rates after substitute schools were included"), are shown separately in the international reports (e.g., listed in a separate section at the bottom of a table), or are omitted from the international reports and datasets (as happened to the Netherlands' PISA results in 2000, the United Kingdom's PISA results in 2003, and Morocco's TIMSS 2007 results at grade 8).
    • Are respondents required to participate in these studies?
      • To our knowledge, no countries require all schools and students to participate in PIRLS, PISA, or TIMSS. However, some countries give more prominence to these studies than do others. In the United States, participation by respondents to international studies is voluntary.
  • Issues of Validity and Reliability
    • How different are assessment test questions from what students are expected to learn in the classroom?
      • The answer varies from study to study. Some studies, like TIMSS, are curriculum-based and are designed to assess what students have been taught in school using multiple-choice and open-ended (or short answer) test questions. Other studies, like PISA and PIAAC, are "literacy" assessments, designed to measure performance in certain skill areas at a broader level than the school curriculum.
    • How do international studies deal with the fact that education systems around the world are so different?
      • The fact that education systems are different across countries is one of the main reasons we are interested in making cross-country comparisons. However, these differences make it essential to carefully define the target populations to be compared, so that comparisons are as fair and valid as possible. For studies focusing on students, depending in large part on when students first start school, students at a given age may have less or more schooling in different countries, and, students in a given grade may be of different ages in different countries. In every case, detailed information on the comparability of the sampled populations is published for review and consideration.

        For PIRLS, the target population represents students in the grade that corresponds to 4 years of formal schooling, counting from the first year of schooling as defined by the International Standard Classification of Education (ISCED), Level 1. This corresponds to fourth grade in most countries, including the United States. This population represents an important stage in reading development.

        In TIMSS, the two target populations are defined as follows: (1) all students enrolled in the grade that corresponds to 4 years of formal schooling—fourth grade in most countries—providing that the mean age at the time of testing is at least 9.5 years, and (2) all students enrolled in the grade that corresponds to 8 years of formal schooling—eighth grade in most countries—providing that the mean age at the time of testing is at least 13.5 years. For example, at grade four in 2007, only England, Scotland, and New Zealand included students who had 5 years of formal schooling at the time of testing. At grade eight, England, Malta, Scotland, and Bosnia and Herzegovina included students who had 9 years of formal school at the time of testing. In addition, at grade eight, the Russian Federation and Slovenia included some students who had less than 8 years of formal schooling. However, in all of these cases, the assessed students were of comparable average age to those participating in other countries.

        Another approach, used in PISA, is to designate a target population as students of a particular age (15 years in PISA), regardless of grade. Both approaches are suited to addressing the particular research questions posed by the assessments. The focus of TIMSS and PIRLS is on content as commonly expected to be taught in classrooms, while PISA emphasizes the skills and knowledge that students have acquired throughout their education both in and out of school.
    • Do international studies take into account that student and adult populations vary in participating countries—for example, the United States has higher percentages of immigrant students and adults than some other countries?
      • Each country has different population characteristics, but the point of international studies is to measure as accurately as possible the levels of achievement or proficiency of each participating country's target population. Differences in the levels of achievement or proficiency among students or adults in different countries may be associated with high variations in respondent characteristics, but they may also be due in part to differences in curriculum, teacher preparation, and other educational or societal factors.
    • What if countries select only their best students to participate? Won't they look better than the rest?
      • Countries cannot independently select the students who will take the test. Students are sampled, but the sampling of schools and students is carefully planned and monitored by the sponsoring international organizations.

        Sampling within countries proceeds as follows:
        A sample of schools in each country is selected randomly from lists of all schools in the country that have students in the particular grade or of the particular age to be assessed. Samples for each country are verified by an international sampling referee. Once the sample of schools is selected, each country must contact these original schools to solicit participation in the assessment. Countries are not allowed to switch schools from the list; doing so can result in the exclusion of their data from the reports.

        Every study establishes response rate targets of selected schools (and students) that countries must meet in order to have their data reported. If the response rate target is not met, countries may be able to assess students from substitute schools following international guidelines. For example, PIRLS and TIMSS guidelines specify that substitute schools be identified at the time that the original sample was selected by assigning the two schools neighboring the sampled school on the sampling frame as substitutes. If the original school declines to participate, the first of two substitute schools is contacted to participate. If it declines, the second substitute school is contacted. If it also declines, no other substitute school may be used. If one of the two substitute schools accepts, there are still several constraints on their participation in order to prevent bias. If participation levels, even using substitute schools, still fall short of international or national guidelines, a special non-response bias analysis is conducted to determine if the schools that did not participate differ systematically from the schools that did participate. If the analysis does not show evidence of bias, then the data for a country may still be included in the reporting of results for the international assessment but the problem of participation rates is noted.

        Once a sample of schools agrees to participate, the schools are asked to provide a list of all students of the target age or a list of a particular kind of class (for example, all grade 4 classrooms) within the school. From those lists, a group or whole class of students is then randomly selected for the assessment. No substitutions for the students randomly selected are allowed. However, some individual students may be excluded. Each study establishes a set of guidelines for excluding individual students from assessment. Typically, if a student has a verifiable cognitive or physical disability, he or she can be excluded from assessment. However, all student exclusions (at the school level and within schools) cannot exceed established levels, and are reported in international publications. For example, the sampling standards used in PISA permit countries to exclude up to a total of 5 percent of the relevant population for approved reasons. In the United States, the overall exclusion rate in PISA 2006 was 4.28 percent.

        Exclusions can take place at the school level (e.g., excluding very small schools or those in remote regions) and the student level. Students can be excluded if they are functionally disabled, intellectually disabled, or have insufficient language proficiency. This determination is made on the basis of information from the school, although the contractors implementing the study also look out for ineligible students who may make it through the screening process. Students cannot be excluded solely because of low proficiency or normal discipline problems.
  • Reported Results
    • Are scores of individual students or adults reported or available for analysis?
      • No. The assessment methods used in international assessments only produce valid scores for groups, not individuals.
    • Can you use the international data to report scores for states?
      • No. The U.S. data are typically representative of the nation as a whole but not of individual states. Drawing a sample that is representative of all 50 individual states would require a much larger sample than the United States currently draws for international assessments, requiring considerable amounts of additional time and money.

        A state may elect to participate in an international assessment as an individual jurisdiction, in which case a sample is drawn that is representative of that state. To date, several states have participated in TIMSS, PIRLS, and PISA that way.
    • Can you compare scores from one study to another?
      • Scores can be compared from one round of an assessment to another round of the same assessment (e.g., TIMSS 1999 to TIMSS 2007), but they typically cannot be directly compared from one study to another (e.g., TIMSS to PISA or NAEP) without special studies to link the different assessments.
    • Can you compare scores between grades—for example, between grade 4 and grade 8 scores on TIMSS?
      • No. The assessments for each grade are scaled separately, so the scores cannot be directly compared in a meaningful way. Only scores from different rounds of the same assessment (e.g., 2003 TIMSS grade 4 and 2007 TIMSS grade 4) can be compared.
    • Why does the United States report different findings for the same subjects from different international assessments?
      • At times, different assessments report different findings for the same subject. One obvious factor to consider when examining findings across assessments is that the grade or age levels of the students assessed may differ. Another factor is that studies also differ in the specific subject matter or skills emphasized, (e.g., reading, mathematics, science). An additional difference between assessments that can affect findings in terms of the U.S. position relative to other countries is the groups of countries involved in a study. The United States may appear to perform better or worse depending on the number and competitiveness of the other participating countries.
    • Why don't TIMSS, PISA, and PIRLS report differences between U.S. students and other countries' students based on race/ethnicity?
      • There are certain demographic characteristics that are not meaningful across countries. Race/ethnicity is one of these. In the United States, race and ethnicity are highly correlated with education and socio-economic status, which makes them meaningful categories for analysis. While that is also true in other countries, the racial and ethnic categories used to classify people vary from country to country
  • About PIAAC (the Program for the International Assessment of Adult Competencies)
    • What is assessed in PIAAC?
      • PIAAC is designed to assess adults over a broad range of abilities: from simple reading to complex computer-based problem-solving skills. All countries that participated in PIAAC in 2012 assessed the domains of literacy and numeracy in both a paper-and-pencil mode and a computer-administered mode. In addition, some countries assessed problem solving (administered on a computer) as well as components of reading (administered only in a paper-and-pencil mode). The United States assessed all four domains.
    • How valid is PIAAC? Are assessment questions that are appropriate for the population in one country necessarily appropriate for the population in another country?
      • The assessment was designed to be valid cross-culturally and cross-nationally.

        PIAAC assessment questions are developed in a collaborative, international process. PIAAC assessment questions were based on frameworks developed by internationally known experts in each subject or domain. Assessment experts and developers from ministries/departments of education and labor and OECD staff participated in the conceptualization, creation, and extensive year-long reviews of assessment questions. In addition, the PIAAC Consortium's support staff, assisted by expert panels, researchers, and working groups, developed PIAAC's Background Questionnaire. The PIAAC Consortium also guided the development of common standards and procedures for collecting and reporting data, as well as the international "virtual machine" software that administers the assessment uniformly across countries. All PIAAC countries follow the common standards and procedures and use the virtual machine software when conducting the survey and assessment. As a result, PIAAC can provide a reliable and comparable measure of literacy skills in the adult population of participating countries.

        Before the administration of the assessment, a field test was conducted in the participating countries. The PIAAC Consortium analyzed the field-test data and implemented changes to eliminate problematic test items or revise procedures prior to the administration of the assessment.
    • How can you be sure that countries administer the test in the same way?
      • The design and implementation of PIAAC was guided by technical standards and guidelines developed by literacy experts to ensure that the survey yielded high-quality and internationally comparable data. For example, for their survey operations, participating countries were required to develop a quality assurance and quality control program that included information about the design and implementation of the PIAAC data collection. In addition, all countries were required to adhere to recognized standards of ethical research practices with regard to respect for respondent privacy and confidentiality, the importance of ethics and scientific rigor in research involving human subjects, and the avoidance of practices or methods that might harm or seriously mislead survey participants. Compliance with the technical standards was mandatory and monitored throughout the development and implementation phases of the data collection through direct contact, submission of evidence that required activities had been completed, and ongoing collection of data from countries concerning key aspects of implementation.

        In addition, participating countries provided standardized training to the interviewers who administered the assessment in order to familiarize them with survey procedures that would allow them to administer the assessment consistently across respondents and reduce the potential for erroneous data. After the data collection process, the quality of each participating country's data was reviewed prior to publication. The review was based on the analysis of the psychometric characteristics of the data and evidence of compliance with the technical standards.
    • What does problem solving test or measure?
      • The "problem solving in technology-rich environments" domain assesses the cognitive processes of problem solving: goal setting, planning, selecting, evaluating, organizing, and communicating results. In a digital environment, these skills involve understanding electronic texts, images, graphics, and numerical data, as well as locating, evaluating, and critically judging the validity, accuracy, and appropriateness of the accessed information.
    • What are "technology-rich environments"?
      • The environment in which PIAAC problem solving is assessed is meant to reflect the fact that digital technology has changed the ways in which individuals live their day-to-day lives, communicate with others, work, conduct their affairs, and access information. Information and communication technology tools such as computer applications, the Internet, and mobile technologies are all part of the environments in which individuals operate. In PIAAC, items for problem solving in technology-rich environments are presented on laptop computers in simulated software applications using commands and functions commonly found in e-mail, web browsers, and spreadsheets.
    • Why does PIAAC assess literacy only in English?
      • PIAAC assesses adults in the official language or languages of each participating country. Based on a 1988 congressional mandate and the 1991 National Literacy Act, the U.S. Department of Education is required to evaluate the status and progress of adults' literacy in English. However, in order to obtain background information from a wide range of respondents in the United States, the PIAAC Background Questionnaire was administered in both English and Spanish.
    • How does PIAAC select a representative sample of adults?
      • Countries that participate in PIAAC must draw a sample of individuals ages 16-65 that represents the entire population of adults living in households in the country. Some countries draw their samples from national registries of all persons in the country; others draw their samples from census data. In the United States, a nationally representative household sample was drawn from the most current Census Bureau population estimates.

        The U.S. sample design employed by PIAAC in the first round of U.S. data collection is generally referred to as a four-stage stratified area probability sample. This method involves the selection of (1) primary sampling units (PSUs) consisting of counties or groups of contiguous counties, (2) secondary sampling units (referred to as segments) consisting of area blocks, (3) dwelling units (DUs), and (4) eligible persons (the ultimate sampling unit) within DUs. Random selection methods are used, with calculable probabilities of selection at each stage of sampling. This sample design ensured the production of reliable statistics for a minimum of 5,000 completed cases for the first round of data collection. For more information about the sample design used in the second round of U.S. data collection, see question 21.
    • Were immigrants, illegal immigrants, or non-English speakers assessed in PIAAC? Did they bring down our scores?
      • All adults, regardless of immigration status, were part of the PIAAC Main Study's target population for the assessment. In order to get a representative sample of the adult population currently residing in the United States, respondents were not asked about citizenship status before taking the assessment and were guaranteed anonymity for all their answers to the Background Questionnaire. Although the assessment was administered only in English, the Background Questionnaire was offered in both Spanish and English. These procedures allowed the estimates to be applicable to all adults in the United States, regardless of citizenship or legal status, and they mitigated the effects of low-English language proficiency.

        As in most participating countries, non-native-born adults in the United States had, on average, lower scores than native-born adults. The percentage of non-native-born adults in the United States was 15 percent. The average percentage of non-native-born adults across all participating countries was 12 percent, ranging from less than 1 percent in Japan to 28 percent in Australia.
    • What if some countries select only their highest performing adults to participate in PIAAC? Won't they look better than the other participating countries?
      • Sampling is carefully planned and monitored. The rules of participation require that countries design a sampling plan that meets the standards in the PIAAC Technical Standards and Guidelines and submit it to the PIAAC Consortium for approval. In addition, countries were required to complete quality control forms to verify that their sample was selected in an unbiased and randomized way. Quality checks were performed by the PIAAC Consortium to ensure that the submitted sampling plans were followed accurately.
    • Are adults required to participate in PIAAC?
      • No, PIAAC is a voluntary assessment.
    • How do international assessments deal with the fact that adult populations in participating countries are so different? For example, the United States has higher percentages of immigrants than some other countries.
      • The PIAAC results are nationally representative and therefore reflect countries as they are: highly diverse or not. PIAAC collects extensive information about respondents' background and therefore supports analyses that take into account differences in the level of diversity across countries. The international PIAAC report produced by the OECD presents some analyses that examine issues of diversity.
    • How does PIAAC differ from international student assessments?
      • As an international assessment of adult competencies, PIAAC differs from student assessments in several ways. PIAAC assesses a wide range of ages (16-65), whereas student assessments target a specific age (e.g., 15-year-olds in the case of PISA) or grade (e.g., grade 4 in PIRLS). PIAAC is a household assessment (i.e., an assessment administered in individuals' homes), whereas the international student assessments (PIRLS, PISA, and TIMSS) are conducted in schools. The skills that are measured in each assessment also differ based on the goals of the assessment. Both TIMSS and PIRLS are curriculum based and are designed to assess what students have been taught in school in specific subjects (such as science, mathematics, or reading) using multiple-choice and open-ended test questions. In contrast, PIAAC and PISA are "literacy" assessments, designed to measure performance in certain skill areas at a broader level than school curricula. So while TIMSS and PIRLS aim to assess the particular academic knowledge that students are expected to be taught at particular grades, PISA and PIAAC encompass a broader set of skills that students and adults have acquired throughout life.
    • How does PIAAC differ from earlier adult literacy assessments, such as NALS, IALS, NAAL and ALL?
      • PIAAC has improved and expanded on the cognitive frameworks of previous large-scale adult literacy assessments (including NALS, NAAL, IALS, and ALL) and has added an assessment of problem solving via computer, which was not a component of these earlier surveys. In addition, PIAAC is capitalizing on prior experiences with large-scale assessments in its approach to survey design and sampling, measurement, data collection procedures, data processing, and weighting and estimation. The most significant difference between PIAAC and previous large-scale assessments is that PIAAC is administered on laptop computers and is designed to be a computer-adaptive assessment, so respondents receive groups of items targeted to their performance levels (respondents not able to or not wishing to take the assessment on computer are provided with an equivalent paper-and-pencil version of the literacy and numeracy items). Because of these differences, PIAAC introduced a new set of scales to measure adult literacy, numeracy, and problem solving. Some scales from these previous adult assessments have been mapped to the PIAAC scales so that performance can be measured over time.
    • How do PIAAC and PISA compare?
      • PIAAC and PISA both emphasize knowledge and skills in the context of everyday situations, asking students and adults to perform tasks that involve real-world materials as much as possible. PISA is designed to show the knowledge and skills that 15-year-old students have accumulated within and outside of school. It is intended to provide insight into what students who are about to complete compulsory education know and are able to do.

        PIAAC focuses on adults who are already eligible to be in the workforce and aims to measure the set of literacy, numeracy, and technology-based problem-solving skills an individual needs in order to function successfully in society. Therefore, PIAAC does not directly measure the academic skills or knowledge that adults may have learned in school. Instead, the PIAAC assessment focuses on tasks that adults may encounter in their lives at home, at work, or in their community.
    • Why doesn't PIAAC report differences between minorities in the United States and minorities in other countries?
      • Each country can collect data for subgroups of the population that have national importance. In some countries, these subgroups are identified by language usage; in other countries, they are distinguished by tribal affiliation. In the United States, different racial and ethnic subgroups are of national importance. However, categories of race and ethnicity are social and cultural categories that differ greatly across countries. As a result, they cannot be compared accurately across countries.
    • Can the PIAAC data be used to report scores for states?
      • In total, in the United States, 8,670 adults participated in PIAAC in 2012 and 2014, which is not enough respondents to produce accurate estimates at the state or county level. Thus, in the United States, PIAAC results can only be reported at the national level. NCES is in the process of reviewing plans for producing state-level (synthetic) estimates.
    • How do international assessments deal with the fact that education systems are so different across countries?
      • PIAAC collects extensive information on educational attainment and years of schooling. For the purpose of cross-country comparisons of educational attainment, the education level classifications of each country are standardized using the International Standard Classification of Education (ISCED). For example, the ISCED level for short-cycle tertiary education (ISCED level 5) is equivalent to an associate's degree in the United States; therefore, comparisons of adults with an associate's degree or its equivalent can be made across countries using this classification. Please note that the education variables in PIAAC 2012 were classified using the ISCED97. Additional education variables that were classified using the ISCED11 are available in the PIAAC 2012/2014 dataset.
    • What is the PIAAC National Supplement?
      • The National Supplement, conducted in 2013–14, was the second round of data collection for PIAAC in the United States; it followed the Main Study, the first round of data collection, which was conducted in 2011–12 and surveyed adults ages 16-65. The National Supplement increased the number of unemployed adults (ages 16-65) and young adults (ages 16-34) in the sample and added older adults (ages 66-74) as well as incarcerated adults (ages 16-74).
    • Why was a second round of PIAAC data collected in the United States?
      • The second round of data collection for PIAAC in the United States was conducted for two reasons. First, augmenting the first round of PIAAC data by increasing the sample size permits more in-depth analyses of the cognitive and workplace skills of the U.S. population (in particular, of unemployed and young adults). Second, the additional information on older adults (ages 66-74) and incarcerated adults makes it possible to compare PIAAC data with rescaled proficiency data from the 2003 National Assessment of Adult Literacy (NAAL). This, in turn, makes it possible to analyze change in adult skills over the decade between the two studies.
    • What are the differences between the first round of data collection for PIAAC in 2012 (the Main Study) and the second round in 2014 (the National Supplement)?
      • In both rounds of PIAAC in the United States, the same instruments and procedures, including the Background Questionnaire and Direct Assessment, were used for the household survey. For the prison study, the Background Questionnaire was modified to collect information related to the needs and experiences of incarcerated adults.

        The two data collections also sampled different populations. The first round of data collection surveyed a nationally representative sample of adults ages 16-65, while the second round did not survey a nationally representative sample of adults, but rather only the key subgroups of interest. The second round of PIAAC also surveyed two subgroups of the population that were not part of the first round of data collection: older adults (ages 66-74) and incarcerated adults (ages 16-74). Note that in the new data release, the two household samples were combined to provide a nationally representative sample of 16-74-year-old adults across the period of data collection (2011–2014).
    • What is the scope of the household sample in the 2014 National Supplement? Is it different from the scope of the household sample in the 2012 Main Study?
      • The second round of data collection for PIAAC (in 2014) sampled 3,660 U.S. adults who were unemployed (ages 16-65), young (ages 16-34), or older (ages 66-74). The household sample selection in the second round differed from the first round (in 2012) in that only persons in the target groups were selected. The sampling approach in the second round consisted of an area sample that used the same primary sampling units (PSUs) as in the first round; in addition, it included a list sample of dwelling units from high-unemployment Census tracts in order to obtain the oversample of unemployed adults. When the data from both rounds are combined, they produce a nationally representative sample with larger subgroup sample sizes that can produce estimates of higher precision for the subgroups of interest.
    • What is the scope of the prison sample in the second round of data collection?
      • The Prison Study sample consists of 1,300 adults, ages 16-74, incarcerated in federal and state prisons in the United States. Data collection began in February 2014 and was completed in June 2014. A two-stage sample design was used to select the inmates. In the first stage, 100 prisons were selected (of which 98 participated), and in the second stage, approximately 15 inmates, on average, were selected from the sampled facilities. An oversample of female prisons was selected to ensure an adequate sample of female inmates. The Prison Study sample was selected independently of the PIAAC household sample and is weighted separately from the household sample. Prison weights are calibrated to national prison population totals (for inmates ages 16-74) provided by the Bureau of Justice Statistics.
    • Were the same instruments and procedures used in the first and second rounds of data collection? Were the same instruments and procedures used for the household samples and the prison sample?
      • The same procedures and instruments, including the Background Questionnaire and Direct Assessment, used in the first round of data collection were employed in the second-round household and prison data collections.

        However, the Background Questionnaire for the prison sample was tailored to collect information related to the needs and experiences of incarcerated adults. Adaptations to the questionnaire for the prison population included (a) deleting questions that would be irrelevant to respondents in prison; and (b) adding questions that addressed respondents' specific activities in prison (e.g., participation in academic programs and English as a Second Language (ESL) classes; experiences with prison work assignments; involvement in nonacademic programs, such as life skills and employment readiness classes; and educational attainment and employment prior to incarceration).

        The same Direct Assessment used in the household sample was used in the prison sample.
    • What incentives were given to participants?
      • A monetary incentive of $5 was paid to household representatives who completed the screener—which contained questions that would determine the eligibility of household members to be included in the sample—in the second round of the PIAAC data collection. In the first round, no monetary incentive was paid to household representative for completing the screener.

        The screener incentive used in the second round of data collection was intended to help reduce nonresponse to a screener that was slightly longer than that used in the first round. Specifically, the second-round screener included various questions about unemployment status that were not included in the first-round screener. As in the first round of data collection, following the completion of the assessment, an additional monetary incentive of $50 was paid to each respondent. The incentive was also paid to those adults who attempted to complete the assessment, but were legitimately not able to complete it because of language barriers or physical or mental disabilities. Respondents who refused to continue with the assessment were not compensated.
    • How and why are the current U.S. results (from the combined 2012/2014 dataset) different from the results from the PIAAC Main Study in 2012? Why did the U.S. household scores and ranking change? Did it change because the skills of U.S. adults improved or declined between 2012 and 2014?
      • The United States conducted two rounds of data collection for PIAAC, but not two independent studies. The first and second rounds of data collected are meant to be combined and analyzed together, but they cannot be compared.

        Because of the timing of the first and second rounds of the PIAAC data collection in the United States, the information available for the study's sampling frames differed between 2012 and 2014. Specifically, the 2012 data were based on the 2000 U.S. Census, while the 2014 data were based on the 2010 U.S. Census. Therefore, in addition to the larger combined sample (8,670 for the household), the improved accuracy of estimates are due in part to the revised population estimates based on the 2010 Census data, which were unavailable when PIAAC 2012 went into the field.

        For the 2012 data collection, weights for all respondents were calibrated to the U.S. Census Bureau's 2010 American Community Survey population totals for those ages 16-65. (The 2010 American Community Survey population totals were derived from 2000 U.S. Census projections because the full 2010 U.S. Census population results were not yet available.) Once the 2010 U.S. Census population results were finalized, the U.S. Census refreshed its entire time series of estimates going back to the previous census each year using the most current data and methodology. One result of this refresh is a shift in the proportion of the population with more education.

        A comparison of the population totals used to calibrate the 2012 Main Study data with those used to calibrate the composite 2012/2014 dataset reveals that the percentage of the U.S. population ages 16-65 with college experience (some college or a college degree) increased by 3 to 4 percent and the percentage of the population ages 16-65 with less than a high school diploma decreased by 4 percent. This change has no effect on PIAAC's measurement of skills in the United States, but it does mean that the proportion of the population with higher skills has been found to be larger than previously estimated for the 2012 Main Study. Therefore, adults' skills did not change in this time period, but due to the larger sample and the updated Census data, the estimates of skills reported with the combined 2012/2014 sample are more accurate.
    • How and why are the international averages reported in the 2014 First Look report is different from the averages reported in the 2012 PIAAC First Look report?
      • The PIAAC international averages in the 2012 PIAAC First Look report were calculated by the OECD using restricted data from all participating countries. However, restricted data from Australia and Canada are not available to the United States because of national restrictions on the use of their data. Thus, with the exception of figures 1 and 2, the PIAAC international averages in the 2014 PIAAC First Look report were calculated (a) without Australia's data, (b) with Canada's publicly available data, and (c) with the 2012/2014 U.S. data. Differences in the international averages calculated for the 2012 PIAAC First Look report and those calculated for the 2014 PIAAC First Look report are very small but, on account of them, some estimates round differently.
    • Can the combined 2012/2014 U.S. household sample be compared to the samples in other countries? Which subsamples can be compared? Which cannot?
      • The combined 2012/2014 U.S. household sample of all adults ages 16-65 can be compared to samples from the other countries that participated in PIAAC. Two of the additional subsamples that were a focus of the National Supplement can also be compared to international samples: the sample of younger adults ages 16-34 and unemployed adults ages 16-65.

        Two of the other household samples are unique to the U.S. supplemental study and cannot be compared to samples from other countries: the sample of older adults 66-74 and the total sample of adults 16-74.
    • Do all of the currently available estimates include data from the 2014 National Supplement?
      • The estimates included in the 2014 PIAAC First Look report include data from the National Supplement. The NCES PIAAC website has also been updated with results based on the 2012/2014 data, where possible. In addition, NCES PIAAC Results Portal has been updated to show results that include the 2012/2014 data. The NCES International Data Explorer (IDE) has also been updated to allow users to conduct analyses on the U.S. PIAAC 2012/2014 data. Additionally, the U.S. PIAAC 2012/2014 public- and restricted-use data files will soon be available.

        The international U.S. public-use file available on the OECD website and the OECD IDE will be updated to include the U.S. PIAAC 2012/2014 data later in 2016.
    • When will the results from the U.S. supplemental study of prisons become available?
      • Results from the U.S. supplemental study of prisons will be available later in 2016.
  • About PIRLS (the Progress in International Reading Literacy Study)
    • What is PIRLS?
      • The Progress in International Reading Literacy Study (PIRLS) is an international assessment and research project designed to measure trends in reading achievement at the fourth-grade level as well as school and teacher practices related to instruction. Since 2001, PIRLS has been administered every 5 years. PIRLS 2016, the fourth study in the series, involves students from 54 education systems, including the United States. For the first time PIRLS is also administering an innovative assessment of online reading called ePIRLS.
    • What is ePIRLS?
      • ePIRLS is an innovative assessment of online reading which makes it possible for countries to understand how successful they are in preparing fourth grade students to read, comprehend, and interpret online information. The first administration of ePIRLS is in 2016 and is being administered in 16 education systems. More information on ePIRLS can be found in the IEA ePIRLS brochure.
    • What questions can PIRLS answer?
      • PIRLS is a carefully constructed reading assessment, consisting of a test of reading literacy and questionnaires to collect information about 4th-grade students' literacy performance. PIRLS will help educators and policymakers by answering questions such as:
        • How well do 4th-grade students read?
        • How do students in one country compare with students in another country in reading literacy?
        • Do 4th-grade students value and enjoy reading?
        • Internationally, how do the reading habits and attitudes of students vary?
    • What aspects of reading literacy are assessed in PIRLS?
      • PIRLS focuses on three aspects of reading literacy:
        • purposes of reading;
        • processes of comprehension; and
        • reading behaviors and attitudes.
        The first two form the basis of the written test of reading comprehension. The student background questionnaire addresses the third aspect.

        In PIRLS, purposes of reading refers to the two types of reading that account for most of the reading done by young students, both in and out of school: (1) reading for literary experience, and (2) reading to acquire and use information. In the assessment, narrative fiction is used to assess students' ability to read for literary experience, while a variety of informational texts are used to assess students' ability to acquire and use information while reading. The PIRLS assessment contains about an equal proportion devoted to each of these two purposes.

        Processes of comprehension refer to ways in which readers construct meaning from the text. Readers focus on and retrieve explicitly stated information; make straightforward inferences; interpret and integrate ideas and information; and evaluate and critique content, language, and textual elements.
        For more information on the purposes for reading and processes of comprehension, see the PIRLS 2016 Assessment Framework.
    • What are the components of PIRLS?
      • Assessment
        The assessment instruments include 4th-grade-level stories and informational texts collected from several different countries. Students are asked to engage in a full repertoire of reading skills and strategies, including retrieving and focusing on specific ideas, making simple and more complex inferences, and examining and evaluating text features. The passages are followed by open-ended and multiple-choice format questions about the text.

        The 2016 assessment consists of 15 booklets and 1 reader (presented in a magazine-type format with the questions in a separate booklet). The assessment is given in two 40-minute parts with a 5- to 10-minute break in between. Each of the booklets contains two parts—one block of literary experience items and one block of informational items—and each block occurs twice across the 15 total booklets. As the entire assessment consists of 12 blocks of passages and items, using different booklets allows PIRLS to report results from more assessment items than can fit in one booklet, without making the assessment longer. To provide good coverage of each skill domain, the test items developed require about 8 hours of testing time. However, testing time is limited to 80 minutes per student by clustering items in blocks and randomly rotating the blocks of items throughout the student test booklets. As a consequence, no student receives all items (there were a total of 175 items on the 2016 assessment), but each item is answered by a representative sample of students.

        A total of 12 reading passages—two from PIRLS 2001, 2006 and 2011, two from 2006 and 2011, two from PIRLS 2011 only, and six new passages—are included in the 2016 assessment booklets used in all participating education systems. The use of common passages from the 2001 through the 2016 assessments allows for the analysis of change in reading literacy over the 15-year period between administrations for countries that participated in these cycles. The passages, as well as all other study materials, were translated into the primary language or languages of instruction in each education system.

        Similar to PIRLS, the ePIRLS assessment consist of five tasks, but students are asked to complete only two of the five 40-minute tasks. Each task contains school-based online reading tasks, each of which involves two to three different websites totaling five to ten web pages, for which students are then asked to complete a series of comprehension questions based on the task.

        Questionnaires
        Background questionnaires are administered to collect information about students' home and school experiences in learning to read. A student questionnaire addresses students' attitudes towards reading and their reading habits. The student questionnaire is administered after the assessment portion, taking about 30 minutes to complete. In all, PIRLS takes 1½ to 2 hours of each student's time, including the assessment and background questionnaire.

        In addition, questionnaires are given to students' teachers and school principals to gather information about students' school experiences in developing reading literacy. The teacher and school questionnaires are administered either online from a secure website or via a hardcopy form. Teacher questionnaires take about 40 minutes to complete and ask teachers questions about their education and experience, available resources, and instructional practices. School questionnaires take about 40 minutes to complete and ask about school practices and resources.

        In countries other than the United States, a parent questionnaire is also administered.
    • What is the data collection schedule?
      • In both hemispheres, PIRLS is conducted near the end of the school year. Thus, for PIRLS 2016, countries in the Southern Hemisphere conduct the study between October and December, 2015. Countries in the Northern Hemisphere conduct the study between March and June, 2016.
    • What is the sample design?
      • Each participating country agrees to select a sample which is representative of the target population as a whole. In 2001, the target population was the upper of the two adjacent grades with the most 9-year-olds. For PIRLS 2006, 2011, and 2016, the definition of the target population was refined to represent students in the grade that corresponds to four years of schooling, counting from the first year of International Standard Classification of Education (ISCED) Level 1—4th grade in most countries, including the United States. This population represents an important stage in the development of reading. At this point, generally children have learned to read and are using reading to learn. IEA's Trends in International Mathematics and Science Study (TIMSS) has also chosen to assess this target population of students.

        In each administration of PIRLS, schools are randomly selected first (with a probability proportional to the estimated number of students enrolled in the target grade) and then one or two classrooms are randomly selected within each school. In 2001, a nationally representative sample of 3,763 U.S. 4th-grade students was selected from a sample of 174 schools. In 2006, a nationally representative sample of 5,190 U.S. 4th-grade students was selected from a sample of 183 schools. In 2011, a nationally representative sample of 12,726 U.S. 4th-grade students was selected from a sample of 370 schools.

        The reason for a larger sample size in 2011 than in previous administrations of PIRLS was that in 2011 both TIMSS and PIRLS happened to coincide in the same year. The decision was made to draw a larger sample of schools and to request that both studies be administered in the same schools (where feasible), albeit to separate classroom samples of students. Thus, TIMSS (grade 4) and PIRLS in the United States were administered in the same schools but to separately sampled classrooms of students.

        Between February and May of 2016, 150 schools nationwide are taking part in the PIRLS Main Study. Within each school, 4th-grade classrooms are randomly selected to represent the nation's 4th-graders. All students from selected classrooms are invited to participate in the PIRLS main study.
    • Which countries are participating?
      • The table below lists the total number of education systems that have participated in each of the four administrations of PIRLS at grade 4. This number includes both countries and subnational entities, such as Canadian provinces, U.S. states, England, and Hong Kong. For more information on participating education systems, including a complete list of education systems participating in ePIRLS, visit the PIRLS Country Page.

        Year Education systems participating
        in PIRLS at grade 4
        2001 36
        2006 45
        2011 53
        2016 54
    • How was PIRLS developed and administered?
      • PIRLS is a cooperative effort involving representatives from every education system participating in the study. Prior to each administration of PIRLS, the framework is reviewed and updated to reflect changes in the curriculum and instruction of participating education systems, while maintaining the ability to measure change over time. Extensive input is received from experts in reading education, assessment, and curriculum, as well as representatives from national education centers around the world.

        In order for educators, policymakers, and other stakeholders to better understand the results from PIRLS, many assessment items are released for public use after each administration. To replace these items, countries submit items for review by subject-matter specialists, and additional items are written by a committee in consultation with item-writing specialists in various countries to ensure that the content, as explicated in the frameworks, is covered adequately. Items are reviewed by a committee and field-tested in most of the participating education systems. Results from the field test are used to evaluate item difficulty, how well items discriminate between high- and low-performing students, evidence of bias toward or against individual countries or in favor of boys or girls, etc. In 2016, 95 new items were selected for inclusion in the international assessment and added to 80 existing items. PIRLS is administered as a pencil-and-paper assessment and includes both multiple choice and constructed response items. The item pool contains a selection of literary passages drawn from children's storybooks and informational texts. Literary passages include realistic stories and traditional tales, while informational texts include chronological and nonchronological articles, biographical articles, and informational leaflets.
    • How does PIRLS compare to the NAEP fourth-grade reading assessment?
      • Three studies have compared PIRLS and NAEP in terms of their measurement frameworks and the reading passages and questions included in the assessments. The most recent study (See Appendix C in the PIRLS 2011 report) compared NAEP with PIRLS 2011. A prior study—A Comparison of the NAEP and PIRLS Fourth-Grade Reading Assessments PDF icon (852 KB)— compared NAEP with PIRLS 2001 and the second study—Comparing PIRLS and PISA with NAEP in Reading, Mathematics, and SciencePDF icon (211 KB)—compared NAEP with PIRLS 2006. The studies found the following similarities and differences:

        Similarities
        • PIRLS and NAEP call for students to develop interpretations, make connections across text, and evaluate aspects of what they have read.
        • PIRLS and NAEP use literary passages drawn from children's storybooks and informational texts as the basis for the reading assessment.
        • PIRLS and NAEP use multiple-choice and constructed-response questions with similar distributions of these types of questions.
        Differences
        • PIRLS reading passages are, on average, shorter than fourth grade NAEP reading passages.
        • Results of readability analyses suggest that the PIRLS reading passages are easier than the NAEP passages (by about one grade level on average).
        • PIRLS calls for more text-based interpretation than NAEP. NAEP places more emphasis on having students take what they have read and connect to other readings or knowledge and to critically evaluate what they have read.
    • Where can I get a copy of the PIRLS U.S. Report?
    • When will PIRLS be administered again?
      • PIRLS is being administered in 2016. For more information on the schedule leading up to the release of PIRLS results in December 2017, visit the PIRLS Schedule & Plans Page.
    • Which schools are selected for participation?
      • Schools of varying demographics and locations are randomly selected so that the overall U.S. sample is representative of the overall U.S. school population. The random selection process is important for ensuring that a country's sample accurately reflects its schools and, therefore, can be compared fairly with samples of schools from other countries.
    • Are all fourth-grade students in a school asked to participate?
      • In schools with only one or two fourth-grade classrooms, all students are asked to participate. In schools with more than two fourth-grade classrooms, only students in two randomly selected classrooms are asked to participate. Some classrooms selected to participate in PIRLS are also asked to take part in ePIRLS. In classrooms that are also asked to participate in ePIRLS, PIRLS is administered on one day and ePIRLS on a second day. In addition, some students with special needs or limited English proficiency may be excused from the assessment.
  • About PISA (the Program for International Student Assessment)
    • What subject areas are assessed in PISA?
      • PISA measures student performance in mathematics, reading, and science literacy. Conducted every 3 years, each PISA data cycle assesses one of the three core subject areas in depth (considered the major domain), although all three core subjects are assessed in each cycle (the other two subjects are considered minor subject areas for that assessment year). Assessing all three subjects every 3 years allows countries to have a consistent source of achievement data in each of the three subjects while rotating one area as the primary focus over the years. More information on the PISA assessment frameworks can be found at: www.oecd.org/pisa/pisaproducts.

        Science is the major subject area in 2015, as it was in 2006, since each subject is a major subject area once every three cycles. In 2015, all subjects were assessed primarily through a computer-based assessment. In addition to the core assessments of science, reading, mathematics, and collaborative problem solving, the United States participated in the optional financial literacy assessment in 2015.

        PISA administration cycle

        Assessment year 2000 2003 2006 2009 2012 2015
        Subjects assessed READING
        Mathematics
        Science
        Reading
        MATHEMATICS
        Science
        Problem solving
        Reading
        Mathematics
        SCIENCE
        READING
        Mathematics
        Science
        Reading
        MATHEMATICS
        Science
        Problem solving
        Financial
        Reading
        Mathematics
        SCIENCE
        CPS
        Financial

        NOTE: Reading, mathematics, and science literacy are all assessed in each assessment cycle of the Program for International Assessment (PISA). The subject in all capital letters is the major subject area for that cycle. A collaborative problem solving (CPS) assessment was administered in 2015. Financial literacy is an optional assessment for countries. As of 2015, PISA will be administered entirely on computer.
    • What are the components of PISA?
      • Assessments
        PISA 2015 consists of computer-based assessments of students' mathematics, science, and reading literacy, and collaborative problem solving skills. In each participating school, sampled students sit for a two-hour computer-based assessment. Countries can also opt to participate in an assessment of financial literacy.

        Questionnaires
        In 2015, students completed a student questionnaire providing information about their background, attitudes towards science, and learning strategies and the principal of each participating school completed a school questionnaire providing information on the school's demographics and learning environment. New to 2015, PISA included a teacher questionnaire, to be completed by up to 10 science and 15 non-science teachers per school. There were separate teacher questionnaires that were administered to science teachers and non-science teachers. The PISA questionnaires used in the United States in prior cycles are available at: http://nces.ed.gov/surveys/pisa/questionnaire.asp.
    • How many U.S. schools and students participate in PISA?
      • Assessment year Number of participating students Number of participating schools School response rate (percent) Overall student response rate (percent)
        Original schools With substitute schools
        2000 3,700 145 56 70 85
        2003 5,456 262 65 68 83
        2006 5,611 166 69 79 91
        2009 5,233 165 68 78 87
        2012 6,111 161 67 77 89
    • How does PISA select a representative sample of students?
      • Step 1

        To provide valid estimates of student achievement and characteristics, PISA selects a sample of students that represents the full population of 15-year-old students in each participating country or education system. This population is defined internationally as 15-year-olds (15 years and 3 months to 16 years and 2 months at the beginning of the testing period) attending both public and private schools in grades 7-12. Each country or education system submitted a sampling frame to the consortium of organizations responsible for the implementation of PISA 2015 internationally. Westat, a survey research firm in Rockville, Maryland, contracted by the OECD, then validates each country or education system's frame.

        Step 2

        Once a sampling frame is validated, Westat draws a scientific random sample of a minimum of 150 schools from each frame with two replacement schools for each original school, unless there are less than 150 schools, in which case all schools would be sampled. A minimum of 50 schools are sampled for adjudicating entities (e.g., U.S. states that opted to participate separately in 2015). The list of selected schools, both original and replacement, is delivered to each education system's PISA national center. Countries and education systems do not draw their own samples.

        Step 3

        Each country/education system is responsible for recruiting the sampled schools. They begin with the original sample and only use the replacement schools if an original school refuses to participate. In accordance with PISA guidelines, replacement schools are identified by assigning the two schools neighboring the sampled school in the frame as substitutes to be used in instances where an original sampled school refuses to participate. Replacement schools are required to be in the same implicit stratum (i.e., have similar demographic characteristics) as the sampled school. A minimum participation rate of 65 percent of schools from the original sample of schools is required for a country or education system's data to be included in the international database.

        Step 4

        After schools are sampled and agree to participate, students are sampled. Each country/education system submits student listing forms containing all age-eligible students for each of their schools using Key Quest, the internationally provided software.

        Step 5

        Westat carefully reviews the student lists and uses sophisticated software to perform data validity checks to compare each list against what is known of the schools (e.g., expected enrollment, gender distribution) and PISA eligibility requirements (e.g., grade and birthday ranges). The selected student samples are then sent back to each national center. Unlike school sampling, students are not sampled with replacement.

        Step 6

        Schools inform students of their selection to participate on assessment day. Student participation must be at least 80 percent for a country's/education system's data to be reported by the OECD.
    • Which countries participate in PISA?
      • Countries and education systems within countries participate in PISA.

        • PISA 2015: 75 countries and education systems have signed up to participate.
        • PISA 2012: 65 countries and education systems participated.
        • PISA 2009: 75 countries and education systems participated (10 of these administered PISA 2009+ in 2010).
        • PISA 2006: 57 countries and education systems participated.
        • PISA 2003: 41 countries and education systems participated.
        • PISA 2000: 43 countries and education systems participated (11 of these administered PISA 2000 in 2001/2002).
        The list of countries and education systems that participated in each PISA cycle is available at: http://nces.ed.gov/surveys/pisa/countries.asp.
    • When is PISA data collected in the United States?
      • PISA is administered in the fall in the United States, typically in October to November in the year of the assessment. The 2015 data was collected in October-November 2015.
    • Where can I get a copy of the U.S. PISA reports?
    • When is PISA next scheduled to be administered?
      • The most recent administration of PISA was in 2015. In the United States, data collection occurred in October-November 2015. Results will be reported in December 2016. The next round of PISA will be administered in 2018.
    • How is the OECD Test for Schools related to PISA?
      • In 2012, the OECD piloted a new test, based on the PISA assessment frameworks and statistically linked to the PISA scales, for individual schools. The purpose of this test, called the OECD Test for Schools in the United States, is for individual schools to benchmark their performance internationally. While based on PISA, the OECD Test for Schools is a different assessment and has a different purpose than PISA. More information about this is available from the OECD at: http://www.oecd.org/pisa/aboutpisa/pisa-basedtestforschools.htm.
    • How does PISA differ from other international assessments?
      • PISA differs from these studies in several ways:

        Content
        PISA is designed to measure "literacy" broadly, while other studies, such as TIMSS and NAEP, have a stronger link to curriculum frameworks and seek to measure students' mastery of specific knowledge, skills, and concepts. The content of PISA is drawn from broad content areas, such as space and shape for mathematics, in contrast to more specific curriculum-based content such as geometry or algebra.

        Tasks
        In addition to the differences in purpose and age coverage between PISA and other international comparative studies, PISA differs from other assessments in what students are asked to do. PISA focuses on assessing students' knowledge and skills in reading, mathematics, and science literacy in the context of everyday situations. That is, PISA emphasizes the application of knowledge to everyday situations by asking students to perform tasks that involve interpretation of real-world materials as much as possible. Analyses based on expert panels' reviews of mathematics and science items from PISA, TIMSS, and NAEP indicate that PISA items require multi-step reasoning more often than either TIMSS or NAEP. The study also shows that PISA mathematics and science literacy items often involve the interpretation of charts and graphs or other "real world" material. These tasks reflect the underlying assumption of PISA: as 15-year-olds begin to make the transition to adult life, they need to not only comprehend what they read or to retain particular mathematical formulas or scientific concepts, they need to know how to apply their knowledge and skills in the many different situations they will encounter in their lives.

        Moreover, NAEP and PISA have different underlying approaches to mathematics that play out in the operationalization of items. NAEP focuses more closely on school-based curricular attainment whereas PISA focuses on literacy, or the use of mathematics in real-word situations. The implication of this difference is that while the NAEP assessment is not devoid of real-world contexts, it does not specifically require them; thus it includes computation items as well as problem solving items U.S. students are likely to encounter in school. PISA does not include any computation items (nor any items) that are not placed within a real-world context and, in that way, may be more unconventional to some students. PISA items also may have a heavier reading load, use a greater diversity of visual representations, and require students to make assumptions or sift through information that is irrelevant to the problem (i.e., 'mathematize'), whereas NAEP items typically do not include this aspect. These are thus other ways in which the assessments differ and explain divergent trend results.

        A study comparing the PISA and NAEP (grades 8 and 12) reading assessments found that PISA and NAEP view reading as a constructive process and both measure similar cognitive skills. There are differences between them, though, reflecting in part the different purposes of the assessments. First, NAEP has longer reading passages than PISA and asks more questions about each passage, which is possible because of the NAEP passages' longer length. With regard to cognitive skills, NAEP has more emphasis on critiquing and evaluating text, while PISA has more emphasis on locating information. NAEP also measures students' understanding of vocabulary in context and PISA does not include any questions of this nature. Finally, NAEP has a greater emphasis on multiple-choice items compared to PISA and the nature of the open-ended items differs, where PISA open-ended items call for less elaboration and support from the text than do those in NAEP.

        To learn more about the differences in the respective approaches to the assessment of mathematics, science and reading among PISA, TIMSS, and NAEP, see the following papers: Age-based sample
        The goal of PISA is to represent outcomes of learning rather than outcomes of schooling. By placing the emphasis on age, PISA intends to measure what 15-year-olds have learned inside and outside of school throughout their lives, not just in a particular grade. Focusing on age 15 provides an opportunity to measure broad learning outcomes while all students across the many participating nations are still required to be in school. Finally, because years of education vary among countries and education systems, choosing an age-based sample makes comparisons across countries and education systems somewhat easier.
    • How does the performance of U.S. students in mathematics and science on PISA compare with U.S. student performance on TIMSS?
      • Before talking about how the TIMSS results compare with the PISA results, it is important to recognize the ways in which TIMSS and PISA differ.

        While TIMSS and PISA both assess mathematics and science, they differ with respect to which students are assessed, what is measured, and the participating countries and educational jurisdictions.
        • TIMSS assesses younger students (4th- and 8th-graders) on their knowledge of specific mathematics and science topics and cognitive skills that are closely linked to the curricula of the participating countries. PISA assesses older students (15-year-old students) in mathematics literacy and science literacy, or how well they can apply their knowledge and skills to problems set in real-world contexts.
        • While there is some overlap in content, each assessment may have unique topics or different emphases and the nature of the items may differ as well, given their different focuses.
        • Not all countries have participated in TIMSS and PISA, or in all administrations of either assessment. Both TIMSS and PISA include developed and developing countries; however, TIMSS has a larger proportion of developing countries participating than PISA because PISA is principally a study of the member countries of the OECD—an intergovernmental organization of developed countries. All 34 OECD countries participate in PISA, but not all of these 34 countries participate in TIMSS.
        On TIMSS, students at 4th and 8th grades performed above the TIMSS scale average in both mathematics and science, unlike what we see in PISA in which—in 2012—U.S. 15-year-olds performed below (in mathematics) or not measurably different than (in science) the OECD averages. Five East Asian countries and education systems (Singapore, Korea, Hong Kong-China, Chinese Taipei, and Japan) outperformed the United States in mathematics and science in both TIMSS and PISA.
        • Mathematics. The 2011 TIMSS results showed that U.S. students' average mathematics score for both 4th-graders and 8th-graders were above the TIMSS scale average, which is set at 500 for every administration of TIMSS at both grades. At the 8th grade, students in 6 countries and 5 states or provinces had higher mathematics scores than U.S. students on average; this includes Singapore, Korea, Hong Kong-China, Chinese Taipei, Japan, Russian Federation, Indiana, Massachusetts, Minnesota, North Carolina, and Quebec.
        • Science. The 2011 TIMSS results showed that U.S. students' average science scores for both 4th-graders and 8th-graders were above the TIMSS scale average, which is set at 500 for every administration of TIMSS at both grades. However, students in 8 countries and 4 states or provinces outperformed U.S. students at the 8th grade level: Chinese Taipei, Finland, Japan, Korea, Singapore, Russian Federation, Slovenia, Hong Kong-China, Colorado, Massachusetts, Minnesota, and Alberta, Canada.
    • Are PISA scores of individual students reported or available for analysis?
      • Student and school-level data are available for download and analysis. However, the assessment methods used in international assessments only produce valid scores for groups, not individuals. To protect respondent privacy, individual students, principals and teachers cannot be identified from the data. Data from PISA 2012 for all countries, including the United States can be obtained from the OECD website at www.pisa.oecd.org. Data collected in the United States for PISA can be downloaded from:http://nces.ed.gov/pubsearch/getpubcats.asp?sid=098. Those interested in exploring the PISA data can use the PISA International Data Explorer (IDE) (http://nces.ed.gov/surveys/international/ide/), an online data tool that helps users create their own tables and figures.
    • Can you report PISA results for states?
      • Yes and no. The U.S. national PISA results are representative of the nation as a whole but not of individual states. Drawing a sample that is representative of all 50 individual states and the District of Columbia would require a much larger sample than the United States currently draws for international assessments, requiring considerable amounts of additional time and money. A state may elect to participate in PISA as an individual education system—as Connecticut, Florida and Massachusetts did in 2012, and Massachusetts, North Carolina, and Puerto Rico in 2015—and in that case a sample is drawn that is representative of that state.
  • About TIMSS (the Trends in International Mathematics and Science Study)
    • Who is in charge of TIMSS?
      • The National Center for Education Statistics (NCES), part of the U.S. Department of Education, is responsible for administering TIMSS in the United States and for representing the United States in international collaboration on TIMSS.

        The International Association for the Evaluation of Educational Achievement (IEA) coordinates TIMSS internationally. The IEA is an independent international cooperative of national research institutions and government agencies with nearly 70 member countries worldwide. The IEA has a permanent secretariat based in Amsterdam, and a data processing and research center in Hamburg, known as the IEA Data Processing Center (DPC).

        The IEA contracts with the TIMSS & PIRLS International Study Center at Boston College to lead the design and implementation of TIMSS. The TIMSS & PIRLS International Study Center works with country representatives, called National Research Coordinators, to design and implement TIMSS, assure quality control and international comparability, and report results. The U.S. National Research Coordinator is Stephen Provasnik of NCES. Data collection for TIMSS 2015 within the United States is done under contract with WESTAT, Inc.
    • Can my school sign up to participate in TIMSS?
      • Schools cannot sign up to participate in TIMSS as part of the national U.S. sample. It is important for fair comparisons across countries that each country only include in its national sample those schools and students scientifically sampled by the international contractor to fairly represent the country.
    • How does TIMSS select a representative sample of students?
      • To provide valid estimates of student achievement and characteristics, TIMSS selects a random sample of students that represents the full population of students in the target grades. This population is defined internationally as the following:
        Fourth-grade: all students enrolled in the grade that represents four years of formal schooling, counting from the first year of the International Standard Classification of Education (ISCED), Level 1, providing the mean age at the time of testing is at least 9.5 years.

        Eighth-grade: all students enrolled in the grade that represents eight years of formal schooling, counting from the first year of ISCED Level 1, providing the mean age at the time of testing is at least 13.5 years.

        Twelfth-grade: All students in the final year of secondary schooling who are taking or have taken advanced mathematics or physics courses.
        TIMSS guidelines call for a minimum of 150 schools to be sampled per grade, with a minimum of 4,000 students assessed per grade. The school response rate target is 85 percent for all countries. A minimum participation rate of 50 percent of schools from the original sample of schools is required for a country's data to be included in the international database. The response rate target for classrooms is 95 percent, and the target student response rate is set at 85 percent, from both original and substitute schools.

        Countries are allowed to use substitute schools (selected during the sampling process) to increase the response rate once the 50 percent minimum participation rate of original school sampling is reached. In accordance with TIMSS guidelines, substitute schools are identified by assigning the two schools neighboring the sampled school in the frame as substitutes to be used in instances where an original sampled school refuses to participate. Substitute schools are required to be in the same implicit stratum (i.e., have similar demographic characteristics) as the sampled school.

        U.S. sampling frame
        The TIMSS U.S. sample is drawn from the Common Core of Data (CCD) listing of public schools supplemented with the Private School Universe Survey (PSS) listing of private schools. The combination of these national listings has proven to be close to 100 percent complete.

        U.S. sampling design
        The U.S. TIMSS sample uses a stratified two-stage cluster sampling design. The U.S. sampling frame, or list of schools from which the sample is selected, is both explicitly and implicitly stratified (that is, sorted for sampling).

        The U.S. sampling frame is explicitly stratified by three categorical stratification variables: (1) the percentage of students eligible for free or reduced-price lunch, (2) school control (public or private), and (3) region of the country (Northeast, Central, West, Southeast). Explicit stratification controls completely the sample size for a specific variable or variables so that the proportion of schools in the specific variable's subgroups exactly matches that of the population.

        The U.S. sampling frame is implicitly stratified by two categorical stratification variables: community type (city, suburb, town, or rural) and minority status (i.e., above or below 15 percent of the student population). Implicit stratification controls the sample size for a specific variable or variables, but it does not do so completely because it does not rely on independent random draws within each stratum, as occurs with explicit stratification. Instead, implicit stratification entails sorting the list of all schools by the implicit stratification variable(s), and taking a systematic sample. The sample's proportion of schools in the specific variable's subgroups will then be close to that of the population. The variability of the sample sizes in the subgroups will be reduced considerably by systematic sampling, but it will not be reduced to zero as in explicit stratification.

        Once the sampling frame has been stratified, the first stage of the sampling design makes use of a systematic “probabilities proportional to size” technique to select schools for the original sample that are representative of the U.S. as a whole. The second stage of the sampling design consists of selecting intact mathematics classes within each participating school. All students in sampled classrooms are selected for assessment. In this way, the overall sample design for the United States is intended to approximate a self-weighting sample of students as much as possible, with each fourth- or eighth-grade student having an equal probability of selection.
    • How many U.S. schools and students participated in previous TIMSS cycles?
      • At grade 4
        Assessment year Number of participating schools Number of participating students Overall weighted response rate (percent)
        1995 182 7,296 80
        2003 248 9,829 78
        2007 257 7,896 84
        2011 369 12,569 80

        At grade 8
        Assessment year Number of participating schools Number of participating students Overall weighted response rate (percent)
        1995 183 7,087 78
        1999 221 9,072 85
        2003 232 8,912 73
        2007 239 7,377 77
        2011 501 10,477 81

        At grade 12
        Assessment year Number of participating schools Number of participating students Overall weighted response rate (percent)
        1995      
            Advanced mathematics 199 2,349 67
           Physics 203 2,678 68

        NOTE: The overall weighted response rate is the product of the school participation rate, after replacement, and the student participation rate, after replacement. There was no grade 4 assessment in 1999.
    • Have there been changes in the countries participating in TIMSS?
      • Yes. Please follow this link to a table of all TIMSS participating countries and non-national education systems for each of the TIMSS years of administration.
    • If the makeup of the countries changes across the years, how can one compare countries to the TIMSS scale average?
      • Achievement results from TIMSS are reported on a scale from 0 to 1,000, with a TIMSS scale average of 500 and standard deviation of 100. The scale is based on the 1995 results, and the results of all subsequent TIMSS administrations have been placed on this same scale. This allows countries to compare their performance over time as well as to compare with a set standard, the TIMSS scale average.
    • What areas of mathematics and science are assessed in TIMSS?
      • At grade 4, TIMSS focuses on three domains of mathematics:
        • numbers (manipulating whole numbers and place values; performing addition, subtraction, multiplication, and division; and using fractions and decimals),
        • geometric shapes and measures, and
        • data display.
        At grade 8, TIMSS focuses on four domains of mathematics:
        • numbers,
        • algebra,
        • geometry, and
        • data and chance.
        At grade 12, TIMSS focuses on three domains of advanced mathematics:
        • algebra,
        • calculus, and
        • geometry.
        At grade 4, TIMSS focuses on three domains of science:
        • life science,
        • physical science, and
        • Earth science.
        At grade 8, TIMSS focuses on four domains of science:
        • biology,
        • chemistry,
        • physics, and
        • Earth science.
        At grade 12, TIMSS focuses on three domains of advanced physics:
        • mechanics and thermodynamics,
        • electricity and magnetism and,
        • wave phenomena and atomic/nuclear physics.
    • How do the results of TIMSS compare with the results in PISA?
      • The TIMSS 2011 results at 8th grade, the grade closest to the age of the PISA students, showed U.S. average scores higher than the TIMSS scale average in both mathematics and science. In PISA 2009, the average scores of U.S. 15-year-old students were below (in mathematics) or not measurably different (in science) from the OECD average—the average score of students in the 34 Organization for Economic Cooperation and Development countries. How do we reconcile the apparent differences?

        The results from TIMSS and PISA are difficult to compare because the assessments are so different in at least three key ways that could influence results. First, TIMSS assesses 8th- and 4th-graders, while PISA is an assessment of 15-year-old students, regardless of grade level. (In the United States, PISA data collection occurs in the autumn, when most 15-year-olds are in 10th grade.) So, the grade levels of students in PISA and TIMSS differ. Second, the knowledge and skills measured in the two assessments differ. TIMSS is intended to measure how well students have learned the mathematics and science curricula in participating countries, whereas PISA is focused on application of knowledge to “real-world” situations. Third, the participating countries in the two assessments differ. Both assessments cover much of the world, but they do not overlap neatly. Only 25 of the 42 participating education systems in TIMSS 2011 at the 8th-grade level participated in the PISA 2009 assessment of 15-year-olds. Both assessments include key economic competitors and partners, but the overall makeups of the countries participating in the two assessments differ markedly. Thus, the “averages” used by the two assessments are in no way comparable, and the “rankings” often reported in media coverage of these two assessments are based on completely different sets of countries.

        To learn more about how the TIMSS assessment differs from PISA as well as NAEP, see the following paper: Comparing TIMSS with NAEP and PISA in Mathematics and Science (2007) PDF (281 KB)
    • How does the mathematics and science achievement of U.S. students on TIMSS compare with achievement on NAEP?
      • Both TIMSS and NAEP provide a measure of fourth- and eighth-grade mathematics and science learning. It is natural to compare them, but the distinctions described below need to be kept in mind in understanding the converging or diverging results.

        Mathematics
        The most recent results from NAEP and TIMSS include information on trends over time in fourth- and eighth-grade mathematics achievement for a similar time interval: in NAEP between 1996 and 2011 and in TIMSS between 1995 and 2011.
        Both assessments showed statistically significant increases in the mathematics performance of fourth- and eighth-grade students between these years.
        Science
        The most recent results from TIMSS provide trend information for fourth- and eighth-grade science achievement between 1995 and 2011. In contrast, NAEP only provides trends for fourth-grade between 1996 and 2005 and for eighth-grade between 1996 and 2005 as well as between 2009 and 2011. (Due to a major revision of the NAEP Science Framework in 2009, no trend comparison can be made between 1996 and 2011). Compared with mathematics, the available trends shown by NAEP and TIMSS in science are less consistent with one another.
        In fourth grade, NAEP shows that there was an increase in students' science performance overall between 1996 and 2005, whereas TIMSS did not detect any change in performance from 1995 to 2007. In eighth-grade, NAEP detected no measurable change between 1996 and 2005, but shows an increase in student's science performance overall between 2009 and 2011. In contrast, TIMSS detected an increase between 1995 and 2003 and between 1995 and 2011, but detected no measurable change between 2003 and 2011 or between 2007 and 2011.
        For cross-study comparison reports, please see http://nces.ed.gov/surveys/international/cross-study-comparisons.asp.

        To learn more about how TIMSS compares to NAEP, see the following paper for a framework comparison: Comparison of TIMSS 2011 Items and the NAEP 2011 Framework (2011) PDF (756 KB)
    • How does TIMSS differ from other international assessments and NAEP?
    • Can you directly compare TIMSS scores at grade 4 to scores at grade 8?
      • The scaling of TIMSS data is conducted separately for each grade and each content domain. While the scales were created to each have a mean of 500 and a standard deviation of 100, the subject matter and the level of difficulty of items necessarily differ between the assessments at both grades. Therefore, direct comparisons between scores across grades should not be made.
    • On TIMSS, why do U.S. boys outperform girls in mathematics at grade 4 but not at grade 8, and U.S. boys outperform girls in science at grade 8 but not at grade 4? Why aren't differences between the sexes more consistent?
      • The seeming inconsistencies between the achievement scores of U.S. boys and girls in mathematics and science are not easily explainable. Research into differences in achievement by sex has been unable to offer any definitive explanation for these differences. For example, Xie and Shauman (2003)i, in examining sex differences primarily at the high school level, find that "differences in math and science achievement cannot be explained by the individual and familial influences that we examine." Indeed, that sex differences vary in the participating TIMSS countries—some in favor of males and others in favor of females—would appear to support the idea that the factors related to sex differences in mathematics and science achievement are complicated.

        iXie, Y. & Shauman, K. (2003). Women in Science: Career Processes and Outcomes. Boston, MA: Harvard University Press.
    • When are TIMSS data collected?
      • TIMSS operates on a 4-year cycle, with 1995 being the first year it was administered. Countries in the Northern Hemisphere conduct the assessment between April and June of the assessment year, while countries in the Southern Hemisphere conduct the assessment in October and November of the assessment year. In both hemispheres the assessment is conducted near the end of the school year.
    • Where can I get a copy of the TIMSS U.S. Report?
    • When is TIMSS scheduled to be administered next?
      • TIMSS is scheduled to be administered next in 2015, with results to be reported at the end of 2016.
    • Can my state or school district or school sign up to obtain its own TIMSS results, independent of the U.S. results?
      • Yes, states, school districts, and schools can sign up to obtain their own TIMSS results at their own cost. Sample size restrictions apply. Please contact Stephen Provasnik, the U.S. TIMSS National Research Coordinator, for more information.
  • About TALIS (Teaching and Learning International Survey)
    • What's new in 2013?
      • TALIS 2013 will offer new questionnaire items not included in TALIS 2008, though a number of items used in 2008 will be retained to allow reporting of changes over time. In addition, TALIS 2013 will include teachers of students with special needs, who were not included in 2008.
    • What is the target population for TALIS 2013?
      • The core target population in TALIS is ISCED* Level 2 teachers and school principals. ISCED Level 2 corresponds to grades 7, 8, 9 in the United States.

        At selected schools, any staff member with instructional responsibilities for grade 7, 8, and/or 9 students—whether with a whole class or a single student—is eligible to be selected. This means that classroom teachers as well as "pull-out" and "push-in" instructors will be included in the study.

        Once a school is selected, we ask that the principal (or head administrator) and a random selection of up to 22 teachers complete online questionnaires. If a school has fewer than 22 teachers that teach students in grades 7, 8, or 9, all teachers are selected. Schools and teachers are randomly selected in order to ensure that participating schools and teachers in the country truly represent the variety of schooling available.

        *ISCED stands for the International Standard Classification of Education. Details on the ISCED classification system can found at http://www.unesco.org/education/information/nfsunesco/doc/isced_1997.htm.
    • What sorts of topics does TALIS 2013 explore?
      • TALIS 2013 explores the following topics: teachers' and principals' backgrounds and characteristics; teacher and principal professional experience and development; school staffing, funding, and student body characteristics; school leadership and management; teacher appraisal and feedback; teacher induction and mentoring; teachers' subject matter (class) assignments; teachers' instructional approaches and assessment practices; teachers' and principals' job satisfaction; and school climate.
    • How are the TALIS instruments developed?
      • Survey items for the TALIS questionnaires are developed in a collaborative, international process.
        Led by the OECD and its contractors, national representatives from each participating country meet several times throughout a year to develop and refine the survey items. The themes of TALIS (e.g., teacher evaluation and feedback) are based on the policy and research priorities of all participating countries. Based on these priorities, new items are developed and items from previous rounds are reviewed. All items are included in a field trial conducted in every participating country. The field trial is used to identify items that do not function as designed or that may be inappropriate or do not easily translate into the various national contexts. Finally, the final survey instruments are reviewed and approved by the national representatives.

        There is an extensive translation verification process.
        Each participating country is responsible for translating the survey instruments into their own language or languages, unless the original survey items are in the language of the country. Each country identifies translators to translate the source versions into their own language. External translation companies independently review each country's translations. Instruments are verified twice, once before the field test and again before the main data collection. Statistical analyses of the item data are then conducted to check for evidence of differences in response patterns across countries that could indicate a translation problem. If a translation problem with an item is discovered in the field test, it is removed for the final survey instruments.
    • How is cross-country comparability monitored?
      • Procedures for administration are standardized and independently verified.
        TALIS is designed, developed and implemented by international organizations that have extensive experience in large-scale international data collection projects. These coordinating organizations produce a number of manuals that are provided to each country's representative for the administration of the questionnaires. These manuals specify standardized procedures that all countries must follow on all aspects of assessment sampling, preparation, administration, and scoring. The countries themselves organize their own quality control monitors to observe the survey administration processes, and the OECD also organizes an independent group of quality control monitors to observe the survey administration process. Instances in which the quality of the data collected cannot be independently verified can lead to the data being omitted from international reports.
    • Is participation mandatory?
      • Participation in TALIS is entirely voluntary. However, because potential respondents are randomly chosen to represent others like themselves, the participation of each chosen respondent is very important to obtaining accurate results. Respondents may also skip individual items if they wish.
[Show All] International Exchange Programs and Foreign Study