Skip Navigation

Program for the International Assessment of Adult Competencies (PIAAC)



5. DATA QUALITY AND COMPARABILITY

This section provides information for rounds 1 (main study), 2 (national supplement), and 3 (2017 study sample).

Two broad categories of error occur in estimates generated from surveys: sampling and nonsampling errors.

Sampling Error

Sampling error is the uncertainty that exists because population estimates are based on a sample rather than a census. Clustering effects can cause additional uncertainty in estimates that cannot be handled by conventional formulas for variance estimation.

Another procedure that affects variances which is not captured by standard estimation approaches is estimation through Item Response Theory (IRT) models; because different respondents take different sets of items that could differ by level of difficulty, it is inappropriate to base the competency estimates simply on the number of correct answers obtained. The IRT model uses the item responses for each individual, regarding the latent literacy score as random, and generates several predicted (plausible) values, which also have variation. Given these complexities, the Consortium specified standards in the TS&Gs regarding the creation of special weights to facilitate computation of sampling error estimates for PIAAC. For these reasons, PIAAC provides estimates of standard errors using a stratified jackknife replication approach.

Nonsampling Error

Nonsampling error contains all sources of error besides sampling error. There are three components of nonsampling error: (1) frame error, (2) measurement error, and (3) nonresponse error, with nonresponse bias being a key indicator of the latter.

Unit nonresponse for the Main Study and National Supplement to the Main Study. The PIAAC samples were subject to unit nonresponse from the screener, background questionnaire, assessment (including reading components), and item nonresponse to background questionnaire items. Both the screener and the background questionnaire had a unit response rate below 85 percent and thus required an analysis of the potential for nonresponse bias according to the National Center for Education Statistics statistical standards.

For the U. S., the final screener response rate was 84.7 percent weighted (main study and national supplement combined). Based on the screener data, 10,668 respondents age 16 to 65 were selected to complete the background questionnaire and the assessment; 8,670 actually completed the background questionnaire. The final response rate for the background questionnaire-which included respondents who completed it and respondents who were unable to complete it because of a language problem or mental disability-was 80.9 percent weighted.

Of the 8,670 adults age 16 to 65 who completed the background questionnaire, 8,367 completed the adult literacy assessment. The final response rate for the overall assessment-which included respondents who answered at least one question on each scale and the respondents who were unable to do so because of a language problem, mental disability, or technical problem-was 98.8 percent weighted.

The final U. S. household reporting sample-including the imputed cases-consisted of 8,670 respondents.

Unit nonresponse for the 2017 Study Sample. For the U.S., the final screener response rate was 74.9 percent weighted. Based on the screener data, 4,769 respondents age 16 to 65 were selected to complete the background questionnaire and the assessment; 3,660 actually completed the background questionnaire. The final response rate for the background questionnaire-which included respondents who completed it and respondents who were unable to complete it because of a language problem or mental disability-was 76.3 percent weighted.

Of the 3,660 adults age 16 to 65 who completed the background questionnaire, 3,406 completed the adult literacy assessment. The final response rate for the overall assessment-which included respondents who answered at least one question on each scale and the respondents who were unable to do so because of a language problem, mental disability, or technical problem-was 98.0 percent weighted.

The final U. S. household reporting sample-including the imputed cases-consisted of 3,660 respondents.

Unit nonresponse for the Prison Study. Of the 1,546 sampled inmates, 1,315 completed the background questionnaire. The final response rate for the background questionnaire, which included respondents who completed it and respondents who were unable to complete it because of a literacy-related barrier, was 85.8 percent weighted.

Of the 1,315 inmates who completed the background questionnaire, 1,274 completed the assessment. The final response rate for the overall assessment was 97.7 percent weighted.

The overall weighted response rate for the prison study was 82.2 percent (treating substitute prisons as nonresponse). The final prison reporting sample consisted of 1,319 respondents, including 1,315 respondents who completed the background questionnaire plus the 4 respondents who were unable to complete the background questionnaire for literacy-related reasons.

Nonresponse error. Nonresponse bias is a key indicator of nonresponse error, and can be substantial when two conditions hold: (1) when response rate is relatively low, and (2) when the difference between the characteristics of respondents and nonrespondents is relatively large. The nonresponse bias analyses of the PIAAC household samples in the United States revealed differences in the characteristics of respondents who participated in the background questionnaire compared with those who refused. In a bivariate unit-level analysis at the background questionnaire stage, estimated percentages for respondents were compared with those for the total eligible sample to identify any potential bias owing to nonresponse. Multivariate analyses were conducted to further explore the potential for nonresponse bias by identifying the domains with the most differential response rates.

For the main study, these analyses revealed that the subgroup with the lowest response rates for the background questionnaire had a combination of the following characteristics:

  • Hispanics age 26 and older;
  • with no children under age 16 in the household;
  • not living in the Northeastern United States;
  • living in segments with unemployment exceeding 4.8 percent; and
  • living in areas (census tracts) with less than 5.1 percent of the population being linguistically isolated.

The presence of children under age 16 in the household was the dominant variable in distinguishing response rate groups. In general, younger persons were found to be more likely to participate, as were those with children age 16 and younger, and women.

For the national supplement household area sample, analyses identified that the lowest response rate was for a combination of the following characteristics:

  • with no children under age 16 in the household;
  • not unemployed (age 16 to 34) or older (age 66 to 74);
  • living in census tracts in which the employment rate exceeds 64.53 percent;
  • living in the Northeastern United States;
  • living in census tracts in which more than 2.42 percent of the population is foreign born;
  • persons age 25 to 34 or older than 55; and
  • living in census tracts in which the unemployment rate is 4.48 percent or less.
The presence of children under age 16 in the household was the dominant variable in distinguishing response rate groups.

For the national supplement household list sample, analyses identified that the lowest response rate was for a combination of the following characteristics:

  • living in a Metropolitan Statistical Area;
  • female;
  • living in the Western and Northeastern United States;
  • living in census tracts in which less than 28.57 percent of the population has a high school education; and
  • with no children under age 16 in the household.

The indicator of whether a sampled person resided in a Metropolitan Statistical Area was the dominant variable in distinguishing response rate groups.

No nonresponse bias analysis was needed for the prison study because the weighted response rates for all data collection stages and all background questionnaire items were above the 85 percent response rate requirement

The variables found to be significant in the bivariate analysis-those used to define areas with low response rates-were used in weighting adjustments. The analysis showed that weighting adjustments were highly effective in reducing the bias. The overall conclusion from the PIAAC study on nonresponse bias is that some minimal potential for nonresponse bias exists in the PIAAC estimates; however, the analysis shows that the bias is negligible.

Item nonresponse. Since all items had greater than an 85 percent response rate, the potential for bias due to item nonresponse was considered negligible.

Data Comparability

Overall trend comparisons over time can be conducted for the total adult population in the areas of literacy and numeracy. In literacy, comparisons are made between PIAAC (2012/2014, 2017) and both ALL (2003-2008) and IALS (1994-1998). In numeracy, trend comparisons are made between PIAAC (2012/2014, 2017) and ALL (2003-2008). In both the literacy and numeracy domains, approximately 60 percent of the items are common between PIAAC and previous international surveys to ensure the comparability of these domains.

Table PIAAC-1. U.S. weighted response rates: PIAAC 2012, 2014 and 2017

    National supplement (round 2)
Component Main study sample
(round 1)
Household area sample Household list sample Prison sample 2017 study sample (round 3)
Screener 87 81 85 75
Background questionnaire 82 78 93 86 76
Assessment (without reading component) 99 99 99 98 98
Overall 70 63 78 82 56

† Not applicable.
SOURCE: PIAAC publication NCES 2016-036REV and NCES 2020-224; available at https://nces.ed.gov/pubsearch/getpubcats.asp?sid=113.