The literacy tasks contained in ALL and the adults asked to participate in the survey were samples drawn from their respective universes. As such, they were subject to some measurable degree of uncertainty. ALL implemented procedures to minimize both sampling and nonsampling errors. The ALL sampling design and weighting procedures assured that participants’ responses could be generalized to the population of interest. Quality control activities were employed during interviewer training, data collection, and processing of the survey data.
Because ALL employed probability sampling, the results were subject to sampling error. Although small, this error was higher in ALL than in most studies because the cost of surveying adults in their homes is so high. Most countries simply could not afford large sample sizes.
Each country provided a set of replicate weights for use in a jackknife variance estimation procedure.
The key sources of nonsampling error in ALL were differential coverage across countries and nonresponse bias, which occurred when different groups of sampled individuals failed to participate in the survey. Other potential sources of nonsampling error included deviations from prescribed data collection procedures and errors of logic that resulted from mapping idiosyncratic national data into a rigid international format. Scoring error, associated with scoring open-ended tasks reliably within and between countries, also occurred. Finally, because ALL data were collected and processed independently by the various countries, the study was subject to uneven levels of commonplace data capture, data processing, and coding errors.
Coverage error. The design specifications for ALL stated that in each country the study should cover the civilian, noninstitutionalized population ages 16 to 65. It is the usual practice to exclude the institutionalized population from national surveys because of the difficulties in conducting interviews in institutional settings. Similarly, it is not uncommon to exclude certain other parts of a country’s population that pose difficult survey problems (e.g., persons living in sparsely populated areas). The intended coverage of the surveys generally conformed well to the design specifications: each of the ALL countries attained a high level of population coverage. However, it should be noted that actual coverage is generally lower than the intended coverage because of deficiencies in sampling frames and sampling frame construction (e.g., failures to list some households and some adults within listed households).
Nonresponse error. For ALL, several procedures were developed to reduce biases due to nonresponse, based on how much of the survey the respondent completed.
Unit nonresponse. The definition of a respondent for ALL was a person who partially or fully completed the background questionnaire. Unweighted response rates varied considerably from country to country, ranging from a high of 82 percent (Bermuda) to a low of 40 percent (Switzerland). The United States had an unweighted response rate of 66 percent (see table ALL-1).
Several precautions were taken against nonresponse bias. Interviewers were specifically instructed to return several times to nonrespondent households in order to obtain as many responses as possible. In addition, all countries were asked to ensure that the address information provided to interviewers was as complete as possible in order to reduce potential household identification problems.
Item nonresponse. Not-reached responses were classified into two groups: nonparticipation immediately or shortly after the background information was collected; and premature withdrawal from the assessment after a few cognitive items were attempted. The first type of not–reached response varied a great deal across countries according to the frames from which the samples were selected. The second type of not–reached response was due to quitting the assessment early, resulting in incomplete cognitive data. Not–reached items were treated as if they provided no information about the respondent’s proficiency, so they were not included in the calculation of likelihood functions for individual respondents. Therefore, not–reached responses had no direct impact on the proficiency estimation for subpopulations. The impact of not–reached responses on the proficiency distributions was mediated through the subpopulation weights.
Measurement error. Assessment tasks were selected to ensure that, among population subgroups, each literacy domain (prose, document, numeracy, and problem solving) was well covered in terms of difficulty, stimuli type, and content domain. The ALL item pool was developed collectively by participating countries. Items were subjected to a detailed expert analysis at ETS and vetted by participating countries to ensure that the items were culturally appropriate and broadly representative of the population being tested. For each country, experts who were fluent in both English and the language of the test reviewed the items and identified ones that had been improperly adapted. Countries were asked to correct problems detected during this review process. To ensure that all of the final survey items had a high probability of functioning well, and to familiarize participants with the unusual operational requirements involved in data collection, each country was required to conduct a pilot survey.
Although the pilot surveys were small and typically were not based strictly on probability samples, the information they generated enabled ETS to reject items, to suggest modifications to a few items, and to choose good items for the final assessment. ETS’s analysis of the pilot survey data and recommendations for final test design were presented to and approved by participating countries.
Table ALL–1. Sample size and response rate for the United States for the Adult Literacy and Lifeskills Survey: 2003 | ||||||||
---|---|---|---|---|---|---|---|---|
Country |
Population
ages 16 to 65 (millions) |
Initial
sample size |
Out-of- scope cases 1 |
Number of
respondents 2 |
Unweighted
response rate (percent) | |||
United States | 184 | 7,045 | 1,846 | 3,420 | 66 | |||
1Out–of–scope cases are those where the residents were not eligible for the survey,
the dwelling could not be located, the dwelling was under construction, the dwelling
was vacant or seasonal, or the cases were duplicates.
2A respondent’s data are considered complete for the purposes of the scaling of a country’s psychometric assessment data provided that at least the Background Questionnaire variables for age, gender, and education have been completed. SOURCE: Desjardins, R., Murray, S., Clermont, Y., and Werquin, P. (2005). Learning a Living: First Results of the Adult Literacy and Life Skills Survey. Ottawa, Canada: Statistics Canada. |