Statistical Standards Program
Table of Contents Introduction 1. Development of Concepts and Methods 2. Planning and Design of Surveys 3. Collection of Data 4. Processing and Editing of Data 5. Analysis of Data / Production of Estimates or Projections 6. Establishment of Review Procedures 7. Dissemination of Data Glossary Appendix A Appendix B ·Measuring Bias ·Problems with Ignoring Item Nonresponse ·Imputing Item Nonresponse ·Data Analysis with Imputed Data ·Comparisons of Methods ·References Appendix C Appendix D Publication information For help viewing PDF files, please click here |
APPENDIX B: EVALUATING THE IMPACT OF IMPUTATIONS FOR ITEM NONRESPONSE | |||
The alternative to ignoring missing item responses is to adopt a strategy to "fill-in," or in other words, impute the missing responses. A number of different methods have been proposed and used in survey research. Before discussing the specific methods and the relative advantages and disadvantages of each one, it is worthwhile to consider the pros and cons of explicit imputations in general. Most authors in this area caution that imputations carry both potentially positive and negative outcomes. For example, Kalton and Kasprzyk, 1982, identified three positive aspects of explicit imputations. They are intended to reduce biases from item nonresponse in sample survey data. By filling in the holes, they allow analyses to proceed as though the data set were complete, thus making analysis easier to conduct and results easier to report. They result in consistent results across analyses, because all analysts should be working with the same set of "complete" cases. They also identified potential drawbacks. They cautioned that imputation methods do not necessarily lead to a reduction in bias, relative to the incomplete data set. And, they warned against the danger of analysts treating the "complete" cases as actual responses, thus overstating the precision of the survey estimates. Brick and Kalton, 1996, concur with these statements and add that imputation methods may also distort the association between variables. They note that although methods can be selected to maintain the associations of the variable subject to imputation with certain, associations with other variables may be attenuated. Imputations can be categorized along two dimensions. First, by whether they are deterministic or stochastic. In the case of deterministic imputations, the residual term is set to zero. This yields the best prediction of the missing value, however it results in an attenuation of the variance of the imputed estimate relative to that of the unobserved estimate and it distorts the distribution of the values of the item in question. Thus deterministic imputations give more precise estimates of means (e.g., an average score), but produce biased estimates of distributions (e.g., the percent of students scoring above a certain point). In stochastic imputations, the residual or error term is randomly assigned. This addition of random noise improves the shape parameters by yielding more realistic distributions. Brick and Kalton, 1996, concluded that given "the importance of shape parameters in many analyses, stochastic imputations are generally preferred." The second dimension has to do with whether or not auxiliary variables are used in the imputation method. Within the set of imputation methods that use auxiliary variables, they may be either categorical, categorizing sample members into imputation classes, or they may be continuous, as in the case of regression imputation methods. As mentioned earlier, a number of different types of imputation methods
have been developed and used in survey research. A partial, although
probably not complete, listing includes historical imputation, deductive
imputations, mean imputations, random imputation, overall mean imputations
within classes, random imputation within classes, hot-deck imputation,
cold-deck imputation, flexible matching imputation, ratio imputation,
predicted regression imputation, random or stochastic regression imputation,
EM algorithm imputation, distance function matching, composite methods,
Bayesian Bootstrap imputation, and multiple imputation methods. There
are a number of sources that review the methods and properties of these
varied imputation techniques (Little and Rubin, 1987; Kalton, 1983;
Kalton and Kasprzyk, 1982, 1986; Lessler and Kalsbeek, 1992; Hu, Salvucci,
and Cohen, 2000). Table 2, taken from a forthcoming NCES report by Salvucci, et.al, ( 2002) shows the imputation methods used in recent NCES data collections. In the case of the universe data collections (CCD, PSS, IPEDS) the imputation methods most used include ratio imputation, mean imputation, and cold-deck imputation. In a few cases deductive or logical imputations are employed, and hot-deck imputation methods are also used in a few cases. Historical imputations should be added to this list, inasmuch as they are used in the Digest of Education Statistics and perhaps in the Condition of Education. The sample survey data collections primarily use sequential hot-deck imputation along with deductive imputations. There has also been limited use of within-class random imputation, regression imputation, multiple imputation, and a few of the methods listed above under universe data collections. Deductive or logical imputations Table 2. Imputation methods employed in
This method works best when the relationship over time is stronger than the relationship between variables at one point in time. Cold-deck imputation Mean value imputation This method can only provide unbiased estimates for means and totals if the missing values meet the strong assumption of missing completely at random. Because this procedure creates a spike at the mean value, it does not preserve the distribution or the multivariate relationships in the data. Furthermore, because the sample size is effectively reduced by nonresponse, standard variance formulas will underestimate the true variance. Overall mean value imputation is not recommended. Kovar and Whitridge in Cox et. al., 1995 caution that if all else fails, within-class mean value imputations can be used with carefully chosen classes for means and totals, but that it does not work for other statistics. Salvucci et. al., 2002 point out that if the missing values depend on any variables not included in the auxiliary variables used to form the imputation class, the means and totals will be biased, the distribution will be distorted, and the variances will be substantially underestimated. Little and Rubin, 1987, make the point that the distortion of the distribution is particularly problematic when the tails of the distribution or the standard errors of the estimates are the focus of study. Ratio Imputations It is important to note here that the ratio imputations used by at least some NCES data collections do not follow this description exactly. Instead, what is done for example with state level fiscal data in CCD, is to partition the responding cases, remove the value of the variable in question from the total for each state, compute the ratio of the value for each responding state to their reduced total, compute the average of these ratios across all responding states, and then multiply the total for each state with missing data by the average ratio. Regression Imputation In this case, as in other forms of imputation, the component of variance that is attributable to survey nonresponse is not accounted for in standard variance estimation software; resulting in an underestimation of the true variance. Hot-Deck Imputation In the case of the sequential hot-deck imputation each class starts with a single value for the item subject to imputation; each record is compared to that item, if the record has a value for that item, it replaces the starter value, on the other hand, if the record is missing that item the starter value or the value that has replaced it is "filled-in" on the case with the missing value. One problem occurs with this approach when several records with missing values occur together on the file. This results in the current donor value being assigned to multiple records, thus leading to a lack of precision in the survey estimates (Kalton and Kasprzyk, 1986). A variation on this approach is known as random imputation within classes; the difference here being that the donor respondent is chosen at random within the imputation class for assignment to the nonrespondent. Lessler and Kalsbeek, 1992, pointed out that if this is done with replacement, the multiple use of a donor problem persists; however, they also noted that this can be avoided by sampling without replacement. While this procedure is more cumbersome, it has the advantage of providing a basis to correctly formulate the mean square error of estimators using a hot-deck imputation. Another way to avoid the problems associated with sequential hot-deck imputation is the hierarchical hot-deck imputation. This method sorts respondents and nonrespondents into a large number of imputation classes based on a detailed categorization of a large set of auxiliary variables. Nonrespondents are then matched with respondents in the smallest class first, if no match is found that class is collapsed with the next one, and so on until a donor is found-hence the label hierarchical. As problems have been identified, alternative schemas have been devised to solve those problems. Regardless of the specifics, all hot-deck procedures take imputed values from a respondent in the same data file, thus yielding imputations that are valid, although not necessarily internally consistent for the respondent values. In order to evaluate the hot-deck imputation used for any specific data collection, detailed information is required. |