Table of Contents  |  Search Technical Documentation  |  References

NAEP Analysis and Scaling → Estimation of Population and Student Group Distributions → Using Population-Structure Model Parameters to Create Plausible Values for Later Computation

Using Population-Structure Model Parameters to Create Plausible Values for Later Computation

    horizontal line    

Creation of Plausible Values

Plausible Values Versus Individual Scores

In order to calculate the many statistics estimated for each NAEP sample and to provide data for secondary analysis, NAEP uses the information provided by the population and subgroup distributions formed using the measurement and population-structure models to create plausible values. The plausible values can be used in standard statistical equations for many statistics of interest and can be used to correctly estimate the standard errors for those statistics, as long as the population-structure model includes any groups for which statistics are calculated.

The combination of Item Response Theory (IRT) models and population-structure models provides an estimated distribution of underlying performance for the population and subgroups of interest. This distribution is

The probability of vector theta given the matrix x and the matrix y equals the probability of the vector theta given matrices x, y, alpha, gamma, and sigma

where x is the matrix of item responses, y is the matrix containing group membership information, α is the matrix of IRT parameters, and matrix gamma and matrix sigma are parameters from the population-structure models. The goal of NAEP is to summarize different characteristics of this distribution.

Any statistic, t, of interest can be calculated directly on the basis of this estimated distribution of underlying performance. However, to allow secondary analyses of NAEP data to be conducted with software available in most statistical packages, five plausible values are assigned to each student record. The plausible values or the average of the plausible values attached to student record r cannot be treated as a student's scale score.

Plausible values can be thought of as a mechanism for accounting for the fact that the true scale scores describing the underlying performance for each student are unknown. Because the IRT models are latent variable models, the vector thetar values are not observed, even for the students in a NAEP sample. To overcome this problem, we follow Rubin (1987) by considering the vector thetar as "missing data," and approximating the values of a statistic t based on the vector thetar's for all students by t's expected value given x and y, the data that actually were observed.

Last updated 09 July 2008 (KL)

Printer-friendly Version