A number of checks are made to detect serious violations of the assumptions underlying the models employed by NAEP. Checks are made to detect multidimensionality of the construct being measured and certain conditional dependencies. DIF analyses are used to examine issues of dimensionality, and what are called statistics in the IRT literature are used to flag responses with serious departures from the IRT model. The latter statistics might better be called item fit statistics, since they do not really have distributions. These checks include comparisons of empirical and theoretical item response functions to identify items for which the IRT model may provide a poor fit to the data. When warranted, remedial efforts, such as collapsing categories of polytomous items or combining two or more items into a single item, are made to mitigate the effects of such violations on inferences. These procedures are used for all items regardless of block format (e.g., passage-based items, discrete items) or response type (e.g., multiple-choice items, constructed-response items).