Deleting Items from Scales

This figure contains the plot of an item that was deleted (or dropped) from the 1998 Reading to Gain Information scale at grade 12. The item is a three-category constructed-response item with unsatisfactory, partial, and full credit responses. The proportion of students who actually received partial or full credit, denoted by the triangles that form an X on the plot, was not well predicted by the solid curves representing the theoretical estimates for these categories. The fact that the triangles representing the empirical proportions of students with responses in these categories are far from the theoretical item response functions (IRFs) is an indication of this. In addition, few students (fewer than 40 percent) in the score scale range of interest have partial or complete responses to this item. The unsatisfactory category, represented by the line that begins at (-4.0, 0.6) and slopes downward, provides virtually no discrimination. The empirical IRF for that category is essentially flat, indicating that students with scores in the score scale range of interest are likely to have unsatisfactory work on this item, whether they are lower or higher on the scale.

In making decisions about deleting items from the final scales, a balance is sought between being too stringent, hence, deleting too many items and possibly damaging the content representativeness of the pool of scaled items, and being too lenient, hence including items with model fit poor enough to endanger the types of model-based inferences made from NAEP results. Items that clearly do not fit the models are not included in the final scales; however, a certain degree of misfit is tolerated for a number of items included in the final scales. For the majority of items in NAEP the model fit appears to be very good, however sometimes the Item Response Theory (IRT) models do not fit well.

In the plot, the horizontal axis represents the theta (theta) scale and the vertical axis represents the probability of having a response fall in each category. The solid curves are the theoretical IRFs based on the item parameter estimates and the equation for the generalized partial credit model. The centers of the triangles represent the empirical proportions of students with responses in each category for the 1998 grade 12 reading assessment data.

Polytomous item (R016603) exhibiting unacceptably poor model fit
Plot Showing a Polytomous Item Exhibiting Unacceptably Poor Model Fit
