It is common practice in NAEP to categorize each item into one of three categories (Petersen 1988):
"A" (items exhibiting no DIF),
"B" (items exhibiting a weak indication of DIF), or
"C" (items exhibiting a strong indication of DIF).
Items in category "A" have Mantel-Haenszel common odds ratios on the delta scale that do not differ significantly from 0 at the alpha = 0.05 level or are less than 1.0 in absolute value. Category "C" items are those with Mantel-Haenszel values that are significantly greater than 1 and larger than 1.5 in absolute magnitude. Other items are categorized as "B" items.
A plus sign (+) indicates that items are differentially easier for the focal group; a minus sign (-) indicates that items are differentially more difficult for the focal group. Analogous to the dichotomous case, polytomous items are categorized into categories that are of the "A", "B", and "C" categories, labeled "AA","BB", or "CC." Items identified as "C" or "CC" items are reviewed by a committee of trained test developers and subject-matter specialists to determine whether the differential functioning of a particular item is due to bias or not.
Following standard practice, all items identified as having DIF are reviewed by a committee of trained test developers and subject-matter specialists. Such committees are charged with making judgments about whether or not the differential difficulty of an item is unfairly related to group membership. The committees that review NAEP items include NAEP item developers and, sometimes, outside members with expertise in the subject-area field. The committees carefully examine each identified item to determine if either the language or contents would tend to make the item more difficult for an identified group of examinees.