Since students with disabilities (SD) and English language learners (ELL) tend to score below average on assessments, excluding students from these groups may increase a jurisdiction's scores. Conversely, including more of these students might depress score gains. In 2005, exclusion rates varied among jurisdictions. In addition, cases of both increases and decreases in exclusion rates occurred between previous and current assessment administrations, making it complex to interpret comparisons over time within jurisdictions. Thus, the potential impact of exclusion rates on assessment results is a validity concern. The essential problem is that the populations represented by the samples may differ, which could affect cross-state comparisons within a given year and within-state trends.
At least three factors cause variability in exclusion rates across states. The first is that the percentage of students who are identified as SD or as ELL varies across jurisdictions and over time. Reasons for this variation include the following: lack of standardized criteria for defining students as SD or ELL; changes and differences in policies and practices regarding implementation of the Individuals with Disabilities Education Act (IDEA); and population shifts in the percentages of students classified as ELL and, to a lesser extent, as SD.
The second factor is that some SD and/or ELL students are excluded because they require accommodations, such as using a calculator for computation tasks on the mathematics assessment or a passage translated from English into another language in the reading assessment, that would be inconsistent with NAEP's frameworks and would change the constructs that NAEP is intended to measure.
The third factor is that some SD and/or ELL students are excluded because they are so severely disabled or lacking in English language skills that no accommodation would be sufficient to enable them to participate meaningfully.
With regard to cross-state comparisons, the correlations between rates of exclusion and average 2005 achievement scores were low or close to zero in reading, mathematics, and science at both grades (-.14 and .01 for grade 4 and grade 8 mathematics, respectively; -.08 and -.10 for grade 4 and grade 8 reading, respectively; and -0.22 and -0.10 for grade 4 and grade 8 science, respectively). In other words, higher exclusion rates were not associated with higher average scores in 2005. However, with regard to state trends, low to moderate correlations were found between changes in the rate of exclusion and average score gains from 2003 to 2005 in reading and mathematics (.48 and .19 for grade 4 and grade 8 mathematics, respectively, and .63 and .40 for grade 4 and grade 8 reading, respectively). For state trends in science, correlations between rate of exclusion and average score gains from 2000 to 2005 were low (-0.09 and 0.06 for grade 4 and grade 8 science, respectively). While there was a low to moderate tendency for an increase in exclusion rates to be associated with an increase in average scale scores in reading and mathematics, exclusion increases do not explain the entirety of score gains.
Because the representativeness of samples is ultimately a validity issue, NCES has commissioned studies of the impact of assessment accommodations on overall scores. NCES has also investigated scenarios for estimating what the average scores might have been if excluded students had been assessed. NCES will continue to evaluate the potential impact of changes in exclusion rates on score gains.
Several statistical scenarios have been proposed, based on different hypotheses about how excluded students might have performed. Combined with the actual performance of students who were assessed, these scenarios produce results for the full population (that is, one that includes estimates for excluded students) in each jurisdiction and each assessment year. Although these scenarios are somewhat speculative, these techniques do provide some indication as to which statements about trend gains or losses might be changed if exclusion rates were zero in both assessment years and if the hypotheses about the performance of missing students are correct.
Although the results of one of these scenarios are presented here, the methods used to construct the scenario are still under development. The scenario illustrates the potential impact of reasonable hypotheses about the performance of excluded students on score gains in the states and other jurisdictions that participated in the NAEP reading and mathematics assessments in 2003 and 2005 and the NAEP science assessment in 2000 and 2005. The results of this special analysis should not be interpreted as official results.
The scenario was developed by Donald McLaughlin, formerly of American Institutes for Research, and predicts what the performance of excluded SD and/or ELL students might have been if these students had been tested. The basic assumption underlying this approach is that these students would have performed as well as included SD and/or ELL students with similar disabilities, level of English proficiency, and background characteristics.
Analyses were performed for each participating state and jurisdiction for mathematics, reading, and science at grades 4 and 8. The results are presented in the following tables:
The first column of each table presents the reported score gain (or loss) for each jurisdiction based on the sample of included students. The second column shows the score gain (or loss) under the McLaughlin scenario. The third column reports the difference between the official gain and the gain under this scenario. Statistically significant score changes in columns one and two are marked with an asterisk. A footnote marks jurisdictions that show a trend that is statistically significant in the official results but not significant under the McLaughlin scenario, or vice versa.
McLaughlin, D.H. (2000). Protecting State NAEP Trends from Changes in SD/LEP Inclusion Rates (Report to the National Institute of Statistical Sciences). Palo Alto, CA: American Institutes for Research.
McLaughlin, D.H. (2001). Exclusions and Accommodations Affect State NAEP Gain Statistics: Mathematics, 1996 to 2000 (Appendix to chapter 4 in the NAEP Validity Studies Report on Research Priorities). Palo Alto, CA: American Institutes for Research.
McLaughlin, D.H. (2003). Full-Population Estimates of Reading Gains between 1998 and 2002 (Report to NCES supporting inclusion of full population estimates in the report of the 2002 NAEP reading assessment). Palo Alto, CA: American Institutes for Research.
McLaughlin, D.H. (2005). Properties of NAEP Full Population Estimates (Report to NCES). Palo Alto, CA: American Institutes for Research.
Wise, L.L. (1999). Impact of exclusion rates on NAEP 1994 to 1998 grade 4 reading gains in Kentucky (NCES Commissioner Remarks 1999). Alexandria, VA: HumRRO. Retrieved August 24, 2005, from http://nces.ed.gov/whatsnew/commissioner/remarks99/9_27_99pt2.asp.
Wise, L.L., Le, H., Hoffman, R.G., & Becker, D.E. (2004). Testing NAEP full population estimates for sensitivity to violation of assumptions. Technical Report TR-04-50. Alexandria, VA: HumRRO.
Wise, L.L., Le, H., Hoffman, R.G., & Becker, D.E. (2006). Testing NAEP full population estimates for sensitivity to violation of assumptions: Phase II. Draft Technical Report DTR-06-08. Alexandria, VA: HumRRO.