Comparisons of different groups of students with respect to mean scale scores or percentages of a certain attribute are of primary interest to users of National Assessment of Educational Progress (NAEP) results. The user is cautioned to rely on the results of statistical tests, rather than on the apparent magnitude of the difference between sample means or percentages, when determining whether
the sample differences are likely to represent actual differences among the groups in the population.
By convention, the text in reports of NAEP results indicates that means or percentages from two groups are different (e.g., one group performed higher or lower than another group) only when the difference in the point estimates for the groups being compared is statistically significant at an approximate level of .05.
Since 1998, t tests have been used for most NAEP comparisons. These tests are more appropriate than z tests based on normal distribution approximations when the statistics that are being compared are from distributions with thicker tails than those from the normal distribution. One aspect of the use of t tests that contributes to the difficulty in their use for large-scale surveys is the determination of the appropriate degrees of freedom for the t distribution of interest.