Comparisons of different groups of students with respect to mean scale scores, achievement-level percentages, percentiles, and student group percentages are of primary interest to users of NAEP results. The user needs to refer to the results of statistical tests, rather than the observed differences, before making any statement indicating that the differences observed in the sample(s) represent actual differences in the population(s). In this section, there is additional information on the statistical tests used in NAEP. A t test for independent groups is used to compare population means where there is no overlap in terms of sampled students representing these populations. A t test for partially overlapping groups is used when part-whole comparisons are being made (e.g., comparing a state to the nation).
By convention, the results in reports and online tools indicate that means, percentiles, or percentages from two groups are different (e.g., one group performed higher or lower than another group) only when the difference in the point estimates is statistically significant at an alpha level of .05.
Since 1998, t tests have been used for results disseminated through official NAEP reports. These tests are appropriate when the statistics that are being compared are based on a sampling distribution which is not fairly normal. However, the degrees of freedom in that distribution need to be determined to define the exact shape of the student t-distribution. The degrees of freedom often refer to the total number of independently variable elements in the sample. In NAEP, determining the number of independently variable elements is not straightforward; therefore, the number is estimated.