Dependent T-Tests: NDE Statistical Specification
In NAEP, t-tests are independent unless there is a part-whole comparisons, where one jurisdiction is part of another one. Under the enhancement, additional dependencies are being taken into consideration. These pertain to part-part comparisons, where two jurisdictions might share some but not all of their sample/population, and dependencies between subgroups. The following table holds.
|
Situation |
Comparison |
Dependence |
Standard Error and Degrees of Freedom Equation |
|---|---|---|---|
|
I |
Across years |
Independent |
Pooled |
|
II |
Between mutually exclusive jurisdictions |
Independent |
Pooled |
|
III |
Between student groups |
Dependent |
Dependent via Differences |
|
IV |
Between non-mutually exclusive jurisdictions where one is fully subsumed in the second |
Dependent |
Dependent via either (a) Differences or (b) Part-whole equation |
|
V |
Between non-mutually exclusive jurisdictions that both share some, but not all of their sample/population |
Dependent |
Dependent via (a) Differences or (b) Part-part equation |
Under the current infrastructure, the t-test module either handles STATS or SUMS.
STATS is a series of statistics that can directly be used to compute a t-value and a p-value.
SUMS produce a set of intermediate replicate weight or plausible value based statistics, that together can be combined to compute a t-value and a p-value.
STATS are typically used for situation I, II, IV(b), and V(b) while SUMS are typically used for situation III, IV(a), and V(a). Requests typically behold a combination of situations. The current architecture only supports either STATS or SUMS within a single table. This might not be the case indefinitely. Formulae IV(a) and IV(b) are approximately equivalent, except for some mild, reasonable assumptions in (b). Therefore, these two formulae might not yield the exact equivalent results, although they should be expected to be close. This is also true for V(a) and V(b).
STATS
Pooled
For independent comparisons, a simple pooled standard error or
![]()
can be computed for years g and h or mutually exclusive jurisdictions g and h.
Part-whole
Suppose that the largest jurisdiction, the whole, is S and the smaller, the part, is X. Also, p is the weighted proportion of X into S. As a Venn-diagram, this could be represented as follows:

Then,
![]()
Part-part
Suppose that one jurisdiction is S and the other is X. Also, Q is the overlapping part and p1 is the weighted proportion of Q into S and p2 is the weighted proportion of Q into X. As a Venn-diagram, this could be represented as follows:

Then,
.
This would obviously require that the standard error for Q is part of the STATS.
SUMS
Dependent via differences
For dependent comparisons, the standard error of the difference needs to be computed. Suppose that
and
are the (weighted) mean, achievement level proportion, or percentile estimates for groups g and h, where these indexes can point to either jurisdictions (including nation and TUDAs) or student groups.
Furthermore, there are m=(1,2,...m,m+1,...M) plausible values (usually 5), hence,
.
Also, there are r=(1,2,...,r,r+1,...,R) replicate weights, hence, the mean based on replicate weight r is denoted
. Then, the standard error takes on the following form:
(1)

For (weighted) student group distribution percentages, the second term on the right hand side is equal to 0. Hence,
(2)
.
For proportions associated with achievement levels as independent variable equation (1) applies, where
is the average subgroup percentage across the plausible values for group g as will be reported as statistic in the table,
is the percentage for plausible value m, and
is the percentage for group g, based on the first plausible value and based on jackknife replicate weight r.
Degrees of freedom – dependent via differences
For differences of dependent groups, the degrees of freedom follows from the usual computation, except that no pooling is conducted. Hence,

where c is the usual Johnson-Rust correction, defined as:
![]()
and m is the number of jackknife replicates, which we currently set at 62 for all comparisons in this particular correction factor.
Application to statistics
Below is a summary table to indicate what components factor into the standard error computation and, thus, whether equation (1) or (2) should be used.
|
Statistic |
Plausible Values |
Jackknife |
Equation |
|---|---|---|---|
|
Mean |
Yes |
Yes |
(1) |
|
Achievement Level |
Yes |
Yes |
(1) |
|
Percentile |
Yes |
Yes |
(1) |
|
Subgroup percentage |
No |
Yes |
(2) |
|
Achievement level as independent variable, yielding a subgroup percentage |
Yes |
Yes |
(1) |
For more information, see the NAEP Technical Documentation at http://nces.ed.gov/nationsreportcard/tdw/analysis/infer.asp.