Dependent T-Tests: NDE Statistical Specification

In NAEP, t-tests are independent unless there is a part-whole comparisons, where one jurisdiction is part of another one. Under the enhancement, additional dependencies are being taken into consideration. These pertain to part-part comparisons, where two jurisdictions might share some but not all of their sample/population, and dependencies between subgroups. The following table holds.

Situation

Comparison

Dependence

Standard Error and Degrees of Freedom Equation

I

Across years

Independent

Pooled

II

Between mutually exclusive jurisdictions

Independent

Pooled

III

Between student groups

Dependent

Dependent via Differences

IV

Between non-mutually exclusive jurisdictions where one is fully subsumed in the second

Dependent

Dependent via either (a) Differences or (b) Part-whole equation

V

Between non-mutually exclusive jurisdictions that both share some, but not all of their sample/population

Dependent

Dependent via (a) Differences or (b) Part-part equation

Under the current infrastructure, the t-test module either handles STATS or SUMS.

STATS is a series of statistics that can directly be used to compute a t-value and a p-value.

SUMS produce a set of intermediate replicate weight or plausible value based statistics, that together can be combined to compute a t-value and a p-value.

STATS are typically used for situation I, II, IV(b), and V(b) while SUMS are typically used for situation III, IV(a), and V(a). Requests typically behold a combination of situations. The current architecture only supports either STATS or SUMS within a single table. This might not be the case indefinitely. Formulae IV(a) and IV(b) are approximately equivalent, except for some mild, reasonable assumptions in (b). Therefore, these two formulae might not yield the exact equivalent results, although they should be expected to be close. This is also true for V(a) and V(b).

STATS

Pooled

For independent comparisons, a simple pooled standard error or    

SE sub absolute value of g minus h equals the square root of SE squared sub g plus SE squared sub h

can be computed for years g and h or mutually exclusive jurisdictions g and h.

Part-whole

Suppose that the largest jurisdiction, the whole, is S and the smaller, the part, is X. Also, p is the weighted proportion of X into S. As a Venn-diagram, this could be represented as follows:

Venn diagram showing relationship of S and X.

Then,   

SE equals the square root of SE squared of S bar plus 1 minus 2 times p times SE squared of X bar

Part-part

Suppose that one jurisdiction is S and the other is X. Also, Q is the overlapping part and p1 is the weighted proportion of Q into S and p2 is the weighted proportion of Q into X. As a Venn-diagram, this could be represented as follows:

Venn diagram as described above

Then,

SE equals the square root of SE squared of X bar plus SE squared of S bar minus 2 times p sub1 times p sub 2 times SE squared of Q bar .

This would obviously require that the standard error for Q is part of the STATS.

SUMS

Dependent via differences

For dependent comparisons, the standard error of the difference needs to be computed. Suppose that mu hat sub g   and  mu hat sub h  are the (weighted) mean, achievement level proportion, or percentile estimates for groups g and h, where these indexes can point to either jurisdictions (including nation and TUDAs) or student groups.

Furthermore, there are m=(1,2,...m,m+1,...M) plausible values (usually 5), hence,

mu hat sub g equals 1 over uppercase M times the sum of m of mu hat sub g, lowercase m  .

Also, there are r=(1,2,...,r,r+1,...,R) replicate weights, hence, the mean based on replicate weight r is denoted mu hat sub g, m, r . Then, the standard error takes on the following form:

(1)   

SE squared sub absolute value of g minus h equals the sum of r of quantity the difference of mu hat sub g, 1, r and mu hat sub h, 1, r minus the difference of mu hat sub g, 1 and mu hat sub h, 1 quantity squared plus 1 plus uppercase M to the -1 power divided by uppercase M minus 1 times the sum of lowercase m of quantity the difference of mu hat sub g, lowercase m, and mu hat sub h, lowercase m, minus the difference of mu hat sub g and mu hat sub h quantity squared

For (weighted) student group distribution percentages, the second term on the right hand side is equal to 0. Hence,

(2)   

SE squared sub absolute value of g minus h equals the sum of r of quantity the difference of mu hat sub g, 1, r and mu hat sub h, 1, r minus the difference of mu hat sub g, 1 and mu hat sub h, 1 quantity squared.

For proportions associated with achievement levels as independent variable equation (1) applies, where

mu hat sub g is the average subgroup percentage across the plausible values for group g as will be reported as statistic in the table,

mu hat sub g, m is the percentage for plausible value m, and

mu hat sub g, 1, ris the percentage for group g, based on the first plausible value and based on jackknife replicate weight r.

 

Degrees of freedom – dependent via differences

For differences of dependent groups, the degrees of freedom follows from the usual computation, except that no pooling is conducted. Hence,

 df sub absolute value of g minus h equals c times the square of the sum of r of quantity the difference of mu hat sub g, 1, r and mu hat sub h, 1, r minus the difference of mu hat sub g, 1 and mu hat sub h, 1 quantity squared divided by the sum of r of quantity the difference of mu hat sub g, 1, r and mu hat sub h, 1, r minus the difference of mu hat sub g, 1 and mu hat sub h, 1 quantity to the 4th power

 where c is the usual Johnson-Rust correction, defined as:    

c equals 3.16 minus 2.77 over the square root of m

and m is the number of jackknife replicates, which we currently set at 62 for all comparisons in this particular correction factor.

Application to statistics

Below is a summary table to indicate what components factor into the standard error computation and, thus, whether equation (1) or (2) should be used.

Statistic

Plausible Values

Jackknife

Equation

Mean

Yes

Yes

(1)

Achievement Level

Yes

Yes

(1)

Percentile

Yes

Yes

(1)

Subgroup percentage

No

Yes

(2)

Achievement level as independent variable, yielding a subgroup percentage

Yes

Yes

(1)

 

For more information, see the NAEP Technical Documentation at http://nces.ed.gov/nationsreportcard/tdw/analysis/infer.asp.