﻿ Confidence Intervals: NDE Statistical Specification

Confidence Intervals: NDE Statistical Specification

Statistical testing is based on confidence intervals. In the calculation of confidence intervals, a separate procedure is required for each of four types of weighted statistics: means, student group distribution proportions, achievement level proportions, and percentiles.

Means
For weighted means, denoted as , the confidence interval takes on the form:

whereis the 97.5th quantile of the t-distribution with degrees of freedom as estimated following the usual formula of

where r denotes the mean based on replicate weight r and the index 1 denotes that the first plausible value is used for this computation. The Johnson-Rust adjustment needs to be applied to the df result.

Furthermore,

where m denotes the plausible value and is the average over plausible values.

Student group proportions

The method of choice for proportions is a method derived by Wilson. Wilson’s approach takes on the following (asymmetric) form:

There are three variables in this equation:

is the estimated achievement level proportion,
is the effective sample size, which is computed as
where
n is the weighted sample size of the sample (NOT the population estimate). When the proportion observed in the sample is 0, then the limits have to be evaluated. Since the denominator is a squared term, it will reach 0 quicker than the numerator and, thus, the effective sample size becomes infinite. Hence, an additional restriction is placed which is that , which basically means that in very small samples the design effect is 1. The logic is that in very small samples students are approximately at random distributed. Empirically it can be verified that in relatively small samples, unless a specific clustering exists, the design effect is relatively close to 1.

is the t-distribution with df degrees of freedom.

NOTE that the degrees of freedom does not exist when the proportion is zero. By inspecting the limits, the denominator goes to zero faster than the numerator. Instead, a t-distribution with one degree of freedom may be chosen, i.e.,

Achievement level proportions

For achievement level proportions the same procedure as above is followed except that the standard error also has to take into account the variance due to measurement. This component can be easily added to the design effect to decrease the effective sample size and increase the variation accordingly. Specifically, this component is

where is the proportion estimate based on the mth plausible value, and the average of that is the estimated proportion .

This component is expected to be very small since the proportion is a summary statistic, which are generally quite stable across plausible values unless a particular small group is queried. The effective n is:

where r is the index for the replicate weight and 1 denotes that the first plausible value is used. The formula of the interval is similar to that for student group proportions, including the adjustments for minimum design effects and degrees of freedom.

Percentiles

Percentiles can be computed using much of the same techniques as above. This approach is somewhat different from the current approach.

1. Both standard error components can be found by finding the student who is exactly at the pth percentile (or by using the usual extrapolation if such student does not exist) and finding this student's proportion (if ranked) across replicate weights and plausible values.

2. Then, first a lower and upper bound can be found accounting for measurement.

3. Subsequently, the Wilson formula can be applied similar as with achievement level proportions.

4. After finding the final lower and upper bound for the proportion, the average plausible value can be used to translate these bounds into bounds in the percentile scale.

Again, the same adjustments for small proportions are used, although these will usually not be an issue as the exact percentile is known (i.e. manipulated).Note that the weight of a particular student at the pth percentile may be zero for a particular replicate weight and therefore equal to the student below him or her with a non-zero weight.