Table of Contents | Search Technical Documentation | References
Variances for NAEP assessment estimates are computed using the jackknife replicate variance procedure. This technique is applicable for common statistics, such as means and ratios, as well as for more complex statistics such as Item Response Theory (IRT) scores.
In general, the jackknife replicate variance procedure involves pairing clusters of first-stage sampling units to form H variance strata (h = 1, 2, 3, ...,H) with two units per stratum. The first replicate is formed by deleting one unit at random from the first variance stratum, inflating the weight of the remaining unit to weight up to the variance stratum total, and using all other units from the other (H-1) strata. This procedure is carried out for each variance stratum resulting in H replicates, each of which provides an estimate of the population total.
The jackknife estimate of the variance for any given statistic is given by the following formula:
where
Each replicate undergoes the same weighting procedure as the full sample so that the jackknife variance estimator reflects the contributions to or reductions in variance resulting from the various weighting adjustments.
The NAEP jackknife variance estimator is based on 62 variance strata resulting in a set of 62 replicate weights assigned to each school and student.
The basic idea of the jackknife variance estimator is to create the replicate weights so that use of the jackknife procedure results in a correct, unbiased variance estimator for simple totals and means which is also reasonably efficient (i.e., has a low variance as a variance estimator). The jackknife variance estimator will then produce a consistent (but not fully unbiased) estimate of variance for (sufficiently smooth) nonlinear functions of total and mean estimates such as ratios, regression coefficients, and so forth ( Shao and Tu, 1995). The development below shows why the NAEP jackknife variance estimator returns a correct unbiased variance estimator for totals and means, which is the cornerstone to the asymptotic results for nonlinear estimators. See for example Rust (1985). This paper also discusses why this variance estimator is generally efficient (i.e., more reliable than alternative approaches requiring similar computational resources).
The development will be done for an estimate of a mean based on a simplified sample design which closely approximates the sample design for first-stage units used in the NAEP studies. The sample design is a stratified random sample with H strata with population weights W_{h}, stratum sample sizes n_{h}, and stratum sample means . The population estimator and standard unbiased variance estimator are:
with
The jackknife replicate variance estimator assigns one replicate h=1,…,H to each stratum, so that the number of replicates equals H. In NAEP, the replicates correspond generally to doublets and triplets (with the latter only being used if there are an odd number of sample units within a particular hard boundary generating replicate strata). For doublets, the process of generating replicates can be viewed as taking a simple random sample of size n_{h}/2 within the replicate stratum, and assigning a doubled weight to the sampled elements, and zero weight to the unsampled elements. In this simplified context of stratified random sampling, this assignment reduces to replacing with , the latter being the sample mean of the sampled n_{h}/2 units. The replicate estimator corresponding to stratum r is
The r-th term in the sum of squares for is thus:
In stratified random sampling, when a sample of size n_{r}/2 is drawn without replacement from a population of size n_{r}, the sampling variance is
See for example Cochran (1977), Theorem 5.3, using n_{r} as the “population size”, n_{r}/2 as the “sample size”, and s_{r}^{2} as the “population variance” in the given formula. Thus
Taking the expectation over all of these stratified samples of size n_{r}/2, it is found that
In this sense, the jackknife variance estimator “gives back” the sample variance estimator for means and totals as desired under the theory. In practice, random selection is not done in each replicate stratum, but units are instead assigned systematically (the first, third, etc.). Replicate strata are also grouped to make sure that the number of replicates is not too large (the replicate total is usually 62 for NAEP surveys). The randomization from the original sample distribution guarantees that the sum of squares contributed by each replicate will be close to the target expected value (rather than much larger or much smaller).
For triplets, the NAEP weighting contractor assigns two sets of replicate weights for replicate stratum r: r_{1} and r_{2} (which are then usually grouped with other doublets and/or triplets). Note that r_{1} is always equal to r, with r_{2} being another replicate (“far away” from the first replicate). The replicate stratum r_{1} is partitioned into three equal-sized replicate units, with the following replicate weight assignments for the two replicates:
where w_{i} is the full sample base weight,
In the case of stratified random sampling, this formula reduces to replacing with for replicate r_{1}, where is the sample mean from a “2/3” sample of 2n_{r}/3 units from the n_{r} sample units in the replicate stratum, and replacing with for replicate r_{2}, where is the sample mean from another overlapping “2/3” sample of 2n_{r}/3 units from the n_{r} sample units in the replicate stratum.
The r_{1}-th and r_{2}-th replicates can be written:
From these formulas, expressions for the r_{1}-th and r_{2}-th components of the jackknife variance estimator are obtained (ignoring other sums of squares from other grouped components attached to those replicates):
These sums of squares have expectations as follows, using the general formula for sampling variances:
Thus,
as desired again.