Skip to main content
Skip Navigation

Table of Contents  |  Search Technical Documentation  |  References

NAEP Technical DocumentationOverlap Control of Twelfth-Grade School Sample With the Education Longitudinal Study

The Education Longitudinal Study (ELS) was another study which was conducted in U.S. schools with a twelfth grade. In order to minimize burden on schools, the sample was designed to minimize the overlap between the NAEP 2002 sample and the ELS sample. This is done using a technique explicated below called 'Keyfitzing', named after the original paper's author (Keyfitz (1951)).

The twelfth-grade public school sample has as unconditional probabilities of selection the values pi subscript s  equal to min(1,Es), where Es is the expected number of 'hits' as computed in School Selection for Twelfth-Grade Public Schools. The procedure (a modification of that proposed by Keyfitz (1951)) is designed to assure that these desired values are in fact the unconditional probabilities, while making the conditional probabilities higher if the school is not in the ELS sample, and lower if the school is in the ELS sample.

Let N denote the set of schools which we will sample into the NAEP sample and E the set of schools that were sampled in the ELS sample. Let pi superscript E subscript s  be the probability that school s is in the ELS sample (determined by the ELS sample design).

Using general probability theory regarding conditional probabilities, the selection probability of a school in the NAEP twelfth grade public school sample can be written as

pi subscript s equals the probability of i being a member of N given that i is a member of E times pi  E subscript s plus the probability of i being a member of N given that i is not a member of E times left parenthesis 1 minus pi  E subscript s right parenthesis

The overlap between NAEP and ELS can be made small by setting probability of i being a member of N given that i is a member of E equals zero  where possible. For schools with pi subscript s plus pi superscript E subscript s is less than 1  that are in the ELS sample, probability that i is a member of N given that i is a member of E  is set to 0, (i.e., such schools are given no chance of selection in NAEP). Assigning them a conditional selection probability of 0 does this:

probability of i being a member of N given that i is a member of E equals zero

Schools with pi subscript s plus pi superscript E subscript s is less than 1 , which are not in the ELS sample, receive a conditional selection probability of

probability of i being a member of N given that i is not a member of E equals pi subscript s divided by left parenthesis 1 minus pi superscript E subscript s right parenthesis.

Note that under the condition pi subscript s plus pi superscript E subscript s is less than 1 , this is a well-defined probability (between 0 and 1). To a school with pi subscript s plus pi superscript E subscript s is less than 1  that was in the ELS sample, the algorithm assigns

probability of i being a member of N s given that i is a member of E equals left parenthesis pi subscript s minus 1 plus pi superscript E subscript s right parenthesis divided by pi superscript E

Note again that this is a well-defined probability (between 0 and 1) under the condition pi subscript s plus pi superscript E subscript s is greater than or equal to 1 . A school with pi subscript s plus pi superscript E subscript s is greater than or equal to 1  that was not in the ELS sample gets

probability of i being a member of N given that i is not a member of E equals 1

These formulas define the conditional probability in entering the NAEP sample in all cases. It can easily be shown that the conditional probabilities assigned in this way are consistent with the unconditional probabilities desired. In the first case, the unconditional selection probability in NAEP of a school with pi subscript s plus pi superscript E subscript s is less than 1  is

probability of i being a member of N equals the probability of i being a member of N given that i is a member of E times pi superscript E subscript s plus the probability of i being a member of N given that i is not a member of E times left parenthesis 1 minus pi superscript E subscript s right parenthesis equals zero times pi superscript E subscript s plus pi subscript s divided by left parenthesis 1 minus pi superscript E subscript s right parenthesis times left parenthesis 1 minus pi supers

as desired. In the second case, for a school with pi subscript s plus pi superscript E subscript s is greater than or equal to 1 , we have

probability of i being a member of N equals left parenthesis pi subscript s minus 1 plus pi superscript E subscript s right parenthesis divided by pi superscript E subscript s times pi superscript E subscript s plus 1 times left parenthesis 1 minus pi superscript E subscript s right parenthesis equals pi subscript s

also as desired.

A final step in the process is to take care of schools which were certainty selections and had expected multiple hits (Es ≥1 ). These have a conditional probability of 1, and are also assigned a conditional hits expected value of Es, so that they can receive multiple hits in the conditional sample as well (in other words, the expected hits values of unconditional certainty schools are unaffected by the Keyfitzing process).


Last updated 08 July 2008 (PE)

Printer-friendly Version