Conditional probabilities were assigned to control overlap between samples while maintaining the desired probabilities of selection for each sample individually. This was done using a technique called Keyfitzing. The original reference for Keyfitzing is Keyfitz (1951). Rust and Johnson (1992) discuss the method in its NAEP application. The desired probabilities of selection for the schools in the TUDA district samples were the probabilities derived from the measures of size. These are called below (for probability in TUDA district sample).1 There are three cases which define the sampling status which regulated the conditional probability into the TUDA sample.
School sampled in Alpha sample. In this case conditional probability was increased.
School not sampled in either alpha or beta sample .
The goal was to maximize the overlap between TUDA district NAEP and α-sampled schools while minimizing the overlap between TUDA district NAEP and -sampled schools. The discussion below describes the logic in setting the conditional probabilities to achieve these goals while still achieving the desired unconditional probabilities .
The desired probability of the school being selected for district NAEP can be written as
To recap, it was necessary to select a school for TUDA district NAEP with probability while maximizing X and minimizing Y. As all the quantities are probabilities, they are restricted to be between 0 and 1. The task of maximizing X and minimizing Y will, by the algebra, separate out into three cases based on the interrelationships of , , and .
For the first case , Y can be set to 0 (its absolute minimum), and X can be maximized then by making Z as small as possible (0). Setting Y and Z to 0 gives
Note that X is less than or equal to 1 when , explaining why this logic only works in this case.
If , Y can be set to 0 (its best value), X can be set to its largest value 1 (its best value), with the following equation for Z:
If and , then Y can no longer be set to zero also maintaining the constraints. X can be set equal to 1 (its best value), with the following result
Y can be minimized by making Z as large as possible. Setting Z to its maximum value of 1 gives
Since = 1 --, this simplifies to
Note that in this case , which guarantees that the above expression for Y is between 0 and 1. The following table summarizes the results up to this point.
The expression for can be further simplified by observing that
Except for new or newly eligible schools, the β sample was selected to minimize overlap with the α sample and
Therefore it follows that
The complete solution is given by the following table.
For new or newly eligible schools the α and β samples were selected independently and
For these schools the complete solution is given by the following table.
The table below summarizes the formulas for assigning conditional probabilities.
|Alpha sample school||Beta and not alpha sample school||Neither alpha nor beta sample school|
|CCD school and and and||1||1|
|CCD school and and||1||1|
|CCD school and and and||1||0|
|New school and and||1||1|
|New school and and||1||0|
1 A desired expected hits is actually computed, which is equal to a probability of selection when expected hits is less than 1. is the minimum of 1 and this expected hits total. If expected hits was greater than 1 and the conditional probability was less than 1, then the school was selected by the conditional probability and had a maximum of one hit. If expected hits was greater than 1 and the conditional probability was equal to 1, then the school was selected with certainty, and was sampled for hits using the expected hits total.