NAEP Sample Design → NAEP 2002 Sample Design → 2002 State Assessment Sample Design → Trial Urban District Assessment School Selection → Assignment of Keyfitzed Conditional Probabilities for the Trial Urban Assessment School Sample

## Assignment of Keyfitzed Conditional Probabilities for the Trial Urban District Assessment School Sample

Conditional probabilities were assigned to control overlap between samples while maintaining the desired probabilities of selection for each sample individually. This was done using a technique called Keyfitzing. The original reference for Keyfitzing is Keyfitz (1951). Rust and Johnson (1992) discuss the method in its NAEP application. The desired probabilities of selection for the schools in the TUDA district samples were the  probabilities derived from the measures of size. These are called  below (for probability in TUDA district sample).1 There are three cases which define the sampling status which regulated the conditional probability into the TUDA sample.

• School sampled in Alpha sample. In this case conditional probability was increased.

• School sampled in Beta sample and not in Alpha sample ()2. In this case, the conditional probability was decreased.

• School not sampled in either alpha or beta sample .

The goal was to maximize the overlap between TUDA district NAEP and α-sampled schools while minimizing the overlap between TUDA district NAEP and -sampled schools. The discussion below describes the logic in setting the conditional probabilities to achieve these goals while still achieving the desired unconditional probabilities .

Let

•   = the probability of the school being selected for the alpha sample,
•  = the probability of the school being selected for the beta sample and not the alpha sample, and
•  = the probability of the school being selected for neither the alpha nor the beta samples.

The desired probability of the school being selected for district NAEP can be written as

where

• ,
• , and
• .

To recap, it was necessary to select a school for TUDA district NAEP with probability  while maximizing X and minimizing Y. As all the quantities are probabilities, they are restricted to be between 0 and 1. The task of maximizing X and minimizing Y will, by the algebra, separate out into three cases based on the interrelationships of , , and .

For the first case , Y can be set to 0 (its absolute minimum), and X can be maximized then by making Z as small as possible (0). Setting Y and Z to 0 gives

Note that X is less than or equal to 1 when  , explaining why this logic only works in this case.

If , Y can be set to 0 (its best value), X can be set to its largest value 1 (its best value), with the following equation for Z:

giving

If  and , then Y can no longer be set to zero also maintaining the constraints.   X can be set equal to 1 (its best value), with the following result

Y can be minimized by making Z as large as possible. Setting Z to its maximum value of 1 gives

Since  = 1 --, this simplifies to

Note that in this case , which guarantees that the above expression for Y is between 0 and 1. The following table summarizes the results up to this point.

Condition X Y Z
0 0
and  1 1
and  1 0

The expression for  can be further simplified by observing that

Except for new or newly eligible schools, the β sample was selected to minimize overlap with the α sample and

Therefore it follows that

The complete solution is given by the following table.

Condition X Y Z
0 0
and  and  1 1
and    1 1
and and 1 0

For new or newly eligible schools the α and β samples were selected independently and

For these schools the complete solution is given by the following table.

Condition X  Y Z
0 0
and  1 1
and  1 0

The table below summarizes the formulas for assigning conditional probabilities.

Condition School sample
Alpha sample school Beta and not alpha sample school Neither alpha nor beta sample school
0 0
CCD school and  and and   1 1
CCD school and and  1 1
CCD school and and  and 1 0
New school and and 1 1
New school and  and 1 0

1 A desired expected hits is actually computed, which is equal to a probability of selection when expected hits is less than 1.  is the minimum of 1 and this expected hits total. If expected hits was greater than 1 and the conditional probability was less than 1, then the school was selected by the conditional probability and had a maximum of one hit. If expected hits was greater than 1 and the conditional probability was equal to 1, then the school was selected with certainty, and was sampled for hits using the expected hits total.

2The figure ^ represents logical 'and' (condition 1 and condition 2).

Last updated 11 March 2009 (RF)