Skip Navigation
small NCES header image

Table of Contents  |  Search Technical Documentation  |  References

NAEP Sample Design → NAEP 2002 Sample Design → 2002 State Assessment Sample Design → Trial Urban District Assessment School Selection → Assignment of Keyfitzed Conditional Probabilities for the Trial Urban Assessment School Sample

Assignment of Keyfitzed Conditional Probabilities for the Trial Urban District Assessment School Sample

Conditional probabilities were assigned to control overlap between samples while maintaining the desired probabilities of selection for each sample individually. This was done using a technique called Keyfitzing. The original reference for Keyfitzing is Keyfitz (1951). Rust and Johnson (1992) discuss the method in its NAEP application. The desired probabilities of selection for the schools in the TUDA district samples were the pi subscript j s probabilities derived from the measures of size. These are called P subscript d below (for probability in TUDA district sample).1 There are three cases which define the sampling status which regulated the conditional probability into the TUDA sample.

  • School sampled in Alpha sample. In this case conditional probability was increased.

  • School sampled in Beta sample and not in Alpha sample (Beta and not alpha)2. In this case, the conditional probability was decreased.

  • School not sampled in either alpha or beta sample left parenthesis not alpha intersect not beta right parenthesis.

The goal was to maximize the overlap between TUDA district NAEP and α-sampled schools while minimizing the overlap between TUDA district NAEP and Beta and not alpha-sampled schools. The discussion below describes the logic in setting the conditional probabilities to achieve these goals while still achieving the desired unconditional probabilities P subscript d.

Let

  • P subscript alpha  = the probability of the school being selected for the alpha sample,
  • P subscript beta and not alpha = the probability of the school being selected for the beta sample and not the alpha sample, and
  • P subscript not alpha and not beta = the probability of the school being selected for neither the alpha nor the beta samples.

The desired probability of the school being selected for district NAEPP subscript d can be written as

P subscript d equals left parenthesis P subscript alpha right parenthesis X plus left parenthesis P subscript beta and not alpha right parenthesis Y plus left parenthesis P subscript not alpha and not beta right parenthesis Z

where

  • X equals P subscript d conditional alpha,
  • Y equals P subscript d conditional beta and not alpha, and
  • Z equals P subscript d conditional not alpha and not beta.

To recap, it was necessary to select a school for TUDA district NAEP with probability P subscript d while maximizing X and minimizing Y. As all the quantities are probabilities, they are restricted to be between 0 and 1. The task of maximizing X and minimizing Y will, by the algebra, separate out into three cases based on the interrelationships of P subscript d, P subscript alpha, and P subscript beta and not alpha.  

For the first case P subscript d less than or equal to P subscript alpha, Y can be set to 0 (its absolute minimum), and X can be maximized then by making Z as small as possible (0). Setting Y and Z to 0 gives

X equals P subscript d divided by P subscript a

Note that X is less than or equal to 1 when  P subscript d less than or equal to P subscript alpha, explaining why this logic only works in this case.

If P subscript d greater than P subscript alpha and  P subscript d less than 1 minus P subscript beta and not alpha, Y can be set to 0 (its best value), X can be set to its largest value 1 (its best value), with the following equation for Z:

P subscript d equals P subscript alpha plus left parenthesis P subscript not alpha and not beta right parenthesis Z

giving

Z equals P subscript d minus P subscript alpha divided by 1 minus P subscript alpha minus P subscript Beta and not alpha

If P subscript d greater than P subscript alpha and P subscript d greater than or equal to 1 minus P subscript beta and not alpha, then Y can no longer be set to zero also maintaining the constraints.   X can be set equal to 1 (its best value), with the following result

P subscript d equals P subscript alpha plus left parenthesis P subscript beta and not alpha right parenthesis Y plus left parenthesis P subscript not alpha and not beta right parenthesis Z

Y can be minimized by making Z as large as possible. Setting Z to its maximum value of 1 gives

Y equals P subscript d minus P subscript alpha minus P subscript alpha not beta divided by P subscript beta not alpha

Since P subscript not alpha and not beta = 1 -P subscript alpha-P subscript beta and not alpha, this simplifies to

Y equals P subscript d plus P subscript beta not alpha minus 1 divided by P subscript beta not alpha

Note that in this case P subscript d greater than or equal to 1 minus P subscript beta and not alpha, which guarantees that the above expression for Y is between 0 and 1. The following table summarizes the results up to this point.

Condition X Y Z
P subscript d less than or equal to P subscript alpha P subscript d divided by P subscript alpha 0 0
P subscript d greater than P subscript alpha  and P subscript d greater than or equal to 1 minus P subscript beta and not alpha 1 P subscript d plus P subscript Beta and not alpha divided by P subscript Beta and not alpha 1
P subscript d greater than P subscript alpha and   P subscript d less than 1 minus P subscript beta and not alpha 1 0 P subscript d minus P subscript alpha divided by 1 minus P subscript alpha minus P subscript Beta and not alpha

The expression for P subscript beta and not alpha can be further simplified by observing that

P subscript Beta and not alpha equals left parenthesis P subscript beta conditional not alpha right parenthesis left parenthesis 1 minus P subscript alpha right parenthesis

Except for new or newly eligible schools, the β sample was selected to minimize overlap with the α sample and

P subscript beta left slash alpha equals left brace 2 row 2 column matrix, row 1 column 1 is p subscript beta divided by into bracket 1 minus P subscript alpha, 1 row 2 column: if P subscript beta is less than or equals 1 minus row2 column 1 equals 1, row 2 column 2 equals p subscript beta is greater than 1 minus P subscript alpha.

Therefore it follows that

P subscript beta logical alpha equals left brace 2 rows 2 column matrix, row 1 column 1 is P subscript beta, row 1 column 2 is: if p subscript beta is less than or equals 1 minus P subscript alpha; or row 2 column 1 is 1 minus p subscript alpha, row 2 column 2 is: p subscript beta is greater than 1 minus P subscript alpha.

 

 The complete solution is given by the following table.

Condition X Y Z 
P subscript d less than or equal to P subscript alpha P subscript d divided by P subscript alpha 0 0
P subscript d greater than P subscript alpha and   P_d greater than or equal to 1 minus P_beta  and P subscript Beta less than or equal to 1 minus P subscript alpha 1 P_d plus P_Beta divided by P_Beta 1
P subscript d greater than P subscript alpha and   P subscript Beta greater than 1 minus P subscript alpha 1 P subscript d minus P subscript alpha divided by 1 minus P subscript alpha 1
P subscript d greater than P subscript alpha  and   P subscript d less than 1 minus P subscript beta and P subscript Beta less than or equal to 1 minus P subscript alpha 1 0   P subscript d minus P subscript alpha divided by 1 minus P subscript alpha minus P subscript Beta

For new or newly eligible schools the α and β samples were selected independently and

P subscript Beta and not alpha equals left parenthesis P subscript beta right parenthesis left parenthesis 1 minus P subscript alpha right parenthesis

For these schools the complete solution is given by the following table.

Condition X  Y Z 
P subscript d less than or equal to P subscript alpha P subscript d divided by P subscript alpha 0 0
P subscript d greater than P subscript alphaand P subscript d greater than or equal to 1 minus P subscript beta plus P subscript alpha 1 P subscript d plus P subscript Beta minusP subscript alpha P subscript Beta minus 1 divided by P subscript Beta minusP subscript alpha P subscript Beta 1
P subscript d greater than P subscript alphaand P subscript d less than1 minus P subscript beta plus P subscript alpha P subscript beta 1 0 P subscript d minus P subscript alpha divided by 1 minus P subscript alpha minus P subscript Beta plus P subscript alpha P subscript Beta

The table below summarizes the formulas for assigning conditional probabilities.

Condition School sample
Alpha sample school Beta and not alpha sample school Neither alpha nor beta sample school
P subscript d less than or equal to P subscript alpha P subscript d divided by P subscript alpha 0 0
CCD school and P subscript d greater than P subscript alpha and P_d greater than or equal to 1 minus P_betaand P subscript Beta less than or equal to 1 minus P subscript alpha  1 P subscript d plus P subscript Beta minus 1 divided by P subscript Beta 1
CCD school and P subscript d greater than P subscript alphaand  P subscript Beta greater than 1 minus P subscript alpha 1 P subscript d minus P subscript alpha divided by 1 minus P subscript alpha 1
CCD school and P subscript d greater than P subscript alphaand   P subscript d less than 1 minus P subscript beta  and P subscript Beta less than or equal to 1 minus P subscript alpha 1 0 P subscript d minus P subscript alpha divided by 1 minus P subscript alpha minus P subscript beta
New school and P subscript d greater than P subscript alphaand P subscript d greater than1 minus P subscript beta plus P subscript alpha P subscript beta 1 P subscript d plus P subscript Beta minusP subscript alpha P subscript Beta minus 1 divided by P subscript Beta minusP subscript alpha P subscript Beta 1
New school and P subscript d greater than P subscript alpha and P subscript d less than1 minus P subscript beta plus P subscript alpha P subscript beta 1 0 P subscript d minus P subscript alpha divided by 1 minus P subscript alpha minus P subscript Beta plus P subscript alpha P subscript Beta

 

1 A desired expected hits is actually computed, which is equal to a probability of selection when expected hits is less than 1. P subscript d is the minimum of 1 and this expected hits total. If expected hits was greater than 1 and the conditional probability was less than 1, then the school was selected by the conditional probability and had a maximum of one hit. If expected hits was greater than 1 and the conditional probability was equal to 1, then the school was selected with certainty, and was sampled for hits using the expected hits total.

2The figure ^ represents logical 'and' (condition 1 and condition 2).  


Last updated 11 March 2009 (RF)

Printer-friendly Version


Would you like to help us improve our products and website by taking a short survey?

YES, I would like to take the survey

or

No Thanks

The survey consists of a few short questions and takes less than one minute to complete.
National Center for Education Statistics - http://nces.ed.gov
U.S. Department of Education