View Quarterly by:
This Issue | Volume and Issue | Topics
|
|||
| |||
This article is excerpted from the Technical Report of the same name. The universe data are from the NCES Private School Universe Survey (PSS). | |||
The Private School Universe Survey (PSS) is conducted by the Bureau of the Census, under the sponsorship of the National Center for Education Statistics. It is a mail survey, designed to provide data relating to all private schools in the 50 states and the District of Columbia. The survey is a census of private schools. It is conducted biannually and attempts to achieve a complete count of private schools and accompanying counts of their students, teachers, and graduates. During each administration of the survey, the PSS private school register is updated prior to survey mailout. Two sources are used to update the register: (1) the list frame, a synthesis of association, state, and commercial listings of private schools; and (2) an area sample, an independent listing of private schools included in a sample of geographical areas. Despite ongoing efforts to update the PSS register, the private schools' list frame remains incomplete. The most recent estimate of the undercoverage rate for private schools was about 8 percent (Jackson and Frazier 1995); that is, about 8 percent of the private schools were not included on the register after the update from the list frame. The list enumeration is therefore supplemented by an area sample designed to identify and represent unlisted private schools in the PSS estimates. A nationally representative sample of primary sampling units (PSUs)each PSU consisting of a single county or a group of countiesis chosen for the area sample. Therefore, our area frame consists of the list of PSUs of which the nation is composed. The sample facilitates the identification of private schools not included in the list frame. Within each selected PSU, a list of private schools is compiled from such sources as telephone books, yellow pages, local government offices, chambers of commerce, and religious institutions. This list is merged with the list frame, and therefore represents an expansion of the survey frame to the extent that unlisted schools were detected. The PSS sample design can readily support the computation of direct survey estimates of the number of private schools and their numbers of students, teachers, and graduates at the national and regional levels. These direct survey estimates are obtained in the conventional manner in survey analysis, where sampled schools are weighted up to represent unsampled and nonresponding schools. While direct estimation produces estimates of adequate precision for the four geographical regions, the national-level design of the area sample can result in less reliable estimates for individual states. In order to address this problem, the use of indirect estimation methods is recommended. This report describes the development and evaluation of the statistical models used to produce indirect state estimates from the PSS for the 1991-92 and 1993-94 school years. The statistical models are based on the data obtained from the area sample PSUs. Within these PSUs, data are available for both the private schools listed in the list frame and those identified through the area frame. From these data, models can be developed to predict the probability that a school of a given type is included in the list frame. Then for nonsampled PSUs, the listed schools of the designated school type can be weighted up by the inverse of this probability, in order to represent the corresponding unlisted schools in those PSUs. A problem that arises with the use of indirect estimates for relatively small geographical areas is that when the estimates from such areas are added together, the sum will not be consistent with the direct estimate for the combined area. Consequently, the sum of the indirect estimates for the states in a region generally will not equal the direct estimate for the region. This problem is handled by a constrained estimation procedure that adjusts the indirect state estimates so that the resultant estimates for the states in a region sum to the direct regional estimate.
This section describes the PSS sample design and direct estimation procedures currently used to produce national and regional survey estimates. For direct estimation, each unlisted school added to the list frame's total through the area sample is weighted by the reciprocal of its PSU's selection probability. All list frame schools are included in the PSS, and therefore receive a sampling weight of 1.0. Consequently, the overall weight adjustment for those schools reflects only a noninterview adjustment. An estimated 8 percent of the targeted private schools did not respond for the 1993-94 survey period (Broughman 1996). The corresponding rate for 1991-92 was 2 percent. Within each sampled PSU, the weighted estimate of the number of unlisted schools from the area sample is added to the list frame count. This sum is aggregated over PSUs within the individual states to obtain state totals, and over states to obtain the four regional totals for the number of private schools. Estimates are obtained similarly for the number of students, teachers, and graduates. This approach is readily extended to produce estimates for subgroups, such as regions or type of school, by confining the summations to schools in a specified subgroup. While this procedure can be used to provide unbiased estimates for states, the estimates produced in this manner are subject to considerable sampling error. The reason for this lack of precision is that the sample of PSUs for the area frame was not stratified geographically by state but only by region. As a result, the number of PSUs sampled in a state is random. The percentage of sampled PSUs in a given region, from a particular state, can differ considerably from the percentage of the total population of the region ascribed to the state. If the number of PSUs sampled in the state is larger than expected, the state estimates will be too large, and if smaller than expected, they will be too small. As a result, we have developed a model-based procedure for state estimation in an effort to improve upon estimates derived from direct estimation.
An indirect or synthetic estimator is generally defined as a nontraditional estimator which "borrows strength" from a domain or time period, other than those of interest, in deriving desired predictions or estimates. With indirect estimation, as with direct estimation, the PSS sample is treated as being composed of schools from both the list and area frames. However, the indirect procedure uses the area sample to identify schools not included in the list frame, and to establish a basis for data adjustment in nonsampled PSUs to account for the missing schools. The unweighted counts from these unlisted (missed) schools are added to the list frame counts, providing a complete count in sampled PSUs. For nonsampled PSUs, noncoverage adjustment factors derived from the area sample are applied to the list frame sample to compensate for the unlisted schools.
The application of the suggested indirect approach requires the specification of a model for noncoverage. The simplest of such models assumes that the unlisted schools are missing completely at random (MCAR). Under this model, the probability that a school is missed or unlisted is the same for every school. This probability may be estimated from the PSS to yield an undercoverage adjustment that is multiplied by each school's nonresponse adjustment factor to give its final weight.
Logistic regression The MCAR assumption is a stringent one that is unlikely to hold in practice. Coverage can be very different for different domains of the PSS population. Consequently, it seemed desirable to consider the application of undercoverage adjustments for several subgroups of the private school population (where the MCAR assumption may be more plausible) before computing state estimates. Moreover, Jackson and Frazier (1995) provide evidence of a significant relationship between school size, as measured by student enrollment, and the probability of the school's inclusion in the original list frame. This led to the fitting of logistic regression models to the 1991-92 and 1993-94 PSS data in the nine domains or subgroups defined by school type.* The model relates the "undercoverage proportion" (or the probability that a given school is not listed) to the regressor variable (school size). It can be estimated for area sample schools. The undercoverage adjustments were determined and applied to the listed schools and students in the non-sampled PSUs. Estimates of the regression coefficients of the model were obtained from the SAS iterative reweighted least squares logistic procedure. The model was assessed using Hosmer-Lemeshow goodness of fit statistics to evaluate the error term of the model. For six of the nine school types there was a reasonably good fit. However, for the other three school typesthe conservative Christian and unaffiliated subgroups of the "other religious" category, and the special emphasis subgroup of the nonsectarian categorythe p-values suggested a lack of fit of the model.
Adjustments to regional totals In an effort to achieve greater precision and consistency, the regional totals based on the indirect estimation method were adjusted to those based on direct estimation. Indirect State-Level Estimation for the Private School Survey
Table A presents the original list frame counts (Listed), the direct estimates, the indirect estimates from the logistic regression model (Logistic), and the final indirect estimates adjusted to unbiased regional counts (Final) of the number of private schools by state. In addition, for comparison, corresponding indirect estimates were produced by adjusting list frame schools in nonsampled PSUs by an under-coverage adjustment. This was done for the nine school types (Ratio 1) and for quartiles of the school enrollment variable (Ratio 2) within school type. The assumption associated with the use of the latter adjustment is that within a given range of the school enrollment variable, the coverage probability is fairly stable. Obviously the four indirect estimates are reasonably close for the individual states, especially the first (Logistic) and the fourth (Ratio 2). Comparison of the third and fourth indirect estimates (Ratio 1 and Ratio 2) permits an assessment of the effect of introducing school enrollment as an additional stratifying variable for the adjustment process. The second indirect estimate (Final) shows the impact of the adjustments to unbiased regional counts and provides the published state numbers for 1993-94. While the indirect estimates seem quite similar, a comparison between these estimates and the direct estimates shows disparity reflecting the underrepresentation (or overrepresentation) of sampled PSUs in the area sample. For example, there are states such as Indiana and Wisconsin for which there were no sampled PSUs in the area sample, while other states, such as Missouri and Ohio, may have been "overrepresented."
An indirect estimation approach is recommended as an alternative to the current procedure for the production of state estimates of the number of private schools in the nation and the associated numbers of students, teachers, and graduates. This procedure borrows strength from the area frame estimates of coverage in deriving "acceptable" and more equitable state estimates. Unless the list frame is complete for a given state, the current estimation procedure necessarily results in biased and highly variable state estimates. However, indirect estimation methods attempt to produce a distribution among the states of the unlisted schools (and therefore of all schools) that is "close" to the actual distribution of the target population.
While the indirect estimates based on simple ratio adjustments for undercoverage compared favorably with those based on the logistic regression model, there is a clear potential for improvement in the model. For example, a geographic variable could possibly be added as a regressor variable. Moreover, school-level or program emphasis could be considered as an alternative undercoverage adjustment variable. The appropriateness of the state estimation methodology under consideration should be evaluated over several survey collection cycles. Moreover, it is suggested that an effort be exerted to identify and ensure the collection of additional data that could define other explanatory variables that might be effective in the modeling of coverage.
NOTE: Details may not add to totals due to rounding. SOURCE: U.S. Department of Education, National Center for Education Statistics, Private School Survey, 1993-94. (Originally published as table 6.1 on p. 14 of the complete report from which this article is excerpted.)
Footnotes
*The
nine domains or subgroups consist of three types of Catholic schools (parochial,
diocesan, and private order); three types of "other religious"
schools (conservative Christian, affiliated, and unaffiliated); and three
types of nonsectarian schools (regular, special emphasis, and special
education).
Jackson, B.J., and Frazier, R.L. (1995). Improving the Coverage of Private Elementary-Secondary Schools. Proceedings of the Section on Survey Research Methods, American Statistical Association, 143-148. Broughman, S. (1996). Private School Universe Survey, 1993-94 (NCES 96-143). U.S. Department of Education. Washington, DC: U.S. Government Printing Office.
|