Sources of Data
National Household Education Survey (NHES)
Kathryn A. Chandler
Schools and Staffing Survey (SASS)
This report focuses on teachers' responses. The overall weighted response rates were 84 percent for public school teachers and 73 percent for private school teachers. In the Public School Teacher Questionnaire, 91 percent of the items had a response rate of 90 percent or more, and in the Private School Teacher Questionnaire, 89 percent of the items had this level of response. Values were imputed for questionnaire items that should have been answered but were not. For additional information about SASS, refer to R. Arbramson, C. Cole, S. Fondelier, B. Jackson, R. Parmer, and S. Kaufman, 1996, 1993-94 Schools and Staffing Survey: Sample Design and Estimation (NCES 96-089), or contact:
National School-Based Youth Risk Behavior Survey (YRBS)
The YRBS used a three-stage cluster sampling design to produce a nationally representative sample of 9th- through 12th-grade students in the United States. The target population consisted of all public and private school students in grades 9 through 12 in the 50 states and the District of Columbia. The first-stage sampling frame included selecting primary sampling units (PSUs) from strata formed on the basis of urbanization and the relative percentage of black and Hispanic students in the PSU. These PSUs are either large counties or groups of smaller, adjacent counties. At the second stage, schools were selected with probability proportional to school enrollment size. Schools with substantial numbers of black and Hispanic students were sampled at relatively higher rates than all other schools. The final stage of sampling consisted of randomly selecting within each chosen school at each grade 9 through 12 one or two intact classes of a required subject, such as English or social studies. All students in selected classes were eligible to participate. Approximately 16,300, 10,900, and 16,300 students were selected to participate in the 1993 survey, the1995 survey, and the 1997 survey, respectively.
The overall response rate was 70 percent for the 1993 survey, 60 percent for the 1995 survey, and 69 percent for the 1997 survey. The weights were developed to adjust for nonresponse and the oversampling of black and Hispanic students in the sample. The final weights were normalized so that only weighted proportions of students (not weighted counts of students) in each grade matched national population projections. For additional information about the YRBS, contact:
Fast Response Survey System: Principal/School Disciplinarian Survey on School Violence
The sample of public schools was selected from the 1993-94 NCES Common Core of Data (CCD) Public School Universe File. The sample was stratified by instructional level, locale, and school size. Within the primary strata, schools were also sorted by geographic region and by percent minority enrollment. The sample sizes were then allocated to the primary strata in rough proportion to the aggregate square root of the size of enrollment of schools in the stratum. A total of 1,415 schools were selected. Among them, 11 schools were found no longer to be in existence, and 1,234 schools completed the survey. In April 1997, questionnaires were mailed to school principals, who were asked to complete the survey or to have it completed by the person most knowledgeable about discipline issues at the school. The raw response rate was 88 percent (1,234 schools divided by the 1,404 eligible schools in the sample). The weighted overall response rate was 89 percent, and item nonresponse rates ranged from 0 percent to 0.9 percent. The weights were developed to adjust for the variable probabilities of selection and differential nonresponse and can be used to produce national estimates for regular public schools in the 1996-97 school year. For more information about the FRSS: Principal/School Disciplinarian Survey on School Violence, contact:
National Crime Victimization Survey (NCVS)
The NCVS sample consists of about 55,000 households, selected using a stratified, multi-stage cluster design. In the first stage, the primary sampling units (PSU's), consisting of counties or groups of counties, are selected. In the second stage, smaller areas, called Enumeration Districts (ED's) were selected from each sampled PSU. Finally, from selected ED's, clusters of four households, called segments, were selected for interview. At each stage, the selection was done proportionate to population size in order to create a self-weighting sample. The final sample was augmented to account for housing units constructed after the decennial Census. Within each sampled household, Census Bureau personnel interviewed all household members ages 12 and older to determine whether they had been victimized by the measured crimes during the 6 months preceding the interview. About 90,000 persons ages 12 and older are interviewed each 6 months. Households remain in sample for 3 years and are interviewed 7 times at 6-month intervals. The initial interview at each sample unit is used only to bound future interviews to establish a time frame to avoid duplication of crimes uncovered in these subsequent interviews. After their seventh interview households are replaced by new sample households. The NCVS has consistently obtained a response rate of about 95 percent at the household level. During the study period, the completion rates for persons within households were about 91 percent. Thus, final response rates were about 86 percent. Weights were developed to permit estimates for the total U.S. population 12 years and older. For more information about the NCVS, contact:
Michael R. Rand
School Crime Supplement (SCS)
In both 1989 and 1995, the SCS was conducted for a 6-month period from January through June in all households selected for the NCVS (see discussion above for information about the sampling design). Within these households, the eligible respondents for the SCS were those household members who were between the ages of 12 and 19, had attended school at any time during the 6 months preceding the interview, and were enrolled in a school that would help them advance toward eventually receiving a high school diploma. These persons were asked the supplemental questions in the SCS only after completing their entire NCVS interview. A total of 10,449 students participated in the 1989 SCS, and 9,954 in the 1995 SCS. In the 1989 and 1995 SCS, the household completion rates were 97 percent and 95 percent, respectively, and the student completion rates were 86 percent and 78 percent, respectively. Thus, the overall SCS response rate (calculated by multiplying the household completion rate by the student completion rate) was 83 percent in 1989 and 74 percent in 1995. Response rates for most survey items were high-mostly over 95 percent of all eligible respondents. The weights were developed to compensate for differential probabilities of selection and nonresponse. The weighted data permit inferences about the 12- to 19-year-old student population who were enrolled in schools in 1989 and 1995. For more information about SCS, contact:
Kathryn A. Chandler
Monitoring the Future (MTF)
The sample selection involves three stages. The first stage selects geographic areas or primary sampling units (PSUs). These PSUs are developed by the Sampling Section of the Survey Research Center for use in the Center's nationwide interview studies. In the second stage, schools within PSUs are selected with a probability proportionate to the size of their senior class. In the third stage, up to about 400 seniors within each selected school are sampled. Each year, about 130 schools participate in the survey, and from these schools, about 16,000 high school seniors complete questionnaires. These students are divided into six subsamples consisting of an average of 2,700 respondents, and each subsample is administered a different form of the questionnaire. Since the inception of the study, the participation rate among schools has been between 60 and 80 percent, and the student response rate has been between 77 and 86 percent. For more information about Monitoring the Future, contact:
Survey Research Center
Data Source for School-Associated Violent Deaths
A total of 105 school-associated violent deaths were identified by the following sequential procedures: 1) tracking fatalities through a newspaper clipping service and informal voluntary reports from state and local education officers; 2) searching two computerized newspaper and broadcast media databases; 3) interviewing local press, law enforcement officers, or school officials who were familiar with each case; and 4) once cases were identified, obtaining further information about the deaths from official sources.
Accuracy of Estimates
Sampling errors occur because observations are made on samples rather than on entire populations. Surveys of population universes are not subject to sampling errors. Estimates based on a sample will differ somewhat from those that would have been obtained by a complete census of the relevant population using the same survey instruments, instructions, and procedures. The standard error of a statistic is a measure of the variation due to sampling; it indicates the precision of the statistic obtained in a particular sample. In addition, the standard errors for two sample statistics can be used to estimate the precision of the difference between the two statistics and to help determine whether the difference based on the sample is large enough so that it represents the population difference.
Most of the data used in this report were obtained from complex sampling designs rather than a simple random design. In these sampling designs, data were collected through stratification, clustering, unequal selection probabilities, or multistage sampling. These features of the sampling usually result in estimated statistics that are more variable (that is, have larger standard errors) than they would have been if they had been based on data from a simple random sample of the same size. Therefore, calculation of standard errors requires procedures that are markedly different from the ones used when the data are from a simple random sample. The Taylor series approximation technique or the balanced repeated replication (BRR) method was used to estimate most of the statistics and their standard errors in this report. Table B3 lists the various methods used to compute standard errors for different data sets.
Standard error calculation for data from the National Crime Victimization Survey, the School Crime Supplement, and Monitoring the Future relied on a different procedure. For statistics based on the NCVS and the SCS data, standard errors were derived from a formula developed by the Census Bureau, which consists of three generalized variance function (gvf) constant parameters that represent the curve fitted to the individual standard errors calculated using the Jackknife Repeated Replication technique. The formulas used to compute the adjusted standard errors associated with percentages or population counts can be found in table B3.
For the statistics based on the Monitoring the Future data, their standard errors were derived from the published tables of confidence intervals in appendix A (pp. 313-322) of Monitoring the Future: Questionnaire Responses from the Nation's High School Seniors, 1995, by Lloyd D. Johnston, Jerald G. Bachman, and Patrick M. O'Malley, Survey Research Center, Institute for Social Research, the University of Michigan, 1997. Generally, the table entries, when added to and subtracted from the observed percentage, establish the 95 percent confidence interval. The appendix presents specific guidelines for using the tables of confidence intervals and conducting statistical tests for the difference between two percentages.
where E1 and E2 are the estimates to be compared and se1 and se2 are their corresponding standard errors. Note that this formula is valid only for independent estimates. When the estimates are not independent (for example, when comparing a total percentage with that for a subgroup included in the total), a covariance term (i.e., 2*se1*se2) must be added to the denominator of the formula
Once the t value was computed, it was compared with the published tables of values at certain critical levels, called alpha levels. For this report, an alpha value of 0.05 was used, which has a t value of 1.96. If the t value was larger than 1.96, then the difference between the two estimates was statistically significant at the 95 percent level.
When multiple comparisons between more than two groups were made, for example, between racial/ethnic groups, a Bonferroni adjustment to the significance level was used to ensure that the significance level for the tests as a group was at the .05 level. Generally, when multiple statistical comparisons are made, it becomes increasingly likely that an indication of a population difference is erroneous. Even when there is no difference in the population, at an alpha of .05, there is still a 5 percent chance of concluding that an observed t value representing one comparison in the sample is large enough to be statistically significant. As the number of comparisons increase, the risk of making such an erroneous inference also increases. The Bonferroni procedure corrects the significance (or alpha) level for the total number of comparisons made within a particular classification variable. For each classification variable, there are (K*(K-l)/2) possible comparisons (or nonredundant pairwise combinations), where K is the number of categories. The Bonferroni procedure divides the alpha level for a single t test by the number of possible pairwise comparisons in order to produce a new alpha level that is corrected for the fact that multiple contrasts are being made. As a result, the t value for a certain alpha level (e.g., .05) increases, which makes it more difficult to claim that the difference observed is statistically significant.
Finally, a linear trend test was used when a statement describing a linear trend, rather than the differences between two discrete categories, was made. This test allows one to examine whether, for example, the percentage of students using drugs increased (or decreased) over time or whether the percentage of students who reported being physically attacked in school increased (or decreased) with their age. Based on a regression with, for example, student's age as the independent variable and whether a student was physically attacked as the dependent variable, the test involves computing the regression coefficient (b) and its corresponding standard error (se). The ratio of these two (b/se) is the test statistic t. If t is greater than 1.96, the critical value for one comparison at the .05 alpha level, the hypothesis that there is a linear relationship between student's age and being physically attacked is not rejected.
 For detailed information about how the data were collected and analyzed, see S.P. Kachur et al., "School-Associated Violent Deaths in the United States, 1992 to 1994," Journal of the American Medical Association 275 (22) (1996): 1729-1733.