Skip Navigation
small NCES header image
Indicators of School Crime and Safety: 2005

Appendix A: Technical Notes

General Information

The indicators in this report are based on information drawn from a variety of independent data sources, including national surveys of students, teachers, and principals, and data collections from federal departments and agencies, including the Bureau of Justice Statistics, the National Center for Education Statistics, the Federal Bureau of Investigation, and the Centers for Disease Control and Prevention. Each data source has an independent sample design, data collection method, and questionnaire design or is the result of a universe data collection. Universe data collections include a census of all known entities in a specific universe (e.g., all deaths occuring on school property). Readers should be cautious when comparing data from different sources. Differences in sampling procedures, populations, time periods, and question phrasing can all affect the comparability of results. For example, some questions from different surveys may appear the same, but asked of different populations of students (e.g., students ages 12-18 or students in grades 9-12); in different years; about experiences that occurred within different periods of time (e.g., in the past 30 days or during the past 12 months); or at different locations (e.g., in school or anywhere).

The following is a description of data sources, accuracy of estimates, and statistical procedures used in this report.

Sources of Data

This section briefly describes each of the data sets used in this report: the School-Associated Violent Deaths Surveillance Study, the Supplementary Homicide Reports, the Web-based Injury Statistics Query and Reporting System Fatal, the National Crime Victimization Survey, the School Crime Supplement to the National Crime Victimization Survey, the Youth Risk Behavior Survey, the Schools and Staffing Survey, and the School Survey on Crime and Safety. Directions for obtaining more information are provided at the end of each description. Figure A.1 (PDF 90 KB) presents some key information for each of the data sets used in the report, including the survey year(s), target population, response rate, and sample size. The wording of the interview questions used to construct the indicators are presented in Figure A.2 (PDF 90 KB).

School-Associated Violent Deaths Surveillance Study (SAVD)

The School-Associated Violent Deaths Surveillance Study (SAVD) is an epidemiological study developed by the Centers for Disease Control and Prevention in conjunction with the U.S. Department of Education and the U.S. Department of Justice. SAVD seeks to describe the epidemiology of school-associated violent deaths, identify common features of these deaths, estimate the rate of school-associated violent death in the United States, and identify potential risk factors for these deaths. The surveillance system includes descriptive data on all school-associated violent deaths in the United States, including all homicides, suicides, and unintentional firearm-related deaths where the fatal injury occurred on the campus of a functioning elementary or secondary school, while the victim was on the way to or from regular sessions at such a school, or while attending or on the way to or from an official school-sponsored event. Victims of such events include nonstudents as well as students and staff members. SAVD includes descriptive information about the school, event, victim(s), and offender(s). The SAVD Surveillance System has collected data from July 1, 1992, through present.

SAVD uses a four-step process to identify and collect data on school-associated violent deaths. Cases are initially identified through a search of the Lexis/Nexis newspaper and media database. Then police officials are contacted to confirm the details of the case to determine if the event meets the case definition. Once a case is confirmed, a police official and a school official are interviewed regarding details about the school, event, victim(s), and offender(s). If police officials are unwilling or unable to complete the interview, a copy of the full police report is obtained. The information obtained on schools includes school demographics, attendance/absentee rates, suspension/expulsions and mobility, school history of weapon carrying, security measures, violence prevention activities, school response to the event, and school policies about weapon carrying. Event information includes the location of injury, the context of injury (while classes held, during break, etc.), motives for injury, method of injury, and school and community events happening around the time period. Information obtained on victim(s) and offender(s) includes demographics, circumstances of the event (date/time, alcohol or drug use, number of persons involved), types and origins of weapons, criminal history, psychological risk factors, school-related problems, extracurricular activities, and family history, including structure and stressors.

One hundred five school-associated violent deaths were identified from July 1, 1992-June 30, 1994 (see Kachur et al. 1996). A more recent report from this data collection identified 253 school-associated violent deaths between July 1, 1994-June 30, 1999 (see Anderson et al. 2001). Other publications from this study have described how the number of events changes during the school year (Centers for Disease Control 2001), the source of the firearms used in these events (Reza et al. 2003), and suicides that were associated with schools (Kauffman et al. 2004). The interviews conducted on cases between July 1, 1994 and June 30, 1999 achieved a response rate of 97 percent for police officials and 78 percent for school officials. Data for subsequent study years are preliminary and subject to change. For additional information about SAVD, contact:

Mark Anderson
Division of Violence Prevention
National Center for Injury Prevention and Control
Centers for Disease Control and Prevention, Mailstop K60
4770 Buford Highway NE
Atlanta, GA 30341
Telephone: (770) 488-4646

Supplementary Homicide Reports (SHR)

The Supplementary Homicide Reports (SHR), which are a part of the Uniform Crime Reporting (UCR) program, provide incident-level information on criminal homicides including situation (number of victims to number of offenders); the age, sex, and race of victims and offenders; types of weapons used; circumstances of the incident; and the relationship of the victim to the offender. The data are provided monthly to the Federal Bureau of Investigation (FBI) by local law enforcement agencies participating in the FBI's UCR program. The data include murders and non-negligent manslaughters in the United States from January 1976-December 2003; that is, negligent manslaughters and justifiable homicides have been eliminated from the data. Based on law enforcement agency reports, the FBI estimates that 561,412 murders were committed from 1976 to 2003. Agencies provided detailed information on 561,412 victims and 561,412 offenders.

About 91 percent of homicides are included in the SHR. However, adjustments can be made to the weights to correct for missing reports. Estimates from the SHR used in this report were generated by the Bureau of Justice Statistics (BJS) using a weight developed by BJS that reconciles the counts of SHR homicide victims with those in the UCR for the 1992 through 2003 data years. The weight is the same for all cases for a given year. The weight represents the ratio of the number of homicides reported in the UCR to the number reported in the SHR. For additional information about SHR, contact:

Communications Unit
Criminal Justice Information Services Division
Federal Bureau of Investigation
Module D3
1000 Custer Hollow Road
Clarksburg, WV 26306
Telephone: (304) 625-4995

Web-based Injury Statistics Query and Reporting System Fatal (WISQARSTMFatal)

WISQARS Fatal provides mortality data related to injury. The mortality data reported in WISQARS Fatal come from death certificate data reported to the National Center for Health Statistics (NCHS), Centers for Disease Control and Prevention. Data include causes of death reported by attending physicians, medical examiners, and coroners. It also includes demographic information about decedents reported by funeral directors who obtain that information from family members and other informants. NCHS collects, compiles, verifies, and prepares these data for release to the public. The data provide information about what types of injuries are leading causes of deaths, how common they are, and who they affect. These data are intended for a broad audience-the public, the media, public health practitioners and researchers, and public health officials-to increase their knowledge of injury.

WISQARS Fatal mortality reports provide tables of the total numbers of injury-related deaths and the death rates per 100,000 U.S. population. The reports list deaths according to cause (mechanism) and intent (manner) of injury by state, race, Hispanic origin, sex, and age groupings. For more information on WISQARS Fatal, contact:

National Center for Injury Prevention and Control
Mailstop K59
4770 Buford Highway NE
Atlanta, GA 30341-3724
Telephone: (770) 488-1506

National Crime Victimization Survey (NCVS)

The National Crime Victimization Survey (NCVS), administered for the U.S. Bureau of Justice Statistics by the U.S. Bureau of the Census, is the nation's primary source of information on crime and the victims of crime. Initiated in 1972 and redesigned in 1992, the NCVS collects detailed information annually on the frequency and nature of the crimes of rape, sexual assault, robbery, aggravated and simple assault, theft, household burglary, and motor vehicle theft experienced by Americans and their households each year. The survey measures crimes reported to police as well.

Readers should note that in 2003, in accordance with changes to the Office of Management and Budget's standards for the classification of federal data on race and ethnicity, the NCVS item on race/ethnicity was modified. A question on Hispanic origin is followed by a question on race. The new race question allows the respondent to choose more than one race and delineates Asian as a separate category from Native Hawaiian or Other Pacific Islander. Analysis conducted by the Demographic Surveys Division at the U.S. Census Bureau shows that the new race question had very little impact on the aggregate racial distribution of the NCVS respondents with one exception. There was a 1.6 percentage point decrease in the percent of respondents who reported themselves as White. Due to changes in race/ethnicity categories, comparisons of race/ethnicity across years should be made with caution.

The NCVS sample consists of about 63,124 households selected using a stratified, multistage cluster design. In the first stage, the primary sampling units (PSUs), consisting of counties or groups of counties, were selected. In the second stage, smaller areas, called Enumeration Districts (EDs), were selected from each sampled PSU. Finally, from selected EDs, clusters of four households, called segments, were selected for interview. At each stage, the selection was done proportionate to population size in order to create a self-weighting sample. The final sample was augmented to account for housing units constructed after the decennial Census. Within each sampled household, U.S. Bureau of the Census personnel interviewed all household members ages 12 and older to determine whether they had been victimized by the measured crimes during the 6 months preceding the interview.

The first NCVS interview with a housing unit is conducted in person. Subsequent interviews are conducted by telephone, if possible. About 87,422 persons ages 12 and older are interviewed each 6 months. Households remain in the sample for 3 years and are interviewed seven times at 6-month intervals. The initial interview at each sample unit is used only to bound future interviews to establish a time frame to avoid duplication of crimes uncovered in these subsequent interviews. After their seventh interview, households are replaced by new sample households. The NCVS has consistently obtained a response rate of about 92 percent at the household level. The completion rates for persons within households were about 87 percent. Thus, final response rates were about 79 percent in 2003. Weights were developed to permit estimates for the total U.S. population 12 years and older. For more information about the NCVS, contact:

Katrina Baum
Victimization Statistics Branch
Bureau of Justice Statistics
U.S. Department of Justice
810 7th Street NW
Washington, DC 20531
Telephone: (202) 307-5889

School Crime Supplement (SCS)

Created as a supplement to the NCVS and codesigned by the National Center for Education Statistics and Bureau of Justice Statistics, the School Crime Supplement (SCS) survey was conducted in 1989, 1995, 1999, 2001, and 2003 to collect additional information about school-related victimizations on a national level. This report includes data from the 1995, 1999, 2001, and 2003 collections. The 1989 data are not included in this report as a result of methodological changes to the NCVS and SCS. The survey was designed to assist policymakers as well as academic researchers and practitioners at the federal, state, and local levels so that they can make informed decisions concerning crime in schools. The SCS asks students a number of key questions about their experiences with and perceptions of crime and violence that occurred inside their school, on school grounds, on a school bus, or on the way to or from school. Additional questions not included in the NCVS were also added to the SCS, such as those concerning preventive measures used by the school, students' participation in after-school activities, students' perceptions of school rules, the presence of weapons and street gangs in school, the presence of hate-related words and graffiti in school, student reports of bullying and reports of rejection at school, and the availability of drugs and alcohol in school, as well as attitudinal questions relating to fear of victimization and avoidance behavior at school.

In all SCS survey years, the SCS was conducted for a 6-month period from January-June in all households selected for the NCVS (see discussion above for information about the NCVS sampling design and changes to the race/ethnicity item made in 2003). It should be noted that the initial NCVS interview is included in the SCS data collection. Within these households, the eligible respondents for the SCS were those household members who had attended school at any time during the 6 months preceding the interview, and were enrolled in grades 6-12 in a school that would help them advance toward eventually receiving a high school diploma. The age range of students covered in this report is 12-18 years of age. Eligible respondents were asked the supplemental questions in the SCS only after completing their entire NCVS interview.

In 2001, the SCS survey instrument was modified from previous collections in three ways. First, in 1995 and 1999, “at school” was defined for respondents as in the school building, on the school grounds, or on a school bus. In 2001, the definition for "at school" was changed to mean in the school building, on school property, on a school bus, or going to and from school. This change was made to the 2001 questionnaire in order to be consistent with the definition of “at school” as it is constructed in the NCVS and was also used as the definition in 2003. Cognitive interviews conducted by the U.S. Bureau of the Census on the 1999 School Crime Supplement suggested that modifications to the definition of “at school” would not have a substantial impact on the estimates.

The prevalence of victimization for 1995, 1999, 2001, and 2003 was calculated by using NCVS incident variables appended to the 1995, 1999, 2001, and 2003 SCS data files. The NCVS type of crime variable was used to classify victimizations of students in the SCS as serious violent, violent, or theft. The NCVS variables asking where the incident happened and what the victim was doing when it happened were used to ascertain whether the incident happened at school. For prevalence of victimization, the NCVS definition of “at school” includes in the school building, on school property, or on the way to or from school.

Second, the SCS questions pertaining to fear and avoidance changed between 1999 and 2001. In 1995 and 1999, students were asked if they avoided places or were fearful because they thought someone would "attack or harm" them. In 2001 and 2003, students were asked if they avoided places or were fearful because they thought someone would "attack or threaten to attack them." These changes should be considered when making comparisons between the 1995 and 1999 data and the 2001 and 2003 data.

Third, the SCS question pertaining to gangs changed in the 2001 SCS. The introduction and definition of gangs as well as the placement of the item in the questionnaire changed in the 2001 SCS. Because of these changes, the reader should be cautioned not to compare results based on the 2001 and 2003 SCS presented in this report with those estimates of gangs presented in previous reports.

Total victimization is a combination of violent victimization and theft. If the student reported an incident of either violent or theft victimization or both, he or she is counted as having experienced “total” victimization. Serious violent crimes include rape, sexual assault, robbery, and aggravated assault. Violent crimes include serious violent crimes and simple assault.

A total of 9,728 students participated in the 1995 SCS, 8,398 in 1999, 8,374 in 2001, and 7,152 in 2003. In the 2003 SCS, the household completion rate was 92 percent. In the 1995, 1999, and 2001 SCS, the household completion rates were 95 percent, 94 percent, and 93 percent, respectively; and the student completion rates were 78 percent, 78 percent, and 77 percent, respectively. For the 2003 SCS, the student completion rate was 70 percent.

Thus, the overall unweighted SCS response rate (calculated by multiplying the household completion rate by the student completion rate) was 74 percent in 1995, 73 percent in 1999, 72 percent in 2001, and 64 percent in 2003. Response rates for most survey items were high-typically over 95 percent of all eligible respondents. The weights were developed to compensate for differential probabilities of selection and nonresponse. The weighted data permit inferences about the eligible student population who were enrolled in schools in 1995, 1999, 2001, and 2003. For SCS data, a full nonresponse bias analysis has not been conducted. For more information about SCS, contact:

Kathryn A. Chandler
National Center for Education Statistics
1990 K Street NW
Washington, DC 20006
Telephone: (202) 502-7486

Youth Risk Behavior Survey (YRBS)

The National School-Based Youth Risk Behavior Survey (YRBS) is one component of the Youth Risk Behavior Surveillance System (YRBSS), an epidemiological surveillance system developed by the Centers for Disease Control and Prevention (CDC) to monitor the prevalence of youth behaviors that most influence health.1 The YRBS focuses on priority health-risk behaviors established during youth that result in the most significant mortality, morbidity, disability, and social problems during both youth and adulthood. This report uses 1993, 1995, 1997, 1999, 2001, and 2003 YRBS data.

The YRBS uses a three-stage cluster sampling design to produce a nationally representative sample of students in grades 9-12 in the United States. The target population consisted of all public and private school students in grades 9-12 in the 50 states and the District of Columbia. The first-stage sampling frame included selecting primary sampling units (PSUs) from strata formed on the basis of urbanization and the relative percentage of Black and Hispanic students in the PSU. These PSUs are either large counties or groups of smaller, adjacent counties. At the second stage, schools were selected with probability proportional to school enrollment size.

Schools with substantial numbers of Black and Hispanic students were sampled at relatively higher rates than all other schools. The final stage of sampling consisted of randomly selecting within each chosen school at each grade 9-12 one or two intact classes of a required subject, such as English or social studies. All students in selected classes were eligible to participate. Approximately 16,300, 10,900, 16,300, 15,300, 13,600, and 15,200 students participated in the 1993, 1995, 1997, 1999, 2001, and 2003 surveys, respectively.

The overall response rate was 70 percent for the 1993 survey, 60 percent for the 1995 survey, 69 percent for the 1997 survey, 66 percent for the 1999 survey, 63 percent for the 2001 survey, and 67 percent for the 2003 survey. NCES standards call for response rates of 85 percent or better for cross-sectional surveys and bias analyses are called for by NCES when that percentage is not achieved. For YRBS data, a full nonresponse bias analysis has not been done because the data necessary to do the analysis are not available. The weights were developed to adjust for nonresponse and the oversampling of Black and Hispanic students in the sample. The final weights were constructed so that only weighted proportions of students (not weighted counts of students) in each grade matched national population projections. Where YRBS data are presented, accurate national population projections are provided from the Digest of Education Statistics.

State level data were downloaded from Youth Online: Comprehensive Results web page Each state and local school-based YRBS employs a two-stage, cluster sample design to produce representative samples of students in grades 9-12 in their jurisdiction. All except a few state and local samples include only public schools, and each local sample includes only schools in the funded school district (e.g., San Diego Unified School District) rather than in the entire city (e.g., greater San Diego area).

In the first sampling stage in all except a few states and districts, schools are selected with probability proportional to school enrollment size. In the second sampling stage, intact classes of a required subject or intact classes during a required period (e.g., second period) are selected randomly. All students in sampled classes are eligible to participate. Certain states and districts modify these procedures to meet their individual needs. For example, in a given state or district, all schools, rather than a sample of schools, might be selected to participate. State and local surveys that have a scientifically selected sample, appropriate documentation, and an overall response rate greater than 60 percent are weighted. The overall response rate reflects the school response rate multiplied by the student response rate. These three criteria are used to ensure that the data from those surveys can be considered representative of students in grades 9-12 in that jurisdiction. A weight is applied to each record to adjust for student nonresponse and the distribution of students by grade, sex, and race/ethnicity in each jurisdiction. Therefore, weighted estimates are representative of all students in grades 9-12 attending schools in each jurisdiction. Surveys that do not have an overall response rate of greater than or equal to 60 percent and do not have appropriate documentation are not weighted and are not included in this report.

In 2003, a total of 32 states and 20 districts had weighted data. In sites with weighted data, the student sample sizes for the state and local YRBS ranged from 968 to 9,320. School response rates ranged from 67 to 100 percent, student response rates ranged from 60 to 94 percent, and overall response rates ranged from 60 to 90 percent.

Readers should note that reports of these data published by the CDC do not include percentages where the denominator includes less than 100 unweighted cases. However, NCES publications do not include percentages where the denominator includes less than 30 unweighted cases. Therefore, estimates presented here may not appear in CDC publications of YRBS estimates and are considered unstable by CDC standards.

In 1999, in accordance with changes to the Office of Management and Budget's standards for the classification of federal data on race and ethnicity, the YRBS item on race/ethnicity was modified.

The version of the race and ethnicity question used in 1993, 1995, and 1997 was:

How do you describe yourself?
  1. White - not Hispanic
  2. Black - not Hispanic
  3. Hispanic or Latino
  4. Asian or Pacific Islander
  5. American Indian or Alaskan Native
  6. Other
The version used in 1999, 2001, and 2003 was:

How do you describe yourself? (Select one or more responses.)
  1. American Indian or Alaska Native
  2. Asian
  3. Black or African American
  4. Hispanic or Latino
  5. Native Hawaiian or Other Pacific Islander
  6. White

This new version of the question used in 1999, 2001, and 2003 results in the possibility of respondents marking more than one category. While more accurately reflecting respondents' racial and ethnic identity, the new item cannot be directly compared to responses to the old item. A recent study by Brener, Kann, and McManus (2003) found that allowing students to select more than one response to the race/ethnicity question on the YRBS had only a minimal effect on reported race/ethnicity among high school students.

For additional information about the YRBS, contact:

Laura Kann
Division of Adolescent and School Health
National Center for Chronic Disease Prevention and Health Promotion
Centers for Disease Control and Prevention
Mailstop K-33
4770 Buford Highway NE
Atlanta, GA 30341-3717
Telephone: (770) 488-6181

Schools and Staffing Survey (SASS)

This report draws upon data on teacher victimization from the Schools and Staffing Survey (SASS), which provides national- and state-level data on public schools and national- and affiliation-level data on private schools. The 1993-94 and 1999-2000 SASS were collected by the U.S. Bureau of the Census and sponsored by the National Center for Education Statistics (NCES). SASS consists of four sets of linked surveys, including surveys of schools, the principals of each selected school, a subsample of teachers within each school, and public school districts.

The sampling frames for the 1993-94 and 1999-2000 SASS were created using the 1991-92 and 1997-98 NCES Common Core of Data (CCD) Public School Universe File, respectively. Data were collected by multistage sampling, which began with the selection of schools. This report uses 1993-94 and 1999-2000 SASS data. Approximately 9,900 public schools and 3,300 private schools were selected to participate in the 1993-94 SASS and 9,900 public schools and 3,600 private schools were selected to participate in the 1999-2000 SASS. Within each school, teachers selected were further stratified into one of five teacher types in the following hierarchy: (1) Asian or Pacific Islander; (2) American Indian, Aleut, or Eskimo; (3) teachers who teach classes designed for students with limited English proficiency; (4) teachers in their first, second, or third year of teaching; and (5) teachers not classified in any of the other groups. Within each teacher stratum, teachers were selected systematically with equal probability. In 1993-94, approximately 53,000 public school teachers and 10,400 private school teachers were sampled. In 1999-2000, 56,400 public school teachers and 10,800 private school teachers were sampled.

This report focuses on responses from teachers. The overall weighted response rates were 83 percent and 77 percent for public school teachers in 1993-94 and 1999-2000, respectively. For private school teachers, the overall weighted response rates were 73 percent and 67 percent in 1993-94 and 1999-2000, respectively. Values were imputed for questionnaire items that should have been answered but were not. For additional information about SASS, contact:

Kerry Gruber
National Center for Education Statistics
1990 K Street NW
Washington, DC 20006
Telephone: (202) 502-7349

School Survey on Crime and Safety (SSOCS)

The School Survey on Crime and Safety (SSOCS) was conducted by NCES in Spring/Summer of the 1999-2000 school year. SSOCS focuses on incidents of specific crimes/offenses and a variety of specific discipline issues in public schools. It also covers characteristics of school policies, school violence prevention programs and policies, and school characteristics that have been associated with school crime. The survey was conducted with a nationally representative sample of regular public elementary, middle, and high schools in the 50 states and the District of Columbia. Special education, alternative and vocational schools, schools in the territories, and schools that taught only prekindergarten, kindergarten, or adult education were not included in the sample.

The sampling frame for the SSOCS:2000 was constructed from the public school universe file created for the 2000 Schools and Staffing Survey from the 1997-98 NCES Common Core of Data (CCD) Public School Universe File. The sample was stratified by instructional level, type of locale, and enrollment size. Within the primary strata, schools were also sorted by geographic region and by percentage of minority enrollment. The sample sizes were then allocated to the primary strata in rough proportion to the aggregate square root of the size of enrollment of schools in the stratum. A total of 3,300 schools were selected for the study. Among those, 2,270 schools completed the survey. In March 2000, questionnaires were mailed to school principals, who were asked to complete the survey or to have it completed by the person most knowledgeable about discipline issues at the school. The weighted overall response rate was 70 percent, and item nonresponse rates ranged from 0-2.7 percent on the public-use data file. For SSOCS data, a full nonresponse bias analysis was conducted and no bias on the basis of nonresponse was detected. The weights were developed to adjust for the variable probabilities of selection and differential nonresponse and can be used to produce national estimates for regular public schools in the 1999-2000 school year. For more information about the School Survey on Crime and Safety, contact:

Kathryn A. Chandler
National Center for Education Statistics
1990 K Street NW
Washington, DC 20006
Telephone: (202) 502-7486

Accuracy of Estimates

The accuracy of any statistic is determined by the joint effects of nonsampling and sampling errors. Both types of error affect the estimates presented in this report. Several sources can contribute to nonsampling errors. For example, members of the population of interest are inadvertently excluded from the sampling frame; sampled members refuse to answer some of the survey questions (item nonresponse) or all of the survey questions (questionnaire nonresponse); mistakes are made during data editing, coding, or entry; the responses that respondents provide differ from the “true” responses; or measurement instruments such as tests or questionnaires fail to measure the characteristics they are intended to measure. Although nonsampling errors due to questionnaire and item nonresponse can be reduced somewhat by the adjustment of sample weights and imputation procedures, correcting nonsampling errors or gauging the effects of these errors is usually difficult.

Sampling errors occur because observations are made on samples rather than on entire populations. Surveys of population universes are not subject to sampling errors. Estimates based on a sample will differ somewhat from those that would have been obtained by a complete census of the relevant population using the same survey instruments, instructions, and procedures. The standard error of a statistic is a measure of the variation due to sampling; it indicates the precision of the statistic obtained in a particular sample. In addition, the standard errors for two sample statistics can be used to estimate the precision of the difference between the two statistics and to help determine whether the difference based on the sample is large enough so that it represents the population difference.

Most of the data used in this report were obtained from complex sampling designs rather than a simple random design. The features of complex sampling require different techniques to calculate standard errors than are used for data collected using a simple random sampling. Therefore, calculation of standard errors requires procedures that are markedly different from the ones used when the data are from a simple random sample. The Taylor series approximation technique or the balanced repeated replication (BRR) method was used to estimate most of the statistics and their standard errors in this report. Figure A.3 (PDF 90 KB) lists the various methods used to compute standard errors for different data sets.

Standard error calculation for data from the National Crime Victimization Survey and the School Crime Supplement was based on the Taylor series approximation method using PSU and strata variables available from each data set.

For statistics based on all years of NCVS data, standard errors were derived from a formula developed by the U.S. Bureau of the Census, which consists of three generalized variance function (gvf) constant parameters that represent the curve fitted to the individual standard errors calculated using the Jackknife Repeated Replication technique. The formulas used to compute the adjusted standard errors associated with percentages or population counts can be found in figure A.3 (PDF 90 KB).

Statistical Procedures

The comparisons in the text have been tested for statistical significance to ensure that the differences are larger than might be expected due to sampling variation. Unless otherwise noted, all statements cited in the report are statistically significant at the .05 level. Several test procedures were used, depending upon the type of data being analyzed and the nature of the statement being tested. The primary test procedure used in this report was the Student's t statistic, which tests the difference between two sample estimates, for example, between males and females. The formula used to compute the t statistic is as follows:

Mathematical Forumula 1 (1)

where E1 and E2 are the estimates to be compared and se1 and se2 are their corresponding standard errors. Note that this formula is valid only for independent estimates. When the estimates are not independent (for example, when comparing a total percentage with that for a subgroup included in the total), a covariance term (i.e., 2*se1*se2) must be added to the denominator of the formula:

Mathematical Formula 2 (2)

Once the t value was computed, it was compared with the published tables of values at certain critical levels, called alpha levels. For this report, an alpha value of .05 was used, which has a t value of 1.96. If the t value was larger than 1.96, then the difference between the two estimates is statistically significant at the 95 percent level.

A linear trend test was used when differences among percentages were examined relative to ordered categories of a variable, rather than the differences between two discrete categories. This test allows one to examine whether, for example, the percentage of students using drugs increased (or decreased) over time or whether the percentage of students who reported being physically attacked in school increased (or decreased) with their age. Based on a regression with, for example, student's age as the independent variable and whether a student was physically attacked as the dependent variable, the test involves computing the regression coefficient (b) and its corresponding standard error (se). The ratio of these two (b/se) is the test statistic t. If t is greater than 1.96, the critical value for one comparison at the .05 alpha level, the hypothesis that there is a linear relationship between student's age and being physically attacked is not rejected.

When using data sets in which multiple years of data are available, a Bonferroni adjustment to the significance level was used when one year's estimate was compared to another. The Bonferroni adjustment to the significance level was used to ensure that the significance level for the tests as a series was at the .05 level. Generally, when multiple statistical comparisons are made, it becomes increasingly likely that an indication of a population difference is erroneous. Even when there is no difference in the population, at an alpha of .05, there is still a 5 percent chance of concluding that an observed t value representing one comparison in the sample is large enough to be statistically significant. As the number of years and thus the number of comparisons increase, so does the risk of making such an erroneous inference. The Bonferroni procedure corrects the significance (or alpha) level for the total number of comparisons made within a particular classification variable. For each classification variable, there are (K*(K-1)/2) possible comparisons (or nonredundant pairwise combinations), where K is the number of categories. The Bonferroni procedure divides the alpha level for a single t test by the number of possible pairwise comparisons in order to produce a new alpha level that is corrected for the fact that multiple contrasts are being made. As a result, the t value for a certain alpha level (e.g., .05) increases, which makes it more difficult to claim that the difference observed is statistically significant.

While many descriptive comparisons in this report were tested using t statistic or the F statistic, some comparisons among categories of an ordered variable with three or more levels involved a test for a linear trend across all categories, rather than a series of tests between pairs of categories. In this report, when differences among percentages were examined relative to a variable with ordered categories, Analysis of Variance (ANOVA) was used to test for a linear relationship between the two variables. To do this, ANOVA models included orthogonal linear contrasts corresponding to successive levels of the independent variable. The squares of the Taylorized standard errors (that is, standard errors that were calculated by the Taylor series method), the variance between the means, and the unweighted sample sizes were used to partition total sum of squares into within- and between-group sums of squares. These were used to create mean squares for the within- and between-group variance components and their corresponding F statistics, which were then compared with published values of F for a significance level of .05. Significant values of both the overall F and the F associated with the linear contrast term were required as evidence of a linear relationship between the two variables.

1 For more information on the YRBSS methodology, see Brener et al. (2004).

Would you like to help us improve our products and website by taking a short survey?

YES, I would like to take the survey


No Thanks

The survey consists of a few short questions and takes less than one minute to complete.