Skip Navigation
PEDAR: Research Methodology  First-Generation Students in Postsecondary Education: A Look at Their College Transcripts
The Naitonal Education Longitudinal Study of 1988
The NELS:88 Postsecondary Education Transcript Study
Analysis Sample and Weights
Accuracy of Estimates
Data Analysis System
Statistical Procedures
Differences Between Means
Linear Trends
Multivariate Commonality Analysis
Executive Summary
Full Report (PDF)
Executive Summary (PDF)
 Data Analysis System: Differences Between Means

The descriptive comparisons were tested in this report using Student’s t statistic. Differences between estimates are tested against the probability of a Type I error,3 or significance level. The significance levels were determined by calculating the Student’s t values for the differences between each pair of means or proportions and comparing these with published tables of significance levels for two-tailed hypothesis testing (p<.05).

Student’s t values may be computed to test the difference between estimates with the following formula:

equation 1

where E1 and E2 are the estimates to be compared and se1 and se2 are their corresponding standard errors. This formula is valid only for independent estimates. When estimates are not independent, a covariance term must be added to the formula:

equation 2

where r is the correlation between the two estimates.4 This formula is used when comparing two percentages from a distribution that adds to 100. If the comparison is between the mean of a subgroup and the mean of the total group, the following formula is used:

equation 3

where p is the proportion of the total group contained in the subgroup.5 The estimates, standard errors, and correlations can all be obtained from the DAS.

There are hazards in reporting statistical tests for each comparison. First, comparisons based on large t statistics may appear to merit special attention. This can be misleading since the magnitude of the t statistic is related not only to the observed differences in means or percentages but also to the number of respondents in the specific categories used for comparison. Hence, a small difference compared across a large number of respondents would produce a large t statistic.

A second hazard in reporting statistical tests is the possibility that one can report a “false positive” or Type I error. In the case of a t statistic, this false positive would result when a difference measured with a particular sample showed a statistically significant difference when there is no difference in the underlying population. Statistical tests are designed to control this type of error, denoted by alpha. The alpha level of .05 selected for findings in this report indicates that a difference of a certain magnitude or larger would be produced no more than one time out of 20 when there was no actual difference in the quantities in the underlying population. When we test hypotheses that show t values below the .05 significance level, we treat this finding as rejecting the null hypothesis that there is no difference between the two quantities. Failing to reject the null hypothesis (i.e., finding no difference), however, does not necessarily imply the values are the same or equivalent.

A third hazard in reporting statistical tests for each comparison occurs when making multiple comparisons among categories of an independent variable. For example, when making paired comparisons among different race/ethnicities, the probability of a Type I error for these comparisons taken as a group is larger than the probability for a single comparison. When more than one difference between groups of related characteristics or “families” are tested for statistical significance, one must apply a standard that assures a level of significance for all of those comparisons taken together. In this analysis, adjustments for multiple comparisons were not made because a subsequent multivariate analysis was conducted, which included all independent variables where significant differences were found (see description below). A difference that was significant by chance alone would not be found significant in the multivariate analysis.

next section