Skip Navigation
small NCES header image
PEDAR: Research Methodology  Waiting to Attend College: Students Who Delay Their Postsecondary Enrollment
The 1999-2000 National Postsecondary Student Aid Study
The National Education Longitudinal Study of 1988
Beginning Postsecondary Students Longitudinal Study
Accuracy of Estimates
Item Response Rates
Data Analysis System
Statistical Procedures
Differences Between Means
Linear Trends
Multivariate Commonality Analysis
Missing Data dn Adjusting for Complex Sampling Design
Interpreting the Results
Executive Summary
Full Report (PDF)
Executive Summary (PDF)
 Introduction

There are many ways for members of the public and other researchers to make use of NCES results. The most popular way is to read the written reports. (Other ways include obtaining and analyzing public use and restricted use data files. These allow researchers to carry out and publish their own secondary analyses of NCES data.)

It is very important when reading NCES reports to remember that they are descriptive in nature. That is, they are limited to describing some aspect of the condition of education. These results are usefully viewed as suggesting various ideas to be further examined in light of other data, including state and local data, and in the context of the large research literature elaborating on the many factors predicting and contributing to educational achievement or to other outcome variables of interest.

However, some readers are tempted to make unwarranted causal inferences from simple cross tabulations. It is never the case that a simple cross tabulation of any variable with a measure of educational achievement is conclusive proof that differences in that variable are a cause of differential educational achievement or that differences in that variable explain any other outcome variable. The old adage that “correlation is not causation” is a wise precaution to keep in mind when considering the results of NCES reports. Experienced researchers are aware of the design limitations of many NCES data collections. They routinely formulate multiple hypotheses that take these limitations into account and readers of this volume are encouraged to do likewise. As part of the Institute of Education Sciences, NCES has a responsibility to try to discourage misleading inferences from the data presented and to educate the public on the genuine difficulty of making valid causal inferences in a field as complex as education. Our reports are carefully worded to achieve this end.

This focus on description, eschewing causal analysis, extends to multivariate analyses as well as bivariate ones. Some NCES reports go beyond presenting simple crosstabulations and present results from multiple regression equations that include many different independent (“predictor”) variables. This can be useful to the reader, especially those without the time or training to access the data on their own. Because many of the independent variables included in descriptive reports are related to each other and to the outcome they are predicting, a multivariate approach can help users to understand their interrelation. For example, many of the independent variables included in this study are related, and to some extent, the patterns of differences displayed in the descriptive tables reflect a common variation. For instance, when examining degree attainment or persistence by delayed enrollment status, some of the observed relationship may be due to differences in other factors related to delaying enrollment (e.g., delayed entrants enroll in public 2-year institutions and attend part time at higher rates than immediate entrants). While it is possible to create three-variable tables, when the number of independent variables increases to four or more, the number of cases in individual cells of such a table often becomes too small to find significant differences simply because there are too few cases to achieve statistical significance. To make economical use of the many available independent variables in the same data display, other statistical methods must be used that can take multiple predictor variables into account simultaneously.

Multiple linear regression is often used for this purpose: to adjust for the common variation among a list of independent variables.16 This approach is referred to as “commonality analysis,”17 because it identifies lingering relationships after adjustment for “common” variation. This method is used simply to confirm statistically significant associations observed in the bivariate descriptive analysis while taking into account the interrelationships among the predictor variables.

Thus, this multiple regression approach is descriptive. Significant coefficients reported in the regression tables indicate that when the variable is deleted from (or added to) the set of independent variables, it results in a non-zero change in R-squared, which is the basis of the commonality analysis. In other words, a significant coefficient means that the independent variable has a relationship with the outcome variable that is unique, or distinct from its relationship with other independent variables in the model.

Multivariate description of this sort is distinct from either a modeling approach in which an analyst attempts to identify the smallest relevant set of causal or explanatory independent variables associated with the dependent variable or variables or an approach using one of the many varieties of structural equation modeling. In contrast, a multivariate descriptive or commonality approach provides a richer understanding of the data without needing to make any kind of causal assumptions, which is why descriptive multivariate commonality analysis is often employed in NCES statistical reports.

When should commonality analysis be employed? It should be used in statistical analysis reports when independent variables are correlated with both the outcome variable and with each other. This will allow the analyst to determine how much of the effect of one independent variable is due to the influence of other independent variables, since in a multiple regression procedure these effects are adjusted for. As discussed in the section “Data Analysis System” section, all analyses included in PEDAR reports must be based on the DAS, which is available to the public online (http://nces.ed.gov/das). Exclusively using the DAS in this way provides readers direct access to the findings and methods used in the report so that they may replicate or expand on the estimates presented. However, the DAS does not allow users access to the raw data, which limits the range of multivariate procedures that can be used. Specifically, the DAS produces correlation matrices, which can be used as input in standard statistical packages to produce least squares regression models. This means that logit or probit procedures, which are more appropriate for dichotomous dependent variables cannot be used.18 However, empirical studies have shown that when the mean value of a dichotomous dependent variable falls between 0.25 and 0.75 (as it does in this analysis), regression and log-linear models are likely to produce similar results.19

The independent variables analyzed in this study and subsequently included in the multivariate model were chosen based largely on earlier empirical studies (cited in the text), which showed significant associations with the key analytic variable, delayed enrollment. Before conducting the study, a detailed analysis plan was reviewed by a Technical Review Panel (TRP) of experts in the field of higher education research and additional independent variables requested by the TRP were considered for inclusion. The analysis plan listed all the independent variables to be included in the study. The TRP also reviewed the preliminary results as well as the first draft of this report. The analysis plan and subsequent report were modified based on TRP comments and criticism.


next section

Would you like to help us improve our products and website by taking a short survey?

YES, I would like to take the survey

or

No Thanks

The survey consists of a few short questions and takes less than one minute to complete.
National Center for Education Statistics - http://nces.ed.gov
U.S. Department of Education