Statistical Standards Program
Table of Contents
1. Development of Concepts and Methods
2. Planning and Design of Surveys
3. Collection of Data
4. Processing and Editing of Data
5. Analysis of Data / Production of Estimates or Projections
5-1 Statistical Analysis, Inference, and Comparisons
5-2 Variance Estimation
5-4 Tabular and Graphic Presentations of Data
6. Establishment of Review Procedures
7. Dissemination of Data
For help viewing PDF files, please click here
|ANALYSIS OF DATA / PRODUCTION OF ESTIMATES OR PROJECTIONS|
SUBJECT: STATISTICAL ANALYSIS, INFERENCE AND COMPARISON
NCES STANDARD: 5-1
PURPOSE: To ensure that statistical analyses, comparisons, and inferences included in NCES products are based on appropriate statistical procedures.
KEY TERMS: effect size, estimation, hypothesis testing, Minimum Substantively Significant Effect (MSSE), power, rejection region, simple comparison, statistical inference, tail, Type I error, and Type II error.
GUIDELINE 5-1-3A: If the survey purpose or prior research indicates that only differences between estimates in a specific direction are of interest or an established trend is to be updated with a new year of data, one-sided tests (in tests such as t tests or z tests) may be used to optimize power. In this case the region of rejection of the null hypothesis HO, is contained in only one tail of the sampling distribution of the test statistic.
GUIDELINE 5-1-4A: When conducting multiple comparisons, appropriate procedures should be considered to control the level of Type I error for simultaneous inferences. Multiple comparison procedures include, for example, Bonferroni, False Discovery Rate (FDR), Scheffe, and Tukey tests (see, for example Hochberg, Y. and Tamhane, A.C. 1987 and Benjamini, Y. and Hochberg,Y. 1995).
GUIDELINE 5-1-4B: Alternative presentation of the results, such as confidence intervals or coefficients of variation, should also be considered as appropriate.
GUIDELINE 5-1-4C: When testing for structure in the data over time, a trend test or other suitable procedure should be performed (e.g., regression, ANOVA, or non-parametric statistics). In conducting over time analyses, possible changes in population composition should be considered.
GUIDELINE 5-1-4D: When it is appropriate, the use of multiple regression and multivariate analysis techniques should be considered to examine relationships between a dependent variable (e.g., test score) and a set of independent variables (e.g., race, sex, and family background). Such techniques can provide an integrated approach to testing many simultaneous relationships.
GUIDELINE 5-1-4E: In general, standardized regression coefficients should be used. When the units of measurement are meaningful (e.g., number of years of schooling), unstandardized regression coefficients or mean differences should be provided.
GUIDELINE 5-1-4F: When the results of an analysis are statistically significant, it is useful to consider the substantive interpretation of the size of the effect. For this purpose, the observed difference can be converted into an effect size to allow the interpretation of the size of the difference.
For a t-test of the mean difference, for example, the estimated effect size is the observed difference between the two observed means relative to a measure of variability, such as the standard deviation.
In correlation analysis, r is the effect size. Consult Cohen (1988) for measures of effect size using additional statistical procedures.
Cohen's (1988) convention for interpreting effect sizes may be used. Empirical evidence has shown that for t tests or z tests, an effect size of 0.2 is small, 0.5 is medium, and 0.8 is large. As for correlations, an r of 0.1 is small, 0.3 is medium, and 0.5 is large.
GUIDELINE 5-1-4G: Another approach to considering the substantive importance of a significant difference is to compare the size of the difference to the minimum substantively significant effect (MSSE) size that is determined a priori.
GUIDELINE 5-1-4H: When reporting on the significance of important findings, confirmatory and corroborative statistical methods and significance tests should be used. For example, if the original significant finding is based on a simple comparison t test, t tests adjusted for multiple comparisons could also be used if appropriate. Another example would be to confirm important findings obtained with one analytic approach with a second analysis conducted using an alternative approach.
Agresti, A. (2002). Categorical Data Analysis, 2nd Edition. NewYork, NY: Wiley Interscience.
Benjamini, Y. and Hochberg,Y. (1995). "Controlling for the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing." Journal of the Royal Statistical Society, Series B, 57(1), pp. 289-300.
Binder, D.A., Gratton, M., Hidiroglou, M. A., Kumar, S. and Rao, J.N.K. (1984). "Analysis of Categorical Data from surveys with Complex Designs: Some Canadian Experiences." Survey Methodology, Vol. 10, 141 | 156.
Cohen, B.H. (2001). Explaining Psychological Statistics. New York: Wiley.
Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences. New York: Academic Press.
Cohen, J. and Cohen, P. (1983). Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Hillsdale, NJ: L. Erlbaum Associates.
Draper, N. R. and Smith, H. (1998). Applied Regression Analysis, 3rd Edition. NY: Wiley Interscience.
Hays, W. L. (1994). Statistics. Fifth Edition. Fort Worth, TX: Harcourt College Publishers.
Hochberg, Y. and Tamhane, A.C. 1987. Multiple Comparison Procedures. New York: John Wiley & Sons.
Hoenig, J.M. and Heisey, D.M. (2001). "The Abuse of Power: The Pervasive Fallacy of Power Calculations for Data Analysis." The American Statistician 55(1) pp. 19-24.
Holt, D., Smith, T.M.F., and Winter, P.D. (1980). "Regression Analysis from Complex Surveys." Journal of the Royal Statistical Society, Series A, Vol. 143, 474-481.
Jones, L.V., Lewis, C., and Tukey, J.W. (2001). Hypothesis tests, multiplicity of. In N.J. Smelser & P.B. Baltes, Eds., International Encyclopedia of the Social and Behavioral Sciences. London: Elsevier Science, Ltd., pp. 7127-7133.)
Kish, L.and Frankel, M.R. (1974). "Inferences from Complex Samples." Journal of the Royal Statistical Society, Series B, Vol. 36, 1-37.
Kleinbaum, D.G., Kupper, L.L., Muller, K.E., and Nizam, A. (1998). Applied Regression Analysis and Other Multivariate Methods. Pacific Grove: Duxbury Press.
Lehtonen, R. and Pahkinen, E.J. (1995). Practical Methods for Design and Analysis of Complex Surveys. New York, NY: Wiley Interscience.
Moore, D.S. (2000). The Basic Practice of Statistics. 2nd edition. New York:NY: W.H. Freeman.
NCES Statistical Analysis Manual 2002 (forthcoming). Washington, DC: NCES.
Neter, J., Kutner, M., Nachtsheim, C., and Wasserman, W. (1996). Applied Linear Statistical Models, 4th Edition. New York: NY: McGraw-Hill - Irwin
Skinner, C.J., Holt, D., and Smith, T.M.F. eds. (1989). Analysis of Complex Surveys. New York, NY: John Wiley & Sons.