Skip Navigation

Search Results: (1-15 of 23 records)

 Pub Number  Title  Date
IES 2020001REV Cost Analysis: A Starter Kit
This starter kit is designed for grant applicants who are new to cost analysis. The kit will help applicants an a cost analysis, setting the foundation for more complex economic analyses.
NCEE 20174027 Multi-armed RCTs: A design-based framework
Design-based methods have recently been developed as a way to analyze data for impact evaluations of interventions, programs, and policies. The estimators are derived using the building blocks of experimental designs with minimal assumptions, and have important advantages over traditional model-based impact methods. This report extends the design-based theory for the single treatment-control group design to designs with multiple research groups. It discusses how design-based estimators found in the literature need to be modified for multi-armed designs when comparing pairs of research groups to each other. It also discusses multiple comparison adjustments when conducting hypothesis tests across pairwise contrasts to identify the most effective interventions. Finally, it discusses the complex assumptions required to identify and estimate the complier average causal effect (CACE) parameter for multi-armed designs.
NCEE 20174025 What is Design-Based Causal Inference for RCTs and Why Should I Use It?
Design-based methods have recently been developed as a way to analyze data from impact evaluations of interventions, programs, and policies. The approach uses the building blocks of experimental designs to develop impact estimators with minimal assumptions. The methods apply to randomized controlled trials and quasi-experimental designs with treatment and comparison groups. Although the fundamental concepts that underlie design-based methods are straightforward, the literature on these methods is technical, with detailed mathematical proofs required to formalize the theory. This brief aims to broaden knowledge of design-based methods by describing their key concepts and how they compare to traditional model-based methods, such as such as hierarchical linear modeling (HLM). Using simple mathematical notation, the brief is geared toward researchers with a good knowledge of evaluation designs and HLM.
NCEE 20154013 A Guide to Using State Longitudinal Data for Applied Research
State longitudinal data systems (SLDSs) promise a rich source of data for education research. SLDSs contain statewide student data that can be linked over time and to additional data sources for education management, reporting, improvement, and research, and ultimately for informing education policy and practice.

Authored by Karen Levesque, Robert Fitzgerald, and Joy Pfeiffer of RTI International, this guide is intended for researchers who are familiar with research methods but who are new to using SLDS data, are considering conducting SLDS research in a new state environment, or are expanding into new topic areas that can be explored using SLDS data. The guide also may be useful for state staff as background for interacting with researchers and may help state staff and researchers communicate across their two cultures. It highlights the opportunities and constraints that researchers may encounter in using state longitudinal data systems and offers approaches to addressing some common problems.
NCEE 20154011 Statistical Theory for the RCT-YES Software: Design-Based Causal Inference for RCTs
This Second Edition report updates the First Edition published in June 2015 that presents the statistical theory underlying the RCT-YES software that estimates and reports impacts for RCTs for a wide range of designs used in social policy research. The preface to the new report summarizes the updates from the previous version. The report discusses a unified, non-parametric design-based approach for impact estimation using the building blocks of the Neyman-Rubin-Holland causal inference model that underlies experimental designs. This approach differs from the more model-based impact estimation methods that are typically used in education research. The report discusses impact and variance estimation, asymptotic distributions of the estimators, hypothesis testing, the inclusion of baseline covariates to improve precision, the use of weights, subgroup analyses, baseline equivalency analyses, and estimation of the complier average causal effect parameter.
NCEE 20144017 Understanding Variation in Treatment Effects in Education Impact Evaluations: An Overview of Quantitative Methods
This report summarizes the complex research literature on quantitative methods for assessing how impacts of educational interventions on instructional practices and student learning differ across students, educators, and schools. It also provides technical guidance about the use and interpretation of these methods. The research topics addressed include: subgroup (moderator) analyses based on study participants’ characteristics measured before the intervention is implemented; subgroup analyses based on study participants’ experiences, mediators, and outcomes measured after program implementation; and impact estimation when treatment effects vary. The focus is on randomized controlled trials, but the methods are also applicable to quasi-experimental designs.
NCSER 20133000 Translating the Statistical Representation of the Effects of Education Interventions Into More Readily Interpretable Forms
This new Institute of Education Sciences (IES) report assists with the translation of effect size statistics into more readily interpretable forms for practitioners, policymakers, and researchers. This paper is directed to researchers who conduct and report education intervention studies. Its purpose is to stimulate and guide researchers to go a step beyond reporting the statistics that represent group differences. With what is often very minimal additional effort, those statistical representations can be translated into forms that allow their magnitude and practical significance to be more readily understood by those who are interested in the intervention that was evaluated.
NCEE 20124019 Using an Experimental Evaluation of Charter Schools to Test Whether Nonexperimental Comparison Group Methods Can Replicate Experimental Impact Estimates

This NCEE Technical Methods Paper compares the estimated impacts of the offer of charter school enrollment using an experimental design and a non-experimental comparison group design. The study examined four different approaches to creating non-experimental comparison groups ordinary least squares regression modeling, exact matching, propensity score matching, and fixed effects modeling. The data for the study are from students in the districts and grades that were represented in an experimental design evaluation of charter schools conducted by the U.S. Department of Education in 2010 (For more information, see:

The study found that none of the comparison group designs reliably replicated the impact estimates from the experimental design study. However, the use of pre-intervention baseline data that are strongly predictive of the key outcome measures considerably reduced, but did not eliminate the estimated bias in the non-experimental impact estimates. Estimated impacts based on matched comparison groups were more similar to the experimental estimators than were the estimates based on the regression adjustments alone, the differences are moderate in size, although not statistically significant.

NCEE 20124025 Replicating Experimental Impact Estimates Using a Regression Discontinuity Approach
This NCEE Technical Methods Paper compares the estimated impacts of an educational intervention using experimental and regression discontinuity (RD) study designs. The analysis used data from two large-scale randomized controlled trials—the Education Technology Evaluation and the Teach for America Study—to provide evidence on the performance of RD estimators in two specific contexts. More generally, the report presents and implements a method for examining the performance of RD estimators that could be used in other contexts. The study found that the RD and experimental designs produced impact estimates that were meaningful in size, though not significantly different from one another. The study also found that manipulation of the assignment variable in RD designs can substantially influence RD impact estimates, particularly if manipulation is related to the outcome and occurs close to the assignment variable's cutoff value.
NCES 2011304 Documentation for the 2008–09 Teacher Follow-up Survey
This report covers all phases of the Teacher Follow-up Survey (TFS), from survey planning through data file availability. The TFS determines how many teachers remained at the same school, moved to another school, or left the profession in the year following the Schools and Staffing Survey (SASS) administration.
NCEE 20104003 Precision Gains from Publically Available School Proficiency Measures Compared to Study-Collected Test Scores in Education Cluster-Randomized Trials
In randomized controlled trials (RCTs) where the outcome is a student-level, study-collected test score, a particularly valuable piece of information is a study-collected baseline score from the same or similar test (a pre-test). Pre-test scores can be used to increase the precision of impact estimates, conduct subgroup analysis, and reduce bias from missing data at follow up. Although administering baseline tests provides analytic benefits, there may be less expensive ways to achieve some of the same benefits, such as using publically available school-level proficiency data. This paper compares the precision gains from adjusting impact estimates for student-level pre-test scores (which can be costly to collect) with the gains associated with using publically available school-level proficiency data (available at low cost), using data from five large-scale RCTs conducted for the Institute of Education Sciences. The study finds that, on average, adjusting for school-level proficiency does not increase statistical precision as well as student-level baseline test scores. Across the cases we examined, the number of schools included in studies would have to nearly double in order to compensate for the loss in precision of using school-level proficiency data instead of student-level baseline test data.
NCEE 20104004 Error Rates for Measuring Teacher and School Performance Using Value-Added Models
This study estimates error rates in identification of upper elementary school teachers as low or high performing based on student test score gain data. The study develops error rate formulas for commonly-used performance measurement schemes that are based on OLS and Empirical Bayes estimators and value-added models, where educator performance is compared to the district average using hypothesis testing. Simulation results suggest that performance estimates are likely to be noisy using the amount of data that are typically used in practice—1 to 3 years. Type I and II error rates are likely to be about 25 percent based on three years of data and 35 percent based on one year of data. Corresponding error rates for overall false positive and negative errors for all teachers who are subject to misclassification are 10 and 20 percent, respectively. Lower error rates can be achieved by increasing the number of student achievement gain measures that are available for any teacher. School-level results also have less error.
NCEE 2009006 Survey of Outcomes Measurement in Research on Character Education Programs
Character education programs are school-based programs that have as one of their objectives promoting the character development of students. This report systematically examines the outcomes that were measured in evaluations of a delimited set of character education programs and the research tools used for measuring the targeted outcomes. The multi-faceted nature of character development and many possible ways of conceptualizing it, the large and growing number of school-based programs to promote character development, and the relative newness of efforts to evaluate character education programs using rigorous research methods all combine to make the selection or development of measures relevant to the evaluation of these programs especially challenging. This report is a step toward creating a resource that can inform measure selection for conducting rigorous, cost effective studies of character education programs. The report, however, does not provide comprehensive information on all measures or types of measures, guidance on specific measures, or recommendations on specific measures.
NCEE 2009013 Technical Methods Report: Using State Tests in Education Experiments: A Discussion of the Issues
Securing data on students' academic achievement is typically one of the most important and costly aspects of conducting education experiments. As state assessment programs have become practically universal and more uniform in terms of grades and subjects tested, the relative appeal of using state tests as a source of study outcome measures has grown. However, the variation in state assessments--in both content and proficiency standards--complicates decisions about whether a particular state test is suitable for research purposes and poses difficulties when planning to combine results across multiple states or grades. This discussion paper aims to help researchers evaluate and make decisions about whether and how to use state test data in education experiments. It outlines the issues that researchers should consider, including how to evaluate the validity and reliability of state tests relative to study purposes; factors influencing the feasibility of collecting state test data; how to analyze state test scores; and whether to combine results based on different tests. It also highlights best practices to help inform ongoing and future experimental studies. Many of the issues discussed are also relevant for non-experimental studies.
NCEE 20094065 Do Typical RCTs of Education Interventions Have Sufficient Statistical Power for Linking Impacts on Teacher Practice and Student Achievement Outcomes
For RCTs of education interventions, it is often of interest to estimate associations between student and mediating teacher practice outcomes, to examine the extent to which the study's conceptual model is supported by the data, and to identify specific mediators that are most associated with student learning. This paper develops statistical power formulas for such exploratory analyses under clustered school-based RCTs using ordinary least squares (OLS) and instrumental variable (IV) estimators, and uses these formulas to conduct a simulated power analysis. The power analysis finds that for currently available mediators, the OLS approach will yield precise estimates of associations between teacher practice measures and student test score gains only if the sample contains about 150 to 200 study schools. The IV approach, which can adjust for potential omitted variable and simultaneity biases, has very little statistical power for mediator analyses. For typical RCT evaluations, these results may have design implications for the scope of the data collection effort for obtaining costly teacher practice mediators.
   1 - 15     Next >>
Page 1  of  2