Commissioner's Remarks - Comments delivered to the American Enterprise Institute for Public Policy Research

Mark Schneider
Commissioner, National Center for Education Statistics

Comments delivered to the American Enterprise Institute for Public Policy Research
February 6, 2006

NCES is charged with documenting the condition of American education, which now means from pre-K through postsecondary education. And, as the Commissioner of Education Statistics, it is my job to make sure that the NCES fulfills that charge efficiently and responsibly. To do that we face a set of challenges-some are broad challenges, cutting across the entire range of activities that the Center undertakes; others are more particular to specific data collection efforts.

The first broad challenge is for the NCES to link better the study of the different layers of education and breakdown the institutional barriers within NCES that lead the people collecting data on early childhood education to forget that after a student gets out of 3rd grade, she goes on to 4th, 5th grade and then on to high schools and, hopefully, to postsecondary education. In short, we need to make sure that our data collections articulate with one another better than they do. As noted below, the biggest gap is between our K-12 data collections and our postsecondary data, but other gaps are also evident.

We also need to work to break down the vertical barriers between divisions within the NCES. For example, the National Assessment of Education Progress (NAEP) is arguably the most visible of NCES's data collections and accounts for almost half its budget. More importantly, NAEP may be the most sophisticated assessment in the world, but we run assessments in other parts of the Center as well. Yet we don't cross-link these assessments sufficiently and advances in one division that could help assessments in another don't get used to the degree that they should.

With these overarching ideas in mind, I will concentrate on challenges in four specific areas of data collection:

The National Assessment of Educational Progress (NAEP);
High schools;
The postsecondary sector; and
Teacher data collections.

I will conclude with some general challenges facing NCES.

The National Assessment of Educational Progress

Here are some of the challenges facing NCES concerning the implementation of the National Assessment of Educational Progress.

Declining participation

It's a saturated testing market and many schools and students feel overburdened with tests, many of which carry higher stakes than NAEP.

We are experiencing declining school participation, which differs by sector (private vs. public) and by grade level. For example, last year we could not report some NAEP performance data for each type of private school in the nation.
The nonparticipation problem is more severe for subjects not covered by the No Child Left Behind Act (e.g., writing, science) and some states have opted out of the science assessment.
We have a particular kind of "participation" problem with 12th grade students. The issue is whether or not older students take the exam seriously enough when there are no stakes directly associated with the exam and they don't get immediate feedback on how well they did on the assessment.

We are not the only agency facing declining participation in our surveys-it is a nationwide phenomenon affecting all government agencies and university-based researchers as well. Even if misery loves company, this declining participation threatens the quality of many of our studies.

Nonparticipation and data reporting

By practice, at least 70% of schools originally selected for an assessment must participate in the assessment for NCES to report data. Given declining participation rates this is creating some serious reporting problems, as with the private school data noted above. The 70% reporting rule was established a while back. Given statistical advances, we need to revisit our reporting criteria with modern statistical methods in mind: could nonresponse bias analysis allow us to relax the blanket 70% rule?

Use of effect size

NAEP has a very large sample so that small differences are statistically significant, but are they meaningful? We need to better identify changes that are substantively significant as well as significant statistically.

Computerization of test instruments

New NAEP frameworks calling for computer-based testing. This includes the 2009 Science assessment, which requires simulation exercises, and the 2011 writing assessment, which may require writing on a computer. But the shift in technology leads to some questions we must face:

Many states already use computer-based testing. Will differential student familiarity create bias? Obviously, to the extent that such familiarity affects scores, we have a problem.
Does computer-based testing alter the construct being tested-and hence hinder our ability to measure trends over time?
How do we implement this transition? How, for example, do we make sure that students are taking exams on computers of roughly the same capabilities?

Measuring What Happens in High Schools

There is not doubt that the NCES, the Institute of Education Sciences, and the U.S. Department of Education are now focused on how well American students are doing in high school. This is a natural progression from the focus on K-8 that marked the early years of the No Child Left Behind era. The question is how best to measure what is going on in the American high school. I will talk about two data sources that NCES uses to document the condition of America's high schools: student-based longitudinal studies and state administrative data.

Before talking about the data, I will identify some of the main issues that will motivate our data collection in the realm of high school education.

Drop out and graduation rates-this has been a particularly contentious area of measurement, although we hope that the recently released Adjusted Freshman Graduation Rate will provide some closure.
Transitions into and through ninth grade-many believe that what happens to students in this grade shapes their academic future, including the likelihood that they will even complete high school.
Transition from high school to the world of work, the military or post secondary education, which includes and incredible diversity of institutions, including community colleges, proprietary colleges, and four year schools of wide ranging mission and quality.
There is also a host of outcomes that we are interested in. While NCLB has focused on accountability focused on test scores and graduation rates, we are concerned with other life outcomes, such as earnings, civic participation, marriage and child rearing practices, and the avoidance of criminal behavior.

As we go about gathering data needed to address these, and other, issues, NCES has two approaches to data collection.

Student-based longitudinal databases

The NCES has created and maintained several longitudinal databases going back decades. These are critically important NCES products and because they involve so many students and are run for such a long period of time and because they, in turn, cost so much, they are unique research data bases for the research and policy community.

Our current studies are built on samples of students not of schools, which limits our ability to drill down tightly on school effects, which is what people who study high school reform are most interested in.

One of the most important of our current student-based longitudinal studies is the Early Childhood Longitudinal Study (also known as ECLS-K), which has been in the field studying a cohort of students that began kindergarten in 1998-1999. We have already re-interviewed students in first, third, and fifth grades. The sample frame of this study, like most all our others, is based on individual students, and will provide a wealth of information on the elementary school experiences of students, their socioeconomic background, parental participation, and the like. Children and their families, teachers, and schools provide information on the cognitive, social, emotional, and physical development of students in our sample. There is also information on the children's home environment, home educational practices, school environment, classroom environment, classroom curriculum, and teacher qualifications.

From the study's beginning, NCES kept open the possibility of extending the ECLS-K beyond fifth grade and indeed we are conducting an eighth grade follow-up study in 2007.

The question we face flows from the current concern for high schools: should this student-based ECLS-K sample be continued or did we need to create a whole new study?

Most high school reform issues are focused on what goes on in the school and researchers seek to measure school effects as precisely as possible. At minimum, this requires a sufficient number of students to be found in a sufficient number of schools and, ultimately, many believe a high school study requires a school-based sample rather than a student-based one such as ECLS-K.

One of the first questions we had to ask concerned the likelihood that the student-based design of ECLS-K would produce a sample with enough students in a set of schools to attempt to measure school effects.

We know that when students finish 8th grade there are two different forces at work: in some districts, multiple middle schools feed into a single high school, so the number of schools in the sample could go down and ECLS-K might have enough students in enough schools to address some of the research and policy issues on the high school reform agenda. But we also know that parents often move when "natural" break points arise in the course of a student's education. Further, given the rapidly expanding world of choice, the number of schools that students can choose from is large and growing, so there will be forces dispersing students into a larger number of schools. Historically we have found that these centrifugal forces are stronger than the centralizing ones, and the number of schools represented in our longitudinal studies has increased from 8th to 9th grade.

The alternative we are now pursuing is to scale back ECLS-K and extend the study by collecting administrative data to see how students do in later years and relate these measures of success to the K-8 data we will have. To study high schools more intensively, we are planning to create a new study based on a sample of say 250 high schools and a sample of students and classes within those schools.

In this new design, we thinking that we will get administrative data from the middle schools that students attended, picking up historical data on students-but obviously this will be a lot less information than we would have from the multiple waves of data we have collected on the students from K through 8th grade using the ECLS-K data base.

In this new study, we are discussing interviewing students at the beginning of the ninth grade and either at the end of 9th or early in 10th grade, to find out what happened in that critical year. We would reinterview them in 11th or 12th grade and then 2 years out of high school-and beyond?

The resulting data set would constitute a unique large-scale school-based longitudinal data set documenting the high school experiences of American students.

In addition to the assessment of student learning, as in other NCES longitudinal studies, we plan to collect contextual data on student, parent, teacher, and school characteristics; intended and enacted curricula; instructional practices; school, district, and state policies; and student course taking patterns would also be provided, along with future data on respondents' employment and academic outcomes.

When made available to researchers nationwide, this database will allow for value-added and in-depth analyses of student learning, curricular differentiation, and teacher effects in high schools.

We are still struggling with the questions and trade-offs involved in designing and launching this new study. Among them:

Is the clock reset sufficiently from junior high school to high school that we can be happy with administrative records and current interviews with teachers, parents, and students to not miss the depth of the ECLS-K data?
While this new study would make people interested in high school reform happier, others argue that the critical problems in American education really emerge in middle school and that a high school-based study has already missed the boat.
Given the concern for measuring value added, is there any way that we can actually measure it within any study design that doesn't cost a fortune? That is, the sample frame will be high schools, but students take radically different courses of study within a high school and are exposed to many different teachers. So how can value added models be applied to this kind of data set-or is it simply that we need to use administrative data sets to model value added?

This leads me to note NCES efforts to support the collection of administrative data that can also be used to answer fundamental research questions.

State Administrative Data Sets

The NCES has recently launched a $50 million program of cooperative agreements to help states develop their administrative databases. We announced 14 state winners of the competition and another round of these grants is included in the president's budget for the next fiscal year. K-12 administrative data sets have served as valuable research tools when people like Rick Hanushek, David Figlio, or Sunny Ladd have gained access to them.

There are many issues, particularly concerning privacy that affect the use of these databases, but hopefully, as more states get familiar with them and as the quality of these databases improve, more researchers will get better access.

I am also concerned about encouraging the use of these data sets to help us bridge the two different worlds of high schools and postsecondary education. While the existing program was targeted on K-12, many of the states we have funded have K-16 or even P-20 data sets. I hope to find ways of encouraging states to open up these comprehensive data sets to researchers and to perhaps lead the way in showing how such seamless data sets can be used to document the entire range of educational experiences that Americans are exposed to. These data sets might help break down the artificial barriers that now exist between the K-12 and postsecondary worlds of research and data.

Postsecondary Data

Right now NCES's postsecondary data collection efforts operate relatively independently of the concerns of elementary and secondary data collections. This is not surprising since that's the way state education departments are typically organized. And it's also not surprising since the interests of the federal government in postsecondary education are so much different than its interests in K-12 education.

While we need to think about ways of aligning these data collections, right now I want to just talk about the challenges facing NCES's Integrated Postsecondary Education Data System (known as IPEDS).

IPEDS is a census of all 6,800 Title IV institutions in the nation and is probably the higher education data collection effort for which the NCES is best known. I want to talk about IPEDS through two very limited lenses-that of price (which affects access) and accountability. And I will narrow the focus even further, focusing on only one of the many indicators of accountability that could be used.

Institution Based Data Collection

Through IPEDS, NCES is the principal source of annual data at the level of individual postsecondary institutions with respect to characteristics of students, staff, finance, student aid, graduation rates, and a number of other variables. Despite its size, it's a limited data set and can't answer many of the questions we need answered. I present two examples.

Affordability

We have a fundamental measurement problem-there is a divergence between the "sticker price" that colleges post and the real, discounted "out of pocket price" that students actually pay to attend the institution.

Using another NCES data set, the National Postsecondary Student Aid Study, I estimate that, on average across the nation, students pay only 60% of the listed price of colleges and universities. But there is wide variation from this nationwide 40% discount rate.

For the most selective schools in the nation, the discount rate is 27% but for the least selective, it's almost 50%
More importantly, for the bottom half of the income distribution, the discount rate is around 50% but for the top half of the income distribution it's 30%

We can see that there is substantial variation between the sticker price and the discounted price that students actually pay, but this information is not systematically available and it is not tied to individual schools or student financial capacities.

We need to gather data to compute actual out of pocket price for students so they can more reasonably shop for schools. Given that the discount rate is greater for the lowest income students, we need to let them know that the high sticker price colleges announce does not automatically bar them from attending college. Unfortunately, IPEDS data cannot be used to provide students with this type of cost information.

Accountability

Let's leave aside the question of what schools should be accountable for-the list ranges from learning, employment, satisfaction, and so on-to concentrate on something that everyone agrees should be part of the evaluation of colleges and universities and that's graduation rates. Despite the fundamental importance of calculating and presenting these rates, our current data simply don't allow us to estimate graduation rates for the vast majority of students.

This traces back to the very design of IPEDS, where the units of analysis are institutions of higher education and they report their data on an aggregate basis.

Here's the problem: IPEDS data are limited to full-time, first-time degree- or certificate-seeking students in a particular year (cohort), by race/ethnicity and gender.

No data are available on time to degree for individual students.
No data are available by family income.
Students who transfer and graduate from a subsequent institution are not counted in the statistics;
Students who enroll on a part-time basis are not counted in the statistics;
Students who start - "stop out" - restart are not counted in the statistics.

Research has shown that almost three quarters of postsecondary students are "nontraditional," with characteristics such as part-time attendance and delayed enrollment. In addition, 40 percent of students now enroll in more than one institution at some point during their progress through postsecondary education, including transfer to other institutions as well as co-enrollment.

Thus IPEDS collects and reports information on individual institutions for aggregates of first-time, full-time students-who are now a minority of students in higher education. How do you measure quality or design accountability systems for institutions that serve an appreciable number of non-traditional students (and that is all but the elite private universities) with data that ignore these students?

The answer: You can't.

Can IPEDS be fixed?

One possibility to improving IPEDS is what we refer to colloquially as "Huge IPEDS." Institutions would still submit data to us in aggregates, but the aggregates would be much smaller slices. For example, every Title IV institution could be required to calculate and submit net price or graduation rates for different categories of students in different programs.

The "huge" in Huge IPEDS refers to the burden this would impose on institutions. But Huge IPEDS still couldn't handle many of the issues raised by nontraditional students. For example, an individual institution has no way of knowing whether a student who enrolled but didn't complete a degree on time dropped out or transferred to another school. We need unit records for all individuals to efficiently estimate these types of measures at the individual, institution, and system levels.

In March of 2005, NCES published a feasibility study of another approach, a student unit record system within IPEDS. The essence of a unit record system is that institutions would provide student-level data, rather than aggregate data. The student-level data would be tagged with a unique identifier for each student. This would allow us to calculate everything now in IPEDS, plus calculate other indicators on graduation and transfer rates, time to degree, net prices, and persistence by student characteristics. Institutions could use these data to address their own questions and policy makers could design sophisticated accountability systems using it.

However, there has been resistance to unit records and whether NCES will be allowed to move beyond samples to get the fine grained measures needed to answer the kinds of pressing questions we need answered is still an open question.

Data on Teachers

I am running out of time, so I will deal with data needs concerning teachers more briefly than the importance of these data demands. Teachers are the most expensive input into the education system and increasingly research has identified the independent effects of teachers on student learning. Despite their central role in the education system, NCES's data collections documenting the need for and practices pertaining to the teacher workforce are weak.

Some questions that we need to address:

What is the future demand for teachers likely to be? And more specifically, what is the future demand for science and math teachers likely to be?

Many studies project high turnover in the teacher workforce in the near future. The push for highly qualified teachers in every classroom and the growing concern for the supply of and quality of math and science teaching in the nation all combine to increase the need for solid projections of the future demand for teachers. How can NCES help root such projections in strong data?

How much do teachers cost?

We know surprisingly little about the compensation that teachers receive. State level data have been reported but these are average salaries by state and do not account for variation in the age or the qualifications of the teachers in a state and they do not allow the comparison of teacher compensation across different regions within a state. Moreover, many believe that teacher salaries represent a declining share of the true compensation that teachers receive-that more and more of their compensation is in the form of benefit and retirement packages. We simply don't have data to address many of these issues. Working with Bureau of Labor Statistics we are trying to design data collections to get a better handle on teacher compensation and benefits.

While we believe we can get some traction on intra-state variation, this leaves an even more challenging task: Can we map teacher costs onto schools? This is a very difficult task.

Overarching Issues

Clearly, there are many issues involved in each of the four specific domains that I have highlighted-and there are other specific data collections (all with their own particular set of challenges) that I could have highlighted. I will conclude by returning to more broad-based challenges that cut across the entire center.

"Re-engineering" the NCES
Do we fill in holes in our data collections, such as the gap in teacher compensation or do we try to go about re-engineering the entire NCES? In some ways, the organization of the NCES reflects very traditional ideas and data collections that have accreted over time. The standards and rigor of the Center are without parallel and the Center does a wonderful job in meeting its mandate, but it is still approaching data collections in ways that haven't changed, except incrementally, in decades. While fixing the problems noted above can more than fill every day of my life as Commissioner, I often wonder about how best to position the NCES to fulfill data collection needs of the 21st century. Keeping the future in mind while dealing with a large organization generating continuous demands every day is a challenge that all managers face-including this one.

The need to keep high quality staff
I worry about the erosion of analytic capacity in the staff. More and more of our work is done by contractors and NCES staff increasingly play the role of contract officers rather than data analysts. Analytic skills can and do get rusty without constant use. I also worry about an aging workforce. Hiring is critical-but tight fiscal constraints limit the number of young people with strong analytic and managerial skills that can be hired. We are moving forward, but recruiting the next generation of NCES leaders is hard work.

Resources
Sen. John Glenn, while atop a rocket, was asked how he felt. He responded, "Pretty good considering every part of this machine was bought from the lowest bidder." Enough said on this point!

Measurement
A cynic once observed that "You can't measure a live snake." Our ability to measure many aspects of the American education system has improved substantially in the past decades, but, as with measuring a live snake, by the time we measure something in America's complex and dynamic school system, it's likely to have squirmed around and moved. There is also a related problem-many of the live snakes we are trying to measure have very good friends, so that efforts to change definitions or data collections run into political problems-even if it's time to move on. We don't want to spend scarce time, money and other resources measuring something that is irrelevant to policy makers and the research community, but how do we keep up with the ever-changing world of America's schools?

As you can see, the challenges abound, but I look forward to an exciting tenure in one of the best jobs in the world of American education policy.

Top