Dr. Gary W. Phillips
Acting Commissioner of the National Center for Education Statistics

April 11, 2000

The data from the 1999 long-term trend writing assessment will not be released until further analyses have been conducted to ensure the accuracy of the results. I would like to distinguish between the results of this assessment and the 1998 writing assessment. NCES is not concerned about the accuracy of the 1998 report, because the assessment was based on different test specifications. The mathematics, reading and science components of the long-term trend report are unaffected and will be released this summer, as planned.

Before moving to describe the details of the situation, let me make two key points. First, it is important to distinguish between the long-term trend assessment and the 1998 national and state writing assessment. Beginning in the early 1970s, NCES began to conduct trend assessments in reading, mathematics, science, and writing. Over time, these assessments have remained essentially the same, both in content and manner of administration. As a result, NCES has maintained a record of trends in student achievement that stretch over several decades. Beginning in the early 1990s, NCES began a new series of national and state assessments in the same four subjects, as well as assessments of geography, U. S. History, civics, and the arts. The national and state writing assessment is based on a different, and more robust, set of assessment specifications that made use of the most current methodologies. These NAEP assessments, often known as The Nation's Report Cards," are now considered to be the main NAEP reporting series. We are in no way concerned about the accuracy of the 1998 national and state assessment reports.

The second point that I want to make is that we are planning to reanalyze data from the long-term writing assessment. We hope to issue these data in a separate report. I have included more details in the paragraphs that follow.

The National Assessment Governing Board, which sets policy for the National Assessment of Educational Progress (NAEP), agreed in March with NCES that current analyses of the long-term trend writing assessment indicated some potential problems, and should be further investigated.

Long-term trend assessments must remain exactly the same, both in content and manner of administration, so that NCES can maintain a record of trends in student achievement that stretches over several decades. Writing is unique among the four long-term trend subjects in that this assessment is entirely performance-based. Students do not respond to any short-answer or multiple-choice questions in the assessment, but instead write one or two open-ended essays.

One of the challenges of performance-based trend assessment is to provide a sufficient number of items to meet statistical requirements, but few enough so that students and administrators are not overly burdened by testing requirements. In the case of a survey such as NAEP, the total number of prompts must be limited if sample sizes are to remain manageable. In the case of the long-term trend writing assessment, which is based on exercises first administered in 1984, other constraints further limited the number of exercises in the pool. Specifically, the assessment had to fit in the same booklets as the long-term trend reading assessment and had to be relatively inexpensive to score and administer. As a result, an instrument based on a limited number of older exercises was used. Again, it is important to note that the main national and state writing assessments were based on a far greater number of prompts, more modern analysis and scoring methodologies, and far larger samples, so the problems with long-term trend do not adhere in this case.

Test contractors performed routine quality control checks on previously analyzed data and noted a problem with the method used to compare results from one testing year to the next. Ultimately the two problems, the limited number of items and the analytical techniques, led to a lack of confidence in the data. This was not a problem with the NAEP 1998 Writing Report Card (which was based on 20 topics per grade, rather than six), that reported national and state results.

As a result of the potential problems with the long-term trend writing assessment, the NCES Assessment division has removed long-term trend writing data from the website. Further, the division has notified report recipients that the data are "under review" and that reports will be re-released online when the data have been reanalyzed.

In addition:

  1. The 1999 long-term trend report with mathematics, reading, and science data will be issued on schedule.
  2. New measurement models that allow incorporation of rater effects and more robust variance estimation into the analysis of data will be developed.
The results of this effort will be shared with other practitioners and will, I believe, further our understanding of the technical requirements of performance-based testing and methods of analyzing year-over-year results.
