Note: The Introduction section of NAEP 1996 Trends in Writing refers to content located in other sections of the report. The full report is provided as a PDF file and may be viewed using the link above. To view or print any of the Appendices, Tables 1.1 - 1.3 (Chapter 1), or Tables 2.1 - 2.3 (Chapter 2), use the PDF file link.
The NAEP long-term trend writing assessment provides an important picture of students' progress over time because it compares performance on the same writing tasks, administered in identical fashion to comparable samples of students and yielding comparable scores. There have been six national assessments of writing conducted during the school years ending in 1984, 1988, 1990, 1992, 1994, and 1996. The 1996 assessment included the same set of 12 writing tasks that had been administered in the five previous assessments. Each of these trend assessments was administered to nationally representative samples of students in grades 4, 8, and 11.
Over the past three decades, many teacher educators and classroom teachers have been emphasizing the writing process. The writing process approach focuses on the iterative nature of writing, in which writers plan, write, and revise their ideas in several drafts before a final version is produced. It is during the revision or editing stages of this process that writers focus on correcting grammatical and mechanical errors. Grammatical and mechanical correctness is not viewed as an end in and of itself, but eliminating these errors is an important part of improving the final draft. This report focuses on what changes, if any, have occurred in student writing between 1984 and 1996, the period examined by the NAEP long-term trend writing assessment.
Results of the 1996 long-term trend writing assessment are reported in two publications. This report describes two aspects of writing for which change has been measured since 1984: writing fluency, as determined by holistic scoring; and mastery of the conventions of written English (spelling, punctuation, grammar) as determined by mechanics scoring. This report is supplementary to NAEP 1996 Trends in Academic Progress, the main report for the NAEP long-term trend assessment. That document reports trends in writing scores since 1984 as determined by primary trait scoring. This report presents the results of the holistic scoring of a subgroup of four of the 12 writing tasks, and the mechanics scoring of two of these four tasks.
The report is organized as follows: Chapter 1 compares student performance on writing tasks in 1984 and 1996 as measured by holistic scoring. Chapter 2 compares students' mastery of the conventions of writing (grammar, punctuation, and spelling) in 1984 and 1996. The brief Summary offers conclusions, and is followed by three appendices. Appendix A contains information about sample sizes and scoring procedures, and Appendix B contains the guides for holistic and mechanics scoring. Appendix C provides the standard errors for the data in the tables contained in the body of the report.
The NAEP long-term trend writing assessments discussed here and in Trends should not be confused with the main NAEP writing assessments. The long-term trend writing assessment was begun in 1984, and has presented students with the same writing tasks in the five ensuing assessments. These writing tasks are completely different from the prompts in the main NAEP assessment. The use of different writing prompts, as well as other procedural differences, precludes direct comparison of the results of the long-term trend assessment discussed here with those of the main assessments.
In order to assess students' abilities to write in a variety of formats and genres, the NAEP long-term trend writing assessment asks them to respond to several different tasks in each of three types of writing:
The NAEP long-term trend instrument consists of 12 distinct writing tasks; however, each student who participated in the assessment responded to only a few (usually two) of the 12 tasks. These tasks are assessed using three types of measures:
Primary trait scoring is based on established criteria that reflect the success of the student in accomplishing the specific writing task; for primary trait scoring, a unique scoring guide was used for each of the tasks. Student responses to all 12 writing tasks received primary trait scoring as reported in the principal 1996 long-term trend report, NAEP 1996 Trends in Academic Progress.
However, there are other aspects of writing that it is also important to assess. For instance, general writing quality or fluency -- the student's capacity to organize and develop a written piece, to use correct syntax, and to observe the conventions of standard written English -- is important. These aspects of written communication, taken together, are what holistic evaluation of writing addresses.
The long-term trend writing assessment consisted of three distinct parts: primary trait, holistic, and mechanics scoring criteria. First, all 12 of the long-term trend writing tasks were scored using primary trait scoring criteria. The results of this are reported in NAEP 1996 Trends in Academic Progress in Chapters 7 and 8 (pages 151-197).
Next, a subgroup of four of these tasks was scored holistically -- two tasks at each grade level. Two of the writing tasks were administered at grade 4 only, while the two other tasks were both administered at grades 8 and 11. One of the four is an informative task, one is a narrative task, and two are persuasive tasks. A brief description of each writing task and the grades at which the task was administered are provided in Figure I.1 below. Holistic scoring of these tasks yielded information about students' level of writing fluency, as seen in Tables 1.1 - 1.3. Different scoring guides were used for holistic scoring of narrative, informative, and persuasive tasks, as described in Appendix B.
Lastly, to gain information about students' mastery of the conventions of written English, a subgroup of two of the holistic tasks was scored for mechanics -- one at each grade level (see the figure above and Tables 2.1 - 2.3). The mechanics scoring involved assessing students' use of standard English sentence structure, rules of agreement, word choice, spelling, and punctuation. It also captured information about the overall length of the students' responses and the number and complexity of the sentences that they used. For mechanics scoring, the same criteria were used to evaluate all tasks. See Appendix B of this report for the mechanics and holistic scoring guides.
Holistic scoring is the most commonly used method for evaluating students' writing performance in the United States today. Holistic scoring for NAEP focuses on the writer's fluency in responding to a task relative to the performance of other students at that grade level. Fluency reflects a writer's facility with language both in terms of the development and organization of ideas and in the use of syntax, diction, and grammar. Holistic scoring methods were specifically designed to assess writing fluency. The underlying assumption of holistic scoring is that the whole piece of writing is greater than the sum of its parts. In holistic scoring, readers do not make separate judgments about specific aspects of a written response, but rather consider the overall effect, rating each paper on the basis of its general fluency.
In the NAEP long-term trend assessment, responses to four tasks are scored holistically, two tasks at each of the three grades (the same two tasks are administered at both eighth and eleventh grades). The characteristics of general fluency are assessed on a six-point scale, and described in the holistic scoring guides for narrative, informative, and persuasive writing tasks in Appendix B. In order to make comparisons of students' writing fluency across all six years of the assessment, all papers from the previous years were scored holistically, along with all of the 1996 papers. For each year, approximately 1200 papers from each grade are scored.
As is typical with all holistic scorings, raters are trained on a particular task immediately before scoring the papers written in response to that task (as described in Appendix A). For each task, the papers from all years are randomly mixed and then assigned one of six scores. To detect changes in fluency from one assessment to another, the percentages of papers from each year within a given score category are compared. The comparisons reported here are for the first or base year and the current year, as in previous reports.
Thus, while primary trait scoring is based on specific constant criteria and so permits year-to-year and grade-to-grade comparisons, holistic scoring allows within grade comparisons of relative fluency over all years according to contemporaneous criteria.
Another set of analyses, applied to papers written for two of the tasks (see Figure I.1 above), focused on the mechanics of students' writing. While error counts do not fully reflect a writer's fluency and competency, many educators, policy makers, and parents are interested in the kinds of surface errors students make as they write. Students' mastery of the sentence-level and word-level conventions of English, as well as their use of correct spelling and punctuation, were examined. (See Appendix A for procedures used in scoring, and Appendix B for the mechanics scoring guide.) In order to examine changes in students' success in using the conventions of written English, one task at each grade was selected for a detailed analysis of writing mechanics, including spelling, word choice, punctuation, and syntactic errors.
Because the analysis is conducted using papers written by students who are part of a sample (rather than from the entire population of fourth, eighth, or eleventh graders in the nation) the numbers reported are necessarily estimates. As such, they are subject to a measure of uncertainty. This measure of uncertainty is reflected in the standard error of the estimate, which can be seen in Appendix C, in tables paralleling those in the main body of the report. In comparing student performances on a particular characteristic by either number or percentage, it is essential to take into account the standard error, rather than to rely solely on observed similarities or differences. The comparisons discussed in this report and marked with asterisks in the tables are based on statistical tests that consider both the magnitude of the difference between the averages and the standard errors of those statistics.
The statistical tests determine whether the evidence -- based on data from the two years -- is strong enough to conclude that there is an actual difference. If the evidence is strong (i.e., the difference is statistically significant), statements comparing 1996 with 1984 use terms such as higher, lower, increased, or decreased. The reader is cautioned to rely on the results of the statistical tests, as expressed in the text or as indicated in the tables, rather than on the apparent magnitude of the differences.
The statistical tests employed here used Bonferroni procedures to form confidence intervals for the differences for sets of comparisons. Bonferroni procedures are appropriate for sets or "families" of comparisons, allowing adjustments according to family size to keep the certainty or significance level as specified (that is, a 95 percent certainty or 5 percent significance level). For comparisons in this report, several family sizes were used. Consider, for example, Table 2.1, which presents overall averages in 1984 papers compared with those in 1996 papers. For these across-year comparisons, the family size is 1, and consequently no adjustment is needed. Table 2.1 also presents across-year comparisons for papers in the lower and upper halves of the holistic scale; in this case, each half is a family of 1, so a Bonferroni adjustment is made for a family size of 2. Further information on statistical tests and adjustment procedures are in the NAEP 1996 Technical Report.
NCES 1999-456 Ordering information
U.S. Department of Education. Office of Educational Research and Improvement. National Center for Education Statistics. NAEP 1996 Trends in Writing: Fluency and Writing Conventions, NCES 1999-456, by N. Ballator, M. Farnum, and B. Kaplan. Washington, DC: 1999.
Last updated 14 March 2001 (RH)