Skip to main content
Skip Navigation

Students Selecting Stories: The Effects of Choice in Reading Assessment –

Results from "The NAEP Reader" Special Study of the 1994 National Assessment of Educational Progress

June 1997

Authors: Jay R. Campbell and Patricia L. Donahue

PDF Download the complete report in a PDF file for viewing and printing. 694K


Students select stories to read in classrooms, in libraries, in their homes, and wherever reading materials are available. They choose different types of stories based on their purpose for reading and their personal interests. One situation where students rarely have a choice of what to read, however, is in a reading assessment. Some educators view this as a problem in assessing reading comprehension since students may be more engaged when they have chosen a text than when they are reading assigned texts.

While the effect of choice on student performance in large-scale assessments and the psychometric ramifications of offering choice have been studied, The NAEP Reader study represents a significant departure from past efforts. Whereas earlier studies have focused on the effects of offering students a choice among test questions, [1] The NAEP Reader study was designed to examine the feasibility and measurement impact of offering test takers a choice of reading material on an assessment of reading comprehension.

Conducted as part of the 1994 National Assessment of Educational Progress (NAEP), The NAEP Reader study was designed to compare the performance of students who were allowed to select a story with the performance of those who were assigned a story. Booklets containing a selection of seven stories were produced, one for grade 8 and a different selection for grade 12. One nationally representative sample of students at each grade was allowed to choose a story to read. Distinct representative samples at each grade were assigned stories. As The NAEP Reader study was administered in conjunction with the NAEP reading assessment, students participating in the study worked within the same 50-minute time frame as students taking the main assessment. All participants, in both the choice and non-choice samples, answered the same eleven comprehension questions that were generically worded so as to be applicable to each and all of the stories. Students in the choice sample were given an additional question asking them to briefly explain the reason for their choice of story. The major findings from this special study are provided below.

Major Findings

  • Choice vs. Non-Choice Performance

    Among twelfth graders, no significant difference was observed between the average reading scores of students who were given a choice of story and students who were assigned a story. At grade 8, however, students who selected a story demonstrated slightly lower performance than students who did not have a choice of story. The difference was one scale-score point on a 0-to-100 scale with a standard deviation of 10.

  • Choice vs. Non-Choice Perceptions

    Some differences were observed in students' perceptions of the assessment depending on whether or not they were allowed to choose a story. At both grades 8 and 12, students in the choice group were more likely than students in the non-choice group to rate the assessment as easier than other tests or assignments that they had had in school. Also, twelfth graders who could choose a story had higher estimations of their performance on the assessment than did their counterparts who were assigned a story. On the other hand, no significant differences between choice and non-choice groups were observed in students' reports of their motivation for performing well on the assessment.

  • Patterns of Story Selection

    Despite some slight variations, the patterns of students' story selections were mostly similar across racial/ethnic groups at both grades 8 and 12, and across gender groups at grade 8. Among twelfth graders, however, males and females demonstrated strikingly different story preferences. Males were predominantly drawn to a story about a soldier and females were predominantly drawn to a story about a relationship.

  • Story Selection Criteria

    The most frequently reported basis for story selection in both grades was an affective or general evaluative criterion. Also, twelfth graders were more likely than eighth graders to select a story because it represented a particular genre.

  • Context Effects of Stories on Comprehension Questions

    Although identical questions were used to assess students' comprehension of each of the seven stories at each grade, there was evidence that many questions were more or less difficult to answer in conjunction with certain stories.

Background on the NAEP Reading Assessment

As educational theories and instructional approaches change over time to reflect evolving perceptions of how students learn and develop, concerns naturally arise about the assessment methods used to measure students' achievement. The emergence of an interactive, constructive theory of reading over the last two decades has not only brought about pedagogical reforms but has also called into question the traditional approaches of assessing reading development. In response, changes in how reading comprehension is measured can be observed in classrooms, in state-wide assessment initiatives, and in national large-scale assessment programs.

Reflecting these changing theories and practices, the National Assessment of Educational Progress (NAEP) reading assessment was redesigned in 1992 to include an increased emphasis on constructed-response questions and to involve students in reading authentic texts (materials selected from sources commonly available to students in and out of school). The assessment framework which provided the basis for developing the 1992 reading assessment views reading as a complex, interactive process between the reader, the text, and the context of the reading situation. [2] Furthermore, the processes and strategies used by readers to construct meaning from text are assumed to vary across texts and reading activities. As such, the framework specified that students should be assessed in reading for three different purposes: reading for literary experience, reading to gain information, and reading to perform a task.

The 1994 NAEP assessment of reading was conducted using two-thirds of the content from the 1992 assessment and new content that was developed from the same framework. Results from the 1994 assessment, as well as comparisons with results from the 1992 assessment are presented in NAEP 1994 Reading Report Card for the Nation and the States. [3]

Both the 1992 and 1994 NAEP reading assessments incorporated innovative tools and procedures for measuring reading comprehension that may be seen as responsive to the concerns of educators and researchers about more traditional testing approaches. For example, the use of authentic reading materials rather than passages that were written or abridged specifically for the assessment was viewed as creating a test situation which more closely replicated real-world reading tasks. Also, using a variety of texts representing different reading purposes rather than relying on a single type of text provided for a more comprehensive assessment. Emphasizing constructed responses to comprehension questions rather than relying primarily on multiple-choice formats provided opportunities for students to express fuller and more diverse interpretations based on their prior experiences and background knowledge.

Rationale for The NAEP Reader Study

Although the 1992 and 1994 NAEP reading assessments incorporated a number of innovations in measuring reading comprehension, many reading educators and researchers have voiced additional concerns about traditional assessment approaches. As the need to replicate tasks across students is paramount if comparisons between students' performance are to be made, standardized assessments of reading comprehension typically include a common set of reading materials and questions that are administered to all students participating in the assessment. Although fundamental principles of educational measurement require such a practice, it has been criticized by some within the field of reading as creating a situation in which test takers may lack the motivation and interest that support engagement and comprehension in more typical reading situations. [4]

The interaction of cognitive and affective processes has come to be viewed as an important aspect of readers' ability to comprehend texts. Some reading theorists have suggested that a reader's affective stance toward a text may play a critical role in the processes of comprehension. [5] Studies have shown that a positive attitude toward the reading task may increase the reader's attention, strategy use, and persistence. [6] Other studies indicate that the link between a reader's attitude and comprehension may be mediated by other variables, including the extent and relevance of prior knowledge, the task demands, and the context of the reading situation. [7]

The influence of affective process such as interest and motivation on reading comprehension and literacy development has become a central focus in numerous recent research studies and efforts to improve reading instruction. [8] It has been suggested that readers who are interested in the material and motivated to understand are more likely to demonstrate a level of engagement that promotes deeper levels of comprehension. [9] For example, readers who have interest in a text may more willingly engage in thoughtful consideration and be more apt to make personal connections with text ideas.

Often cited in the literature on engagement in reading is the body of research investigating the effects of intrinsic and extrinsic motivations on learning. Intrinsic motivations that are internal to the learner, such as interest, curiosity, and challenge, have been shown to promote and sustain higher levels of learning. Conversely, extrinsic motivations that are imposed externally, such as grades, recognition, and competition, may focus the learner on minimal levels of task completion. [10] Educators who seek to promote a life-long desire for reading in students and to provide students with the tools for succeeding at literacy tasks have come to recognize the importance of intrinsic motivation in classroom activities.

Increasingly, the growing knowledge base in literacy motivation and engagement has influenced school curriculum. For example, research indicating that student selection of tasks and materials can enhance learning attitudes and involvement have led to an emphasis on self-selected reading in many classrooms. [11] Recognizing that strong intrinsic motivation for reading is necessary to the student's development of strategies, such as summarizing and drawing inferences, many classrooms encourage such motivations as curiosity and involvement by allowing students to choose their own topics.

Providing students with a choice and giving students time to read books of their own choosing exemplify some of the effective strategies for literacy development that have become a part of instructional practice. [12] In addition, materials used for reading instruction are no longer limited to passages that were traditionally part of basal programs, passages that were usually written in a manner that controlled for vocabulary, language, and topic. Instead, many teachers use a range of texts and text types in their instruction, giving students exposure to diverse reading materials and providing them opportunities to develop personal interests and preferences in reading. [13] By linking student's intrinsic motivations to curriculum activities, the classroom becomes a site of possibility for students to become engaged in and to further their own literacy development.

As the theory and practice of reading instruction evolve, it is important to consider the implication of these changing ideas on assessment procedures. Undoubtedly, the constraints of large-scale assessment do not allow for accommodating the infinite variety of interests and preferences of each individual participant. Indeed, as the assessment situation typically calls upon an extrinsic motivation of compliance, the degree to which a students' intrinsic motivation can be incited may be at least partially circumscribed. The NAEP Reader study was conceived as an examination of one concern voiced by educators and researchers -- the effects of choice on an assessment of reading comprehension. Set within the context of a large-scale assessment, the primary question addressed by this study is whether or not students perform differently on an assessment of reading comprehension when they are allowed to choose from a selection of texts rather than being given a particular text to read.

Design of The NAEP Reader Study

In order to examine the effect of choice, the NAEP Reader study was conducted with equivalent but distinct samples at each grade, differing only in whether or not they had a choice of which story to read. A nationally representative sample of 2,416 eighth graders and 2,100 twelfth graders was given a choice. These students, having received a collection of seven stories appropriate to their grade, were asked to select a story, to write a brief explanation of why they chose the story, and answer eleven constructed-response (open-ended) comprehension questions. The nationally representative samples that were assigned one of these same stories to read (i.e., one sample for each of the seven stories at each grade) ranged from 581 to 859 students at grade 8, and from 456 to 629 students at grade 12. The total number of students in the non-choice samples across all seven stories was 4,825 at grade 8 and 3,664 at grade 12. Students in these non-choice samples were asked to answer the same eleven comprehension questions for the assigned story as the choice sample answered in relation to a selected story.

The collection of stories at each grade, entitled The NAEP Reader, comprised a variety of literary genres by both well- and lesser-known authors. The stories were drawn from sources appropriate to either grade 8 or grade 12 and were chosen for both their literary merit and cultural diversity. The length of the seven stories ranged approximately from 1,200 to 2,200 words at grade 8 and from 1,300 to 2,600 words at grade 12. While deemed comparable in difficulty by the committee of reading experts that oversaw the development of this study (see Appendix B), the stories covered distinctly different topics. Printed on the inside cover of each collection, very brief story summaries provided students with a hint about the plot or main character. On the facing page, the table of contents provided the authors' names. Thus, the collection resembled a literary text that students might encounter in school or in their reading experience. Figures 1 and 2 on the following pages present the story summaries which appeared at the beginning of The NAEP Reader for each grade.

Figure 1: Eighth-Grade NAEP Reader Story Summaries

Figure 2: Twelfth-Grade NAEP Reader Story Summaries


In addition to the copy of The NAEP Reader appropriate for their grade, each student involved in the study received a booklet containing eleven comprehension questions. Of these eleven questions, eight were short-constructed response questions requiring a one or two sentence response and three were extended constructed-response questions requiring a more developed, reflective response of one or more paragraphs. Short constructed-response questions were scored as acceptable or unacceptable; extended constructed-response questions were scored according to a four-level rubric ranging from unsatisfactory to extensive. The assessment time was 50 minutes both for those students who were assigned a story and for those who were given a choice.

To accommodate students' choices and to allow for comparison of performance across the seven stories for students in both the choice and non-choice samples, the comprehension questions were composed generically so as to be applicable to any of the stories in the grade 8 or grade 12 NAEP Reader. For example, one of the questions asked students to describe the qualities of one of the main characters; another asked students to evaluate the appropriateness of the story's title. As these questions could be answered about any of the stories at each grade, all students participating in the assessment responded to the same set of questions. (The comprehension questions are presented Appendix A.)

For each grade, responses to The NAEP Reader comprehension questions were analyzed to determine the percentages of students responding in each of the categories specified by the scoring rubrics. The performance of the nationally representative student samples that were each assigned one of the seven stories was used to establish a scale. Item response theory (IRT) methods were used to produce the scale, which ranged from 0 to 100, with a mean of 50 and a standard deviation of 10. The performance of students who were allowed to choose a story was then analyzed using the same scale; thus, it is possible to report and compare students' performance in the choice and the non-choice samples on this scale.

An advantage of using IRT methods is that results for all students no matter which story they read are easily placed on the same scale. Three important assumptions were made in using this methodology. One is that each of the subsamples of students that were assigned a story to read is representative of the national student population. A second assumption is that for each story, each of the comprehension questions meant the same thing for students who selected the story and for students who were assigned the story. The third assumption is that the questions as answered in the context of each story all measure the same construct.

This Report

This report is comprised of three chapters, each focusing on a different aspect of the study. Chapter One presents findings related to the primary question of the study: Was student performance better when choice of stories was offered than when students were randomly assigned a story? Results are presented for the nation and by racial/ethnic and gender subgroups. In addition, students' perceptions of the assessment, including their motivation for performing well, are presented in this chapter. Chapter Two describes patterns of choices displayed by students who were allowed to select a story. Student selection patterns are presented for the nation, and by race/ethnicity and gender. Also in Chapter Two is a description of the selection criteria reported by students in making their story choices. Chapter Three examines how student performance on the generically worded questions varied in relation to different stories and presents sample student responses. The report concludes with a discussion of study results and issues related to study design and interpretations.

The average scale scores and percentages presented in this report are estimates because they are based on samples rather than the entire population. As such, the results are subject to a measure of uncertainty due to sampling error. In addition, measurement error contributes to the uncertainty of average scale scores reported for groups of students. The degree of uncertainty is reflected in the standard errors presented in parentheses along with the estimated average scores or percentages in tables and figures throughout this report.

The differences between scale scores or percentages discussed in the following chapters take into account the standard errors associated with the estimates. The comparisons are based on statistical tests that consider both the magnitude of the difference between the group average scores or percentages and the standard errors of those statistics. Throughout this report, differences are discussed only if they were determined to be statistically significant at the .05 level with appropriate adjustments for multiple comparisons.

  1. Lukhele, R., Thissen, D. & Wainer, H. (1994). On the relative value of multiple-choice, constructed response, and examinee-selected items on two achievement tests. Journal of Educational Measurement, 31(3), 234-250.

  2. Reading framework for the 1992 and 1994 National Assessment of Educational Progress. (1994). National Assessment Governing Board. Washington, DC: U. S. Government Printing Office.

  3. Campbell, J. R., Donahue, P. L., Reese, C. M., & Phillips, G. W. (1996). NAEP 1994 reading report card for the nation and the states. National Center for Education Statistics. Washington, DC: U. S. Government Printing Office.

  4. Levande, D. (1993). Standardized reading tests: Concerns, limitations, and alternatives. Reading Improvement, 30(2), 125-127.

  5. Mathewson, G. C. (1994). Model of attitude influence upon reading and learning to read. In R. B. Ruddell, M. R. Ruddell, & H. Singer (Eds.), Theoretical models and processes of reading (pp. 1131-1161). International Reading Association: Newark, DE.

    Rosenblatt, L. M. (1994). The transactional theory of reading and writing. In R. B. Ruddell, M. R. Ruddell, & H. Singer (Eds.), Theoretical models and processes of reading (pp. 1057-1092). International Reading Association: Newark, DE.

  6. Alexander, P. A., Kulikowich, J. M., & Jetton, T. L. (1994). The role of subject-matter knowledge and interest in the processing of linear and nonlinear texts. Review of Educational Research, 64, 201-252.

    Baldwin, R. S., Peleg-Bruckner, Z., & McClintock, A. H. (1985). Effects of topic interest and prior knowledge on reading comprehension. Reading Research Quarterly, 20(4), 497-504.

  7. Henk, W. A., & Homes, B. C. (1988). Effects of content-related attitude on the comprehension and retention of expository text. Reading Psychology, 9(3), 203-225.

    Hollingsworth, P. M., & Reutzel, D. R. (1990). Prior knowledge, content-related attitude, reading comprehension: Testing Mathewson's affective model of reading. Journal of Educational Research, 83(4), 194-199.

  8. Cramer, E., & Castle, M. (Eds.). (1994). Fostering the love of reading: The affective domain in reading education. International Reading Association: Newark, DE.

  9. Guthrie, J. T. (1996). Educational contexts for literacy engagement in literacy. The Reading Teacher, 49(6), 432-445.

    Sweet, A. P., & Guthrie, J. T. (1996). How children's motivation relate to literacy development and instruction. The Reading Teacher, 49(8), 660-662.

  10. Deci, E. L., Vallerand, R. J., Pelletier, L. G., & Ryan, R. M. (1991). Motivation and education: The self-determination perspective. Educational Psychologist, 26, 325-346.

  11. Sweet, A. P. (1993, November). Transforming ideas for teaching and learning to read. Office of Educational Research and Improvement. U.S. Department of Education: Washington, DC.

  12. Raphael, T.E., & McMahon, S.I. (1994). Book club: An alternative framework for reading instruction. The Reading Teacher, 48(2), 102-116.

    Turner, J., & Paris, S.G. (1995). How literacy tasks influence children's motivation for literacy. The Reading Teacher, 48(8), 662-673.

  13. Hiebert, E. H. (1994). Becoming literate through authentic tasks: Evidence and adaptations. In R. B. Ruddell, M. R. Ruddell, & H. Singer (Eds.), Theoretical models and processes of reading (pp. 391-413). International Reading Association: Newark, DE.

    Strickland, D. S. (1994/1995). Reinventing our literacy programs: Books, basics, balance. The Reading Teacher, 48(4), 294-302.

PDF Download the complete report in a PDF file for viewing and printing. 694K

NCES 97-491 Ordering information

Last updated 23 March 2001 (RH)

Go to Top of Page