Skip to main content
Skip Navigation

Learning About Our World and Our Past: Using the Tools and Resources of Geography and U.S. History

January 1998

Authors: Evelyn Hawkins, Fran Stancavage, Julia Mitchell, Madeline Goodman, and Stephen Lazer

PDF Download the complete report in a PDF file for viewing and printing. 5,052K


Introduction

The National Assessment of Educational Progress (NAEP) is mandated by the United States Congress to survey the educational accomplishments of U.S. students. For more than a quarter of a century, NAEP has assessed the educational achievement of fourth-, eighth-, and twelfth-grade students in selected subject areas, and as such, it is the only nationally representative and continuing assessment of what America's students know and can do. NAEP assessments are based on content frameworks and specifications developed through a national consensus process involving teachers, curriculum experts, parents, and members of the general public. The frameworks are designed to reflect a balance among the emphases suggested by current instructional efforts, curriculum reform, contemporary research, and desirable levels of achievement.

In 1994, NAEP conducted assessments in geography and U.S. history. Both are fields in which students are required to have a strong knowledge base of facts and concepts. Mastery of these subjects also involves the ability to use a variety of tools and resources as well as competence in a range of interpretive skills, including recall, analysis, judgment, application, and evaluation. Among the more important resources used in the study of history are primary source documents, such as the U.S. Constitution, the Declaration of Independence, personal correspondence, pictures, photographs, political cartoons, literature, and other artifacts. Historians interpret primary source documents by placing them in comparative, thematic, and chronological contexts to further their knowledge and understanding of historical events. Among the major tools of the field of geography are maps and atlases. Geographers use a variety of maps to guide their work and to record information. Other tools and resources in history and geography include charts, graphs, tables, and timelines.

While achievement in history and geography involves both knowledge and skills, many past assessments in these areas have tended to focus on the former at the expense of the latter. Conversely, some other assessments have downplayed knowledge in order to emphasize the mastery of skills. The frameworks that guided the 1994 NAEP assessments, by contrast, attempted to achieve a more balanced view of the relationship between knowledge and skills. Specifically, both the geography and U.S. history frameworks portrayed their disciplines as fields in which factual knowledge, use of specialized tools and resources, and interpretive skills are all inseparable components of achievement.

Basic results from the 1994 NAEP assessments in geography and U.S. history have been presented in a series of reports released earlier. [1] These reports were intended for policymakers, educators, and the general public. They focused primarily on overall scale-score and achievement-level results for major populations in the United States, and on general factors related to achievement in history and geography. This current study has a more specific target audience (history and social studies educators) and a different purpose: a more in-depth look at the types of tasks that made up the 1994 NAEP assessments, and at how students performed on those tasks. Specifically, this report examines the ways in which students use the tools and resources of history and geography. Rather than looking at aggregated results, we examine performance in different skills areas and on particular assessment exercises. This report therefore makes extensive use of examples of student work and of exercise-level statistics.

As was mentioned above, this report will examine the success students had working with a range of resource materials similar to those used by professional geographers and historians. History and geography educators view the ability to interpret a broad range of authentic materials as an essential element of learning in these fields. [2] Many questions on both surveys were not based on the types of stimulus discussed in this report; for example, a number of tasks were designed to measure student knowledge and not the ability to interact with textual, quantitative, or graphic materials. Because of the particular focus of this report, no attempt is made to discuss the full range of knowledge and skills assessed as part of the 1994 NAEP geography and U.S. history assessments.


1994 NAEP U.S. History Framework

The framework for the 1994 NAEP U.S. history assessment [3] represented an ambitious vision both of what students should know and be able to do, and of the ways in which those competencies should be assessed. It presented the study of history as an exciting endeavor and emphasized the importance of knowing and understanding history in all its complexity -- stressing the relationship between people, events, and ideas in understanding the past. Furthermore, the framework called for an assessment that reflected the richness of history and historical sources through the use of a variety of grade-appropriate stimulus materials.

The 1994 framework was organized around two content dimensions and one cognitive dimension. One content dimension focused on four themes which represent the major areas of endeavor that have characterized U.S. history. These four themes were:

  1. Change and Continuity in American Democracy: Ideas, Institutions, Practices, and Controversies

  2. The Gathering and Interaction of Peoples, Cultures, and Ideas

  3. Economic and Technolgical Changes and Their Relation to Society, Ideas, and the Environment

  4. The Changing Role of America in the World

Because history is concerned with the experiences of people over time, it is critical to establish a basic chronological structure for tracing, reconstructing, and connecting the stories of those experiences. Thus, the second content dimension provided a chronological structure for the many issues included in the four themes. Eight periods were identified to focus attention on several major eras of U.S. history. They overlapped at some points in order to permit coherent coverage of major trends and events. The periods were:

  1. Three Worlds and Their Meeting in the Americas (Beginnings to 1607)

  2. Colonization, Settlement, and Communities (1607 to 1763)

  3. The Revolution and the New Nation (1763 to 1815)

  4. Expansion and Reform (1801 to 1861)

  5. Crisis of the Union: Civil War and Reconstruction (1850 to 1877)

  6. The Development of Modern America (1865 to 1920)

  7. Modern America and the World Wars (1914 to 1945)

  8. Contemporary America (1945 to present)

As Figure 1.1 illustrates, the themes and periods of U.S. history functioned as a matrix. The framework made clear that not all themes were equally important in each period. It also included special recommendations for adapting the assessment for fourth-grade students, who might not have received any formal instruction in U.S. history.

Figure 1.1: 1994 NAEP U.S. History Content Matrix

In addition to themes and periods, the U.S. history framework explicitly considered the ways of thinking and kinds of knowledge that historical study requires. These were divided into the following two general cognitive domains:

  1. Historical Knowledge and Perspective. This domain includes knowing and understanding people, events, concepts, themes, movements, contexts, and historical sources; sequencing events; recognizing multiple perspectives and seeing an era or movement through the eyes of different groups; and developing a general conceptualization of U.S. history.

  2. Historical Analysis and Interpretation. This domain includes explaining issues; identifying historical patterns; establishing cause-and-effect relationships; finding value statements; establishing significance; applying historical knowledge; weighing evidence to draw sound conclusions; making defensible generalizations; and rendering insightful accounts of the past.


1994 NAEP Geography Framework

The 1994 NAEP geography framework [5] reflected the heightened need for geographic knowledge and skills that has arisen as the world has become increasingly interconnected through technological advancement and shared concerns about economic, political, social, and environmental issues. The 1994 geography framework required students to reach far beyond place-name geography. It called for an assessment in which students would demonstrate an ability to work with the tools of geography, which include maps, aerial photographs, atlases, and graphs. The intent was to give students access to information conveyed through these tools, and ask them to use this information to understand and explain complex relationships and systems, such as ecosystems, communications networks, and urban infrastructures. In addition, students were expected to construct geographic representations, such as maps and diagrams, from narrative descriptions.

Like the 1994 history assessment, the NAEP 1994 geography framework was organized by a matrix of two interrelated dimensions: content and the cognitive demands of the discipline. Content was divided into the following three areas corresponding to the major branches of geographic study:

  1. Space and Place. This area includes knowledge of geography as it relates to particular places on Earth, to spatial patterns on Earth's surface, and to physical and human processes that shape such spatial patterns.

  2. Environment and Society. This area includes knowledge of geography as it relates to the interactions between environment and society.

  3. Spatial Dynamics and Connections. This area includes knowledge of geography as it relates to spatial connections among people, places, and regions.

The cognitive dimension of the framework specified the kinds of thinking expected of students as they deal with specific geography content. The dimension was organized into the following categories:

  1. Knowing. Tasks in this area are generally meant to measure students' ability to observe different elements of the landscape and to answer questions by recalling information.

  2. Understanding. In this area, students are asked to attribute meaning to what has been observed and to explain events.

  3. Applying. This area of thinking calls on students to use many tools and skills of geography as they attempt to develop a comprehensive understanding of a problem en route to proposing viable solutions.

Figure 1.2 illustrates the matrix formed by the content and cognitive dimensions of the assessment by presenting sample tasks and questions. The assessment addressed each cognitive process in each content area.

Figure 1.2: 1994 NAEP Geography Assessment Framework Dimensions


1994 NAEP Geography and U.S. History Assessments

Guided by these new and more forward-looking frameworks, the 1994 NAEP assessments in geography and U.S. history shared a number of innovative and important characteristics. These included the following:

  • Both assessments used a wide range of authentic materials as stimuli for assessment questions. These included an atlas, maps, charts, graphs, tables, text-based primary source documents and literary works, and various art forms, including photographs, paintings, cartoons, and posters. Overall, 76 percent of the questions in the geography assessment and 56 percent in the history assessment involved working with such stimuli.

  • Both assessments assessed a range of skills related to these stimuli. In addition to straightforward interpretative exercises, students were frequently asked to synthesize information from multiple stimuli or to use outside knowledge in order to interpret a given stimulus. They might, for example, be asked both to describe the data in a table and to draw on outside knowledge to give factually accurate explanations for the patterns revealed. In this way, stimulus interpretation was not artificially separated from content knowledge.

  • Both assessments included performance tasks. In a number of exercises students were asked to create maps or graphs based on narratives or tables of quantitative data.

  • Because of the use of performance assessment and the requirement in the frameworks that the geography and history surveys measure broad ranges of content and skills, the overall assessments were too long for any one student to complete. For example, the grade 8 United States history assessment would have taken over four hours for an individual student. For these reasons, NAEP used a design in which participating students took only subsets of the aggregate item pool (specifically, in both geography and U.S. history individual students were tested for 50 minutes). [7] However, it is important to remember that a key feature of this design is that the samples of students who were presented each exercise were representative of the school population at a given grade. This design does not allow for the accurate computation of individual assessment scores; it does allow for the estimation of group performance on the assessment as a whole, on sets of questions, and on individual assessment exercises.

The 1994 NAEP assessments included multiple-choice questions, short constructed-response questions, and extended constructed-response questions. Table 1.1 shows the distribution of questions by grade and format for both the 1994 NAEP geography and U.S. history assessments.

Table 1.1: Number of Questions by Grade Level and Format in the 1994 NAEP Geography and U.S. History Assessments

For constructed-response questions, students provided written responses or performed tasks, such as constructing a graph. Each constructed-response question was scored according to a scoring guide that gave varying degrees of credit for correct or partially correct answers. Short constructed-response questions were scored according to three-level scoring guides in which a "Complete" score represented a complete and appropriate answer, a "Partial" score indicated that the response had some, but not all, of the components of an appropriate response, and an "Inappropriate" score represented an answer that had none of the components of an appropriate response.

Extended constructed-response questions were lengthier and more complex exercises that allowed for a finer level of discrimination in scoring the responses. Responses were scored according to four-part scoring guides in which a "Complete" score was assigned to a response that was complete and appropriate; an "Essential" response was less complete but included the most important components of an appropriate response; a "Partial" response included some appropriate components, but fewer or less central ones than those required for an "Essential" score; and an "Inappropriate" response included only inappropriate material.

As with all NAEP assessments, the schools and students participating in the 1994 geography and history assessments were selected through scientifically designed, stratified random sampling procedures. Approximately 19,000 fourth, eighth, and twelfth graders in 1,500 public and nonpublic schools across the country participated in the 1994 geography assessment, and approximately 22,000 fourth, eighth, and twelfth graders in 1,500 public and nonpublic schools participated in the 1994 U.S. history assessment. Detailed reports on the assessment procedures and results of these assessments are presented in two separate publications from the National Center for Education Statistics: NAEP 1994 U.S. History Report Card and NAEP 1994 Geography Report Card.


Orientation of This Report

To examine students' ability to understand and use a variety of tools and resources, assessment questions were categorized according to the type of tool or resource, if any, that served as the stimulus - that is, the textual or graphic material provided to students which was to be considered in the formulation of their responses. Based on the nature of the stimulus, questions were placed into one of six categories:

  • atlases

  • maps

  • primary source documents

  • charts, graphs, and tables

  • photographs

  • art, which included paintings and cartoons.

For this report, questions that did not use a stimulus, used more than one type of stimulus, or used a stimulus that could not easily be classified into one of the six categories established, were excluded from analytical consideration.

Table 1.2 shows that - except for atlas questions (which were restricted to geography) and art questions (which were restricted to U.S. history) - the 1994 NAEP geography and U.S. history assessments both included questions in each of the different stimulus categories. It is also evident from the table, however, that some stimuli are more central to the study of one discipline than to another; for example, there were substantially more map questions in the geography assessment than in the U.S. history assessment, and more primary source documents in the U.S. history assessment than in the geography assessment.

Table 1.2: Number of Questions by Stimulus Type, Assessment, and Grade Level

One set of geography questions at each grade level was based on a Nystrom Classroom Atlas. [8] The questions in the atlas block were all categorized as atlas questions, whether they employed maps, charts, or graphs as stimuli. The primary source document category was restricted to only those questions that utilized text-based primary sources. Although photographs and paintings also are kinds of primary sources, they were categorized separately.

In addition to an introduction and a concluding chapter, this report includes five chapters corresponding to specific categories of stimuli (photography and art are discussed together in one chapter). Each chapter begins with a general description of the tool or resource. Teacher and student responses to NAEP questionnaires about classroom practices related to the relevant tools or resources are also provided in many of the chapters.

For the two categories with the largest numbers of questions (maps and primary source documents), the descriptive introduction to the chapter is followed by a discussion of student performance on the group of stimulus questions as a whole and in comparison to all other questions in the geography or U.S. history assessment. [9] The remainder of each chapter is devoted to a detailed discussion of representative exercises. Theseexercises exemplify the skills that students were required to demonstrate in order toanswer the questions included in the specific tool or resource category. [10] Information on how students performed on these individual questions is provided along with samples of actual student responses to constructed-response questions. Although there were questions from nearly all of the stimulus groups in each of the assessments, all but one of the chapters (charts, graphs, and tables) focuses on questions from the assessment in which they predominated.

Although the report is structured around particular tools and resources, the reader should bear in mind that many factors besides the stimulus influence question difficulty. Both the content knowledge and the cognitive skills required to respond correctly are inextricably intertwined in the assessment task, as they are in the practice of the disciplines of geography and history. For example, the difficulty of most of the map questions may result from the demand that students use their geographical knowledge and a number of different map reading skills to correctly respond. Similarly in history, the difficulty of any particular item did not necessarily depend on the nature of the stimulus, but rather on the content and conceptual information students needed to have to understand, interpret, and respond fully to the question.


Interpreting NAEP Results

Student responses were analyzed to determine the percentage of students responding correctly to each multiple-choice question and the percentage of students responding in each of the score categories for constructed-response questions. Weighting procedures were then applied to arrive at overall population percentages and percentages for subgroups of students. The percentages are estimates because they are based on samples rather than on the entire population. As such, the results are subject to a measure of uncertainty that is reflected in the standard errors of the estimates. Standard errors provide a measure of how much survey results would be expected to vary if a different but equally valid sample of students were chosen. These standard errors are presented in parentheses along with the estimated percentage-correct scores in tables throughout this report. [11] In the following chapters, all comparisons among question types or between subgroups of students are based on statistical tests that consider both the magnitude of the differences between the average percentages and their standard errors. Throughout this report, differences are discussed only when they are significant from a statistical perspective. This means that observed differences are unlikely to be due to chance factors associated with sampling variability. All differences are significant at the .05 level with appropriate adjustments made for multiple comparisons. The term "significant," therefore, is not necessarily intended to imply judgment about the absolute magnitude or educational relevance of the differences. The term is intended to identify statistically dependable population differences as an aid in focusing subsequent dialogue among policy makers, educators, and the public.


Overall Assessment Results

Average percent-correct performance for the assessments as a whole is shown in Table 1.3. Average percent correct, as used here, represents a different summary metric than the scale scores used in NAEP report-cards, and is designed to give readers a concrete sense of student performance on the specific exercises making up the assessments. Average percent correct is determined by obtaining the mean item score for each assessment question and averaging these over the full set of exercises. For multiple choice and dichotomously-scored constructed-response questions that is constructed-response questions scored on a two-part scale, the statistic represents the percentage of students who answered the question correctly. For polytomously-scored exercises that is, short and extended answered questions that are scored on either a three-point or four-point scale, the statistic represents the average score expressed as a percentage of the maximum possible score. Because the NAEP design ensures that each item is administered to a representative subset of the full sample, the averages presented in Table 1.3 provide a consistent estimate of the average item score that would be obtained if students were administered the entire assessment.

At every grade, the overall geography performance of males was higher than that of females; however, in U.S. history overall performance for males and females was significantly different at twelfth-grade only, where males slightly outperformed females. In both subject areas, the performance of White students was higher than that of Black students and Hispanic students. [12]

Table 1.3: Average Item Score for 1994 NAEP Geography and U.S. History Assessments

In the chapters that follow, statistically significant differences in performance by gender and racial/ethnic subgroups are noted only when these differences vary from those observed for the geography and U.S. history assessments as a whole.


  1. Persky, H. R., Reese, C. M., O'Sullivan, C. Y., Lazer, S., Moore, J., & Shakrani, S. (1996). NAEP 1994 geography report card. National Center for Education Statistics. Washington, DC: U.S. Government Printing Office.

    Beatty, A. S., Reese, C. M., Persky, H. R., & Carr, P. (1996). NAEP 1994 U.S. history report card. National Center for Education Statistics. Washington, DC: U.S. Government Printing Office.

    Williams, P. L., Lazer, S., Reese, C. M., & Carr, P. (1995). NAEP 1994 U.S. history: A first look. National Center for Education Statistics. Washington, DC : U.S. Government Printing Office.

    Williams, P. L., Lazer, S., Reese, C. M., & Shakrani, S. (1995). NAEP 1994 geography: A first look. National Center for Education Statistics. Washington, DC : U.S. Government Printing Office.

  2. Educational Testing Service. (1987) U.S. history objectives; 1988 assessment. Princeton, NJ: Author.

  3. National Assessment Governing Board. (1992). U.S. history framework for the 1994 National Assessment of Educational Progress. Washington, DC: Author.

  4. Ibid.

  5. National Assessment Governing Board. (1992). Geography framework for the 1994 National Assessment of Educational Progress. Washington, DC: Author.

  6. Ibid.

  7. Most students answered between 30 and 35 questions, which represents a subset of the total asked. The total number of questions in the assessment were 90, 125, and 123 at grades 4, 8, and 12 in geography; and 94, 148, and 156 at the three grades in U.S. history.

  8. World atlas: A resource for students. (1992 ed). (1990). Chicago, IL.: NYSTROM, Div. of Herff Jones.

  9. This comparison is limited to these two stimulus groups because the number of items included in the other four groups was too small to allow valid comparisons.

  10. Most of the questions in the 1994 NAEP assessments were not released for public review. Therefore, exercises shown in this report were chosen from only the portion of the 1994 NAEP surveys chosen for public release.

  11. The standard errors in this report should be interpreted in the following fashion: There is a 95 percent probability that a statistic for a population of interest is within two standard errors of the mean reported. For example, if we report that 50 percent of female students answered a question correctly and that the standard error is 0.5, then there is a 95 percent chance that the appropriate statistic falls between 49 and 51 percent.

  12. There were insufficient sample sizes for the American Indian, Asian, and Pacific Islander racial/ethnic subgroups to produce reliable results. Consequently, racial subgroup information is only provided for White, Black, and Hispanic subgroups.

PDF Download the complete report in a PDF file for viewing and printing. 5,052K

NCES 98-581 Ordering information

Last updated 23 March 2001 (RH)

Go to Top of Page