Skip Navigation
small NCES header image

Problem Solving in Technology-Rich Environments:
A Report From the NAEP Technology-Based Assessment Project

August 2007

Authors:   Randy Elliot Bennett, Hilary Persky, Andrew R. Weiss, and Frank Jenkins

Download sections of the report (or the complete report) in a PDF file for viewing and printing.


Image of the cover of the TRE report

Executive Summary

Methodology
The TRE Scenario Scales and Results

The Problem Solving in Technology-Rich Environments (TRE) study is the last of three field investigations in the National Assessment of Educational Progress (NAEP) Technology-Based Assessment Project, which explores the use of new technology in administering NAEP. The TRE study was designed to demonstrate and explore an innovative use of computers for developing, administering, scoring, and analyzing the results of NAEP assessments. The prior two studies, Mathematics Online (MOL) and Writing Online (WOL), compared online and paper testing in terms of issues related to measurement, equity, efficiency, and operations.

In the TRE study, two extended scenarios were created for measuring problem solving with technology. These scenarios were then administered to nationally representative samples of students. The resulting data were used to describe the measurement characteristics of the scenarios and the performance of students.

The context for the problem-solving scenarios was the domain of physical science. The TRE Search scenario required students to locate and synthesize information about scientific helium balloons from a simulated World Wide Web environment. The TRE Simulation scenario required students to experiment to solve problems of increasing complexity about relationships among buoyancy, mass, and volume; students viewed animated displays after manipulating the mass carried by a scientific helium balloon and the amount of helium contained in the balloon. Both scenarios targeted grade 8 students who were assumed to have basic computer skills; basic exposure to scientific inquiry and to concepts of buoyancy, mass, and volume; and the ability to read scientifically oriented material at a sixth-grade level or higher.

In the TRE study, data were collected from a nationally representative sample of grade 8 students in the spring of 2003. Over 2,000 public school students participated, with approximately 1,000 students taking each assessment scenario. (See appendix B for detailed information about the TRE sample selection.) Students were assigned randomly within each school to one of the scenarios—Search or Simulation. Students took the scenarios on school computers via the World Wide Web or on laptop computers taken into the schools. For both scenarios, data were collected about student demographics; students’ access to computers, use of computers, and attitudes toward them; and students’ science coursetaking and activities in school.

Back to Top

Methodology

The TRE study used Evidence-Centered Design (ECD) (Mislevy, Almond, and Lukas 2003) to develop the interpretive framework for translating the multiplicity of actions captured from each student into inferences about what populations of students know and can do. In ECD, the key components of the interpretive framework are student and evidence models. The student model represents a set of hypotheses about the components of proficiency in a domain and their organization. The evidence model shows how relevant student actions are connected to those components of proficiency, including how each relevant action affects belief in student standing on each proficiency component. The structure provided by ECD is particularly important for complex assessments like TRE, for which meaningful inferences must be drawn based on hundreds of actions captured for each student.

For the purposes of TRE, the student model represented the components of student proficiency in the domain of problem solving in technology-rich environments. Two primary components were postulated: scientific inquiry and computer skills. Scientific inquiry was defined as the ability to find information about a given topic, judge what information is relevant, plan and conduct experiments, monitor efforts, organize and interpret results, and communicate a coherent interpretation. Computer skills were defined as the ability to carry out the largely mechanical operations of using a computer to find information, run simulated experiments, get information from dynamic visual displays, construct a table or graph, sort data, and enter text.

Evidence of these skills consisted of student actions called “observables.” Observables were captured by computer and judged for their correctness using scoring criteria called “evaluation rules,” and summary scores were created using a modeling procedure that incorporated Bayesian networks (Mislevy et al. 2000). Bayesian models belong to a class of methods particularly suited to the TRE scenarios because these methods account for multidimensionality and local dependency, neither of which is explicitly handled by the measurement models typically used in NAEP assessments.

Back to Top

The TRE Scenario Scales and Results

Because the TRE study used measures that are experimental, data were analyzed to explore how well the TRE scenario scales captured the skills they were intended to summarize. For each scenario, the following measures were obtained: internal consistency; the relations of student scores to students’ prior knowledge; the TRE scale intercorrelations; the correlations of each observable with each subscale; the locations of the observables on the scales; the response probabilities for prototypic students (i.e., hypothetical students with low, medium, and high levels of proficiency); and the relations of relevant student background information to performance. Results were considered to be statistically significant if the probability of obtaining them by chance alone did not exceed the .05 level.

Readers are reminded that the TRE project was intended as an exploratory study of how NAEP can use technology to measure skills that cannot be easily measured by conventional paper-and-pencil means. This report will discuss the ability of a nationally representative student sample to solve problems using technology in the TRE context. However, the results pertain to student performance in only two scenarios employing a limited set of technology tools and a range of science content sufficient only for demonstration purposes. Therefore, results cannot be generalized more broadly to problem-solving in technology-rich environments for the nation’s eighth-graders.

The Search Scales and Results

TRE Search consisted of 11 items (or observables) and produced a total score and two subscores, scientific inquiry and computer skills.

  • The internal consistency of the three TRE Search scores (total, scientific inquiry, and computer skills) ranged from .65 to .74, as compared to .62 for the typical main NAEP science assessment hands-on task block, which, although measuring skills different from TRE, also includes extended, problem-solving tasks.
  • The Search scores provided overlapping but not redundant information; the (disattenuated) intercorrelation of the subscores was .57. This value contrasts with intercorrelations of .90 to .93 for the main NAEP science assessment scales.
  • The scientific inquiry skill scale score was most related in the student sample to the following scale observables: the relevance of the World Wide Web pages visited or bookmarked, the quality of the constructed response to a question designed to motivate students to search for and synthesize information from the Web, and the degree of use of relevant search terms (r range between performance on the observable and scale score = .51 to .71).
  • The computer skills scale score was related in the student sample primarily to the following scale observables: the use of hyperlinks, the use of the Back button, the number of searches needed to get relevant hits (an efficiency measure), and the use of bookmarking (r range = .60 to .69).
  • Statistically significant differences in performance were found on one or more TRE Search scales for NAEP reporting groups categorized by race/ethnicity, parents’ highest education level, students’ eligibility for free or reduced-price school lunch, and school location. No significant differences were found, however, for reporting groups categorized by gender.

The TRE Simulation Scenario Scales and Results

The TRE Simulation scenario consisted of 28 observables and produced a total score and three subscores: scientific exploration, scientific synthesis, and computer skills.

  • The internal consistency of the four scales ranged from .73 to .89, as compared to .62 for the typical main NAEP science assessment hands-on task block, which, although measuring skills different from TRE, also includes extended, problem-solving tasks.
  • The Simulation scores provided overlapping but not redundant information; the (disattenuated) intercorrelations of the subscores ranged from .73 to .74. These values contrast with intercorrelations of .90 to .93 for the main NAEP science assessment scales.
  • The scientific exploration skill scale score was most related in the student sample to three scale observables: which experiments students chose to run to solve the Simulation problems, whether students constructed tables and graphs that included relevant variables for solving the problems, and the degree to which experiments controlled for one variable in the one problem demanding controlled experimentation.
  • The scientific synthesis scale score was primarily related in the student sample to the degree of correctness and completeness of conclusions drawn for each Simulation problem.
  • Performance on the computer skills scale was related in the student sample mainly to the number of characters in the written responses students gave for each of the three Simulation problems.
  • Statistically significant differences in performance were found on one or more TRE Simulation scales for NAEP reporting groups categorized by race/ethnicity, parents’ highest education level, and students’ eligibility for free or reduced-price school lunch. No significant differences were found, however, for reporting groups categorized by gender or school location.

Back to Top


Download sections of the report (or the complete report) in a PDF file for viewing and printing:

  • PDF PDF 1 of 8 contains:
    Executive Summary
    Forward
    Acknowledgments
    Contents
    Introduction
    -also includes front matter
    (284K PDF)

  • PDF PDF 2 of 8 contains:
    Chapter 1: The TRE Construct Domain and Problem-Solving Scenarios
    (part 1: intro text and figures 1 through 12)
    (819K PDF)

  • PDF PDF 3 of 8 contains:
    Chapter 1: The TRE Construct Domain and Problem-Solving Scenarios
    (part 2: figures 13 through 24)
    (846K PDF)

  • PDF PDF 4 of 8 contains:
    Chapter 2: The TRE Interpretive Framework
    Chapter 3: The TRE Student Sample—Attitudes Toward and Experiences With Technology and the Nature of Science Coursework
    Chapter 4: Scoring TRE
    Chapter 5: The TRE Search Scenario Scales and Results
    Chapter 6: The TRE Simulation Scenario Scales and Results
    Chapter 7: Summary of Results
    References
    (445K PDF)

  • PDF PDF 5 of 8 contains:
    Appendix A: Development Committee for the Problem Solving in Technology-Rich Environments (TRE) Study
    Appendix B: Sample Selection
    Appendix C: Technical Specifications for Participating Schools
    Appendix D: Prior Knowledge and Background Questions for Search and Simulation Scenarios
    Appendix E: TRE Simulation Glossary, Help, and Tutorial Screens
    (part 1: figures E-1 through E-11)
    (927K PDF)

  • PDF PDF 6 of 8 contains:
    Appendix E: TRE Simulation Glossary, Help, and Tutorial Screens
    (part 2: figures E-12 through E-21)
    (650K PDF)

  • PDF PDF 7 of 8 contains:
    Appendix E: TRE Simulation Glossary, Help, and Tutorial Screens
    (part 3: figures E-22 through E-34)
    (882K PDF)

  • PDF PDF 8 of 8 contains:
    Appendix F: Bayesian Estimation in the Problem Solving in Technology-Rich Environments Study
    Appendix G: C-rater Rules for Scoring Students’ Search Queries
    Appendix H: TRE Search and Simulation Scale Scores and Percentiles by Student Reporting Groups for Scales on Which Statistically Significant Group Differences Were Observed
    Appendix I: Summary Statistics for Prior Knowledge Measures and Mean Scale Scores for Background-Question Response Options
    Appendix J: Performance on Problem Solving in Technology-Rich Environments (TRE) Observables
    Appendix K: Understanding NAEP Reporting Groups
    (792K PDF)

  • PDF The complete PDF of Problem Solving in Technology-Rich Environments (TRE) (5284K PDF)

NCES 2007-466 Ordering information

Suggested Citation
Bennett, R.E., Persky, H., Weiss, A.R., and Jenkins, F. (2007). Problem Solving in Technology-Rich Environments: A Report From the NAEP Technology-Based Assessment Project (NCES 2007–466). U.S. Department of Education. Washington, DC: National Center for Education Statistics.

For more information, see The Problem Solving in Technology-Rich Environments (TRE) section on the Nation's Report Card website.

Back to Top


Last updated 23 July 2007 (RH)

Would you like to help us improve our products and website by taking a short survey?

YES, I would like to take the survey

or

No Thanks

The survey consists of a few short questions and takes less than one minute to complete.
National Center for Education Statistics - http://nces.ed.gov
U.S. Department of Education