Peggy G. Carr, Ph.D.
Commissioner of the National Center for Education Statistics
A Pragmatic Future for NAEP: Containing Costs and Updating Technologies
08/27/22
By the National Academies of Sciences, Engineering, and Medicine's Panel on Opportunities for the National Assessment of Educational Progress in an Age of AI and Pervasive Computation
The National Center for Education Statistics commissioned the National Academics of Science, Engineering, and Medicine to convene a panel of experts to provide advice on how "the use of digital technology and other major innovations could transform NAEP over the next ten years and beyond." They were asked also to consider the effects of proposed innovations on the cost of NAEP. The resulting report, A Pragmatic Future for NAEP: Containing Costs and Updating Technologies (March, 2022), includes 20 recommendations spanning a wide range of topics from operational aspects of NAEP such as item development, assessment delivery platform, and test administration, to more structural facets such as program and cost management.
On balance, NCES welcomes the panel's recommendations. Many recommendations endorse existing efforts NCES has undertaken in recent years to modernize the program and others point to promising directions for the program. Unfortunately, some of the recommendations rely on conclusions that are based on inaccurate assumptions. In particular, the panel misrepresents NAEP cost data in some instances. However, NCES welcomes the opportunity to consider the panel's report and recommendations in shaping the future of the NAEP program. Below, the panels' recommendations are grouped thematically and followed by initial responses from NCES.
The panel reports that "despite cooperation from NCES, the panel could not obtain a clear picture of the overall budget for NAEP" (p. 11-1) and recognizes that the budget figures presented in the report are the "panel's best estimate" (p. 3-3). In fact, the panel's estimates of some costs are simply inaccurate. For example, the panel dramatically overestimates costs for management and planning and includes under management and planning costs that are actually analysis and reporting and other core activities.
NCES recognizes that NAEP's contracting and cost structures are complex and difficult to understand. Thus, the panel's recommendation for more clarity about NAEP's cost structures (Rec. 2-1) is well-taken. It is important that key stakeholders, such as the National Assessment Governing Board (NAGB) members who oversee policy for the program and federal policymakers, who, on behalf of the taxpayers, fund the program, can understand the foundations of the cost structures for the NAEP program. Accordingly, NCES will develop for stakeholders involved in the oversight of the program resources describing NAEP cost structures and convene a series of workshops for these stakeholders.
The panel also presents a confused picture of NAEP costs over time. Although, as shown in the report's Figure 2-1 (p. 2-8), the program's core budget, adjusted for inflation, has not increased since 2002, the panel suggests that NAEP costs are rising, threatening the viability of the program. In addition, the panel suggests there are inefficiencies caused by the joint oversight of the program between NAGB and NCES. The panel advocates for an independent audit of the program management and decision-making processes (Rec. 10-1) to improve efficiency and reduce costs (p. 11-11). NCES does not agree with this recommendation as stated. Overlapping responsibilities between NAGB and NCES for the form and content of the assessments and the release and dissemination of results were included by Congress in the legislation for NAEP to produce important safeguards for the independence of NAEP as a consistently fair and unbiased monitor of educational progress in the nation. However, we recognize that a thorough review of NAEP processes and costs within the framework required by law could identify efficiencies for NCES and NAGB to consider. Accordingly, NCES will commission an independent review of NAEP's cost structures, including recommendations for process improvements to reduce costs.
The panel also recommends establishing an identifiable research budget and program of activities for NAEP to support ongoing innovation (Rec. 10-2). We support this position. Currently, the program's research and operational budgets are not separated. This limits NCES' ability to fund essential research activities without compromising the operational activities. We also agree with the panel's call for the program to increase the visibility and coherence of its research activities (Rec. 10-2). NCES will produce a white paper describing the current and future plans to modernize the program. NCES will post the white paper on its website in a new research and development hub disseminating information on NAEP research and development efforts.
The panel makes a set of recommendations concerning how the program designs its assessments to be able to measure trends in achievement over time (Rec. 3-1 and 3-2), as well as potential new ways to integrate subject areas (Rec. 3-3). These are intriguing recommendations that NCES and NAGB, which is responsible for the content of assessments, will explore. NCES and the Governing Board staff will establish a working group to consider the short-term, mid-term, and long-term innovation strategies for the program, including these three recommendations concerning the content of the assessments.
In its recommendations regarding item development, the panel calls for NCES to "examine the costs and scope of work in the item development contract" (Rec. 4-1), "move towards using more structured processes for item development to both decrease costs and improve quality" and draw from the achievement-level descriptions (Rec. 4-2). The panel also recommends an analysis of the value and cost of different item types (Rec. 4-3). Although the report misrepresents some item development costs, NCES agrees that studying item development processes to identify ways to improve efficiency would be beneficial. Similarly, although previous attempts at employing task models (Rec. 4-2) have proven costly, NCES is committed to reevaluating the use of task models with cost efficiency in mind. NAGB is also in the process of creating more detailed reporting achievement level descriptions in reading and mathematics based on a study conducted earlier this year. NCES expects that these reporting achievement level descriptions will better lend themselves to informing the item development process. NCES will include as part of the independent review of NAEP's processes and cost (see under Rec. 10-1) a review of item development processes as suggested by the panel.
As mentioned above, many of the panel's recommendations support current NCES plans. Another example of this is Rec. 7-1, which endorses NCES' work to implement automated scoring in mandated assessments and recommends that NCES consider automated scoring in other assessments administered to state-level samples. NCES is currently planning to use automated scoring in Reading in 2024 and will continue to explore its use next in Mathematics and then in other subjects in coming years. NCES agrees with the panel that automated scoring will not only reduce scoring costs but also enhance and accelerate reporting, and possibly increase the consistency and fairness of scoring over time (p. 7-6). NCES will continue to pursue automated scoring.
Another recommendation that speaks to enhanced and faster reporting is Rec. 8-1, where the panel calls for more budget to be allocated to innovative analysis and reporting, including faster and easier dissemination of raw data and process data, improvements in the NAEP Data Explorer (NDE), and expansion in the use of contextual variables. NCES agrees that the program would benefit from devoting more of its budget for innovative analysis and reporting. NCES has leveraged technology to provide interactive and engaging data visualizations and dashboards since its transition to digital reporting in 2013 in addition to the NDE within current budget limits. For example, state and district profile pages allow users to see how their jurisdictions' results compare with others across grades and subjects by clicking on maps and interactive tables. The achievement gaps dashboard allows users to see trends in gaps not only for traditional factors but also cross tabulated characteristics. Furthermore, the digital report cards showcase modern data visualizations to summarize results across jurisdictions, student groups, and contextual factors. NCES is always exploring ways to improve the dissemination of data. NCES will prioritize this recommendation as part of the efforts of the joint working group with the Board described in response to Rec. 3-3.
NAEP's digitally based assessments are currently administered by professionally trained NAEP staff using NAEP-owned tablets. The panel recognizes that NAEP's current model is intended to reduce the burden on schools while maintaining the level of standardization that is deemed essential for NAEP (p. 11-4).
NAEP's strategic plan for advancements in NAEP's digital platform will allow the program to adopt a more cost-efficient approach that leverages school-owned devices and local school staff as test administrators. The panel endorses NAEP's vision in this realm (Rec. 5-1) and recommends that information about local devices be included in the data collection to allow NCES to explore the use of statistical techniques to produce estimates that generalize across devices (Rec. 5-2).
NCES has a series of studies in place to investigate local administration with local devices. In addition, NCES will bring the suggested investigation to explore statistical techniques (Rec. 5-2) to its advisory groups including Design and Analysis Committee (DAC) and NAEP Validity Studies (NVS) Panel.
The panel also recommends a review of potential cost savings from local administration of the mandated assessments (Rec. 5-3). The program could indeed benefit from a review of projected cost savings as we get closer to the next contract cycle (2024 through 2029), where most of these savings will be realized. NCES will continue to update these projected cost savings. The panel also recommends local administration model for other assessments. NCES agrees with this and has included non-mandated assessments in its plans for local administration (Rec. 5-3).
Two other areas where the panel endorses NCES' current plans are on the investigation of administering more than one NAEP subject to each student1 (Rec. 6-1) and the use of adaptive testing (Rec. 6-3). NCES fully agrees with panel's view that the efforts to assess each student in multiple subjects "should be based primarily on its potential to reduce testing burden by reducing the number of sampled students and to understand dependencies in proficiency across subjects" (p. 11-7), and that adaptive testing should be investigated "for its potential to improve the precision of statistical estimates and the test-taking experiences for low-performing students" (p.11-7). NCES is in agreement with Rec. 6-1 and 6-3 and will continue its plans to conduct studies in 2026 to investigate these two design innovations together. Depending on the results of the studies, the design changes will be implemented in 2028.
The panel also recommends that NCES commission an analysis of the tradeoff between NAEP's sample sizes and its statistical power (Rec. 6-2). In fact, NCES routinely conducts this type of analysis in advance of each assessment cycle. However, in response to this Rec. 6-2, NCES will incorporate these analyses in NAEP's Technical Documentation on the Web (TDW) starting with the 2022 assessments to make them more accessible.
Finally, the panel also recommends that efforts to coordinate NAEP with the international assessment programs should not be used as a strategy to reduce costs (Rec. 6-4). This recommendation is in line with our internal analysis. The value in coordination with the international assessments lies in potential linking studies based on shared samples or content, as the report also acknowledges (p. 6-7).
During the 2010s, the transition of largescale assessments such as NAEP, PISA, and TIMSS and state assessments systems from paper-and-pencil to digitally based administration necessitated the development of digital platforms to deliver the assessments. The NAEP program developed "eNAEP," a platform intended to deliver assessments and collect data with minimal burden on schools and across the range of digital capabilities across the nation's schools at that time. NCES is currently developing a new "Next-Generation" eNAEP integrated into other NAEP program processes to improve efficiency and reduce costs in areas ranging from item development to administration in the field.
The panel acknowledges NAEP's work on Next-Generation eNAEP (p. 11-9) but misinterprets some elements of the platform. NCES agrees with the panel's recommendation that Next-Gen eNAEP components should be custom-built only if rigorous analysis shows that there are clearly large net benefits to this approach (Rec. 9-1). However, NCES does not agree with the panel's position that current eNAEP does not have contemporary data architecture. NCES is already utilizing approaches that address these points with the new Next-Gen eNAEP development. Wherever possible, mature, open-source and commercial components are utilized and integrated as components into the platform. In some cases, when no available technologies meet NAEP's technical or security requirements, or to integrate open-source and commercially licensed components, custom-built software solutions are developed following open standards and best practices for reusability and cost-saving. Software built by vendors or available in open-source libraries is evaluated regularly. The Next-Gen eNAEP platform was developed based on informed decisions on what to build versus buy considering development cost and efficiency as well as the total cost of ownership.
NAEP's transition to a digitally based administration reinforced the need to have the right expertise at the table when making decisions about technological innovations. NCES agrees with the panel's recommendation to ensure that there is adequate internal and external staff with software development expertise (Rec. 9-2) and to seek expert guidance from enterprise application developers and educational technologists who understand assessment technology platform (Rec. 9-3). NCES is confident that the current vendor and internal staff have the required requisite knowledge, skills and experience.2 However, NCES needs more internal staff in this critical area. NCES has requested that priority be given to hiring additional staff with background in enterprise application development to support these complex technical activities (Rec. 9-2). Moreover, NCES will commission a system review of the Next-Gen eNAEP. In addition, NCES will establish an ongoing independent panel to set an agenda to advise, evaluate and support continued improvement of eNAEP platforms and supporting systems (Rec. 9-3).
Moving forward NCES will continue to process the recommendations in the final report, with three major principles in mind: accountability, transparency, and modernization. NCES will be taking the following actions in the coming weeks to address the recommendations in the final report and to illustrate commitment to these principles:
NCES appreciates the NASEM panel's work and the resulting report. The panel's recommendations and the actions NCES plans to initiate or continue in response to these recommendations will ensure the program's continued leadership into the future.
1 Currently, in NAEP assessments each student receives two 30-minute cognitive blocks of items from a single subject (e.g. reading or mathematics).
2 The platform development contractor has extensive expertise related to enterprise, customer-responsive software development, cloud-based architecture, and agile development processes. This includes proven experience with direct development of other state and large-scale test delivery engines/platforms for clients