Forum Guide to Data Quality

PDF acrobat (3.6 MB) & Related Information

TABLE OF CONTENTS

National Cooperative Education Statistics System

Foreword

Working Group Members

Glossary

Part One: Data Quality

Part Two: Case Studies from State and Local Education Agencies (SEAs and LEAs)

Part Three: Data Quality Tip Sheets

Reference List

Related Resources

Part One: Data Quality—Chapter 3: Best Practices for Collecting Quality Data

Agencies can implement best practices to ensure that they are collecting quality data. Best practices focus on the policies and processes for collecting data as well as the people responsible for the data. A focus on policies and processes can help agencies ensure that they have established standards and clear expectations for data collection, data quality checks, and data management. A simultaneous focus on people can help agencies foster communication, provide effective training, and ensure that all staff understand the purpose of the data collection and its use.

Establish Data Standards and Guidelines

Standards and guidelines encourage respect for accurate and useful data. They define data elements as well as outline proper data entry procedures and business rules, and they should reflect state and federal requirements as well as local and state information needs.

Forum Guide to Data Governance

https://nces.ed.gov/forum/pub_2020083.asp

This Forum guide highlights the multiple ways that data governance programs can benefit education agencies. It addresses the management, collection, use, and communication of education data; the development of effective and clearly defined data systems and policies to handle the complexity and necessary protection of data; and the continuous monitoring and decision-making needed in a regularly shifting data landscape.

Agencies benefit from establishing guidelines for how data should be submitted and from ensuring that the guidelines are available to all staff who work with the data. Guidelines may include file requirements that provide information on rules for the submission process and procedures for how to handle situations such as missing data. These are useful even if datasets are small and informal. For example, establishing a standard way to represent a missing value in a dataset saves staff the time of seeking out the answer and helps to avoid situations where staff may enter different values.

Standards and guidelines are part of data governance, which refers to the formal and comprehensive set of policies and practices designed to ensure the effective management of data within an organization. Data governance is a foundational part of the data collection and reporting process, and it is concerned with the people, processes, and technology tied to maintaining safe and effective data management, use, analysis, and communication. Many state education agencies (SEAs) and local education agencies (LEAs) may find it useful to use the Common Education Data Standards (CEDS) to support accuracy in data entry and to facilitate data sharing.

Common Education Data Standards (CEDS)

https://ceds.ed.gov/

CEDS is a set of common data terms and definitions that allow data to be shared across platforms and among varied data teams and education levels and includes data models and several online tools to support users. CEDS serves as an important data management tool for many education agencies.

Agencies can help staff understand and adhere to standards in multiple ways, including

  • providing tools that help staff follow data collection procedures;
  • publishing a calendar of data submission due dates;
  • conducting audits to ensure the accuracy and completeness of data;
  • working with staff to locate and correct data errors;
  • offering training materials and manuals covering data definitions and collection procedures;
  • facilitating helpdesks to respond to questions and concerns;
  • sharing data collection changes, deadlines, and details through newsletters or other existing communications; and
  • establishing the primary source of the data and avoiding redundant collections or storage where multiple versions of the data can exist, causing inaccuracies.

At the federal level, the U.S. Department of Education (ED) provides multiple data quality resources for data submitters that specify ED standards and guidelines, including

  • standardized reporting and data validation instructions for data submission;
  • a standardized data dictionary;
  • tools built into the submission system to check for errors and to review metadata about datasets; and
  • guidance documents to assist with data submission.

Implement Data Checks

Data checks are used to confirm that data collection adhered to standards and guidelines. They “ensure the completeness, validity, and accuracy of any new data, and ensure these practices are integrated into the broader schedule and system for data cleaning and auditing.”11 These data checks can be invaluable to ensuring accuracy. For example, in Pasco County Schools (FL), checks identified a student’s home incorrectly being mapped to the middle of the Atlantic Ocean after the district added longitude and latitude coordinates to students’ addresses.

Explain the Purpose of the Data and Promote Data Use

Data Quality Dashboards

The Rhode Island Department of Elementary and Secondary Education uses data quality dashboards to provide data managers with a clear and private view of their own data to help identify and correct any errors.12

Agency staff who understand the purpose of the data they collect as well as how those data will be used to support students are more likely to enter the data carefully and precisely. Agencies can provide staff with information about the data's intended purpose and how the data meet the needs of data users. Data leaders can increase staff members’ recognition of the key components of data quality by maintaining focus on the “why” of the data: In short, the ultimate purpose of the data collected and used is to serve students and support their needs. Information on the purpose of the data and how the data will be used can also help information technology (IT) staff who are responsible for developing the technology that enables data collection, storage, and interoperability.

Tools and reports such as data dashboards and visualizations increase the visibility of agency data and encourage data use. In turn, data visibility and use demonstrate the importance of data and encourage staff to take ownership of data quality as well as to use the data to inform decision-making. The more staff use data, the more likely they will be to consider data quality.

Data Visualization

TheForum Guide to Data Visualization (https://nces.ed.gov/forum/pub_2017016.asp) offers practices to help education agencies communicate data meaning in visual formats that are accessible, accurate, and actionable for a wide range of education stakeholders.

Additional in-depth information is provided in the Forum Data Visualization Online Course (https://nces.ed.gov/forum/dv_course.asp).

Capture Data Context

For more information on how metadata can be used by education agencies to improve data quality and promote a better understanding of education data, including information on how to plan and successfully implement a metadata system in an education setting, see the Forum Guide to Metadata at https://nces.ed.gov/forum/pub_2021110.asp.

Metadata, or data that describe the data, are an essential part of data collections. Metadata provide data users with the appropriate context to understand the data’s purpose and significance, including any temporary or permanent changes to data definitions or collection practices that affect their use. For example, metadata are especially useful when there are changes in data from one year to the next, such as the widespread adoption of virtual education during the coronavirus disease (COVID-19) pandemic.

Provide Technology and Technical Support

Data validation tools help ensure accurate data at the time of entry and allow staff to verify and correct data as frequently as needed. Automated data checks at the field level can help prevent data quality issues, and reporting tools can help with identifying issues. Available tools must be used to be effective, as it is easier to correct data errors routinely throughout the year than to wait until the end of the school year. Some agencies offer these checks nightly.

Outdated systems can hinder data quality. Older systems may not have the same features and functions that can prevent, identify, and mitigate data entry errors. In Delaware’s student information system (SIS), outdated fields and duplicate fields have sometimes caused data quality issues when data sources do not align with the data system location where staff members enter updated data. Some codes were available on multiple screens, and the location where schools and districts updated them varied.

Collecting and managing data in a single system can help to eliminate the issue with multiple sources of information and confusion regarding which source is accurate. Even if data are collected in multiple systems, recognizing the authoritative source for the data is crucial. Interoperability solutions can help reduce redundant data entry and issues created due to data entry errors. These interoperability solutions will help create a more efficient situation where each data element has a single system in which the data element is managed, and all other systems become consumers of those changes.

Agencies can use the functions of their data systems to increase data quality by

  • utilizing selections or dropdowns to enhance clarity and avoiding text fields wherever possible;
  • applying business rules or required formatting to data fields;
  • considering the workflow of the task and allowing for streamlined and organized data entry;
  • providing warnings or notifications when data outside of established limits or standards have been entered; and
  • ensuring interoperability to avoid redundant entry of the same data in multiple systems.

Beaverton School District’s (OR) Information Technology (IT) Department convenes school staff responsible for entering data in the district’s student information system (SIS) 6 times a year. Since user groups have different responsibilities for data entry and quality assurance, meetings are role specific (that is, elementary office assistants, middle and high school office assistants, middle school registrars, and high school registrars). These regular meetings provide users with training and support prior to data entry efforts (for example, documenting immunization day exclusions or entering parent opt-out requests from state testing) as well as the opportunity to share best practices in job-alike roles. Agendas, handouts, and recordings of the meetings are posted for future reference.

Foster Communication Among Agencies

Many education data collections involve multiple agencies, such as the school where the data are initially documented; LEAs and SEAs where the data are collected, reviewed, and used; and in some cases, the federal government. Maintaining open lines of communication can improve data quality by providing a way for staff involved in the process to clarify needs and requirements.

Create Workable Calendars and Timelines

Stakeholders rely on quality data that are available when needed. Effective data calendars and submission timelines include consideration of when data can be collected, the time needed to check and adjust the data, and when the data can be certified. For example, quality enrollment data may not be available immediately at the start of the school year.

Communication can also help to ensure that new or updated data collections are implemented in ways that give agencies time to prepare. For example, new or modified data elements, data collections, and reporting frequently require changes to the SISs used by LEAs. Early communications involving the SEA, LEAs, and vendors are essential for making these changes effective for all involved. Communication between the LEA and the SEA is crucial to ensure that LEAs can provide quality data. SEAs provide support to LEAs when implementing a new collection by engaging the SIS vendors about changes and providing enough time for LEAs to alter systems as needed to capture or reformat the data. For example, in Texas, an initial step in the data governance process in developing new data collections is review and approval by a committee that is composed mainly of LEA staff but that also includes SIS vendors, regional education center support staff, and SEA stakeholders.

Train and Support Staff

Skilled staff with the tools and information needed to enter data correctly are essential to data quality. Agencies can support staff who enter or work with data by offering and encouraging staff to attend professional development and training sessions that clarify data standards and practices. While many agencies offer training to new staff, it is important that existing staff also have the opportunity for ongoing training to keep their data literacy skills up-to-date and ensure that they understand the purpose of the agency’s data collections.

Agencies are increasingly offering time for data stewards, information technology (IT) staff, program staff, and others to discuss data collections in meetings where they can offer support to one another and ensure that all aspects of an agency’s data collection procedures are aligned. It is also useful to involve these staff in discussions about changes to procedures and updates in policy. When monitoring errors, agency leaders also should communicate the need for data staff to submit promptly, so that datasets can be corrected or SISs updated.

The ubiquity of data means that even staff who may not work with data systems on a day-to-day basis nevertheless interact with data in other ways; for example, data often inform many of the programs offered in schools. It benefits agencies to keep these staff informed about data collections so that they understand the importance of the collections and how data are used. Often, agencies achieve this by publishing data collection updates in school newsletters or sharing key information in staff meetings.

Minimize the Burden of Data Collections

When implementing new data collections or revising existing collections, a best practice is to consider the burden of the data collection on the agencies and staff impacted. While many data collections are mandated by law, others are not, and agencies should consider the best methods and timing for obtaining needed data. To minimize the time and administrative effort spent, it is useful to weigh the potential value of the new data against the time and resources needed to collect them. It also can help to clearly communicate why the additional burden is necessary by explaining why the data are needed and their intended value for instructional support, decision-making, and compliance with state or federal requirements.

Agencies also can collaborate to promote data quality and reduce reporting burdens by ensuring that new data elements added to existing or new collections are essential and do not duplicate data collected elsewhere, and periodically reviewing data collections to remove elements that are no longer needed. For example, in Texas, by state statute, agency collections are reviewed for duplication with existing collections as an explicit step in the data governance process.

Align Data Collections with the Agency’s Data Strategy

The Forum Guide to Strategies for Education Data Collection and Reporting (SEDCAR) (https://nces.ed.gov/pubs2021/NFES2021013.pdf) provides timely and useful best practices for education agencies that are interested in designing and implementing a strategy for data collection and reporting, focusing on these as key elements of the larger data process.

A comprehensive data strategy is a robust, integrated approach to using data to deliver on a mission, serve stakeholders, and steward resources while respecting privacy and confidentiality. A data strategy enables education agencies to leverage data to improve education, increase agency effectiveness, facilitate oversight, and promote transparency. Data strategies encompass data principles and practices such as governance, access, privacy, security, dissemination, and use by internal and external stakeholders.



11 U.S. Department of Education. Office of Elementary and Secondary Education. Data Quality Component 2: Data Quality. https://oese.ed.gov/resources/oese-technical-assistance-centers/state-support-network/resources/data-quality-component-2-data-quality/.
12 U.S. Department of Education. Institute of Education Sciences. National Center for Education Statistics. (2021) SLDS Webinar Summary Improving Data Quality: State Strategies to Enable Data Use. Retrieved September 1, 2023, from https://slds.ed.gov/#communities/pdc/documents/20386.