
|
|
A data model is a structured description of how data are represented, organized, and accessed in an information system. A data model defines the data objects contained in an information system, the relationships between these data objects, and how the information is to be used. Data models are often represented as entity relationship diagrams. They support the development of information or data management systems, enable the exchange of data, enhance application maintainability, and can be imported in parts or whole to reduce the time and cost of developing new systems or redesigning existing ones.
Data modeling consists of a family of techniques used to describe the types of information important to an enterprise. It is a critical tool in providing the information that enterprises need to thrive and to comply with government regulations and industry standards. Data modeling is no longer limited to databases nor is it still viewed as just a tool for information technology (IT) staff. Rather, it is an increasingly expected tool for designing, visualizing, and communicating about education data, data systems, and analyses.
The enterprise of education requires continuous and wide-ranging information-based conversations to facilitate everything from teaching and learning processes to making federal policy decisions. Data modeling makes it possible to organize that information so that it can be used effectively and efficiently.
The American National Standards Institute describes three kinds of data models:1
Database engineers and designers use physical and logical models to design and implement their systems. Conceptual models are convenient tools to help end users and other stakeholders discuss, understand, and participate in the design of systems, such as education data systems or education delivery systems. The Education Data Model is a conceptual data model.
The Education Data Model: Version 1 (PK-12) depicts a large portion of the information that should be contained in education software, information systems, and data warehouses. As a high-level conceptual model, the Education Data Model outlines the meaning of concepts and the relationships among concepts without imposing any particular logical or physical implementation approaches. (See chapter 4, especially the Concept Map section for further information about the model's content.)
The information in this conceptual model takes into account the processes associated with teaching, learning, and the business operations of education organizations. The Data Model focuses on granular information at the school and LEA levels, rather than upon aggregate statistics or indicators for accountability; for example, it includes "student" as an entity but not "total number of students." However, the Data Model includes information that is necessary to produce aggregate and other types of statistics. In brief, the Education Data Model is a catalogue of the data used in PK-12 education and a description of the relationships among those data.
Figure 1 depicts the concepts, relationships, and attributes of the Education Data Model. In the Data Model and the figure below, concepts are called entities (constructs that need to be tracked, measured, and described by software systems in order to support education processes). These entities have relationships to one another, reflecting the associations that exist between them in the real world. In addition, entities have attributes (information associated with an entity that can be measured, classified, or described) (see the Core Data Model Concepts section of Chapter 4). These entities, relationships, and attributes provide the basis for answering important education questions and designing complete information systems.
For instance, in the Education Data Model, teachers and students are entities. These entities are related because teachers provide services to students, and conversely, because students receive services from teachers. Teachers have attributes such as full-time equivalency, certification, hire date, role and content knowledge. Students have attributes including state ID, name, free and reduced price lunch eligibility, allergies, and course completion records. Later in this guide, the structure of the data model will be discussed in greater detail, showing how the entities are organized into a broad taxonomy of classes and subclasses (see chapter 4 for further information).
The Benefits of Conceptual Education Data Modeling
Comprehensive data models can be invaluable in helping education respond to changing business conditions. When used appropriately, conceptual data models can lead to more accurate data, which in turn support more effective use of data contained in business-intelligence and business-analytic tools. Education stakeholders at the LEA and SEA levels who understand basic data-modeling concepts and who work with effective data models are likely to be especially productive. Conversely, information workers who have to struggle with ineffective models or who are not literate in data models face a couple of unhappy choices. They can rely solely on IT counterparts or vendor-specific solutions to assist them with data analyses, and may not get exactly the information they need; they can skip the analyses altogether; or, potentially the most dangerous solution, they can base analyses on inaccurate or incomplete data. Some possible benefits of a well-developed data model are described in the following subsections.3
Process Design: The very process of collaboratively designing a data model has substantial benefits. Designing a local data model can bring together all stakeholders and give them a better understanding of their own data needs and those of their coworkers. The process can also help stakeholders better understand the needs and value of reporting authorities, and drive data decisions and uses with the input of stakeholders' diverse education expertise rather than depend on IT staff to make all determinations.
Communication with Stakeholders: Education shares with other endeavors the tendency to operate in various silos, based on the belief of each silo's owner that it represents unique needs and functioning. Data models, when used strategically, can stimulate communication between educators who are the silo owners and with various external stakeholders requesting program information—policy makers, the media, local and state agencies outside of education, and parents. When stakeholders share a common and comprehensive picture of all the data needed in the education enterprise, individuals in the various stakeholder roles can more readily and accurately communicate their needs.
Employee Training: Comprehensive data models can provide significant professional development opportunities for data users. Staff who are vested owners of a developed data model become better informed in all aspects of the education enterprise and the various dependencies on information. In local systems development, each employee adds expertise to the model to ensure completeness and clearly articulate the data needs to be met. An additional benefit is that an understanding of the need for quality data becomes embedded in the culture of the entire organization.
IT Benefits: Accurate data models lead to more productive and usable data systems at all levels of the organization. Input from the eventual consumers of data systems enables IT professionals, including vendors, to design and implement systems that accurately depict information needed to support daily operations, teaching and learning, and various data reporting demands: information systems that work.
Reporting and Operational "Alignment": Well-developed data models can have dramatic effects outside the daily operations in education. Data models can be used to align data collected and maintained in local systems with data needed for state and federal reporting entities such as the accountability reporting for No Child Left Behind. When data models in local systems vary greatly from the data model of the mandating entity, the reporting process becomes inefficient and increases the burden on the reporting entities, leading to the complaint that schools are asked to report what is basically the same information over and over again. This extra burden ultimately affects data quality and can result in misinformation.
Risks Associated With Conceptual Modeling
One of the most common stumbling blocks in developing and using data models is viewing them first and foremost as IT projects. Successful data model projects first focus on the business and use case needs of eventual data consumers. A strength of the conceptual Education Data Model is that it is technology-independent and was developed by individuals in various stakeholder roles. Before technology becomes the focus of a data modeling project, the considerations described in the following subsections should be taken into account.4
Product Scope: If the scope of a data model project is not carefully thought out before other work begins, the project may end up with the wrong product—or no product at all. During the development of the Education Data Model Version 1: PK-12, the task force carefully defined the scope of the two year project. As a first step, the task force determined that the Data Model would focus only on PK-12 data—more specifically, on the data interactions and definitions supporting the student, teacher, and course triad. Although it is expected that the Data Model will continue to develop over time, defining the initial scope allowed the task force to complete Version 1.
Incompleteness: As with data systems, a data model is never really complete. Instead, it should be viewed as dynamic information that provides ongoing opportunities for input and for addressing evolving information needs. In this project, the task force made every effort to identify the most basic and common data needs at the school and district levels, and left itself open to add future directions.
Generalization: In effective data model development, it is important that specific bits of data link via specific relationships to other bits of data. In building conceptual data models, it is easier to make general linkages between categories of entities than to look for the specific relationships that provide the most detailed information for users. Attaining this kind of specificity is difficult, but the standardization that results from using this kind of granular information is beneficial. The Education Data Model is built on specificity rather than on general linkages.
The Education Data Model is a comprehensive, localized, conceptual model that provides a generic blueprint for schools and districts. This blueprint enables schools to evaluate and improve instructional tools, communicate those needs to their umbrella agency or directly to vendors, enhance the movement of student information from one LEA to another, and in the end, have better tools to inform instruction. Using a standard Education Data Model as a starting point contributes to a comprehensive understanding of the need for data, how data are used, and the questions that can be answered with the data. For instance, the Data Model helps to answer questions such as the following:
Schools and LEAs that have data models often rely on proprietary models developed by vendors and implemented in vendors' software applications. With two out of three (69.6 percent)5 districts in the United States enrolling fewer than 2,500 students, many LEAs either cannot afford proprietary data solutions, or cannot afford to tailor purchased models to the needs of their education stakeholders. States may lack the financial or technical means to develop comprehensive data models for their LEAs.
Until now, there has not been a comprehensive, nonproprietary, generic education data model for use by schools, LEAs, and states to design or guide the selection of systems for instructional delivery, decision support, operations, reporting, and data warehousing. The Data Model serves as a tool to convene relevant stakeholders around the evaluation of data management processes at the school and LEA levels. The model works as a tool to help agencies through the process of identifying requirements, and can then be used to communicate these requirements to vendors (or internal staff in the case of a build solution). And, the model provides a template against which agencies can assess the adequacy of proprietary systems to meet their specific needs.
The Data Model is intended to serve multiple purposes for multiple audiences. Some potential users who should find its benefits relevant to their roles and responsibilities include: