Skip Navigation
Forum Guide to Metadata
NFES 2009-805
July 2009

Chapter 1. What Are Metadata and Why Are They Important? - Metadata as a Component of Data Management

Metadata promise too much value as a business management tool to dismiss their implementation and maintenance effort.4

Some organizations rely on the experience of their data steward(s) as the primary source of information about their data. A metadata system is a better and more reliable alternative—and the only realistic way to effectively accomplish this vital information management task.

Metadata systems may not have been necessary when data sets were relatively small and simply organized. Under these circumstances, data were usually used by only a handful of people who were intimately familiar with each data element's definition, collection source, uses, limitations, and technical characteristics. Moreover, the metadata that did exist often were stored in a data steward's memory or a program manager's paper files, and could be easily passed along from one person to another as a part of the organization's oral and written history. But the education enterprise has grown in complexity over the past decades, resulting in the seemingly exponential growth of information collected, stored, managed, used, and reported. In the field of education, as with other industries, metadata have become a necessary component of sound data systems. Without a formal and systematic method for conveying these "data about data," how can data, technical, and program staff confirm that information needed to understand the data will be available in a timely manner and appropriate format?

Metadata provide context for a single data item; serve as the backbone for efficient data management; and improve the use, analysis, and management of a body of data.

A well-managed metadata system minimizes disruption to data management and use. It ensures that the descriptions, definitions, parameters, usage instructions, and history of each element are maintained in an accurate and up-to-date manner. Additionally, metadata are essential for bridging programs and databases because they provide the framework for data exchange and communication within and between organizations. Metadata also inform data policymaking (for example, data retention procedures) and technology planning (such as load time demands) throughout an organization.

The benefits of properly implementing a robust metadata system include

  • improving the likelihood of data meeting the users' information needs;
  • improving the efficiency of data access and integration;
  • improving the probability of correct data interpretation and use;
  • identifying what data exist (and where) throughout an organization;
  • identifying redundancy and disparity in data sets;
  • increasing the efficiency of data storage and maintenance;
  • improving the accuracy of data transfer across systems;
  • improving the application of business rules and edit checks;
  • reducing user expertise required to conduct effective queries;
  • advancing data quality;
  • ensuring the proper maintenance of information over time; and
  • improving the quality of data-driven decisionmaking in the organization.

Despite its potential value, many organizations have not yet chosen to develop a thorough metadata system. Organizational leaders may make this decision passively if they are unaware of the need, or they may actively decide not to address this issue. Organizations that make an intentional decision not to develop a metadata system often do so because it would:

  • demand expertise that staff may not possess;
  • involve a great deal of work;
  • take a lot of time;
  • cost a fair amount of money;
  • require a thorough understanding of current data resources;
  • potentially expose existing deficiencies in data quality; and
  • involve long-term commitment that does not match short-term goals.

All of these reasons for not developing metadata systems are valid—up to a point. Developing a system is a substantial undertaking that requires significant time, expertise, commitment, and money. But like other time-, staff-, and resource-intensive initiatives, such as installing new networking systems, or introducing new professional development programs, metadata systems should yield benefits that far outweigh the costs of implementation.

The consequences of neglecting metadata are many and severe. In the absence of a sound metadata system, the following types of serious data problems can, and often do, arise:

  • a single data element may be applied inconsistently within an organization— for example, some staff members may code an absence reason as "excused" while others code the same reason as "unexcused";
  • multiple conflicting definitions, code sets, and calculations may be used as though they are interchangeable even when they are not, such as different withdrawal codes or competing dropout rate formulas;
  • a data value may be reported differently on different surveys—for example, different graduation rates may be reported for the same school because of different calculation dates or formulas;
  • trend studies may not account for changes in definitions or policies that would otherwise influence analysis, such as changes in race/ethnicity categorization that might affect trends in student achievement;
  • a data item, or even an entire collection, may be maintained when it no longer provides useful information, placing an unnecessary burden on data collectors;
  • a new database may introduce terminology, definitions, and specifications that are not consistent with existing standards and protocols—for example, database designers may develop codes that will not be recognized by other users or systems in the organization;
  • a data initiative may be at greater risk of failure due to unidentified data quality issues—for example, the implementation of a new data warehouse project may be inefficient because the underlying data quality is poor or insufficiently understood;
  • policymakers may not thoroughly understand the data they are using—for example, they may not appreciate that there is a difference between the number of teachers expressed as a "head count" versus a "full-time equivalent" count; and
  • data may be misinterpreted—for example, a graph of assessment results may seemingly show that student performance is improving when the apparent change is actually related to a new testing instrument.
You can't argue with the data A false sense of security may arise when data are used improperly. Decisions based on misunderstood data can be disastrous. In fact, data without metadata can have consequences far worse than having no data at all.

In the past, some organizations have learned to live with these types of consequences. However, with the ever-increasing reliance on data for managing strategic and day-to-day decisionmaking, accepting these problems rarely is acceptable by today's organizational management standards. While metadata cannot eliminate every opportunity for incorrectly collecting, using, or reporting information, a sound metadata system provides a framework for better understanding data and, therefore, minimizes the likelihood of misuse. Exhibit 1.5 presents an example of the perils of data misuse and misreporting in an education organization.

Top


4 Shankaranarayanan, G. and Even, A. (2006) The Metadata Enigma, Communications of the ACM, 49(2), 88-94.