Forum Guide to Metadata
NFES 2009-805
July 2009

Chapter 3. Using Metadata - Data Management Metadata

At their most basic level, metadata are intended to convey information about data meaning. Management items include a data element name, definition, and other data dictionary entries necessary for understanding the meaning and context of any single piece of data. For example, the Number of Graduates in many states includes only those students receiving regular, standard, endorsed, or advanced diplomas. In other states, however, the Number of Completers is used to count graduates as well as students who receive a high school equivalency certificate, certificate of completion, or attendance certificate. The relative meaning of these data, therefore, clearly depends on how the terms "graduate" and "completer" are defined, and anyone using the information would benefit from metadata that provide clear and accurate definitions for the terms.

Data users often find themselves concerned about data availability, which can be presented as a catalog of what and when data are available. Availability may vary for different users. For example, data might have an earlier release date for internal planning than for external public reporting.

Restrictions and limitations help users identify factors that limit the use, value, or interpretation of a data element. Restrictions might include privacy/sensitivity labels warning users not to share data, or indications about combinations of data that may not be released, such as name and assessment scores. Limitations often address more practical issues, such as a non-comparability warning about two apparently similar items that should not be compared because of meaningful differences in sampling techniques. More advanced users might be interested in data components/operations that describe how a data value was generated based on its components and derivations as in, for example, what data elements and what formula were used to generate a dropout rate. Data purpose/rationale generally indicates the underlying reason for collecting the data, including public laws or administrative policies that require collection.

One person or office in the organization should be responsible for defining each data element and assigning access rights to it. Many organizations call the person or office with these responsibilities the data owner. A data steward, on the other hand, is the individual or office accountable for maintaining a data element's definition and metadata in a manner consistent with the rules established by the data owner. In other words, a data steward works on behalf of a data owner. While the labels "data owner" and "data steward" may vary across organizations depending on governance structures, management terminology, and organization size, the distinction between decisionmaking responsibilities (data ownership) and management responsibilities (data stewardship) is critical to the effective operation of a data and metadata system. Data owners are responsible for determining domains that define the range of permitted values (e.g., 1-999 inclusive). They are also responsible fort the data's time parameters—information about the date when the data were collected or loaded, and the period for which the data are valid.

Data treatment describes how data were modified or otherwise changed, in format or presentation, after collection. This includes information about mapping and transformations, as well as rules for significant digits, rounding, cell sizes, business rules, aggregating, and other formulas and derivations. Data history is often presented in the form of an audit trail or other record of how, when, and why data were modified, and by whom.

As an extension of data storage, retention metadata indicate how long data should be maintained, and when and how they should be destroyed at the end of their life cycle. For example, some enrollment and fiscal data are maintained indefinitely as a function of historical recordkeeping for a school, district, or state; however, private student information such as health and disciplinary records may need to be destroyed as soon as a student is no longer enrolled in school. Security/confidentiality metadata items are often used to identify sensitive and private data. If, for example, data such as social security numbers are identified as particularly sensitive, appropriate destruction methods might include sophisticated technologies such as degaussing (neutralizing the magnetic field of storage tapes) or binary code overwriting.