Forum Guide to Metadata
NFES 2009-805
July 2009

Chapter 4. Implementing a Metadata System - Metadata System Architecture

Metadata system architecture often is driven by the results of a build-versus-buy analysis that, in turn, depends on the organization's existing management, governance, and technology considerations. In the broadest sense, metadata system architecture can be divided into three main designs: centralized, federated, and distributed.

Centralized Architecture: As one might expect of a "centralized" system, all metadata exist in a single database that stores nothing but metadata (see exhibit 4.6). The greatest challenge to implementing a centralized architecture is finding a single model that meets the needs of all data systems and users. If a single metadata model has been designed for the entire organization, a centralized metadata system generally is fairly straightforward to implement. Centralized systems are governed, managed, and operated as a single entity. In other words, decisionmaking is also largely centralized, which helps ensure metadata are consistent across subsystems throughout the entire organization— for example, the definition and attributes of "class" would be the same in the finance system as in the student record system. Data stewards and data users generally access a centralized metadata system via a single interface, although the core interface may be modified to accommodate differences in access privileges or other user rights.

Federated Architecture: In federated designs, each stand-alone data system in the organization maintains its own metadata system within the constraints of a centralized technical framework and governance structure. This allows metadata to reflect the specific information needs of each independent data system while still ensuring communication capabilities with other independent systems. Users who access multiple data may do so through separate interfaces, and data stewards likely manage each system independently. However, metadata items that affect more than one system can be coordinated through automated translation and update processes, or by manual modification. Because of this, federated design requires central planning and rulemaking within a distributed architecture (see exhibit 4.7) and demands a fairly sophisticated technical infrastructure and strong system governance.

Distributed Architecture: In a distributed architecture, each stand-alone data system has a corresponding stand-alone metadata system. The major benefit of a distributed system is that metadata can be modified and updated without the need to coordinate with other systems (see exhibit 4.8). While there are other benefits to distributed architecture (for example, metadata directly reflect related operational data), cohesiveness and integration are generally lacking and stand-alone components tend to evolve without adherence to universal rules and conventions that permit synchronization with the rest of the system. Moreover, vocabularies and definitions often "drift," or start to deviate from those in other systems, usually leading to multiple terms for one item and, conversely, multiple items referenced by the same term. In either case, duplication arises and data quality suffers. Given such drift, these stand-alone components, sometimes called "silos," can become autonomous and independent over time, and eventually unable to exchange data or otherwise work with the rest of the system.