Skip Navigation
small header image
National Forum on Education Statistics

Chapter 4: Content and Structure of the Data Model

This chapter describes the contents and organization of the Data Model. It begins with a discussion of the core concepts and terms related to the Data Model. Then, the chapter illustrates the Data Model's content structure. Two open source tools available for viewing and manipulating the Data Model are presented at the close of the chapter.

Core Data Model Concepts

This section explains important terms and concepts related to the Data Model: entity, attribute, common attributes, class, taxonomy, relationship, and concept map.

Entity

An entity is the basic building block of the Education Data Model. Entities are the constructs that need to be tracked, measured, and described by software systems in order to support education processes. Entities can be:

  • Persons such as students, parents, or staff members
  • Capital assets such as schools, school buses, or buildings
  • Events such as test administrations, class meetings, or discipline incidents
  • Tools such as books, networks, computers, lessons, or assessments
  • Concepts such as perceptions or skills

Attribute

An attribute is information about an entity that you can measure, classify, or describe. An attribute is not a calculation or statistic, and it generally does not contain counts.

Attributes in the Data Model are not generic measures or data elements that can connect to a number of entities. In the Data Model, each attribute is unique to the entity that it measures. The following are examples of attributes:

  • A measurement, current state, or a trait of an entity
  • A person's name, phone number, or IM address
  • An education institution's name, location, or size
  • The date of a test administration, the value of an assessment score
  • A lesson topic, grade level, or student's learning style
  • A teacher's perception of self-efficacy with respect to classroom management

Common Attributes

Attributes are usually specific to each entity; however, there is a small set of attributes that apply to multiple entities6. These attributes are referred to as common attributes and are arranged into a taxonomy. This taxonomy of common attributes should not be confused with the Data Model taxonomy of entities.

A limited set of the common attributes used in the Education Data Model is listed below. The list is provided to establish consistency and uniformity in the naming and use of these attributes. These common attributes involve names, location information, and identifiers and are expected to change as the Data Model continues to be updated.

Items lower in the hierarchy are more specific than those above. Items at any level in the hierarchy are directly or indirectly associated with at least one entity.

Person Name

  • Alias
  • First Name
  • Former Legal Name
  • Generation Code/Suffix
  • Last/Surname
  • Last/Surname at Birth
  • Middle Name
  • Nickname

Locus

  • Location
  • Physical Location
    • Physical Address
    • Latitude/Longitude
    • GPS Coordinates
  • Virtual Location
    • IP Address
    • URI
    • Postal Address
  • Connection ID
    • E-mail Address
    • Phone Number
    • Screen Name
    • IM Address
  • Schedule
    • Day/Date/Time
    • Periodicity
    • Length

Person Characteristic

  • Demographic
    • Race and Ethnicity
    • Sex/Gender
  • Role
  • Physical Characteristic
    • Weight
    • Picture/Likeness
  • Status

Document Metadata

  • IEEE LOM
  • Dublin Core
  • SIF
  • RDF

Program Evaluation

  • Resources Consumed
    • Financial Resources
    • Time Resources
    • Human Resources
    • Curricular Resources
    • Instructional Materials
      • New Technology Resources Used
      • Equipment Used
      • Supplies Used
    • Facilities Resources
      • Number of Buildings
      • Square Feet
  • Measurable Goals and Outcomes
    • Academic Goals
      • Academic Achievement
      • Skills Acquisition
      • Skill Certification
    • Non-academic Goals
      • Health Effects
        • Physical
        • Emotional
        • Developmental
      • Social Effects
        • Community Service
      • Participation Effects
        • Retention
        • Enrollment
        • Completion
    • Goal-Based Outcomes
  • Target/Served Population
    • Size
    • Demographics
    • Program Baselines
    • Method of Identification
    • Population Characteristics
  • Methodology
    • Measurement
    • Analysis
  • Characteristics
    • Program Availability
      • Geographic
      • Calendar
      • Length of Program
      • Periodicity of Service
      • Length of Services
    • NCES Program Type
    • Delivery Methodology
      • Constructivism
      • Inquiry Based
      • Directed Instruction
      • Virtual
      • Groupings

Class

A class is "A set, collection, group, or configuration containing members regarded as having certain traits in common; a kind or category."7 To distinguish between entities and classes, the Data Model begins the names of classes with a capital letter and entity names with a lower-case letter. For example a "Person" is a class while a "staffMember" is an entity.

Figure 2 illustrates the relationships between classes, entities, and attributes. Entities, which identify what needs to be tracked by education systems, are organized into classes and sub-classes. Each entity may have attributes, which represent the measures that are used to track the entity. In the Data Model, entities may relate to other entities.

Figure 2. Core Concepts of the Education Data Model

Figure 2. Core Concepts of the Education Data Model

Taxonomy

The entities, classes, and attributes in the Data Model have been organized into a taxonomy. As illustrated in figure 3, entities (marked with E) organize into classes and sub-classes (marked with C). Each entity has its own set of attributes (marked with A). Entities group together based upon common characteristics.

Figure 3. Taxonomy of Classes, Entities, and Attributes

Figure 3.  Taxonomy of Classes, Entities, and Attributes

Taxonomies arrange items into categories based on like characteristics. For example, lesson plan and unit plan are both types of academic plans in the Data Model, but lesson plan and unit plan differ based upon the scope and purpose of the plan. The "is a type of" organization scheme ensures, with few exceptions, that each entity has one, and only one, place in the arrangement. For example, in the Data Model

    a portfolio is a type of formative assessment is a type of assessment is a type of instruction artifact,

just as in the taxonomy of living things,

    a Homo sapiens is a type of hominid is a type of primate is a type of mammal is a type of chordate is a type of animal.

There are exceptions and hard-to-classify cases, similar to the duck-billed platypus in the taxonomy of living things, which, as a mammal that lays eggs, defies a clean classification.

The Data Model taxonomy is similar in form and function to other well-known taxonomies, such as the Linnaean taxonomy of living things, and the historically used classification systems, such as the Dewey Decimal System.8 As the Data Model user becomes familiar with the structure of the taxonomy, locating a particular entity becomes easy. In addition, the tools provided for using the Data Model have search features that facilitate locating items in the taxonomy.

Relationships

In addition to the taxonomy structure, which is based on entity characteristics, the Data Model contains natural relationships among the entities. For example, in the Data Model taxonomy, the "student" entity is not close to the "class" entity, but the Data Model stores and reflects the relationships between them. Figure 4 shows examples of the types of relationships contained in the Data Model.

Figure 4. Types of Relationships in the Education Data Model

Figure 4.  Types of Relationships in the Education Data Model

The relationship descriptors include verbs or short verb phrases that connect the subject with the object. For example, "student" (subject) "receives services from" (relation) "teacher" (object). This information within the Data Model allows for intelligent searching and for creation of subparts of a model. As the Data Model grows, so too will the types of relationships captured among its entities. Below is a list of relationships that is expected to change as the Data Model continues to be updated.

  • isDirectProviderOf
    Directly provides goods or services.

  • isEvaluationResultOf
    An evaluation of a student or other learner's performance.

  • hasDelimiter
    One entity is modified or limited by another. For example, a calendar date can delimit a count of students.

  • determines
    Determines in part or in whole the content, structure or value the object entity.

  • hasAssociated
    The object entity must have the associated subject entity in order to exist. This is similar to a foreign key relationship. For example, a disruptive event (subject) has an associated victim (object).

  • hasCausalRelationship
    A strong relationship in which one entity causes a change in another entity.

  • isAValueOf
    Used to enumerate the possible values, or states of an entity.

  • receivesServicesFrom
    Subject receives services from the object.

  • isPartiallyDefinedBy
    The subject is defined in part by the object. Or, the subject is a narrower concept than the object.

  • hasFunctionalComponent
    Reflects the construction of an entity through functional components represented by other entities.

  • participatesIn
    A person-type entity participates in an activity-type entity.

  • isIndirectProviderOf
    Indirectly provides goods or services.

  • isACountOf
    A non-duplicated count

  • isDerivedFrom
    One entity is a derivation of another. This means that some or all of the important features of an entity is also in its sub-entities. This is different from functional components.

  • providesServicesTo
    Subject receives services from the object.

  • isOrganizationalComponentOf
    This relation is used to indicate an organizational structure of non-person entities such as schools, districts, etc.

  • isFunctionalComponentOf
    This relation indicates that subject entity makes up, in part or in whole, the function of the object entity.

  • isAlignedWith
    Used mainly in Teaching and Learning entities. One entity is constructed to align in meaning or function with another.

  • isASynonymOf
    Similar in meaning. This could indicate an exact or inexact similarity.

Concept Map

The Concept Map describes the structure of the Education Data Model. It represents a logical and finite set of relationships among classes, sub-classes, and entities, thereby striving to depict the entire domain of education information. It adds multiple simultaneous relationships among the entities to the taxonomy. The relationships among entities are designed to be mutually exclusive but may sometimes overlap in meaning or usage.

To feature the relationships among the entities presented previously in figure 3, figure 5 turns the taxonomy inward.

Figure 5. Taxonomy Featuring Relationships Among Entities

Figure 5.  Taxonomy Featuring Relationships Among Entities

Data Model Content Structure

This section offers insight into the Data Model development process by describing the major concepts around which the Data Model was created. Two core content domains, teaching and learning, provided the foundation for Data Model development. A process of data needs identification was used to identify the information necessary to answer educators' questions. Several major education processes, or process perspectives, further informed the development of the Data Model's contents and organization.

Content Domains

The Education Data Model uses the lowest level of data granularity within PK-12, usually the individual, school, and LEA levels, where data originate and have the greatest impact on the teaching and learning process. Figure 6 illustrates the core domains of the Data Model "spine": teaching and learning. Especially important in this depiction is the triad of student, staff (teacher), and course/class. The interactions among the three members of the triad formed the core processes of interest in the development of the Data Model. Other domains have been and continue to be developed around this core interaction, such as programs, activities, assessment, transportation, facilities management, professional development, and accountability.

Figure 6. Core Content Domains of the Education Data Model

Figure 6.  Core Content Domains of the Education Data Model

Data Needs Identification

The contents of the Data Model were generated by first asking "What do we (educators) need to know?" and "What questions do we need to answer?" The next step was to identify which people, things, and constructs ("entities" in data modeling language) we need to know about. The final step in generating the elements of the data model was to identify the facts that we need to know about the people, things, and constructs in order to answer our questions. The resulting data elements were organized into a hierarchical taxonomy.

Once the data elements were determined and organized, the relationships among them were identified and recorded. Figure 7 shows the described process.

Figure 7. Data Needs Identification Process

Figure 7.  Data Needs Identification Process

Major Education Processes

The contents of the Data Model were also generated by examining six major education processes. These processes represent different perspectives from which items and relationships in the model can be generated. Together, the following six process perspectives make up the education enterprise. The overlap among these major processes, depicted in figure 8, defines the ways in which they interact to support the overall education enterprise.

Figure 8. Major Education Processes in the Education Data Model

Figure 8.  Major Education Processes in the Education Data Model

School Formulation and Administration: Processes that address setting up and maintaining the physical, virtual, and community structures that make up a school.

Course of Study: Processes related to constructing and managing the academic environment around which learning takes place.

Alternative/Supplemental Services and Instruction: Processes that address programs, services, and instruction that supplement, or are alternatives to, the standard course of study.

Teaching and Learning: Processes directly related to teaching and learning.

Schools Improvement and Management of Quality: Processes related to the improvement of schools and strategic planning.

Individual Student Tracking: Processes that manage student information related to student status, student characteristics, evaluation of education programs, and accountability for student success.

Suggested Open Source Tools

More advanced users may want to display the information contained in the Data Model in ways tailored to their particular needs or interests. Some suggestions include two "open source" tools listed below, Protégé and SWOOP, which allow the user to accomplish tasks such as using the model to generate:

  • Lists of entities and attributes
  • Entity detail reports
  • Conceptual maps
  • Filtered conceptual maps, i.e., only portions of the conceptual map shown for ease of understanding
  • Entity taxonomy
  • Relationship diagrams

Protégé

Protégé is one of the most widely used tools to create relationships in OWL. It was developed by Stanford Medical Informatics: http://smi.stanford.edu/. The Protégé 3.4 full version software

SWOOP

SWOOP is an OWL relationship browser and editor. It has an easy browser-like interface. The software was originally produced by the MINDSWAP group, University of Maryland, College Park. SWOOP software


Top


6 Definitions of the common attributes can be found the NCES Handbooks Online.
7 The American Heritage® Dictionary of the English Language, (2004) Fourth Edition. Houghton Mifflin Company. http://dictionary.reference.com/browse/class (accessed: July 2008).
8 Two other projects that will have influence on the Education Data Model are the Suggested Upper Merged Ontology (better known as SUMO) and the Universal Data Element Framework (UDEF). SUMO is an attempt to create a high-level model that applies to all domains of interest. SUMO is expressed in a formal logic language. At some point in the future, it would be useful to map the Education Data Model to SUMO. The UDEF is similar to the Education Data Model in that it attempts to model a particular domain (enterprise information systems). The Education Data Model has independently developed many classes that are similar to classes in the UDEF. In the future the Data Model may continue to be informed by the UDEF.