Table of Contents  |  Search Technical Documentation  |  References

Data File Layouts

The data file layout is a printable file that lists the name, position, format, and description of each data variable on the corresponding response data file. Below is a sample of a partial data file layout:

Example, data file layout: 2000
Seq.   Field    Col.    Field    Decimal                       Key      
no.    name     pos.    width    places      Type     Range    value             Short label
1      Year      1        2         0         C                              Assessment year
2      Age       3        2         0         C                               Assessment age 
3      Book      5        3         0         C                               Booklet number 
4      BKSER     8        6         0         C                        Booklet serial number
NOTE: In the NAEP 2000 codebook file, AN="accommodations not provided", and AP="accommodations provided."
SOURCE: U.S. Department of Education, Institute of Education Sciences, National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 2000.

Each line of the layout file contains the following information for a single data variable:

  • Sequence number: fields are numbered sequentially to represent the order in which they appear on the data record.

  • Field name: eight-character label for the field that is to be used consistently by all secondary-use data files materials.

  • Column position: relative location of the beginning of the data variable on each record using bytes or characters as the unit of measure.

  • Field width: indicates the number of columns used in representing the data value for a data variable.

  • Number of decimal places: If the field contains continuous numeric data, the value under the number of decimal places entry indicates how many places to shift the decimal point before processing data values.

  • Data type: files may include seven mutually exclusive field types:

    1. Type A: character data with alphabetic and/or numeric codes.
    2. Type C: continuous numerical data without fixed ranges.
    3. Type D: discrete data with a fixed number of values. (Type D fields may include raw item responses or imputed (derived) categorical variables.)
    4. Type DI: Discrete data with a special code for "I don't know" responses.
    5. Type O: constructed-response items and performance items in the student data that were professionally scored.
    6. Type OE: constructed-response items in the student data that were professionally scored and scaled using a polytomous item response model.
    7. Type OS: constructed-response items in the student data that were professionally scored and scaled using a dichotomous item response model.

  • Value range: If the field type is discrete numeric, the value range is listed as the minimum and maximum permitted values separated by a hyphen to indicate range.

  • Key or correct response value: If the field is a response to a scorable item, the correct option value, or key, is printed. If the field is an assigned score that was scaled as a dichotomous item using cut-point scoring, the range of correct scores is printed.

  • Short description of the field: Each variable is identified by a 50-character descriptor.

Last updated 05 December 2008 (RF)

Printer-friendly Version