Summary of Quality Control Error Analysis for the Data Entry
To evaluate the effectiveness of the quality control of the data-entry process, the corresponding portion of the final integrated database is verified in detail against a sample of the original instruments received from the field. Through the years, the NAEP database has been found to be more than accurate enough to support the analyses that have been done. Overall, no statistically significant difference in observed error rates has been detected from year to year.
The purpose of a quality control analysis is to assess the quality of the data resulting from the complete data-entry system, beginning with the actual instruments collected in the field and ending with the final machine-readable database used in the analyses. The process involves the selection of instruments at random from among those returned from the field and the comparison of each entire instrument, character by character, with its representation in the final database. In this way, it is possible to measure the error rates in the data as well as the success of the data-entry system.
The observed error rate cannot be taken at face value. For example, let us say a sample of school questionnaires that were selected for close inspection contained two errors out of a total of 2,251 characters. To conclude that the entire school questionnaire database has an error rate of or .0009, would be too optimistic; we may simply have been lucky (or unlucky) with this particular random sample. What is needed is an indication of how bad the true error rate might be, given what we observed. Such an indication is provided by confidence limits. Confidence limits indicate how likely it is that a value falls inside a specified range in a specified context or distribution. In this analysis, the specified range is an error rate between zero and some maximum value beyond which we are confident at a specified level (traditionally 99.8 percent) that the true error rate does not lie. The specified context or distribution turns out to be the cumulative binomial probability distribution. The following example will demonstrate this technique.
Let us say that 1,000 booklets were processed, each with 100 characters of data transcribed for a total of 100,000 characters. Further, let j represent the number of character errors (five in this example), in a random sample of booklets that were completely checked (50 in this example). In other words, five errors were found in a sample of 5,000 characters. The following expression may be used to establish the probability that the true error rate is .0025 or less, rather than a single-value estimate of the observed rate, one in a thousand (.001):
This is the sum of the probability of finding five errors plus the probability of finding four errors plus [and so on, working the equation through to the greatest value, in this case 5,000] plus the probability of finding zero errors in a sample of 5,000 with a true error rate of .0025; that is, the probability of finding five or fewer errors by chance when the true error rate is .0025. Notice that we did not use the size of the database in this expression. Actually, the assumption here is that our sample of 5,000 was drawn from a database that is infinite. The closer the size of the sample verified relative to the size of the actual database, the more confidence we can have in the observed error rate; for example, had there been only 5,000 in the total database, our sample would have included all the data, and the observed error rate would have been the true error rate. The result of the above computation allows us to say, conservatively, that .0025 is an upper limit on the true error rate with 98.53 percent (i.e., 1 - .0147) confidence; that is, we can be quite sure that our true error rate is no larger than .0025. For NAEP quality control we use a more stringent confidence limit of 99.8 percent, which yields an even more conservative upper bound on the true error rate; with 99.8 percent confidence, we would state that the true error rate in this example is no larger than .0031, rather than .0025.
Calculations of true probabilities based on a combinatorial analysis have been done (e.g., Grant 1964). Even when the sample was as much as 10 percent of a population of 50, the estimate of the probability based on the binomial theorem was not much different from the correct probability. NAEP does not sample at a rate greater than about 2 percent. Thus, the computations of the upper limits on the true error rates based on the binomial theorem are likely to be highly accurate approximations.