Data files developed by the government are frequently very large consisting, in some cases, of many millions of logical records. The needs of those who would do secondary analysis of these data have not necessarily been a factor in determining either the content of the data collected or the physical form in which the data are made available. As a result, much of this potentially useful data tends to be incomplete for many research purposes and inadequately documented for others. These are not problems unique to data generated by the government. However, the size and the complexity of many government-produced files makes solutions more difficult and more expensive. This paper uses the 1970 United States Census of Population and Housing as a means of illustrating both the benefits and the difficulties of utilizing government data bases for secondary analysis.
- This could also be of interest