ENSURING ACCESS OVER TIME TO AUTHENTIC ELECTRONIC RECORDS: STRATEGY, ALTERNATIVES, AND BEST PRACTICES NATIONAL ASSOCIATION OF GOVERNMENT ARCHIVES AND RECORDS ADMINISTRATORS JULY 17, 1997

CHARLES DOLLAR

UNIVERSITY OF BRITISH COLUMBIA

This document consists of copies of transparencies used in the presentation and one handout that includes the Table of Contents and Introduction to the study on which the presentation is based. The Introduction provides context to the presentation. Three tables and figures used in the presentation are not included because of technical difficulties.


OVERVIEW

















FOCUS OF THIS REPORT












BIASES AND ASSUMPTIONS

BIASES







ASSUMPTIONS










TERMINOLOGY


















REFORMAT










(REFORMAT CONTINUED)










COPY











CONVERT

CONVERSION IS NECESSARY WHEN THE SOFTWARE ENVIRONMENT IS UPGRADED OR A NEW SOFTWARE APPLICATION IS INSTALLED








(CONVERT CONTINUED)



OR


MIGRATE









(MIGRATE CONTINUED)







(MIGRATE CONTINUED)


WHAT DOES MIGRATION REQUIRE?
















(MIGRATE CONTINUED)

THERE ARE ONLY A HANDFUL OF INSTANCES OF SUCCESSFUL ARCHIVAL MIGRATION OF ELECTRONIC RECORDS.



IN CONTRAST, THERE ARE MANY INSTANCES OF SUCCESSFULLY REFORMATING, COPYING, AND CONVERTING ELECTRONIC RECORDS

KEY ARCHIVAL PRESERVATION ISSUES


















(KEY ARCHIVAL PRESERVATION ISSUES CONTINUED)



















(KEY PRESERVATION ISSUES CONTINUED)


















(ARCHIVAL PRESERVATION ISSUES CONTINUED)









CONVERT, MIGRATE









LONG-TERM ACCESS STRATEGY

MAINTAINING PROCESSIBLE AUTHENTIC ELECTRONIC RECORDS MIGRATING PROCESSIBLE AUTHENTIC ELECTRONIC RECORDS

MAINTAINING PROCESSIBLE AUTHENTIC ELECTRONIC RECORDS

















(MAINTAIING PROCESSIBILITY CONTINUED)




















COPY








(COPY CONTINUED)






















CONVERT


 

WHAT MEDIUM TO USE?






(CONVERT CONTINUED)















(CONVERT CONTINUED)









MIGRATE


















(MIGRATE CONTINUED)











LONG-TERM ACCESS POLICY AND PROCEDURE






















(LONG-TERM ACCESS POLICY CONTINUED)












(LONG-TERM ACCESS POLICY CONTINUED)







QUALITY CONTROL PROCEDURES
ASSIGNMENT OF RESPONSIBILITY TO A SPECIFIC UNIT OR INDIVIDUAL






BEST PRACTICES, ALTERNATIVE, AND RECOMMENDATIONS

















(BEST PRACTICES CONTINUED)






















REFORMAT ELECTRONIC RECORDS


































TABLE OF CONTENTS

PREFACE

INTRODUCTION

CHAPTER ONE: CONCEPTUAL FOUNDATIONS

1.1	Background

1.2	Research Projects and Studies

1.21 University of Pittsburgh
1.22 University of British Columbia
1.23 RLG/CPA Task Force
1.24 SESAM
1.25 Nordic Council on Scientific Information
1.26 International Council on Archives
1.27 Summary

1.3	Issues

1.31 Document
1.32 Record
1.33 Authentic Records
1.34 Archiving
1.35 Reformat
1.36 Copy
1.37 Convert
1.38 Migrate
1.39 Technology Obsolescence


CHAPTER TWO: OPTIONS AND ALTERNATIVES FOR ACCESS OVER TIME TO

AUTHENTIC ELECTRONIC RECORDS

2.1 Background

2.2 Archival Preservation Domains

2.21 Readable
2.22 Intelligible
2.23 Identifiable
2.24 Encapsulated
2.25 Retrievable
2.26 Understandable
2.27 Reconstructable
2.28 Authentic

2.3 Access Strategy

	2.31 Processibility
		2.311 Reformat
		2.312 Copy
		2.313 Convert

2.32 Migration
		2.321 Migration Steps
		2.322 Non-Migration Options

2.4 Long Term Access Strategy Systems Perspective


CHAPTER THREE:	BEST PRACTICES, RECOMMENDATIONS, AND GUIDELINES


3.1 Background

3.2 Policy

3.3 Quality Control

3.4 Transfer

3.5 Storage Environment

3.6 Reformat

3.7 Copy

3.8 Migrate

3.9 Starting An Electronic Records Preservation Program

3.10 Multi-Institutional Cooperation

ANNEX 1 A TECHNOLOY PRIMER FOR ARCHIVISTS AND RECORDS MANAGERS: RECORD EPRESENTATION, STORAGE, RETRIEVAL, AND PORTABILITY

ANNEX 2 NORDIC COUNCIL MEDIA SELECTION STUDY

ANNEX 3 MEDIA STORAGE COSTS, NATIONAL ARCHIVES OF CANADA

ANNEX 4 ELECTRONIC RECORDS PRESERVATION PROGRAM COSTS, NATIONAL 
ARCHIVES OF THE UNITED STATES









INTRODUCTION

Future historians are likely to view the last three decades of the twentieth century as a watershed where the convergence of digital technologies reshaped the information landscape and thereby fundamentally altered how people create, retrieve, use, and view information. This convergence is particularly evident in the telecommunication industries where audio, traditional print, still pictures, motion pictures, and telephone signals increasingly are being stored and retrieved in a common digital base. The traditional distinction between information objects such as letters, books, audio recordings, maps, photographs, movies, video, and telephony based upon the means of transmission or carrier of the information that has supported separate technologies, disciplines, professions, and industries is being eroded. The magnitude of this transformation and its long-term implications for society are barely recognized, much less understood, although many contemporary observers believe that the transformation is similar to what happened with the introduction of writing three millennia ago.

Although few contemporary observers fully understand the magnitude of this transformation, there are several generalizations that can be offered. First, every indication is that reliance on digital information will increase in virtually every segment of society, ranging from the home, to government, and to the workplace. One indication of this increased reliance is in the estimates of the volume of information in digital form. One estimate asserts that the volume of information in digital form is increasing between twenty and fifty percent annually and that by the year 2000 between 600 and 1,000 Petabytes (PB) will have been accumulated. This is the informational equivalent of between thirty-six billion and one hundred billion 500 page books, which is roughly XXX times greater than the number of books published in the Twentieth Century.

Even assuming that this estimate is off by a factor of one hundred, the volume of information in digital form is enormous and will continue to grow, giving rise to a pervasive environment of digital information and an infrastructure required to support it. It is likely, therefore, that the penetration of digital technologies into the fabric of society and life of individuals will parallel that of the telephone. Ironically, that penetration will be complete when we view and can use computers and digital technologies the same way we view and use telephones and telecommunications.

The second generalization is the question of how to ensure continuing access to digital material as digital technology changes. Recently, computer scientist Jeff Rothenberg addressed this question in an article "Ensuring the Longevity of Digital Objects," that appeared in the January 1995 issue of Scientific American. One of Rothenberg's objectives in writing this article was to heighten public awareness of how digital technology obsolescence can make long-term access very difficult, if not problematic. Another objective was to encourage the systematic analysis of the problems of digital technology obsolescence that could lead to the development of tools and concepts that help ensure long-term access to digital information in the midst of a rapidly changing technology environment.

At the time of Rothenberg's article, several major projects were underway to identify critical issues associated with electronic records in general or to address the question of how archives and archivists could mitigate the effects of technological obsolescence and "ensure technological compatibility, flexibility, and migratability?" These projects, which are reviewed in some detail in Chapter One as part of establishing the conceptual foundations for this study, contribute significantly to our understanding of key issues that electronic records pose for archives and other organizations with the responsibility for providing long-term access to them. None of these projects or studies, either individually or collectively, addresses all of the relevant issues and challenges that digital technology obsolescence poses for long-term access to electronic records.

The scope of this study was shaped by six primary considerations. The first consideration is a focus on electronic records that are no longer required for operational use and have been set aside for future use. Although the study addresses the conditions that give rise to the reliability of electronic records, it is assumed that these conditions prevailed at the moment of their creation and were maintained during their operational use.

The second consideration is that the study does not address archival description as a means of preserving the integrity of electronic records by freezing them in time in relation to other electronic records as Luciana Duranti and Heather MacNeil have suggested. Consequently, this study views the preservation of the context of creation and use of individual electronic records as the most effective way of ensuring their integrity.

Third, the study delineates an access strategy that differentiates the migration of electronic records from maintaining their processibility. Implicit in this differentiation are two key factors. Because migration is so complex, difficult, and costly, the term should be employed within a narrow context of meaning. The other factor is that too much attention has been devoted to access one hundred or two hundred years from now when we have no way of knowing what kinds of technology will be available then. Instead, we should do two things. First, we should focus on a much shorter time frame, perhaps on the order of thirty years or so, during which time information technologies are likely to be relatively stable. Second, we should ensure that the way we use digital technologies to support access over time to electronic records minimizes the likelihood of creating intractable problems for future custodians of electronic records.

The fourth consideration involves the issue of whether an archives should be the custodian of electronic records. This report recognizes that the costs of maintaining the processibility of electronic records and then migrating them to new technologies are likely to exceed the human and financial resources available to most archives. Consequently, new organizations that are not "traditional" archives may need to be created in order to provide this service. One possible model, the Northeast Document Conservation Center, is a regional preservation facility that provides preservation services on a cost recovery basis to archives that do not have the financial resources to support a preservation staff and conservation lab. This study, therefore, employs the concept of a "competent archival entity," which means any competent third party that is under no obligation to the individual or organizational component where the records were created, maintained, and used other than to adhere to best archival practices in order to protect the records from corruption, alteration, or loss. This competent third party entity can be any designated organizational unit, including the organization's archives, a public archives, or local or regional a service bureau. References in the text to an archival repository should be understood as referring to a competent archival entity.

The fifth consideration is an emphasis on technical issues and problems associated with ensuring long-term access to authentic electronic records, especially non-proprietary information technology standards, and practical guidelines for media selection and storage. Non-proprietary information technology standards are especially important because they help support open systems, applications connectivity, and document portability, which in the long run may significantly enhance the prospects of long-term access to electronic records. It must be emphasized that only those international or non-proprietary technology standards that have a substantial market place implementation merit consideration for inclusion in the standards recommended for incorporation into a long-term access strategy for electronic records. This precludes general consideration of Abstract System Notation 1, for example, in a long-term access strategy for electronic records.

The sixth consideration involves a focus on products, tools, and techniques that have an established commercial presence. This excludes research and laboratory products with great potential, such as HD-ROM, that have not been established as viable commercial products. In the future HD-ROM along with other products may in fact become huge commercial successes but it is imprudent to base a long-term access strategy on unproven products, tools, and techniques.

These issues and problems are covered in three chapters. Chapter 1 lays out the conceptual foundations of the study that include a review of six research projects and a discussion from an archival science perspective of nine concepts--document, record, authenticity, copy, reformat, convert, archive, migrate, and technology obsolescence. The concept of migrate is especially important because it is viewed as part of an access strategy that addresses a specific facet of digital technology obsolescence rather than as an macro strategy as suggested by the Task Force Report on Archiving Digital Information.

Chapter 2 discusses an access strategy for electronic records over time but with the proviso that there is not a "one-size fits all" strategy that will accommodate all formats of digital materials or all circumstances under which access can be supported. Therefore, this chapter reviews alternative approaches from which an organization may select methods appropriate for its requirements and resources and that are as effective as possible for the particular formats of materials under consideration. The context for this review is a set of issues that takes into account the concepts and issues reviewed in Chapter 1. This chapter concludes with a long-term access logical process model and a summary of data standards that can support access for a variety of formats. The logical process model is particularly useful because it summarizes the functions and activities that can help achieve the objective of providing long-term access to authentic electronic records.

Best practices and recommended guidelines are the focus of Chapter 3. The section on "best practices" draws upon the experience of several National Archives in mounting electronic records programs. The chapter offers a number of recommendations and guidelines that organizations should take into account in formulating how they will ensure long-term access to authentic electronic in their custody. Several of these recommendations are set within the framework of the European Union, but in fact they are generalizable to a wide variety of settings and circumstances.

The study concludes with four Annexes, the first of which is an information technology primer whose target audience is archivists and records managers who may be interested in detailed explanations of certain digital technology issues or may find it useful as a reference source. The primer examines technical problems of electronic records within the context of data representation of records, the structure of records, the storage of records, the retrieval of records, and the portability of records. This primer also reviews a number of technology issues associated with each of the problems listed above, especially the identification of international, national, and industry standards that can help minimize certain impediments to long-term access to electronic records. The primer may be used as a standalone document or it may be used as a reference source to terms used in the body of the study. Explanations of terms and concepts that are in italics can be found in the primer. Annex 2 consists of an excerpt from a study conducted by the Nordic Council that summarizes the criteria and evaluation of selected digital storage media. Annexes 3 and 4 include cost data for the preservation of electronic records from the National Archives of the United States and the National Archives of Canada.