An Appraisal Archivist's Perspective on the Implementation of Electronic Document Management Systems

by Catherine Bailey

Presented at the Society of American Archivists Annual Meeting
Washington, D.C.
31 August 2001

Note: The opinions expressed in this article are those of the author, and do not represent the views of the National Archives of Canada.

Introduction:

In their two presentations, my colleagues Ineke Deserno and Lisa Polisar have outlined some of the issues and practical concerns as their organizations implement an electronic document management system. What I would like to do now is bring in my own views on the role of the appraisal archivist -- the person whose task it is to determine the archival value of the contents of that system, make recommendations as to what records to preserve, facilitate their extraction from the system, and ensure their long-term preservation. I will touch briefly on a number of issues at three stages: design, implementation, and preservation.

It will probably come as no surprise to anyone in this room who has dealt with electronic records since the early 1980s that I think that archivists can be a truly valuable resource when considering the design and implementation of any new computer system. In fact, Ineke notes this as one of the criteria for a successful implementation. I agree completely, and would add that with the experience that is being gained with EDMSs, it is becoming even more important to make sure that archival concerns are addressed at the design stage, or we risk losing electronic records to "de facto" destruction, a situation where they will not survive extraction or migration between operational platforms. Obviously, this is as much a problem for the organization itself as it is for the appraisal archivist.

When an archivist appraises any electronic system, he or she basically wants to know three things:

  1. What is the nature of the information contained in the system? (its content)
  2. How did that information get into the system, how is it used, and how does it reflect the organization's functions and activities? (information flow, work processes, etc.)
  3. If it is determined that the system contains material of archival value, how can it be removed and preserved? (technical considerations, such as terms and conditions for transfer, or any other technical aspects for long term preservation)

Once something is out of its original software and hardware environment, the issue of preservation becomes more focussed on the long-term concerns which are common to the preservation of all electronic records, regardless of the specific system that created them (i.e., the medium on which the records are stored and its expected lifespan, technological obsolescence of equipment and software, how to give access to the records [for example through the use of emulation], to name only a few examples).

I: The Archivist at the Design Stage- Setting up the Software

At the design stage for an EDMS, there are two main issues which are of concern to the appraisal archivist: the file classification system and its use, and the metadata (profiles) that accompany electronic documents

Issue: File classification system

In my opinion, the most crucial element for any EDMS is a solid file classification system that adequately supports the organization's functions and activities. This may sound simple, but it's surprising how often this issue is not addressed early enough in the system development process.

Consider what happens if you don't have an appropriate file classification system when the filing process becomes automated. You will have frustrated users who will undoubtedly find ways to avoid filing documents if it is difficult to do so, leading to a lack of corporate documentation and ultimately a loss of accountability for operational purposes. We've all heard the observation on the impact of the computer on the documentary heritage of the late 20th century -- the creation of a dearth of preservable information from the late 1980s and early 1990s because records reside in electronic format only on individual's hard drives, and are not printed and sent to the corporate paper records system. This is not a situation that any new EDMS implementation wants to encourage!

The archivist doing the appraisal after the system has become active also needs to see a solid file classification system. While the theory of macro appraisal leads us to hypothesize what the functions and activities worthy of preservation should be, it is the "micro-appraisal" of the records themselves, including those in electronic form, that allows us to confirm the hypothesis and make recommendations on which specific records need to be preserved to document those functions. If the file classification system does not adequately describe/support those functions and activities, it makes the identification of the archival record very difficult. Take as an example an organization that has undergone many re-organizations of functions but has not updated their filing system to reflect this - there could easily be multiple file blocks with similar sounding names or descriptions covering the same function, or even file blocks that reflect organizational structures that are long gone but are still being used. If you envision a user being confronted with this situation when filing a document, you can probably also imagine the resulting confusion and potential inconsistency of the documentation within the files, and thus the difficulty an archivist would have identifying files of archival value.

I should point out that a file classification system that works adequately in the paper based world does not necessarily translate directly and adequately to the electronic world, where the EDMS system can be more seamlessly integrated into the work processes. Let me give you a simple example from my own work. In our classification system, the files containing records on custodial matters relating to our holdings are arranged by a unique number denoting the record group/fonds, and the year of the action (i.e., 9540-RG13-1999, or 9540-R188-2001). Because of the different activity levels of the various fonds, custodial files are not generated automatically for all fonds every year, but only when there is documentation to be placed upon them. When we relied upon paper documentation, this meant a file was created when a piece of paper was received by the records office, usually at the end of the process. Now, however, much of our business is conducted by electronic mail, and we have the ability to generate documentation throughout various stages. In our e-mail system, the user is asked if they wish to file the message to the corporate system; if the user happens to be attempting to file the very first document for a new year or a new fonds, he/she is met with a "file does not exist" error. So in order to get that information into the corporate system electronically, the user has two options: to send it to the general "in-box" folder, where it will be classified by records office staff when time permits, or to notify the records office staff that a file must be created, have them enter the number into the system, and notify the user when it is ready to accept filed documents.

Organizations need to begin their implementation by analysing work processes and their file classification system. From Ineke's presentation, I noted that during the implementation phase, the UNHCR stressed the integration of a classification plan (on which they will be working with their users to ensure it meets their needs), records schedules, and the long term management of captured records. This is a good indication that the organization has thought carefully about the infrastructure necessary to implement an electronic system - and believe me, not every organization does this prior to rollout.

Issue: Metadata elements (profiles)

The second issue of concern to an appraisal archivist is the profile or metadata elements which accompany electronic documents. At the design stage, specific elements of a document profile need to be considered. Archivists will be able to provide information that is necessary for the authenticity and reliability of the records. This doesn't even have to be particularly complicated -- it could be something as simple as requesting the addition of fields to the profile, for example, for "date created," as opposed to "date profiled/filed," or a records disposition authority or retention period. Results of the work of the Authenticity Task Force of the InterPARES project will no doubt have an impact on future system design considerations for EDMSs; a draft for public comment of their document "Requirements for Assessing the Authenticity of Electronic Records" is available the InterPARES website (www.interpares.org)

II: The Archivist at Implementation Stage

From the perspective of the appraisal archivist assessing the content of the system and how it got there, there are three issues for an EDMS: who files records, what records are filed, and will the system be fully or partially implemented within the organization.

Issue: Who is going to do the filing of the documents?

In any implementation of an EDMS, an organization usually has two basic options: centralized filing by records personnel, and user filing. Centralized filing obviously gives the most control over the contents of the system, but it has limitations (just one being that there must be sufficient skilled and trained classifiers that the material is handled promptly). User filing may be quicker, but it is less easy to control what goes in, unless there is a process of quality assurance (specifically, the review of document profiles). This is potentially resource intensive, in that it requires not only classification support, but also training.

Issue: What documents are going to be filed?

Having decided who is going to do the filing, the organization has to consider what is going to be filed, and it is here that there can be a significant impact on the nature of the corporate documentation and hence on the later work of the appraisal archivist. Again, there are two options: mandatory filing of all records, and selective filing controlled by the users.

If an organization choses to implement mandatory filing that is built right into the "save" feature on all programs, then it must consider that in addition to records that support the organization's mandated functions, there will be other items which do not. These latter items, such as personal e-mails and personal documents, should therefore not be preserved within the corporate system. Organizations need to think clearly about how to handle these items this early in the system design and implementation process. In one Canadian government organization, which chose to implement mandatory filing of all documents as part of the save feature, the decision was made to create a "personal folder" outside of the corporate system for each employee to file. The personal employee files are clearly identifiable, and can therefore be easily disposed of without seriously impacting the corporate record. But there may be situations less obvious than personal e-mails which should also be addressed, for example, the generation of records during related professional activities which may be sanctioned by the organization but are not part of an employee's job. As an example, I am the Book Review Editor for Archivaria, the journal of the Association of Canadian Archivists. When I correspond with reviewers using my computer at work, an activity that my institution supports, the correspondence is not part of my position as archivist at the National Archives of Canada, and should therefore not be included in our electronic document management system. In our system, because we have optional filing capabilities, I am able to do this.

In an environment of selective, user controlled filing, how do you ensure that the appropriate documentation is filed at all? Policies and procedures will help, but they need to be enforced through some form of quality assurance to be effective. Ineke has noted that one of the shortcomings of the e-mail filing system at UNHCR is that the procedures were not an automated part of the sending/receiving process, and that as a consequence, only 10% of official e-mail was captured, leaving 90% in individual's mailboxes. I suspect that this situation, which produces an incomplete corporate record, is not an unusual situation for most organizations. The crafting of suitable archival recommendations in these situations becomes nearly impossible, as the archivist would either have to visit each and every person's office for an individual assessment, or resort to the crafting of broadly-based functional terms and conditions which, unfortunately, can be difficult for non-archivists to interpret accurately.

A related issue arises when one considers the long-term implications of scanning incoming correspondence into the EDMS will have on the nature of the corporate record. Is the organization going to scan all incoming mail? Or will there be criteria developed to allow a person to determine what does not get scanned (like routine mailouts ("junk mail"), or professional correspondence)? An example of this is a document management system implemented in our federal Department of Indian Affairs and Northern Development, which has chosen to scan incoming mail not defined as "junk mail" according to their written procedures. Will those criteria be maintained and updated to ensure the system captures only that documentation that needs to be in electronic format for the organization to carry out its functions? The appraisal archivist needs to know the results of this decision, for it affects what goes into the EDMS and may mean that a significant amount of electronic documentation within the system is not of high archival value, possibly obscuring the assessment of those records that are.

Issue: Will there be full or partial implementation of the system?

Regardless of the question of whether there will be mandatory or selective filing of documents, another issue of concern both to the organization and to the appraisal archivist is the level of implementation of the system. This is an issue because a partial implementation will result in overlap or a hybrid situation, where corporate documentation may well be in both electronic format and on the paper system. The organization must have a plan to deal with a hybrid situation as part of the implementation so that users are notified of the potential overlapping of documentation. For example, paper files could be annotated to show that there is an electronic component, the same way the electronic system should note that there is a related paper file. Without such information, the archivist coming along much later to do the appraisal cannot adequately assess where one system stops and the other commences, and is faced with preserving the hybrid system in both forms, possibly resulting in duplication of archival records. One method of reducing the impact of overlap can be a "print to file" policy for a set period, because it will ensure that there is no loss of records of potential archival value prior to the organization going fully electronic (and designating the system as the official record).

Ineke noted that the electronic record is the official record and UNHCR, but that they acknowledge there are two parallel systems during the transition period. I would like to know more about how they have chosen to deal with these parallel systems, and for how long.

III: The Archivist and Preservation

The preservation of electronic records actually starts long before they ever come to the archives -- it begins when the EDMS system is first configured, and the question of the extraction of records is considered (or not!). But there are a couple of key questions that need to be addressed first:

  1. When is the extraction of the archival records initiated - ie., what is to be the trigger for records to come out of the system?
  2. How are those records actually extracted?

In the paper world, files containing individual documents are "closed," a well established concept with understood ramifications. The process is usually determined by an operational policy, records staff know when to stop placing documentation on a physical file folder, and begin to add records to a new physical volume. They may even add a notation to the file folder itself. This closing of the file provides a clear delimiting point in its life cycle, and a point from which an appropriate retention period may be calculated (e.g., "10 years from the date of last correspondence on file," or "5 years after the file is closed".) Closed volumes can then be moved into less expensive storage areas. As a government agency gathers more and more of these closed files, the lack of appropriate storage space often triggers a transfer of records to the archives.

However, in an electronic document management system, some of us have started to ask ourselves, is there is there a concept that equates to "closing a file," when it is technically possible to keep all electronic documents available on a server, regardless of their classification or age, without causing any problems with storage and retrieval? Without such a concept actually being built into the system, what is there to trigger the calculation of a retention period and therefore the transfer of archival records? Can a trigger exist for an EDMS if it is not linked to a companion electronic records management system that also controls the related paper?

I notice that the UNHCR system has this "trigger," and that all items linked to a given file code including documents and folders) will automatically get the same retention period as the file. I'd like to know more about how this actually works, since I have not had any real hands-on experience with this.

In terms of implementation and its impact on preservation, will the archivist be involved in piloting or testing of the system prior to full roll-out to ensure that the system meets long term archival needs, particularly the ability to extract records from the system? These capabilities must be considered at the design and implementation stage, or archival preservation becomes a moot point -- if you cannot extract, you cannot preserve, and you have de-facto destruction. Emphasis must be given to the preservation of the linkage between the document and its contextual metadata, or profile -- one without the other is meaningless.

Finally, the long term operational use of records has a serious impact on preservation because of the issue of software migration. The extraction capability for records of archival value that might have been considered and incorporated into the first design of the system must be retained, re-issued, and monitored throughout the life of the system and its successors, or the possibility exists that the records will someday prove to be inextricable and therefore not be preserved.

These are just a few of the issues of concern to archivists responsible for implementing and appraising electronic document management systems. There is clearly much more to learn about the impact of the implementation of these systems, and I look forward to an interesting discussion here today.