Web Sites as Recordkeeping and ‘Recordmaking
Systems[1]
Web sites are important sources of
organizational records and not properly capturing such records in a trustworthy
recordkeeping system is risky
Rick Barry
At the Core
This article
Ødiscusses public-facing
Web sites as sources of records
Øexplains why many
‘recordmaking’ systems are not recordkeeping systems
Øexamines Web
publication content and records management issues
On March 2, 2004, the Washington Post broke a story concerning lead contamination in the
District of Columbia’s drinking water. Neighboring Arlington County, Virginia,
shares the same source, and the article noted discrepancies in the county’s
public-facing Web site. A follow-up front-page story the next day stated,
“Arlington County officials began recommending yesterday that pregnant women
and young children drink only tap water that has been flushed or filtered,
after preliminary tests of water in eight homes showed that five had elevated
levels of lead … As late as mid-afternoon yesterday, the county's Web site
carried the headline ‘Lead Not a Concern in County Water.’ The Web site did not
mention that, on February 23, the county's Public Works Department quietly
began sampling water in Arlington homes built before 1988, the last year lead
solder was used.” Feeling sure that the contamination problem did not affect
Arlington, officials had decided to leave the “Lead Not a Concern in County
Water” announcement on the county Web site until they received results from the
special testing program.
The story illustrates the importance of
Web sites as possibly the only sources of many organizational records and the
risks of not properly capturing such records in trustworthy recordkeeping
systems. The story was picked up in local TV news coverage and received so much
publicity that the U.S. Congress held hearings on the subject. It became the
source of daily reporting by Post investigative
reporters for the entire month, exposing accountability issues in agencies at
the federal, regional, and local government levels, in their Web site and
e-mail communications.
This example is to draw important lessons,
not to criticize Arlington County that has been an e-government (e-gov) leader
and, as noted later, is taking remedial steps that will put it ahead of most
other organizations.This is the latest
of several news stories in which, to the embarrassment of the organizations
involved, journalists have reported on the sudden and controversial alteration
or deletion of Web content in apparent attempts to “change history.”
The fact is that Web sites
produce official representations to the public. Plainly stated, Web sites make
records, but they do not keep records in ways that meet trustworthy
recordkeeping standards. Chief executive officers (CEOs), attorneys, chief
information officers (CIOs), auditors, and content, records, and other
information managers: Beware.
Web Sites as Recordmaking Systems
The use of Web-based e-business
applications on Web platforms is almost ubiquitous in the private sector. E-gov
applications (including Web-based) have become increasingly prevalent in the
public sector, with mandates at various government levels to implement citizen
access to e-gov services in the 2003-2005 timeframe. Moreover, citizens are
demanding such access. A 2004 Pew Internet & American Life Project survey
report “How Americans Get in Touch with Government” found that 97 million adult
Americans (77 percent of Internet users) participated in e-gov in 2003 by
visiting Web sites or e-mailing government officials for transactions (paying
bills, obtaining licenses), obtaining information, or solving problems. This
reflected a growth of 50 percent from 2002.
“E-Gov Alliance” is a collaborative effort
among Washington state communities (Bellevue, Bothell, Issaquah, Kenmore,
Kirkland, Mercer Island, Sammamish, Snoqualmie, and Woodinville) to
provide a unified approach to automated building processes and services. MyBuildingPermit.com is a model example
of Web-based e-gov at the local government level. This multi-jurisdiction
system allows local citizens to apply for, pay for, and receive electrical,
mechanical, plumbing, and other permits – all Web-based public records – for
each of the participating cities. The customer-friendly system distributes system
costs across participating governments, significantly reducing their individual
total cost of ownership (TCO), a primary systems concern of CIOs. Bellevue is
currently spearheading another project to provide content management services
with trustworthy recordkeeping, for interested Alliance members.
Just as organizational enterprise
resource planning (ERP) systems, call center, e-mail, and instant messaging
systems are important producers of electronic records, so are organizational
Web sites, intranets, extranets, and other emerging information and
communications technologies (e.g., instant messaging, Web logs (“blogs”), agent
and virtual reality technologies) when used for business purposes. Blogs are viewed by some
organizations as more effective for conducting public information and crisis
management than Web sites.
Observant archivists and records managers
have been aware of the mounting Web records issue for a few years through such
sources as the National Archives and Records Administration (NARA), research
funded by the National Historical Publications and Records Commission (NHPRC),
and related research and implementation work in other organizations. A NHPRC
study by Charles R. McClure and J. Timothy Sprehe of federal and state
government organizations found many disparities where dynamic Web-site contents
(records) were more up to date than the “official” records. For example:
In Michigan, the
State Administration Board is putting official minutes of meetings up on a Web
site, knowing that no print version of the minutes exists … the prevailing
opinion is that most information on state Web sites is … unimportant from a
recordkeeping standpoint … In contrast … federal agencies exhibited consensus
that informational materials were appearing on Web sites that qualified as
official records. The materials in question were “original” … not copies of
materials available in some other medium..
Most recordmaking systems are not
sufficiently robust to preserve the principal characteristics of records. Nor
are they necessarily recordkeeping systems.
Web Sites as Recordkeeping
Systems
Although recordkeeping laws and standards
do not always explicitly address electronic records, virtually all recognized
definitions of the word “record” embrace or do not exclude electronic records
including Web content. ISO 15489
Information and Documentation – Records Management does address electronic
records. It “applies to the management of records, in all formats or media,
created or received by any public or private organization in the conduct of its
activities.” It further states, “records created in the public domain, such as
the World Wide Web, require a broad range of contextual information.”
As with other digital records, some Web-based records will be of
long-term evidentiary, secondary information, research, corporate or collective
memory value to the organization. Those will require a “continuing” (i.e.,
indefinite) retention period and rigorous architectural and technological
platforms to survive multiple software version updates and new system
migrations.
ISO 15489 defines records system as “information system which captures, manages, and
provides access to records through time.” A trustworthy recordkeeping system
captures, protects, preserves, and provides ready access to records, possibly
for many decades or indefinitely, and serves as the primary source of business
documentation. In addition to a record’s actual content, it preserves its
structure, business context, and association with other like records. It
preserves a record’s authenticity (it is what it purports to be), reliability
(accurate representation by a knowledgeable source), integrity (complete and
unaltered) and usability (can be located, retrieved, presented, accessed,
interpreted, and understood over time). Achieving this level of trustworthiness
requires more rigorous functionality than most automated systems possess.
The main practice in recent years to
address electronic records (beyond printing them out) has been integration of a
DoD-5015.2 certified records management application (RMA) for integration with
an existing enterprise document management system (EDMS). This has not always
turned out to be as effective as advertised. Most RMA/EDMS integrations were
unable to take account of Web records without adding still another software
layer of tricky integration.
By contrast, Bellevue and Arlington County
governments recently opted to procure enterprise content management (ECM)
systems that were also certified as 5015.2-compliant. Bellevue’s City Manager
took the further important steps of officially endorsing ISO 15489 and DoD
5015.2 as regime and software-level enterprise city standards. At present,
there is no certifying authority for 15489 as there is for 5015.2, although
Standards Australia is presently developing a compliance suite against which
organizations adopting 15489 may be assessed. While it is still early, the
Bellevue and Arlington approach of implementing a recordkeeping ECM has the
potential for significantly reducing TCO, making for better capture, access,
and management of records and other documents in digital, paper, and other
analog forms while being more attractive to IT, archives, records management,
and finance.
The timeline for seeing more than a few
examples of this kind of implementation approach will depend in large part on
how quickly the CEO and IT communities pick up on two principles:
ØLegacy records and
increasing volumes of current and future electronic records are major elements
of the organization’s intellectual capital.
ØWeb sites are among
the key organizational recordmaking systems that are not recordkeeping systems
that place organizations at risk for what in Information Nation, Kahn and Blair call TCF (total cost of failure)
or the cost of compliance failure.
Web Site Records Management Issues
Web Content and Records Management
The term “content management” was
initially limited to the management of Web publishing. This has changed as the
understanding of ECM has matured to include all enterprise content and with
advances in ECM technology that make it possible to do that. This approach is
exemplified in Bellevue and Arlington. Because a high percentage of enterprise
content is records, it is essential that the management of content/records be integrated
at one or more levels – organization, policy, systems, standards, procedures,
and training.
To illustrate using the Arlington County
example, Web site style, content standards, and publishing were being managed
under the county library director while content creation responsibility was
distributed at the department level. The county’s CIO understood the
relationships between enterprise content and records management and thus saw
the importance of integration at the ECM system level. But because there had
not yet been adequate integration at the other levels, there were no
policies/procedures requiring preservation of Web records in a trustworthy
recordkeeping system. Consequently, when the contentious Web content (“Lead Not
a Concern in County Water”) was removed from the Web site and replaced when the
lead-contamination story broke in the Post,
no official copy of the original announcement was retained in any form.
Organization
The NHPRC study recommends that
organizations provide three separate but closely coordinated roles in the
management of their Web sites:
Webmasters –
manage information technology aspects of Web sites
Content managers
– create and manage informational content of Web site postings
Records officers
– ensure that official records management and archival responsibilities are
carried out
Recognition of these roles for
Web sites and establishing responsibilities for each are essential steps toward
risk reduction through coordination of content and records management.Depending
on the culture, size and staffing of the organization, content creation may be
centralized or decentralized. Moreover, content creation responsibilities may
change under certain circumstances. Where normally content creation
responsibilities might be decentralized, in crisis-management circumstances
they may be elevated to a higher, centralized, multi-disciplinary authority and
revert back when the crisis is over.
In the Arlington example, the “Lead Not a
Concern” announcement that the Post
cited had been created by the content manager in the Department of Public Works
(DPW). When the Post reported that
Arlington had undertaken special drinking water tests while that announcement
was still on its Web site, that content was immediately removed from the DPW
home page. Responsibility for information releases on this subject shifted to
the public information office under a multi-disciplinary team that included
managers from DPW and the health and legal departments. The team removed the
DPW page, replaced it with content on the county home page advising citizens of
testing results it had received the same day that showed elevated lead levels
in five of eight residences, and posted precautionary measures. The case
illustrates the risky nature of withholding information that is contradictory
to Web-published information, especially in government organizations where such
information is easily leaked and can become the source of embarrassment and
citizen cynicism when revealed.
Whether content creation is centralized or
decentralized, Web publishing, standards, including the look and feel of pages
throughout the site should be centralized to maintain the organization’s
“branding” so that public users will know that they are still browsing the same
organization’s Web pages. McClure and Sprehe noted numerous cases of multiple
domain names within the same agency, complicating difficulties in coordinating
Web-site content and style and leaving public users uncertain about
relationships (if any) of one site to another.
Where the organizational
culture values its records as prime intellectual assets, it may place Web
publishing, standards, and recordkeeping functions effectively under the CIO.
If the organization values its records only as a means of reducing risk, it may
place the archivist and records manager function under the chief counsel.
However, these should not be seen as mutually exclusive value sets.
The CIO model is widely used
in the federal government and elsewhere with varying degrees of success. In
some cases, this approach has been seen as a way to better integrate records
and compliance management and “hard-wiring” them into the organization’s
information and technology architecture. In other cases, the CIO has used the
integration to cherry-pick positions out of the records group to further build
the IT group.
Web Policy
However
Web content is organized and managed, it is essential that policies for Web
publishing be formulated by a group representing key stakeholders that
addresses Web mastering, content management, and records management
requirements. Stakeholders
may include those responsible for content management, archives and records
management, libraries, IT, legal, auditing and public relations.
Where Web content is decentralized, Web
policymakers should also consider the desirability of procedures for elevating
topic-specific content creation to a centralized multi-disciplinary management
team during crisis situations. Like all coordination, this may result in slower
response times during rapidly changing events. What it loses in time, however,
it likely gains in more accurate information that takes into account the
expertise of key stakeholders.
Managing Public Expectations
Regular users of media Web sites have
become accustomed to Web sites being updated on a real-time basis with the most
up-to-date, complete, and accurate information. The best news-media Web sites
invest in the skills and technology necessary to do this, because publishing
current information and research constitute their core products and
competencies. For major newspapers, their print version is not much more than a
snapshot of their Web site at pre-determined publication times. Until such time
as organizations recognize and resource information as a core product/service,
this is an expectation that few business or government Web sites can live up
to.
Thus, when government Web sites are seen
to be slow with their updates, the public may view it as government
stonewalling or covering up. It is therefore critical that Web sites display
easily visible notices to mitigate setting unrealistic public expectations they
cannot meet, and avoid using content update/revision dates that do not reflect
content changes. On the other hand, use of blogs can improve an organization’s ability
to more quickly respond to rapidly changing events during command-and-control
situations.
Web Content Dating, Removal and
Destruction
Web content dating, removal and
destruction are among several Web site standards that must be addressed. They are
open to considerably different treatment by different content managers in ways
that can have serious recordkeeping consequences.
Some Web sites do not date their content.
Some carry the current date on the home page only. Others use different
conventions on different pages. Individual content managers may use different
conventions for similar announcements. To illustrate, again using the Arlington
County example, its Department of Environmental Services (DES) FAQ on “Drinking
Water Information” (www.arlingtonva.us/Departments/EnvironmentalServices/uepd/waterops/EnvironmentalServicesWaterops.aspx)
was not dated as this publication went to press. As the FAQ was revised several
times during the lead-contamination incident, it probably should have been
clearly marked with correct “Updated” or “Revised” dates for concerned citizens
visiting it daily. However, another DES page (www.arlingtonva.us/departments/EmergencyManagement/emergency/EmergencyManagementEmergencyIsabelWater.aspx)
regarding Hurricane Isabel (9/2003) showed whatever date and time the page was
opened/refreshed; but it was labeled “Updated” even though it was an unchanged,
year-old announcement. Other pages have the same practice but label the dates
“Revised.” Perhaps it simply reflects a lack of standards, or it is to give the
appearance of being updated on a daily basis, but the practice both
misrepresents reality and creates higher public expectations than can be met.
It is also inconsistent that a year-old emergency hurricane announcement would
remain on the Web site while the controversial, “Lead Not a Concern”
announcement would be removed and destroyed without retaining a copy. Policy
should require appropriate, consistent standards for such matters as
content/page dating, removal and destruction. These considerations are
essential to proper Web-site recordkeeping, as are the appraisal and
designation of Web-site disposition management schedules, preferably through a
hands-off archival authority.
Final Analysis: Web Content Is a Record
The rapid uptake of e-business and e-gov
applications using Web-publishing systems has outpaced the ability of many
organizations to properly manage the records produced in these systems. Often
this is coupled with a lack of appreciation in the executive corridors that Web
sites even produce records. So long as this technology is used for
customer-facing and public-facing business/representational purposes, the
content and transactions on such sites constitute organizational records and
therefore must be captured, preserved, and managed into paper-based or
electronic records systems. Most such applications are adopted to reduce
paperwork, and some include multimedia content not amenable to recording on
paper.
For most organizations, this means that
integration of Web content and electronic records management is essential.
Failure to do so puts the organization at considerable legal, regulatory, and
ethical risk and opens it to alienation of its client and public bases.
Moreover, it robs the organization of one of its most precious assets –
hard-earned and well paid-for institutional memory.
Rick
Barry is a management consultant and Principal of Barry Associates, a
consulting firm that specializes in information management and technology, and
electronic archives and records management. Barry is content manager for
www.mybestdocs.com and a co-founder of OpenReader™,a cooperative
project to create an open, next-generation software built on XML and related
open standards for reading multimedia and digital publications and for
long-term preservation of, and continued easy access to, multimedia digital
documents. He may be contacted at RICKBARRY@aol.com.
References
“A Very Brief Look at Blogging for the Uninitiated
Executive.” Global PR Blog Week 1.0. Available at www.globalprblogweek.com/archives/a_very_brief_look_at.php
(accessed 9 September 2004).
Gowen, Annie. “Arlington Issues Warning on
Lead in Water.” Washington
Post, 3 March 2004.
___. “As Fears Grow, Arlington Tests Water
for Lead, D.C. Treatment Plant Supplies County Homes,” by Annie Gowen, Washington Post,
2 March 2004.
Gupta, Amarnath. “Preserving Presidential
Library Websites, A Case Study with the Franklin D. Roosevelt Library, Museum
and Digital Archives.” San Diego Supercomputer Center, SDSC TR-2001-3, 18
January 2001. Available at www.sdsc.edu/TR/TR-2001-03.pdf
(accessed 10 September 2004).
Kahn, Randolph and Barclay T. Blair. Information Nation: Seven Keys to
Information Management Compliance. Silver Spring, Maryland: AIIM
International, 2004.
McClure, Charles R. and J. Timothy Sprehe.
“Guidelines for Electronic Records Management of State and Federal Website.”
Washington, D.C.: National Historical Publications and Records Commission,
National Archives and Records Administration, January 1998.
____. “Analysis and Development of Model
Quality Guidelines for Electronic Records Management on State and Federal
Websites.” Washington, D.C.: National Historical Publications and Records
Commission, National Archives and Records Administration, January 1998.
“How Americans Get in Touch with
Government.” Washington, D.C.: Pew Internet & American Life Project, May
2004. Available at www.pewinternet.org/pdfs/PIP_E-Gov_Report_0504.pdf
(accessed 10 September 2004).
International Standards Organization. Information and documentation — Records
management — Part 1: General. Available at www.iso.org/iso/en/CombinedQueryResult.CombinedQueryResult?queryString=iso+15489
(accessed 9 September 2004).
___. Information
and documentation — Records management — Part 2: Guidelines
Jones, Virginia. “Protecting Records: What
the Standards Tell Us.” TheInformation Management Journal 37
(March/April 2003).
Milbank, Dana. “White House Web
Scrubbing.” Washington Post, 18
December 2003.
____. WWW.MYBESTDOCS.COM. The above-cited
NHPRC reports and others papers, including on Web records implementation at the
Smithsonian Institution, San Diego Supercomputer Center project, University of
Melbourne Web Archiving Strategy Project (WASP) and MIT DSpace Project, are
accessible at www.mybestdocs.com in the Hot Topics/Content Management and
Preservation section.
[1] This paper
was originally published in The
Information Management Journal, the professional journal of ARMA, Vol. 38,
No. 6, Nov/Dec 2004 and is published here with the kind permission of the IMJ.