PAT Project Lessons Learned, Part 2

I first heard about the Persistent Archives Testbed (PAT) Project at the SAA Annual Meeting in August 2005. The project merges the efforts of several large institutions -- NHPRC, NARA, SDSC, etc. -- in an effort to test data grid technology as a means of federated archival storage. In two of the more recent issues of Archival Outlook published by SAA, the a question has been posed to two different groups. The question is roughly: what skills are needed to work with electronic records; The two groups asked were archivists and IT professionals. In light of my recent musings, and the upcoming colloquium in Washington D.C., I took great interest in the most recent article.

Part two of the article series, IT Professionals' Perspectives (Archival Outlook, Mar/Apr 2006, pp. 8 & 27, not yet available online), asks: "what skills / knowledge should IT professionals have to work with archival records and archivists?" Three people were asked (or at least responded to) this question: Adil Hasan of the e-Science Center at the Rutherford Appleton Laboratory in the UK, and Richard Marciano and Reagan Moore of the SDSC. Eureka, I thought – this is exactly the issue I have had running around in my mind lately and from the perspective that has the most to offer with regards to my personal interest.

Hasan starts off strong, proffering that IT types working alongside archivists must have explicit domain knowledge of archival workflows and concepts. This is basic, however, as any programmer trying to develop applications for any domain must have an understanding of that domain -- be it supply chain management, inventory control, marketing, data mining, or even archives. The important takeaway from Hasan is the notion of developing a "toolkit" for archivists to "[combat] the deluge of electronic information." This is exactly the conclusion I came to working on the Joyce collection last year, and from what I understand, quite a lot of attention is being brought to the issue of archival toolkits.

Marciano also touches a salient point: that archivists and IT have different ways of speaking that often overlap. Archivists do much eye-rolling in response the ways tech companies use the term "archival," particularly when describing storage media. Less frequent, but no less important, are differences in the meanings of creation date (especially as implemented by a certain dominant computer operating system), "archive" as a type of compressed file, and other notions of metadata and naming conventions that are imposed by well-meaning IT professionals. Marciano correctly asserts that this terminology gap must be bridged by IT personnel who work with archivists and describes the ongoing efforts by Richard Pearce-Moses to that effect.

But although the respondents answered the question asked of them, this is where my enthusiasm faded slightly. Both Marciano and Moore, from what I was able to read of their comments, address the immediate concerns of IT personnel working on archival projects and with archivists, but appear to avoid the notion that there must be some kind of backflow of concepts into the IT field itself in order to truly address the question of long term preservation of electronic records. Marciano gets close when he says: "If navigated properly, the unsuspected world of archives that unfolds has the potential to draw IT folks in and transform them into champions of the cause." Precisely. And this is the crux of the challenge: how can archivists instill some of the basic, time-worn traditions of records management and preservation for maintaining reliable and authentic records back into the short-term, light-speed horizon of IT? IT is great and manipulating electronic information in any way imaginable, so it is not a stretch to believe that they can effectively and definitively extend its longevity as well, beyond mere "backwards compatibility" and better search methods.

Somehow, we as archivists (or the archivally-aware) must figure out how to imbue programmers with a consciousness of how the artifacts produced by their applications are used, now and into the far future, and how their decisions affect not only the ability of future users to access the information, but the ability of the custodians of that information to work with it as well. Such an awareness will present the IT field with a challenge that they cannot live down and will conquer (with guidance, of course). In summary, I maintain that we must take a two pronged approach in order to overcome the problem of electronic information: combat the deluge of information with appropriate tools (as Hasan put it), and expand the utility of artifacts produced by the tech sector as informed by archival practice.