Digital Archives

Blog posts about Digital Archives

An Emulation Experiment

Through my technical work and experience with preservation projects, I feel that I have a good grasp of migration as a digital preservation strategy. Unfortunately, I have much less functional experience with emulation as a digital preservation strategy. The concept of running a virtual machine within the physical resources of another is intuitive enough, but I have as yet had no real experience with a full emulation environment.

Audio Encoding Project Milestone

This week I achieved a major milestone in my personal audio encoding project -- after a longer period of time than I planned. Other than a few stragglers, I have managed to completely encode all of my compact discs, including creating use copies and fully describing the objects using embedded metadata. These are all stored on a 0.75 TB NAS appliance configured in a RAID 5 array. Additionally, I have ingested (using the term rather loosely) the use copies into an access system, Ampache.

Metadata Quality Control

As I near the end of the first phase of my audio encoding project I feel the need to share some of the metadata quality control observations that I have collected.

Although ripping my CDs to digital media has been time consuming, it has not been near as laborious as checking and correcting the metadata that was automatically gathered during the process. FreeDB as an automatic metadata gathering service has been very helpful, but as I reviewed the corpus of encoded audio, I found many disturbing errors: misspellings, typos, missing articles, missing fields omission or misrepresentation of international characters, and, of course, the usual discrepancies in case handling, title formating, and normalized forms..

Slashdot: Archiving Digital Data an Unsolved Problem

The headline on a front page post on Slashdot today reads:

"Archiving Digital Data an Unsolved Problem"

which links to this article in Popular Mechanics. For archivists, this headline states the obvious, but the words betray how the technology sector, at least stereotypically, views archives and backups as equivalent. Wading through the comments (and discarding the obligatory comical entries), we find a rather robust discussion on digital preservation, sans academic terminology. All the familiar preservation topics -- Migration, emulation, media and file formats, genres, the influence of intellectual property law -- are touched upon, if rather superficially. One commenter brought up the issue of compression in digital archives, but it seems that none have touched the DRM issue (I'll have to remedy that).

That said, however, it is encouraging to see this article highlighted on one of the premiere tech blogs as well as in Popular Mechanics. It's going to take quite a bit more exposure to digital preservation problems in the tech community to get the point across -- to impart the long view, as it were -- but this is a good start.

Neil Beagrie on Personal Digital Libraries and Collections

I finally got around to reading Neil Beagrie's D-Lib article, "Plenty of Room at the Bottom? Personal Digital Libraries and Collections" (June 2005), and I regret not having done so sooner (alas, I have a great deal left in my "to read" folder). This article touches on several major themes in my academic pursuits of the last few years, which I will briefly describe here.

Audio Encoding Project Resumes (or, a funny thing happened on the way to 300 GB)

It's been a while (almost 8 months, to be exact) since I have updated this forum on the status of my audio encoding project. I could cite the usual life delays and an unusually busy Summer as excuses, but there is more to it.

So, a funny thing happened on my way to 300 GB...

Reflections on the SAA 2006 Annual Conference - Part II

This entry is a continuation of my observations on this year's SAA annual conference. For more, see Part I.

Reflections on the SAA 2006 Annual Conference - Part I

Last week I breezed through Washington, DC to attend the SAA/NAGARA/CoSA Joint Conference. Last year at this time, I attended the SAA conference as a new, student member and, as it was my first ever professional conference, I spent most of the time trying to acclimate myself to the conference ebb and flow. This year I've committed to taking better notes, talking a bit more, and, of course, sharing my observations here.

What's in a Creation Date?

There is a certain perception that often accompanies digital objects and, more broadly, computer systems as a whole. This sort of perception manifests itself when, for example, we hear about how massively compressed digital MP3 files are considered to be "perfect" quality audio or in similar myths concerning the infallibility of all things digital. These perceptions are based on incomplete or inaccurate assumptions about how software, operating systems, or file systems function. My favorite way of stating this is that computers are only as smart as those who designed them – if to err is human, then the same goes for our electronic creations.

When making the transition from paper to digital records, these assumptions are likely to appear in unexpected places. While working on the Joyce collection, we ran headlong into one of these assumptions, made a note of it, then moved on. But I promised that I would look closer at the issue at a later time... so here I go.

PAT Project Lessons Learned, Part 2

I first heard about the Persistent Archives Testbed (PAT) Project at the SAA Annual Meeting in August 2005. The project merges the efforts of several large institutions -- NHPRC, NARA, SDSC, etc. -- in an effort to test data grid technology as a means of federated archival storage. In two of the more recent issues of Archival Outlook published by SAA, the a question has been posed to two different groups. The question is roughly: what skills are needed to work with electronic records; The two groups asked were archivists and IT professionals. In light of my recent musings, and the upcoming colloquium in Washington D.C., I took great interest in the most recent article.