Through my technical work and experience with preservation projects, I feel that I have a good grasp of migration as a digital preservation strategy. Unfortunately, I have much less functional experience with emulation as a digital preservation strategy. The concept of running a virtual machine within the physical resources of another is intuitive enough, but I have as yet had no real experience with a full emulation environment.
It's time again for another installment in my ongoing audio encoding project saga. For some time now I have been on the verge of the next phase of the project, which involves encoding the remaining analog sound objects in my collection, specifically cassette tapes and vinyl records. Procrastination, combined with a serious dose of being busy with other things, has delayed my progress on this phase of the project, but one technical aspect has also proved crucial.
I've volunteered at the local National Archives branch for over a year now. Over this time I have gained the acute impression of lament over the dwindling number of researchers and members of the public who make the trip out to the archives to do research. Indeed, it is not uncommon for me to enter a virtually empty research room during my Thursday afternoon visits.
What is seldom spoken of, but is vitally central to the issue, is the instant information gratification that the general populace receives from their increasingly ubiquitous internet connections. The reams of information available through Google, or the convenience of accessing Ancestry.com from home (instead of for free at the archives), keeps them at home and only adds insult to the injury of fewer patrons.
This week I achieved a major milestone in my personal audio encoding project -- after a longer period of time than I planned. Other than a few stragglers, I have managed to completely encode all of my compact discs, including creating use copies and fully describing the objects using embedded metadata. These are all stored on a 0.75 TB NAS appliance configured in a RAID 5 array. Additionally, I have ingested (using the term rather loosely) the use copies into an access system, Ampache.
It has been well over a year since my last digital storage update, and though there has not been any earthshaking new technology announced within that time, there has nevertheless been some advancement in several areas that I would like to address.
As I near the end of the first phase of my audio encoding project I feel the need to share some of the metadata quality control observations that I have collected.
Although ripping my CDs to digital media has been time consuming, it has not been near as laborious as checking and correcting the metadata that was automatically gathered during the process. FreeDB as an automatic metadata gathering service has been very helpful, but as I reviewed the corpus of encoded audio, I found many disturbing errors: misspellings, typos, missing articles, missing fields omission or misrepresentation of international characters, and, of course, the usual discrepancies in case handling, title formating, and normalized forms..
The headline on a front page post on Slashdot today reads:
"Archiving Digital Data an Unsolved Problem"
which links to this article in Popular Mechanics. For archivists, this headline states the obvious, but the words betray how the technology sector, at least stereotypically, views archives and backups as equivalent. Wading through the comments (and discarding the obligatory comical entries), we find a rather robust discussion on digital preservation, sans academic terminology. All the familiar preservation topics -- Migration, emulation, media and file formats, genres, the influence of intellectual property law -- are touched upon, if rather superficially. One commenter brought up the issue of compression in digital archives, but it seems that none have touched the DRM issue (I'll have to remedy that).
That said, however, it is encouraging to see this article highlighted on one of the premiere tech blogs as well as in Popular Mechanics. It's going to take quite a bit more exposure to digital preservation problems in the tech community to get the point across -- to impart the long view, as it were -- but this is a good start.
I finally got around to reading Neil Beagrie's D-Lib article, "Plenty of Room at the Bottom? Personal Digital Libraries and Collections" (June 2005), and I regret not having done so sooner (alas, I have a great deal left in my "to read" folder). This article touches on several major themes in my academic pursuits of the last few years, which I will briefly describe here.
On (roughly) the 50th anniversary of the invention of the hard drive, Tom's Hardware interviews Seagate's Senior Field Applications Engineer Henrique Atzkern (Quo Vadis, Hard Drive? The 50th Anniversary of the HDD). In it, we catch a glimpse of some of the ideas being explored for increasing hard drive density, speed, and reliability, among other things. Parsing through the acronym alphabet soup and surface technicality, one thing remains clear: hard drive manufacturers are not running out of ideas for increasing storage capacity, so we can expect to continue seeing dramatic leaps in storage capacities.
It's been a while (almost 8 months, to be exact) since I have updated this forum on the status of my audio encoding project. I could cite the usual life delays and an unusually busy Summer as excuses, but there is more to it.
So, a funny thing happened on my way to 300 GB...