<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xml:base="http://thomas.kiehnefamily.us"  xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
 <title>infoSpace - Digital Preservation</title>
 <link>http://thomas.kiehnefamily.us/taxonomy/term/13/0</link>
 <description></description>
 <language>en</language>
<item>
 <title>An Emulation Experiment</title>
 <link>http://thomas.kiehnefamily.us/an_emulation_experiment</link>
 <description>&lt;p&gt;Through my technical work and experience with &lt;a href=&quot;from_floppies_to_repository_a_transition_of_bits&quot;&gt;preservation projects&lt;/a&gt;, I feel that I have a good grasp of migration as a digital preservation strategy.  Unfortunately, I have much less functional experience with emulation as a digital preservation strategy.  The concept of running a virtual machine within the physical resources of another is intuitive enough, but I have as yet had no real experience with a full emulation environment.  Recently I thought about recovering some classic-era Macintosh files that I have had in storage and figured that their recovery could make for an excellent hands-on emulation experience.&lt;/p&gt;
&lt;p&gt;I used Macintosh computers from their introduction in 1984 until 1998.  My last Mac was a 16 Mhz 78030 machine with 8 Mb RAM, and 80 Mb of hard drive space running OS 7.5.  Before disposing of the machine in 2005 (donating it to Goodwill), I copied the entire contents of the hard drive – system software, applications, and all – to a 100 Mb ZIP disc. Simply using one of the several Mac to Windows utilities would suffice for recovering many of the documents that are cross-platform (images, raw text, etc.) or that could be migrated (Microsoft Word, Excel, etc.), but there are some documents that might be better off using the original application environment to make the conversion to a cross-platform format.&lt;/p&gt;
&lt;p&gt;A number of programs were discovered during my initial search for Mac emulators, including &lt;a href=&quot;http://www.emulators.com/softmac.htm&quot;&gt;Softmac&lt;/a&gt;, &lt;a href=&quot;http://pearpc.sourceforge.net/&quot;&gt;PearPC&lt;/a&gt;, &lt;a href=&quot;http://www.vmac.org/&quot;&gt;vMac&lt;/a&gt;, &lt;a href=&quot;http://shapeshifter.cebix.net/&quot;&gt;Shapeshifter&lt;/a&gt;, &lt;a href=&quot;http://www.ardi.com/executor.php&quot;&gt;Executor&lt;/a&gt;, and &lt;a href=&quot;http://basilisk.cebix.net/&quot;&gt;Basilisk II&lt;/a&gt;.  These programs run the spectrum from proprietary to open source, the different types of hardware environments emulated, and the platforms that the software will run on.  Additionally, many of these programs have not been maintained in some years.  For my experiment, I chose the &lt;a href=&quot;http://gwenole.beauchesne.info/projects/basilisk2/&quot;&gt;Windows port of Basilisk II&lt;/a&gt; since it met my basic criteria of a free, open source program that is still somewhat current.&lt;/p&gt;
&lt;p&gt;The basic concept behind emulators is that they provide hardware encapsulation and interfaces to allow an operating system (OS) to run in a non-native environment.  The system requirements for these emulators is quite unassuming for current computing hardware, however there are some extra software requirements that are somewhat peculiar.  In many of the programs mentioned above, a copy of the Mac ROM BIOS is required to run the emulator at all, and in every case, a complete copy of the emulated OS is required in order to run software within the emulator.&lt;/p&gt;
&lt;p&gt;The ROM BIOS is a physical chip present in the Macintosh hardware that contains the basic machine instructions used by the OS, which is considered by Apple to be proprietary code.  The emulators avoid copyright infringement by requiring the user to provide a copy of the ROM rather than embed one into the software, thus violating Apple&#039;s intellectual property rights. The ROM can be obtained legally in one of two ways: extract the ROM BIOS from a functional Mac of the correct vintage for the OS version to be emulated; or, purchase a ROM card with an actual Macintosh ROM chip from a commercial vendor.  The preservation-minded among us can already see issues for future emulation efforts. Incidentally, the Copyright Office has allowed &lt;a href=&quot;http://www.copyright.gov/1201/&quot;&gt;exceptions to copyright&lt;/a&gt; and anti-circumvention provisions of the Digital Millenium Copyright Act (DMCA) in certain circumstances for archives and libraries which would seem to include this very situation.  (The exception will be in force until October 2009, at which time it will have to be renewed... would it not be nice to have a permanent exception?)&lt;/p&gt;
&lt;p&gt;As for having a copy of the OS, there are also two ways to get a copy: 1) have an existing system disc or copy of a system folder; or 2) get a disc image from another source.  Apple still has some older system software disc images, installers, upgraders and the like for download, and it is pretty easy to find pre-made Mac system disc images for emulators out on the net.  Fortunately, I already have a system copy on my ZIP disc which will run so long as the emulation environment is of a vintage that supports the OS.  &lt;/p&gt;
&lt;p&gt;The only issue is how to copy the contents of the &lt;a href=&quot;http://en.wikipedia.org/wiki/Hierarchical_File_System&quot;&gt;HFS formatted&lt;/a&gt; ZIP disc, using Windows, to a place where the emulator can access it.  There are two steps to this: 1) creating an HFS volume within the Windows filesystem, and 2) accurately copying the contents of the original HFS media to the new volume, preserving all OS-specific aspects of the data.  For Macs, this means preserving both the resource and data forks of the files. (see the &lt;a href=&quot;from_floppies_to_repository_a_transition_of_bits&quot;&gt;Joyce project report&lt;/a&gt; for more on Mac files and HFS)&lt;/p&gt;
&lt;p&gt;Basilisk II uses disk volume images, called hardfiles (file extension: HFV), to simulate an HFS volume within the emulator. The Windows GUI interface (and presumably the command line tools for the non-Windows versions of Basilisk) can create a raw hardfile, but cannot copy anything into it. Fortunately there is a free program for Windows called &lt;a href=&quot;http://fenestrated.net/~macman/stuff/HFVExplorer/&quot;&gt;HFVExplorer&lt;/a&gt; that can create HFV files and view or manage their contents by copying to or from Windows volumes or other HFS volumes that are accessible to Windows (including CD-ROMs, floppy, SCSI, removables, etc.).  &lt;/p&gt;
&lt;p&gt;Unfortunately, HFVExplorer would not mount my ZIP drive – using either parallel port or USB model.  I would have to guess that because the program has not been updated since 1999 – before Windows 2000/XP – that the program was unable to correctly access the removable media.  It is possible that HFVExplorer running on Windows 98 or NT would not encounter this problem, but I am not about to revert to either of those operating systems.  Besides, having to use an older operating system in order to get an emulator to work runs counter to common sense.&lt;/p&gt;
&lt;p&gt;Unable to rectify the issue, I tracked down a copy of &lt;a href=&quot;http://www.mars.org/home/rob/proj/hfs/&quot;&gt;HFSUtils&lt;/a&gt; (&lt;a href=&quot;http://www.student.nada.kth.se/~f96-bet/hfsutils/&quot;&gt;Windows port&lt;/a&gt;), which is the utility package that HFVExplorer was originally based upon.  HFSUtils is a set of command line tools that provide basic file management tasks for mounting and manipulating HFS volumes.  I mounted the ZIP volume and moved around using the command line with ease.  Copying was laborious, however, because the hcopy program could not recursively copy nested directories. Ideally I would have modified the source to do this, or created a script to use HFSUtils, but then I might as well figure out how to fix HFVExplorer.  &lt;/p&gt;
&lt;p&gt;In order to move the experiment along, I proceeded by copying the directories and files individually using hcopy at the command line; the most banal of tasks, to be sure, but quite effective.  I copied files from the HFS ZIP to a location on my Windows hard drive, then used HFVExplorer to copy the files into an HFV volume file.&lt;/p&gt;
&lt;p&gt;Aside from the lack of a recursive directory copy, there were some other annoying problems.  Firstly, any characters in the source filenames with characters outside of the standard ASCII range (above number 127) had translation issues.  The problems came from trademark symbols, em dashes, and other special characters used in directory and filenames in Mac OS. When the command line utility rendered these characters in a file list, they appeared as question marks by default.  Even when using some of the program options for hls (the file listing utility), the characters still did not display correctly in the Windows character set.  Attempts to copy files with special characters failed since the DOS command line sent the translated character to the command.  As an aside, it is possible to access HFS nodes (directories and files) by their node ID (or &lt;a href=&quot;http://www.mactech.com/articles/mactech/Vol.02/02.01/HFS/index.html&quot;&gt;node specification pair&lt;/a&gt;), but unfortunately, hcopy exposed no means to exploit this feature.&lt;/p&gt;
&lt;p&gt;I noticed a second issue that relates to the file metadata, specifically the creation and modification dates for the files copied from the HFS source to the Windows volume.  The copies on the Windows volume showed creation and modification dates as the date of the copy operation and not the original dates.  Fortunately, the files retained their resource forks during the transfer to the emulation environment, meaning that they had all of the file metadata intact with the original dates.  I&#039;m already well aware of &lt;a href=&quot;whats_in_a_creation_date&quot;&gt;inconsistencies in file dates&lt;/a&gt; from platform to platform, but it would be ideal from a preservation perspective if the hcopy routine were to access the file metadata in the HFS source and set the correct creation date for the Windows copies.&lt;/p&gt;
&lt;p&gt;Incidentally, there are other programs available for copying HFS volume data to Windows: &lt;a href=&quot;http://www.mediafour.com/macdrive&quot;&gt;MacDrive&lt;/a&gt; and &lt;a href=&quot;http://www.asy.com/scrtm.htm&quot;&gt;TransMac&lt;/a&gt;. Each of these are commercial software, but free demos are sometimes available  On a whim I tried a demo version of TransMac, which copied the source files just fine, but I found out upon transferring the files to the HFV volume that the binary (program) files were converted in such a way that they would not function in the emulation environment.  Unless I missed something in my attempt, this issue would effectively prevent emulation using files copied using these programs.&lt;/p&gt;
&lt;p&gt;With all of the emulator software in place and a testbed of data from the original ZIP disk copied into an HFV volume file, I could test the emulator.  The Basilisk GUI program in Windows allows you to define a set of disk or volume images that will be loaded at startup.  This is the equivalent of having hard drives installed or discs inserted at boot time.  For an initial test, I downloaded and used a bare system 7.0 system disc image.  It worked flawlessly on the first try, providing me with a visceral demonstration of the difference between 16 Mhz and 1 Ghz processor speeds – my 1997 self would have been dazzled.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;inline center&quot; style=&quot;width: 300px;&quot;&gt;&lt;a href=&quot;/basilisk_ii_gui&quot;&gt;&lt;img src=&quot;http://thomas.kiehnefamily.us/thomas_files/images/mac_emulation_gui.jpg&quot; alt=&quot;Basilisk II GUI&quot; title=&quot;Basilisk II GUI&quot; class=&quot;image thumbnail&quot; height=&quot;264&quot; width=&quot;300&quot; /&gt;&lt;/a&gt;&lt;span class=&quot;caption&quot; style=&quot;width: 300px;&quot;&gt;&lt;strong&gt;Basilisk II GUI: &lt;/strong&gt;Selecting volumes at startup&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;After fixing a directory hierarchy issue with my copied volume (the system folder was not at the root level on the hard drive image), I relaunched the emulator which resulted in an almost perfect reproduction of  the mac desktop I last viewed in 2005.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;inline center&quot; style=&quot;width: 400px;&quot;&gt;&lt;a href=&quot;/basilisk_ii_running_os_7_5_in_windows_2000&quot;&gt;&lt;img src=&quot;http://thomas.kiehnefamily.us/thomas_files/images/mac_emulation_desktop.jpg&quot; alt=&quot;Basilisk II Running OS 7.5 in Windows 2000&quot; title=&quot;Basilisk II Running OS 7.5 in Windows 2000&quot; class=&quot;image thumbnail&quot; height=&quot;300&quot; width=&quot;400&quot; /&gt;&lt;/a&gt;&lt;span class=&quot;caption&quot; style=&quot;width: 400px;&quot;&gt;&lt;strong&gt;Basilisk II: &lt;/strong&gt;Running Mac OS 7.5 in Windows 2000&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;Next step: resolve the transfer issues to try and establish a full, unmodified copy of the HFS source ZIP disk into the HFV volume file.  Ideally, an unfettered copy should work flawlessly for all programs, assuming that the emulator is complete. If that is the case, then I will finally be able to access the last remaining documents that I need to convert within the original application environment.&lt;/p&gt;
</description>
 <comments>http://thomas.kiehnefamily.us/an_emulation_experiment#comments</comments>
 <category domain="http://thomas.kiehnefamily.us/blog_topics/digital_archives">Digital Archives</category>
 <category domain="http://thomas.kiehnefamily.us/blog_topics/digital_preservation">Digital Preservation</category>
 <category domain="http://thomas.kiehnefamily.us/blog_topics/emulation">Emulation</category>
 <pubDate>Tue, 30 Oct 2007 06:01:53 +0000</pubDate>
 <dc:creator>tkiehne</dc:creator>
 <guid isPermaLink="false">48 at http://thomas.kiehnefamily.us</guid>
</item>
<item>
 <title>Software Activation, DRM, and Implications for Digital Preservation</title>
 <link>http://thomas.kiehnefamily.us/software_activation_drm_and_implications_for_digital_preservation</link>
 <description>&lt;p&gt;It&#039;s time again for another installment in my ongoing &lt;a href=&quot;/blog_topics/audio_encoding_project&quot; rel=&quot;nofollow&quot;&gt;audio encoding project&lt;/a&gt; saga.  For some time now I have been on the verge of the next phase of the project, which involves encoding the remaining analog sound objects in my collection, specifically cassette tapes and vinyl records.  Procrastination, combined with a serious dose of being busy with other things, has delayed my progress on this phase of the project, but one technical aspect has also proved crucial.&lt;/p&gt;
&lt;p&gt;In order to digitize the analog sound objects I require a software platform for encoding the analog input into digital objects that is also capable of cleaning-up the analog input of analog artifacts, such as tape hiss, pops, clicks, scratches, etc.  There are many software packages that are available on the market for sound recording and processing and, fortunately, I already &quot;own&quot; one of them: Sonic Foundry&#039;s Sound Forge.&lt;/p&gt;
&lt;p&gt;So, what&#039;s the technical problem, you ask?  Well, I purchased version 5 of the software in 2001 as part of a special introductory promotion at a very reasonable price.  Unfortunately, Sonic Foundry transferred ownership of the entire Sound Forge product line, as well as a few other key products, to Sony in 2003.  Normally this wouldn&#039;t mean a thing, except for the fact that professional-level software like Sound Forge is protected by an online registration/activation scheme.  In a nutshell, the software will install and run just fine for a 30 day trial period.  During that period, you are expected to perform one of a set of procedures to register the product with the vendor which, when completed, will eliminate the 30 day countdown and give you full, unlimited access to the program.  As you can guess, the transfer to Sony complicated the process in that the online registration routine built into the original program could no longer find the registration server &amp;#151; these functions had been transferred to Sony while the software remained unchanged.&lt;/p&gt;
&lt;p&gt;Not being satisfied with only 30 days of the program at a time, and unwilling to shell out the bucks to upgrade, I embarked on a search to figure out the new registration procedures.  I&#039;ll spare the details, except to say that it took some Googling, several failed customer service contact attempts, numerous user forum searches, and a call to a number that I finally managed to track down, which implored me to visit a chat application on their Web site in order to get to the information I needed to reactivate &quot;my&quot; software.&lt;/p&gt;
&lt;p&gt;In the end, no big deal, right?  But, my experience exposes some very important digital preservation issues.  Sound Forge is not itself a particularly important piece of digital information in itself.  It is a toolkit used to create the artifacts in which we are interested; in this case, sound artifacts.  The same could be said about Photoshop, or any of an increasing number of professional media toolkits.  Perhaps the furthest extent that a person in the future might need current or past versions of these software tools would be to regenerate projects that one might have created using them, or to analyze detailed technical aspects of the software.  But, again, it is the products of these programs that will most likely interest future users, archivists, and the like.&lt;/p&gt;
&lt;p&gt;But consider this:  the registration and activation process used in software like Sound Forge is conceptually identical to the license management process in Digital Rights Management (DRM) schemes used to protect digital information, particularly music, movies, and other copyrighted works.  Having reviewed my account above, one could imaging that instead of activating software that I purchased, that I might have been trying to access a DRM-encoded sound or video file that I had purchased in the past.  The same issues with license servers, transfer of ownership/responsibility, changes in the license registration schemes and so on are just as pertinent in this new situation.&lt;/p&gt;
&lt;p&gt;Everything managed to turn out alright for me in this case, but imagine if Sonic Foundry had simply disappeared instead of selling off it&#039;s product line?  Or what if I had tried to install this software 10 or 15 years later, after the market had decided that the software no longer held enough value to justify supporting it?  All discussion about ownership of digital information aside (a discussion of which would explain my liberal use of scare quotes), it seems apparent from this example that if left to the market (as governed by long copyright terms and far-reaching copyright legislation), we stand to lose not only the right to preserve digital information, but the technical ability to do so.  Conveniently enough, I&#039;ve treated on &lt;a href=&quot;/technologies_of_access_and_the_cultural_record&quot; rel=&quot;nofollow&quot;&gt;this situation&lt;/a&gt; before.&lt;/p&gt;
&lt;p&gt;Stay tuned as I embark on the more complicated phases on my encoding project.&lt;/p&gt;
</description>
 <comments>http://thomas.kiehnefamily.us/software_activation_drm_and_implications_for_digital_preservation#comments</comments>
 <category domain="http://thomas.kiehnefamily.us/blog_topics/access">Access</category>
 <category domain="http://thomas.kiehnefamily.us/blog_topics/audio_encoding_project">Audio Encoding Project</category>
 <category domain="http://thomas.kiehnefamily.us/blog_topics/digital_preservation">Digital Preservation</category>
 <category domain="http://thomas.kiehnefamily.us/blog_topics/drm">DRM</category>
 <pubDate>Fri, 03 Aug 2007 23:00:39 +0000</pubDate>
 <dc:creator>tkiehne</dc:creator>
 <guid isPermaLink="false">45 at http://thomas.kiehnefamily.us</guid>
</item>
<item>
 <title>Slashdot: Archiving Digital Data an Unsolved Problem</title>
 <link>http://thomas.kiehnefamily.us/slashdot_archiving_digital_data_an_unsolved_problem</link>
 <description>&lt;p&gt;The headline on a front page &lt;a href=&quot;http://hardware.slashdot.org/article.pl?sid=06/11/20/2036247&quot;&gt;post on Slashdot&lt;/a&gt; today reads:&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;&lt;b&gt;&quot;Archiving Digital Data an Unsolved Problem&quot;&lt;/b&gt;&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;which links to this &lt;a href=&quot;http://www.popularmechanics.com/technology/industry/4201645.html&quot;&gt;article in &lt;i&gt;Popular Mechanics&lt;/i&gt;&lt;/a&gt;.  For archivists, this headline states the obvious, but the words betray how the technology sector, at least stereotypically, views archives and backups as equivalent.  Wading through the comments (and discarding the obligatory comical entries), we find a rather robust discussion on digital preservation, sans academic terminology.  All the familiar preservation topics -- Migration, emulation, media and file formats, genres, the influence of intellectual property law -- are touched upon, if rather superficially.  One commenter brought up the issue of compression in digital archives, but it seems that none have touched the &lt;a href=&quot;/technologies_of_access_and_the_cultural_record&quot;&gt;DRM issue&lt;/a&gt; (I&#039;ll have to remedy that).&lt;/p&gt;
&lt;p&gt;That said, however, it is encouraging to see this article highlighted on one of the premiere tech blogs as well as in &lt;i&gt;Popular Mechanics&lt;/i&gt;.  It&#039;s going to take quite a bit more exposure to digital preservation problems in the tech community to get the point across -- to impart the long view, as it were -- but this is a good start.&lt;/p&gt;
&lt;!--break--&gt;&lt;!--break--&gt;</description>
 <comments>http://thomas.kiehnefamily.us/slashdot_archiving_digital_data_an_unsolved_problem#comments</comments>
 <category domain="http://thomas.kiehnefamily.us/blog_topics/digital_archives">Digital Archives</category>
 <category domain="http://thomas.kiehnefamily.us/blog_topics/digital_preservation">Digital Preservation</category>
 <pubDate>Tue, 21 Nov 2006 06:24:33 +0000</pubDate>
 <dc:creator>tkiehne</dc:creator>
 <guid isPermaLink="false">37 at http://thomas.kiehnefamily.us</guid>
</item>
<item>
 <title>Reflections on the SAA 2006 Annual Conference - Part I</title>
 <link>http://thomas.kiehnefamily.us/reflections_on_the_saa_2006_annual_conference_part_i</link>
 <description>&lt;p&gt;Last week I breezed through Washington, DC to attend the &lt;a href=&quot;http://www.archivists.org/conference/dc2006/&quot;&gt;SAA/NAGARA/CoSA Joint Conference&lt;/a&gt;.  Last year at this time, I attended the SAA conference as a new, student member and, as it was my first ever professional conference, I spent most of the time trying to acclimate myself to the conference ebb and flow.  This year I&#039;ve committed to taking better notes, talking a bit more, and, of course, sharing my observations here.&lt;/p&gt;
&lt;!--break--&gt;&lt;!--break--&gt;&lt;p&gt;First off, these notes are my attempt to forge meaning from the shards of information that reached me.  They are not meant to be comprehensive in their coverage of the sessions I attended, but merely document my thoughts and observations which, predictably, are skewed towards my own research interests.  These observations are very raw and are meant to suggest areas of further research or verification.  As clearly as possible I will try to indicate what was directly expressed versus what I interpreted or generated.&lt;/p&gt;
&lt;p&gt;Second, I consciously entered each of these sessions with some overarching personal question or intent, not only to help me decide which sessions to attend but to ensure that my mind remained focused on the topics and issues that are of interest to me.  I will state these for each session&#039;s notes which should help the reader understand my mindset and the subsequent observations.&lt;/p&gt;
&lt;p&gt;In this episode, the first day:  Thursday, 3 August, 2006.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;a href=&quot;http://www.archivists.org/conference/dc2006/dc2006prog-Session.asp?event=1708&quot;&gt;Session #103: &amp;ldquo;&#039;X&#039; Marks the Spot: Archiving GIS Databases&amp;rdquo;&lt;/a&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;I attended this session because I hoped to gain some insight into preservation efforts focused on what I will call &amp;ldquo;non-linear&amp;rdquo; records &amp;ndash; things like data sets, Web applications, and other &amp;ldquo;New Media&amp;rdquo; information.  It has long puzzled me how to apply the best practices of digital document preservation to digital forms that span application domains, physical locations, networks, and so on.  My concern arose during the processing of the &lt;a href=&quot;/from_floppies_to_repository_a_transition_of_bits&quot;&gt;Joyce papers&lt;/a&gt;, where hypertext was salient to many of the underlying works, but it also haunts me regularly in my capacity as a Web applications developer.  My working theory here is that geospatial data sets and the applications used to access them present generally the same preservation challenges as software, multimedia &amp;amp; games, relational databases, and so on.&lt;/p&gt;
&lt;p&gt;Three presentations were given, each with distinctive backgrounds and approaches.  Helen Wong Smith of the Kamehameha Schools of Hawaii presented a geospatial cultural / historical database project used to document and maintain land holdings in Hawaii.  Next, Richard Marciano of the San Diego Supercomputer Center presented briefs about several ongoing projects with GIS and geospatial aspects.  Among these were the &lt;a href=&quot;http://www.interpares.org/ip2/ip2_case_studies.cfm?study=23&quot;&gt;InterPARES VanMap project&lt;/a&gt;, the &lt;a href=&quot;http://www.sdsc.edu/PAT/&quot;&gt;Persistent Archival Testbed (PAT) project&lt;/a&gt;, &lt;a href=&quot;http://www.sdsc.edu/ICAP&quot;&gt;ICAP&lt;/a&gt;, and a new project called eLegacy.  Finally, James Henderson of the Maine State Archives presented some of his perspectives and challenges in preserving geospatial data as state government records.&lt;/p&gt;
&lt;p&gt;Geospatial data refers to data sets that link some sort of information (text, image, etc.) to a fixed location or area at a specified time period.  In the case of the  Kamehameha Schools, diverse media such as songs, images, and historical accounts are linked to specific locations within the School&#039;s land holdings.  Localities in the state of Maine maintain road and property data in GIS systems to support applications such as E911.  The most salient aspect of these data sets is that they change over time &amp;ndash; notable historical events happen periodically, roads are re-routed or built, and so on &amp;ndash; much as any other database changes when updated, which suggests that preservation efforts for one can be applied to the other and in other similarly structured applications.&lt;/p&gt;
&lt;p&gt;The three presentations did not flow seamlessly, but did manage to expose some overarching themes.  Perhaps the most significant theme that I observed is the relationship between data sets that change over time and versioning in unitary documents.  The key difference between these two concepts is that examining versions of a document reveals the thought process involved in achieving a final or published work, while examining geospatial data shows how things were at various points in time.  Additionally, the time between discrete versions of documents are usually much shorter than those of geospatial data, usually days versus years, and documents often have a terminal form after which changes cease, whereas geospatial data is usually open-ended or otherwise arbitrarily bounded.  Aside from these differences, the approach to preserving and accessing versions and geospatial data seems very similar.  Data sets that change over time lend themselves to access via temporal queries; where date or date range becomes part of the query criteria.  For a suitably large number of versions, an access mechanism based on date queries would work just as well as it would for geospatial data.  Further, for any body of records that span a period of time, temporal queries can be an immensely useful tool for narrowing query results to relevant time periods.&lt;/p&gt;
&lt;p&gt;When I thought about these ideas in terms of Web applications (such as CRM, sales support, inventory management, etc. -- putting aside the question of why save them) some of the analogies with GIS data break down.  For one, GIS data works in &#039;layers,&amp;rdquo; where types of data can be segregated like unitary documents.  Unfortunately, relational databases have no such abstraction &amp;ndash; they are built to store data efficiently, not in ways that can be easily separated.  &lt;/p&gt;
&lt;p&gt;Another problem is that even though Web application data can be captured by taking snapshots, in much the same way as GIS data, the rate of change within the data set can often be much faster &amp;ndash; on the order of seconds &amp;ndash; than the slower changes in things such as historical events and roads.  Further, as the snapshot horizon nears the immediate, the storage and processing requirements become untenable &amp;ndash; it is impossible to take a snapshot of a database with a frequency that is at or less than the time required to make the snapshot.   As an aside, I wonder what solutions might be suggested by data warehousing techniques.&lt;/p&gt;
&lt;p&gt;Beyond the capturing of the state of the data, Web applications require that not only the data be maintained, but the application code itself be maintained.  Seldom does an application remain unchanged over its service life &amp;ndash; bugs are repaired, features are added and removed, and so on.  These changes can affect the way that the underlying data is represented to the user.  Additionally, such changes are often accompanied by changes to the database structure itself.  As a result, snapshots should be acquired after such changes are applied.  Although not enough detail was given for each of these projects, I wonder if some of the same issues manifested in work with GIS data sets. &lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;a href=&quot;http://www.archivists.org/conference/dc2006/dc2006prog-Session.asp?event=1724&quot;&gt;Session #208: &amp;ldquo;Big Bird&#039;s Digital Future: Appraisal and Selection of Public Television Programming&amp;rdquo;&lt;/a&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;I attended this session in order to revisit my work on the &lt;a href=&quot;/digital_preservation_plan_for_the_texas_legacy_project&quot;&gt;CHAT digital video preservation plan&lt;/a&gt; in the context of similar video preservation projects.  I hoped to validate the decisions that were made in formulating the plan and see what new work, if any, had been done in digital video preservation and access since early last year.  As the title of the session suggests, the subject area focused on TV broadcasts, but I anticipated that the overarching preservation concerns would be indistinguishable from any other video preservation effort. &lt;/p&gt;
&lt;p&gt;The three presentations fit together well, despite differences in scope.  Thomas Connors of the National Public Broadcasting Archives and the University of Maryland gave the first presentation.  Connors led us through a brief presentation that started with mention of a &lt;a href=&quot;http://www.itconversations.com/shows/detail400.html&quot;&gt;podcast&lt;/a&gt; by Brewster Kahle of &lt;a href=&quot;http://www.archive.org&quot;&gt;Internet Archive&lt;/a&gt; fame, which invokes the contentious &amp;ldquo;save everything&amp;rdquo; debate.  Connors invoked the scarcity argument which allowed him to move into a discussion on the lack of literature treating video appraisal criteria.  The remainder of his presentation described Danielle Dumerer&#039;s ranking system, which I interpreted as a risk assessment matrix, for appraising video collections and prioritizing preservation efforts.  This system operationalizes criteria such as current condition of the assets, cost of retention, intellectual rights, use potential, and perceived production value, which is a more formalized but identical process that I used for the CHAT plan.  He then showed how this system mirrors &lt;a href=&quot;http://www.rlg.org/legacy/preserv/joint/gertz.html&quot;&gt;guidelines&lt;/a&gt; described by the RLG and NPO.&lt;/p&gt;
&lt;p&gt;Next in the session was Lisa Carter of the University of Kentucky.  Carter shared her observations in working with television archives, mostly those based on magnetic analog media.  Among these observations were the importance of proper storage of media, the frailty of tape based media, and the importance of keeping the original media even upon conversion to more stable media or digital versions &amp;ndash; all of which were expressed in the CHAT plan.  Much of her talked focused on the importance of metadata for both access and preservation, most notably, the need to work metadata collection into formal workflows.  I found the concept of &amp;ldquo;shutdown procedures&amp;rdquo; to be most interesting, where the creators of a video execute a series of steps to describe, document, and otherwise properly close out a production as a means of combating the often ad hoc procedures that producers often use for the sake of brevity and leave archivists in the dark.&lt;/p&gt;
&lt;p&gt;Leah Weisse of the WGBH (Boston) Media Archives and Preservation Center presented some of her observations in working with the significant back catalog of WGBH broadcasts, reaching all the way back to the 1950s.  One important issue that she presented is that challenges that new direct to drive and flash memory systems present to preservation.  In these cases, there is no original media to work with in the future since the impetus of the users of these devices is to move the digital file off of the memory device and reuse it for subsequent productions.  This is identical to the behaviors of digital camera users, but I had never thought of this in terms of full video capture.  Perhaps the greatest challenge presented in this situation is the need for more rigorous descriptive procedures to ensure that the digital files can be identified, and thus managed, after they have been moved from the capture device.  One observation I made during her presentation is the issue of versioning that I observed during the GIS session.  In this case, the versioning is not only in terms of initial or draft productions (thin director&#039;s cut versus theatrical release in film), but also reformatted versions (letterbox, etc.) and display formats (HD, streaming, etc.).  Weisse had to deal with many of these for many of the works, which implies that the versioning issue is really a genre and form-crossing concern.  I need to see what has been said about versioning in the archival literature and how it translates to other forms.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;a href=&quot;http://www.archivists.org/conference/dc2006/dc2006prog-Session.asp?event=1737&quot;&gt;Session #310: &amp;ldquo;The Current State of Electronic Records Preservation&amp;rdquo;&lt;/a&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Despite it&#039;s comprehensive title, I knew that this session would likely cover only a high-level review of some of the major projects.  With this understanding, I approached this session as a brief update to material I had received while in classes a year or so prior.&lt;/p&gt;
&lt;p&gt;David Lake of NARA and Lee Stout of Penn State University addressed ongoing work on the &lt;a href=&quot;http://www.archives.gov/era&quot;&gt;Electronic Records Archives (ERA)&lt;/a&gt; for the National Archives.  The ERA seems to be the flagship project in North America, at least owing to the amount of information about it that I have encountered of late.  At this point, the ERA has a developer &amp;ndash; Lockheed-Martin &amp;ndash; and is slated for an initial, though not comprehensive release in Fall of 2007.  Much of the questions about the ERA focused on the potential for using the resulting products in venues outside of the National Archives and whether it would be available as an open-source or similar product.  The response emphasized that this project was not only a set of software, but an instantiation of NARA&#039;s workflow processes.  The message seemed to be that while some products that do specific tasks may be portable to other environments, the core of ERA is specific to NARA and its practices.&lt;/p&gt;
&lt;p&gt;Next, Hans Hofman from the National Archives of the Netherlands presented a general overview of three current European projects: &lt;a href=&quot;http://www.digitalpreservationeurope.eu/&quot;&gt;Digital Preservation Europe (DPE)&lt;/a&gt;, &lt;a href=&quot;http://www.dl-forum.de/englisch/projekte/projekte_eng_2711_ENG_HTML.htm&quot;&gt;PLANETS&lt;/a&gt; &amp;ndash; a research project, and &lt;a href=&quot;http://www.casparpreserves.eu/&quot;&gt;CASPAR&lt;/a&gt;.  Much of what Hofman presented was very high-level conceptually, but he did take care to place these projects into the context of previous research and efforts upon which they build.&lt;/p&gt;
&lt;p&gt;Finally, Kenneth Thibodeau of NARA wrapped up the session, providing a bit of thought that transcended the specifics of the previous presenters.  One thought that I took away from his remarks are, paraphrased, that the ERA has shown that preservation has to be attacked as an organizational problem, not a process in isolation &amp;ndash; something that mirrors what I have said before in terms of archival thought infiltrating the process of creation and the tools used by the creators.  One other take-away was his emphasis on the need for digital format repositories of the type that &lt;a href=&quot;http://hul.harvard.edu/gdfr/&quot;&gt;Harvard&lt;/a&gt; is developing.  I interpreted this as not merely as reference databases, but living applications that can provide a supporting framework for preservation software platforms and applications &amp;ndash; think Web services for digital format preservation information.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;General Observations&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;I had one meta-observation concerning the conference as a whole.  Each session was recorded by the conference staff using each room&#039;s audio setup.  The inputs consisted of usually three microphones, one at the podium and two on the panel table.  In virtually every session I attended, the panel participants had to consciously remind themselves to repeat questions from the audience into the microphone so that they would be recorded in addition to the responses given.  This process strikes me as a visceral metaphor for the function of archivists and the frustrations they feel when working with their various constituents.  I often hear the refrain that archival thought needs to happen early in the creation of records, if not before, and given that the recording of these sessions is an inherently future-focused activity &amp;ndash; an attempt to create a complete record of the proceedings &amp;ndash; the panel&#039;s self-reminding process seems apropos.  I have said it before in this venue in different ways, but if we are to capture a more complete cultural record for the future, archival thought in the form of deliberately future-minded actions must be insinuated into our information management &amp;ndash; not only archivists, but everyone that creates information and, especially for the digital realm, in the tools that we use.  I envision this as a sort of repurposing of the &lt;a href=&quot;http://en.wikipedia.org/wiki/Seventh_Generation&quot;&gt;seventh generation&lt;/a&gt; concept for our cultural memory as it is represented in our information objects.&lt;/p&gt;
</description>
 <comments>http://thomas.kiehnefamily.us/reflections_on_the_saa_2006_annual_conference_part_i#comments</comments>
 <category domain="http://thomas.kiehnefamily.us/blog_topics/conferences">Conferences</category>
 <category domain="http://thomas.kiehnefamily.us/blog_topics/digital_archives">Digital Archives</category>
 <category domain="http://thomas.kiehnefamily.us/blog_topics/digital_preservation">Digital Preservation</category>
 <category domain="http://thomas.kiehnefamily.us/blog_topics/saa">SAA</category>
 <category domain="http://thomas.kiehnefamily.us/blog_topics/video_preservation">Video Preservation</category>
 <pubDate>Tue, 15 Aug 2006 01:10:50 +0000</pubDate>
 <dc:creator>tkiehne</dc:creator>
 <guid isPermaLink="false">32 at http://thomas.kiehnefamily.us</guid>
</item>
<item>
 <title>PAT Project Lessons Learned, Part 2</title>
 <link>http://thomas.kiehnefamily.us/pat_project_lessons_learned_part_2</link>
 <description>&lt;p&gt;I first heard about the Persistent Archives Testbed (PAT) Project at the &lt;a href=&quot;http://www.archivists.org/conference/neworleans2005/&quot; rel=&quot;nofollow&quot;&gt;SAA Annual Meeting&lt;/a&gt; in August 2005.  The project merges the efforts of several large institutions -- NHPRC, NARA, SDSC, etc. -- in an effort to test data grid technology as a means of federated archival storage.  In two of the more recent issues of Archival Outlook published by SAA, the a question has been posed to two different groups.  The question is roughly: what skills are needed to work with electronic records; The two groups asked were archivists and IT professionals.  In light of my recent musings, and the upcoming &lt;a href=&quot;http://rpm.lib.az.us/NewSkills/&quot; rel=&quot;nofollow&quot;&gt;colloquium&lt;/a&gt; in Washington D.C., I took great interest in the most recent article.&lt;/p&gt;
&lt;p&gt;Part two of the article series, IT Professionals&#039; Perspectives (Archival Outlook, Mar/Apr 2006, pp. 8 &amp;amp; 27, not yet available online), asks: &quot;what skills / knowledge should IT professionals have to work with archival records and archivists?&quot;  Three people were asked (or at least responded to) this question:  Adil Hasan of the e-Science Center at the Rutherford Appleton Laboratory in the UK, and Richard Marciano and Reagan Moore of the SDSC. Eureka, I thought â€“ this is exactly the &lt;a href=&quot;http://thomas.kiehnefamily.us/musings_on_a_systems_view_of_digital_archives&quot; rel=&quot;nofollow&quot;&gt;issue&lt;/a&gt; I have had running around in my mind lately and from the perspective that has the most to offer with regards to my personal interest.&lt;/p&gt;
&lt;p&gt;Hasan starts off strong, proffering that IT types working alongside archivists must have explicit domain knowledge of archival workflows and concepts.  This is basic, however, as any programmer trying to develop applications for any domain must have an understanding of that domain -- be it supply chain management, inventory control, marketing, data mining, or even archives.  The important takeaway from Hasan is the notion of developing a &quot;toolkit&quot; for archivists to &quot;[combat] the deluge of electronic information.&quot;  This is exactly the conclusion I came to working on the &lt;a href=&quot;http://thomas.kiehnefamily.us/from_floppies_to_repository_a_transition_of_bits&quot; rel=&quot;nofollow&quot;&gt;Joyce collection&lt;/a&gt; last year, and from what I understand, quite a lot of attention is being brought to the issue of archival toolkits.&lt;/p&gt;
&lt;p&gt;Marciano also touches a salient point: that archivists and IT have different ways of speaking that often overlap.  Archivists do much eye-rolling in response the ways tech companies use the term &quot;archival,&quot; particularly when describing storage media.  Less frequent, but no less important, are differences in the meanings of creation date (especially as implemented by a certain dominant computer operating system), &quot;archive&quot; as a type of compressed file, and other notions of metadata and naming conventions that are imposed by well-meaning IT professionals.  Marciano correctly asserts that this terminology gap must be bridged by IT personnel who work with archivists and describes the ongoing efforts by Richard Pearce-Moses to that effect.&lt;/p&gt;
&lt;p&gt;But although the respondents answered the question asked of them, this is where my enthusiasm faded slightly.  Both Marciano and Moore, from what I was able to read of their comments, address the immediate concerns of IT personnel working on archival projects and with archivists, but appear to avoid the notion that there must be some kind of backflow of concepts into the IT field itself in order to truly address the question of long term preservation of electronic records.  Marciano gets close when he says: &quot;If navigated properly, the unsuspected world of archives that unfolds has the potential to draw IT folks in and transform them into champions of the cause.&quot;  Precisely.  And this is the crux of the challenge: how can archivists instill some of the basic, time-worn traditions of records management and preservation for maintaining reliable and authentic records back into the short-term, light-speed horizon of IT?  IT is great and manipulating electronic information in any way imaginable, so it is not a stretch to believe that they can effectively and definitively extend its longevity as well, beyond mere &quot;backwards compatibility&quot; and better search methods.&lt;/p&gt;
&lt;p&gt;Somehow, we as archivists (or the archivally-aware) must figure out how to imbue programmers with a consciousness of how the artifacts produced by their applications are used, now and into the far future, and how their decisions affect not only the ability of future users to access the information, but the ability of the custodians of that information to work with it as well.  Such an awareness will present the IT field with a challenge that they cannot live down and will conquer (with guidance, of course).  In summary, I maintain that we must take a two pronged approach in order to overcome the problem of electronic information: combat the deluge of information with appropriate tools (as Hasan put it), and expand the utility of artifacts produced by the tech sector as informed by archival practice.&lt;/p&gt;
</description>
 <comments>http://thomas.kiehnefamily.us/pat_project_lessons_learned_part_2#comments</comments>
 <category domain="http://thomas.kiehnefamily.us/blog_topics/digital_archives">Digital Archives</category>
 <category domain="http://thomas.kiehnefamily.us/blog_topics/digital_preservation">Digital Preservation</category>
 <pubDate>Tue, 25 Apr 2006 06:02:23 +0000</pubDate>
 <dc:creator>tkiehne</dc:creator>
 <guid isPermaLink="false">29 at http://thomas.kiehnefamily.us</guid>
</item>
</channel>
</rss>
