Making of American at the U. of Michigan

Roy Tennant rtennant at library.berkeley.edu
Fri Mar 21 10:54:11 EST 1997


Posted by request of John Price-Wilkin <jpwilkin at umich.edu>.
Roy

-------------------------------------------------------------------------------

			Making of America
		at the University of Michigan

The University of Michigan Digital Library is pleased to announce the
availability of an extraordinary new electronic collection of American
writing.  A part of the Making of America project, these materials are a
powerful demonstration of several pieces of digital library technology
developed by the University of Michigan.  Currently included in the UM
online collection are some 200,000 pages of American publications from
1850 to 1900; by mid-year, the collection will extend to include
approximately 650,000 pages, including several journals.  The University
of Michigan MOA collection is available at:
	http://www.umdl.umich.edu/moa/

The Making of America project is a collaborative effort between Cornell
University and the University of Michigan.  Funded primarily by the Andrew
W. Mellon Foundation, the focus of the project is American social history
from the antebellum period through reconstruction. Cornell and Michigan
are working to develop a distributed architecture to provide access to the
two collections through a single interface at each institution.  Materials
currently available from Cornell may be found at
	http://moa.cit.cornell.edu/
Work is underway to facilitate cross-collection searching for the two efforts.

Digital Library Resources for the Humanities
The implementation at Michigan demonstrates a number of unique approaches
to building systems for access to scholarly resources.  Capitalizing on
Cornell University's extensive experience in preservation-quality imaging,
pages were scanned as 600dpi TIFF images through a conversion bureau,
using specifications jointly written by Cornell and Michigan.  In a
subsequent process designed by Digital Library Production staff at the
University of Michigan, a subset of the scanned pages were treated with
locally developed routines for automatic OCR.  A relatively low-level of
SGML, using the TEI Guidelines, was applied to the OCR.  This encoding is
used to hold bibliographic information, text, article-level information in
journals, and page references.  It also serves as an extensible framework
as titles are identified for more thorough proofing and richer encoding. 
Images are stored as high resolution, preservation-quality 600dpi TIFF
images, and are rendered to various levels of GIF in real time. 

SGML-based Access Systems
We hope that users of the system will appreciate some of the functionality
developed through UM's nearly eight years of experience with deploying
SGML-based access and delivery systems.  Attractive, easily navigated
displays of results showing the number of occurrences per page are
combined with displays of the page image, circumventing many of the
problems encountered when relying on OCR alone.  As we have opportunities
to "clean up" and more richly encode OCR'd texts, the system will begin to
show dynam ically-rendered HTML with links to the page images.  The
mechanisms used for the MOA system will be provided to participants in the
UM's SGML Server Program (see http://www.hti.umich.edu/misc/ssp/). 

Next Steps 
Development and design of the system continues.  The current
implementation will be exhaustively vetted with focus groups of local
users, especially experts in the fields covered.  We would also encourage
others to send comments and suggestions to moa-info at umich.edu.  Also, as
time and resources permit, texts will be extracted from the system,
carefully proofed and corrected, and encoded at a much higher level of
SGML.  These enriched resources will allow us to continue to improve
functionality in a numbe r of different directions.  For more information
about the Making of American project in general, and the Michigan
implementation in particular, please see: 
	http://www.umdl.umich.edu/moa/about.html





More information about the Web4lib mailing list