Crowdsourcing transcriptions: professionals & tools?

Ben Brumfield benwbrum at GMAIL.COM
Mon Oct 1 13:05:37 EDT 2012


Michael (and list),

I'm the developer behind FromThePage.com and the manuscript transcription
blog you linked to on your post.  I'm currently working with FreeUKGen, the
charity behind the genealogy database FreeBMD, to build a general-purpose,
open-source tool for transcribing structured data into a search-able database.

We're basing our system on the Scribe tool developed for the Citizen Science
Alliance for What's the Score at the Bodleian (http://whats-the-score.org),
which originated out of their experience building OldWeather.org and other
citizen science sites.  Our plan is to build the following systems:

A) A new tool for loading image sets into the Scribe system and attaching
them to data-entry templates.
B) Modifications to the Scribe system to handle our volunteer organization'
workflow, plus some usability enhancements
C) A publicly-accessible search-and-display website to mine the database
created through data entry.  
D) A reporting, monitoring, and coordinating system for our volunteer
supervisors.

We also plan to add support for geocoding during transcription and GIS
support within the search and display system.  Currently, initial
development is mostly finished with A and moving on to B and C above.

Although this tool is focused on support for parish registers and census
forms, we are intent on creating a general-purpose system for any
tabular/structured data.  We're particularly interested in being of use to
archives and libraries, and are looking for collaborators and advisors from
those communities.

We are also looking for supporting collaborators, whether they contribute
code, funding, or advice.  

It might be easier for the Berlin directory digitization project to pool
resources with us now, or to customize our tool once the core functionality
meets their needs.  I suspect that a Scribe-based solution will work better
for their tabular data than Scripto, the Bentham Transcription Desk, or my
own FromThePage -- all of which are wonderful for letters, diaries, and
other free-form prose, but were not designed for work with structured material.

Regardless, I'll be interested in following the project and in any other
feedback you get, and I wish you the best of luck.

Ben Brumfield
http://manuscripttranscription.blogspot.com/

============================

To unsubscribe: http://bit.ly/web4lib

Web4Lib Web Site: http://web4lib.org/

2012-10-01



More information about the Web4lib mailing list