Unique mobile and/or ereader uses?

Eric Lease Morgan emorgan at ND.EDU
Tue Jan 24 11:39:01 EST 2012


On Jan 24, 2012, at 9:51 AM, Jeff Wisniewski wrote:

> Doing something truly unique and noteworthy in the mobile or ereader space?


I am working on the thing that may be of interest, and we call it the Catholic Youth Literature Project.

  * preliminary blog postings - http://bit.ly/pFBZwC, http://bit.ly/z4Bg8N
  * ugly home page - http://bit.ly/nguLYh

About sixty students will be using this interface in the coming weeks to learn about what it meant to be Catholic in the 19th Century.

In short, working with a faculty member, we have had 100 pieces of literature from the 19th century digitized. Each of these items are Catholic in their subject matter, and each of these items were written with youth in mind. The rudimentary catalog as been created and using an mobile interface designed for iPad-like devices, the reader should be able to do two things: 1) read the text, and 2) do "distant reading". The first action -- read the text -- is straight-forward except the PDF documents do not load on the iPad-like devices. They are too big. Other online interfaces are supported instead. 

The second action -- distant reading -- is of more interest to me. Specifically, I have used named-entity recognition software to extract names, places, and organizations from the texts. I then created rudimentary word clouds against the result as well as the ability to look up the items in the text (via a concordance), look items up in wikipedia, and plot the places on world map. I have also used a parts-of-speech tool to extract all of the pronouns, nouns, verbs, adjectives, and adverbs from the text. These extractions can tell one a lot about the texts -- they go way beyond traditional library cataloging techniques. I have extracted the most frequently used words and phrases while linking them to the concordance. Finally, the concordance tool returns snippets of text and makes it easy to see how words are used in context. The concordance also supports a "map" (in the form of a histogram) illustrating where the words are used in the text(s) as well as a "networked diagram" illustrating what words were used "in the same breath" when a given word was used. Here are some examples which will probably melt down as they are used. Your milage will vary:

  * home page for a book - http://bit.ly/x6VCSN
  * close reading - http://bit.ly/xt04QG
  * distant reading choices - http://bit.ly/wmydxZ
  * automatically generated listed of names & organizations - http://bit.ly/wK4Abh
  * automatically generated list of places - http://bit.ly/w1cXOe
  * disambiguation page for Jerusalem with links to maps and wikipedia - http://bit.ly/AzF24c
  * all part of speech used in a book - http://bit.ly/xn1l8a
  * most common words and phrases used in a book - http://bit.ly/yDfMLB
  * the word "god" in the concordance - http://bit.ly/z7HSqU
  * "map" of where "god" is used in the text - http://bit.ly/wCdL5q
  * network diagram illustrating what words are near the word "god" - http://bit.ly/xj5old

Remember, software is never done, and this software is no different!

Much of the work is based on my (unaccepted) Digital Public Library of America Beta-sprint proposal called Use & Understand:

  Use & understand is an evolutionary step in the processes and
  functions of a library. These processes and functions enable the
  reader to ask and answer questions of large and small sets of
  documents relatively easily. Through the use of various text
  mining techniques, the reader can grasp quickly the content of
  documents, extract some of their meaning, and evaluate them more
  thoroughly when compared to the traditional application of
  metadata. Some of these processes and functions include:
  word/phrase frequency lists, concordances, histograms
  illustrating the location of words/phrases in a text, network
  diagrams illustrating what author say “in the same breath” when
  they mention a given word, plotting publication dates on a
  timeline, measuring the weight of a concept in a text, evaluating
  texts based on parts-of-speech, supplementing texts with
  Wikipedia articles, and plotting place names on a world maps.

  http://bit.ly/ojWmzN

In summary, I have been creating an e-reader (iPad-like) interface allowing the student to "read" texts in new and different ways.

-- 
Eric Lease Morgan
University of Notre Dame

============================

To unsubscribe: http://bit.ly/web4lib

Web4Lib Web Site: http://web4lib.org/

2012-01-24



More information about the Web4lib mailing list