[Web4lib] Books Ngram Viewer

Eric Lease Morgan emorgan at nd.edu
Sat Feb 19 17:05:48 EST 2011


On Feb 16, 2011, at 1:47 PM, Jorge Serrano Cobos wrote:

> http://ngrams.googlelabs.com/


Jorge, thank you for bringing the Ngram Viewer to the community's attention, and I think such a tool is symbolic of where we exist in this information age.

With the advent of ubiquitous networked computers, ever-available data and information has become the norm. Prior to the Internet data and information was, relatively speaking, scarce. Libraries played a crucial role as gatekeeper and facilitator. They represented middleman between publishers and consumers. In a world of dictionaries, encyclopedias, bibliographic indexes, catalogs, websites, biographies, manuals, maps, gazetteers, almanacs, and a host of other archaic information tools libraries organized content and helped people find the answers to questions.

With the maturity of information retrieval, the problem of find is not nearly as acute as it once was. People can find plenty of data and information. Instead, the problem is one of use and understanding. "What do I do with a million books?" [1] Digital humanities, specifically text mining, is one possible answer as well as opportunity for the profession.

Google's Ngram Viewer allows one to compare & contrast, measure, and visualize in nanoseconds what it would have taken a Ph.D. student to accomplish in years. It exploits the existence of full text to do simple counting of words across huge corpora. This process -- measuring -- makes it easier to look for trends and find patterns. This is where new knowledge can be created and deeper understandings are found. It is where hunches can be verified and fodder for dissertations can be explored.

Google's Ngram Viewer represents an opportunity for librarianship. The same sort of tools and functionality demonstrated by the Viewer can be incorporated into library discovery systems. Patron searches discovery systems, and identifies items of interest. They click the analyze button, and the results are sets of charts and graphs illustrating the relationships of words and ideas from multiple texts. Such things are not intended to reduce the need for "close reading" any more than traditional tables of contents or back-of-the-book indexes. Rather, they are tools designed to facilitate "distant reading" a la Moretti or "syntopical reading" a la Hutchins. Such things were not possible prior to so much full text. I elaborated on these ideas in a blog posting from about nine months ago. [4]

Again, Jorge, thank you for bringing this to our attention. The Viewer represents something that could be employed in libraries. By combining our knowledge of the student, teacher, and researcher with "cool tools" applied the full text of our collections, we have the opportunity to provide unique services to our clientele.


[1] Crane, Gregory. "What Do You Do with a Million Books?" D-Lib Magazine 12:3 (March 2006)  - http://www.dlib.org/dlib/march06/crane/03crane.html

[2] Moretti, Franco. 2005. Graphs, maps, trees: abstract models for a literary history. London: Verso, page 1.

[3] Hutchins, Robert Maynard. 1952. Great books of the Western World. Chicago: Encyclopdia Britannica. Volume 2, page xi.

[4] blog posting - http://bit.ly/fTr0vQ

-- 
Eric Lease Morgan
Hesburgh Libraries, University of Notre Dame






More information about the Web4lib mailing list