SPIRE retrieval engine

Matthew Dovey matthew.dovey at las.ox.ac.uk
Mon Oct 6 05:30:01 EDT 1997


The "Annual Review of OCLC Research 1996" has an article which surveys
this kind of work and mentions SPIRE amongst others.

The online version of this is at 

http://www.oclc.org/oclc/research/publications/review96/visual.htm

Matthew J. Dovey
Client-Server Development
Libraries Automation Service
Oxford University
65 St Giles
Oxford OX1 3LU
United Kingdom

Tel: +44 1865 278272
Fax: +44 1865 278175
E-Mail: Matthew.Dovey at las.ox.ac.uk



> -----Original Message-----
> From:	Ernest Perez [SMTP:perez at opac.osl.state.or.us]
> Sent:	Friday, October 03, 1997 10:15 PM
> To:	Multiple recipients of list
> Subject:	SPIRE retrieval engine
> 
> I'd not heard of this product before running across it in a small
> mention in an article about intelligence community applications. 
> Article was in InfoWorld, as I recall.
> 
> I dropped an e-mail to theem, and got the following reply. Thought it
> might be of interest to others.  Anyone on the list have experience
> with
> this, and wish to comment?
> 
> -ernest
> Ernest Perez//Oregon State
> Library//perez at opac.osl.state.or.us//503-378-4243
> ----------------------------------------------------------------------
> ------
> Paradise is exactly like where you are right now, only much, much
> better.
> 
> 
> 
> ---------------------------------------------------
> 
> Ernest,
> 
> The following Word RTF doc will bring you up-to-date on ThemeMedia and
> our 
> software applications. Please contact me with any questions.
> 
>  
> Thanks,
> 
> Steve Ardire
> Sr. Director Business Development
> steve at thememedia.com
> PH: 425-602-3559
> 
> ThemeMedia Company Backgrounder and Technology Overview
> 
> ThemeMedia is developing software tools for "content mapping"  -  a
> process 
> that graphically represents thousands of unstructured documents on a
> single 
> computer screen for quick, focused navigation, retrieval, and insight.
> 
> Our work is based on technology that emanated from the Battelle
> Pacific 
> Northwest National Laboratory (PNNL) under contract with the U.S.
> Department of 
> Energy. PNNL was asked to design software that could help intelligence
> and 
> national security research staffs efficiently access thousands of
> publications, 
> documents, and transcripts strewn across the world.
> 
> The result was SPIRE, the acronym for "Spatial Paradigm for
> Information 
> Retrieval and Exploration," a software system for transforming
> text-based 
> information retrieval into a visual system for navigation, retrieval,
> and 
> analysis. Over the last three years, SPIRE has been actively used by
> the
> U.S. 
> intelligence community for research and analysis involving matters of
> national 
> security. In October 1996, the founders of ThemeMedia acquired the
> exclusive 
> worldwide license to SPIRE technology and formed a company around a
> core
> group 
> of the original SPIRE team.
> 
> Desperately seeking information
> 
> Today, the information search method of choice is based on Boolean
> logic, 
> whereby a document must include one or more user-specified terms, or
> keywords, 
> to make it eligible for consideration. Existing search engines, such
> as
> those 
> offered by Yahoo, Excite, AltaVista, Lycos, and others, typically
> generate a 
> list of hundreds or thousands of documents, with only limited ability
> to
> order 
> them by relevance. Moreover, there is no common measure of relevance
> to
> help 
> information seekers determine true value. What AltaVista considers
> relevant for 
> a particular query, Lycos may relegate to a position of less
> importance
> farther 
> down the list. Users are not only at the mercy of how each company
> defines 
> relevance, they have no way of evaluating the methodology behind the
> retrieval 
> process - no way of actually seeing the relationships among the
> documents 
> listed.
> 
> The weakness of the Boolean search has to do with the user's role in
> two 
> standard retrieval measures: precision and recall. Recall measures how
> well a 
> search produces all the documents that fit the search criteria, while
> precision 
> measures how successful the search is at eliminating irrelevant
> documents from 
> that pool. If information seekers were capable of knowing exactly what
> they 
> wanted and, then, how to ask for it, there wouldn't be a problem. But,
> 
> understandably, it's extremely difficult for most of us to state our
> precise 
> information needs to a database we can't see and have never explored.
> As
> a 
> result, Boolean searches often return too much irrelevant information
> or
> not 
> enough of what we really need.
> 
> Given the sheer size and number of databases now available, the
> sweeping 
> diversity of information, and the lack of a common categorization
> scheme, it 
> seems unlikely that Boolean-based search methods can effectively
> manage
> our 
> ever increasing information retrieval needs.
> 
> Information Visualization and Relevance
> 
> As frustration with existing information retrieval methods mounts, the
> appeal 
> of visualization technologies grows. Visually-based software tools,
> like
> those 
> being developed by ThemeMedia, give users a quick way to actually see 
> everything available to them from a given information set, with topics
> and 
> documents grouped by degree of similarity and level of importance.
> 
> ThemeMedia's System for Information Discovery (SID) starts by
> capturing
> any 
> number of documents into a database. By analyzing patterns of word
> usage
> and 
> relationships between words, SID autonomously discovers salient
> themes,
> derives 
> semantic distances between them to represent degrees of similarity,
> and 
> transforms the results into vector representations arranged to reveal
> document 
> relevance.
> 
> In this way, ThemeMedia's technology eliminates the typical
> precision/recall 
> dilemma faced by information seekers - whether to retrieve all
> potentially 
> relevant documents (recall) or only those that are unquestionably
> relevant 
> (precision). By using ThemeMedia "visual content maps" to display
> information 
> users can immediately see everything available, along with the
> relationships 
> between content and the location of information. Our processes
> discriminate for 
> the user by recalling information in ample detail through
> visualization. 
> Consequently, the user is spared irrelevant information, while quickly
> and 
> precisely navigating to all relevant documents.
> 
> ThemeMedia Software Applications
> 
> ThemeMedia is in the process of transforming SPIRE, an existing
> standalone 
> application designed primarily for analysts that runs on an SGI
> workstation, 
> into a "new look" three-tiered client/server application designed for
> the 
> specialized needs of several business markets:
> 
> * Information Providers and Content Aggregators like Lexis/Nexus and
> Individual 
> Inc.
> * Publishers such as Ziff Davis and Knight-Ridder.
> * Corporate Intranets and archives.
> 
> Our new product application will consist of three modules:
> 
> * NT or UNIX server software for capturing and organizing text
> documents.
> * An editorial tool for creating and publishing customized content
> maps.
> * Java-based client software used for navigating content maps and
> linking to 
> documents.
> 
> SPIRE is available today for $5,000/seat that will be fully credited
> to
> the 
> purchase of our new client/server application that will be released in
> Q1 '98. 
> An early adopter/beta site program for our "new look content mapping
> software" 
> will begin this November. ThemeMedia will provide additional
> information
> once a 
> Confidentiality Agreement is signed.
> 
> For more information and details please contact:
> 
> Steve Ardire
> Sr. Director Business Development
> steve at thememedia.com
> PH: 425-602-3559


More information about the Web4lib mailing list