SPIRE retrieval engine

Ernest Perez perez at opac.osl.state.or.us
Fri Oct 3 15:41:52 EDT 1997


I'd not heard of this product before running across it in a small
mention in an article about intelligence community applications. 
Article was in InfoWorld, as I recall.

I dropped an e-mail to theem, and got the following reply. Thought it
might be of interest to others.  Anyone on the list have experience with
this, and wish to comment?

-ernest
Ernest Perez//Oregon State
Library//perez at opac.osl.state.or.us//503-378-4243
----------------------------------------------------------------------------
Paradise is exactly like where you are right now, only much, much
better.



---------------------------------------------------

Ernest,

The following Word RTF doc will bring you up-to-date on ThemeMedia and
our 
software applications. Please contact me with any questions.

 
Thanks,

Steve Ardire
Sr. Director Business Development
steve at thememedia.com
PH: 425-602-3559

ThemeMedia Company Backgrounder and Technology Overview

ThemeMedia is developing software tools for "content mapping"  -  a
process 
that graphically represents thousands of unstructured documents on a
single 
computer screen for quick, focused navigation, retrieval, and insight.

Our work is based on technology that emanated from the Battelle Pacific 
Northwest National Laboratory (PNNL) under contract with the U.S.
Department of 
Energy. PNNL was asked to design software that could help intelligence
and 
national security research staffs efficiently access thousands of
publications, 
documents, and transcripts strewn across the world.

The result was SPIRE, the acronym for "Spatial Paradigm for Information 
Retrieval and Exploration," a software system for transforming
text-based 
information retrieval into a visual system for navigation, retrieval,
and 
analysis. Over the last three years, SPIRE has been actively used by the
U.S. 
intelligence community for research and analysis involving matters of
national 
security. In October 1996, the founders of ThemeMedia acquired the
exclusive 
worldwide license to SPIRE technology and formed a company around a core
group 
of the original SPIRE team.

Desperately seeking information

Today, the information search method of choice is based on Boolean
logic, 
whereby a document must include one or more user-specified terms, or
keywords, 
to make it eligible for consideration. Existing search engines, such as
those 
offered by Yahoo, Excite, AltaVista, Lycos, and others, typically
generate a 
list of hundreds or thousands of documents, with only limited ability to
order 
them by relevance. Moreover, there is no common measure of relevance to
help 
information seekers determine true value. What AltaVista considers
relevant for 
a particular query, Lycos may relegate to a position of less importance
farther 
down the list. Users are not only at the mercy of how each company
defines 
relevance, they have no way of evaluating the methodology behind the
retrieval 
process - no way of actually seeing the relationships among the
documents 
listed.

The weakness of the Boolean search has to do with the user's role in two 
standard retrieval measures: precision and recall. Recall measures how
well a 
search produces all the documents that fit the search criteria, while
precision 
measures how successful the search is at eliminating irrelevant
documents from 
that pool. If information seekers were capable of knowing exactly what
they 
wanted and, then, how to ask for it, there wouldn't be a problem. But, 
understandably, it's extremely difficult for most of us to state our
precise 
information needs to a database we can't see and have never explored. As
a 
result, Boolean searches often return too much irrelevant information or
not 
enough of what we really need.

Given the sheer size and number of databases now available, the sweeping 
diversity of information, and the lack of a common categorization
scheme, it 
seems unlikely that Boolean-based search methods can effectively manage
our 
ever increasing information retrieval needs.

Information Visualization and Relevance

As frustration with existing information retrieval methods mounts, the
appeal 
of visualization technologies grows. Visually-based software tools, like
those 
being developed by ThemeMedia, give users a quick way to actually see 
everything available to them from a given information set, with topics
and 
documents grouped by degree of similarity and level of importance.

ThemeMedia's System for Information Discovery (SID) starts by capturing
any 
number of documents into a database. By analyzing patterns of word usage
and 
relationships between words, SID autonomously discovers salient themes,
derives 
semantic distances between them to represent degrees of similarity, and 
transforms the results into vector representations arranged to reveal
document 
relevance.

In this way, ThemeMedia's technology eliminates the typical
precision/recall 
dilemma faced by information seekers - whether to retrieve all
potentially 
relevant documents (recall) or only those that are unquestionably
relevant 
(precision). By using ThemeMedia "visual content maps" to display
information 
users can immediately see everything available, along with the
relationships 
between content and the location of information. Our processes
discriminate for 
the user by recalling information in ample detail through visualization. 
Consequently, the user is spared irrelevant information, while quickly
and 
precisely navigating to all relevant documents.

ThemeMedia Software Applications

ThemeMedia is in the process of transforming SPIRE, an existing
standalone 
application designed primarily for analysts that runs on an SGI
workstation, 
into a "new look" three-tiered client/server application designed for
the 
specialized needs of several business markets:

* Information Providers and Content Aggregators like Lexis/Nexus and
Individual 
Inc.
* Publishers such as Ziff Davis and Knight-Ridder.
* Corporate Intranets and archives.

Our new product application will consist of three modules:

* NT or UNIX server software for capturing and organizing text
documents.
* An editorial tool for creating and publishing customized content maps.
* Java-based client software used for navigating content maps and
linking to 
documents.

SPIRE is available today for $5,000/seat that will be fully credited to
the 
purchase of our new client/server application that will be released in
Q1 '98. 
An early adopter/beta site program for our "new look content mapping
software" 
will begin this November. ThemeMedia will provide additional information
once a 
Confidentiality Agreement is signed.

For more information and details please contact:

Steve Ardire
Sr. Director Business Development
steve at thememedia.com
PH: 425-602-3559


More information about the Web4lib mailing list