[Web4lib] web site search engines

John Fereira jaf30 at cornell.edu
Thu Sep 22 10:38:34 EDT 2005


At 09:50 AM 9/22/2005, Mark Costa wrote:
>I am looking for a good web site search engine that I can place on our
>library's web site. It needs to be free, easy to implement, and not be
>Google.
>  It's not that I have a beef with Google, its just that they refuse to fix
>their statistics reporting program.

You might want to look at Lucene.  One of the nice things about Lucene is 
that it's not google but one can use google-like query parsing (almost by 
default).  For example, combining boolean expressions with phrases can be 
difficult to implement with some search engines but the default query 
parser for lucene already does it.  Using a google-like query parser can be 
a huge win as it doesn't require explaining yet another query syntax.

The other big advantage with Lucene is that allows an index to be created 
with multiple fields.  For example, you can combine full text from the 
static html files on your site along with metadata in a backend database 
that might be used for dynamically generated pages.  That metadata could, 
for example subsection information (i.e. search only pages in the "help" 
section) or temporal metadata (only show pages which have changed in the 
past week).

Lucene was originally written as a java api (open source) but there are 
implementations in other languages (for perl it's called Plucene).


   



More information about the Web4lib mailing list