[Web4lib] web site search engines
John Fereira
jaf30 at cornell.edu
Thu Sep 22 10:38:34 EDT 2005
At 09:50 AM 9/22/2005, Mark Costa wrote:
>I am looking for a good web site search engine that I can place on our
>library's web site. It needs to be free, easy to implement, and not be
>Google.
> It's not that I have a beef with Google, its just that they refuse to fix
>their statistics reporting program.
You might want to look at Lucene. One of the nice things about Lucene is
that it's not google but one can use google-like query parsing (almost by
default). For example, combining boolean expressions with phrases can be
difficult to implement with some search engines but the default query
parser for lucene already does it. Using a google-like query parser can be
a huge win as it doesn't require explaining yet another query syntax.
The other big advantage with Lucene is that allows an index to be created
with multiple fields. For example, you can combine full text from the
static html files on your site along with metadata in a backend database
that might be used for dynamically generated pages. That metadata could,
for example subsection information (i.e. search only pages in the "help"
section) or temporal metadata (only show pages which have changed in the
past week).
Lucene was originally written as a java api (open source) but there are
implementations in other languages (for perl it's called Plucene).
More information about the Web4lib
mailing list