[WEB4LIB] Dynamic DB content and SE indexing.
Avi Rappoport
avirr at LanMinds.Com
Tue Jun 1 17:58:31 EDT 1999
At 1:15 PM -0700 6/1/1999, Matthew Theobald wrote:
>All, (any),
>
>How are portal search engines starting to deal with dynamically driven
>information?
>(i.e. information that cannot be indexed as simply as static HTML)
I'm writing a paper about search index bots right now, and it's a
fascinating topic!
Right now, most webwide search indexers just ignore anything with ?
or $ in the URL. They have too much stuff and don't want to index
every single item in every database.
For specialized portals, intranets site search engines, you may be
able to override this. It's a definite feature of the better site
search tools.
There's also a workaround that you can use. Make a hierarchical map
into your database as several ever-more specific pages. When you get
to the level that points to your database, you can either make a link
to the book.
ASCII art version of this for the visually inclined (ignore my
categories, I know they're not Dewey or LC)
History
Africa
Early History
Colonial Period
20th Century
1900-1909
Book 1
Book 2
Book 3
Book 4
One way is to make a live link that looks like a static URL but is
actually a request for the item from the database (this is how amazon
does it). It's a good idea to be sure they fill in the modified date
correctly, so that data which doesn't change doesn't have to be
re-indexed. If your database can't take that kind of URL, you may
have to generate the file in response to a ? or $ query and then save
it as a static file. Not so bad for library catalogs, who just have
to regenerate a few updated records (a pain for those with very
dynamic data though).
While you're at it, you get a nice taxonomy of your database for
users to browse through.
Hope that helps,
Avi
________________________________________________________________
Avi Rappoport, Search Tools Maven: <mailto:avirr at lanminds.com>
Guide to Site Indexing and Local Search Engines: <http://www.searchtools.com>
More information about the Web4lib
mailing list