[WEB4LIB] Indexing a Local Newspaper Index on the Web

Darryl Friesen Darryl.Friesen at usask.ca
Tue Oct 19 16:51:52 EDT 1999


> I know that this is possible using MS Access and FrontPage (both

[shiver]

> of which we already have).  I have only just begun to read up on
> *how* to actually do it however.  <smile>  Before I get too far into
> the nuts & bolts I thought I'd check and see if anyone else has
> undertaken a similar project.

Yes, we have.  For the past several summers we've had students
scanning/typing several old Saskatchewan (print) indexes.  More recent
issues of newspapers are cataloged by people from our public library.  We've
merged these into a Saskatchewan Newspaper Index
(http://library.usask.ca/sni/).  This is not full text (is that what you are
looking into) but starting soon we will be capturing (daily) full text
obituaries from our local paper and feeding them into the database.

> Is anyone maintaining a dynamically searchable database on their
> website?  What program(s) are you using to create/maintain the
> database?

Perl. More perl. A bit more Perl. Then finally OCLC's SiteSearch.

The custom perl scripts are used to clean up the data (especially the old
stuff typed or scanned by students), or add additional (MARC) fields.  It's
all fed into SiteSearch which acts as the database server (and does a very
nice job of it too).  SiteSearch 4.1 (which we haven't yet implemented in
production) includes an web based editor (called the Record Builder) which
allows records to be added/cataloged to a live database on the fly.  That
should eliminate some of the current perl scripts and grunt work.

Let me know if you want more info.


- Darryl

 ----------------------------------------------------------------------
  Darryl Friesen, B.Sc.                        Darryl.Friesen at usask.ca
  Programmer/Analyst                            http://gollum.usask.ca/
  Consulting & Development, Computing Services
  University of Saskatchewan                   "The Truth Is Out There"
 ----------------------------------------------------------------------





More information about the Web4lib mailing list