Automating Account & Password Logins on the Web : Summary

Fri Mar 7 17:55:44 EST 1997

About three weeks ago I posted (to Web4lib and DIGLIBNS) a request for
information on enabling people to log on to databases on the web without
their having to know the account name or password. I received responses
from the following people (to whom I wish to express many, MANY thanks):

Walter Henry (whenry at lindy.standford.edu)
Bob Norris (renorris at mbay.net)
Morton Skogly (morton.skogly at stud.jbi.hioslo.no)
Bob Waldstein (wald at library.mt.lucent.com)
Bin Zhang (bzhang at lama105.kcc.hawaii.edu)

The summary (following a brief recap of the original message) is compiled
from their responses and my own research: 

==========

 > Our library would like to offer access to a web-based newspaper database
 > (Knight-Ridder's, that includes our local paper, the Contra Costa Times.) 
 > The database allows free searching, but an account name and password must
 > be entered whenever a link to the full-text of the article is selected.
 > We have enquired about site-licensing, but this option seems to be
 > unavailable at the present time.
 >
 > I am looking for a way to have the account name and password entered
 > automatically, with the password hidden from the user.

Possible solutions to the "login" problem include:

(1) Modify the .htaccess file on the server to allow access from certain
IP addresses.  This would allow "trusted" workstations to access the
databases without needing to enter a password.

I believe that this is the solution used by Information Access (among
other vendors) for web access to their databases.  It does indeed make
access very easy.  Drawbacks to the approach include that not all vendors
are willing to do this, and it makes it difficult for remote users to
access those databases (since they are not coming from the "trusted" IP
addresses.)

Somewhat related to this is the use of "cookies" to store the login and
password information on a PC.  As with the above, the vendor must offer
this access option.  It has the advantage that remote users could
conceivable obtain access from anywhere (if they have the proper "cookie" 
on their PC).

A major disadvantage might be the ease in which the cookie file could be
moved from one PC to another, leading to many unauthorized users having
access to the database (and charges to the library account.)  Not to
mention other qualms that people have with the overall approach of
using "cookies".... 

(2) The "macro key" approach.  Use a keyboard macro program to place the
login and password information on a key that can be pressed when the
password is requested.

This solution has several problems:

(a) It mucks up the user interface by requiring that the user know about
and press the key(s) when prompted.
(b) The password is very vulnerable.  If the key was pressed while in
a text-editor, for example, the password would be displayed.  This could
happen even if the user was in a browser's "kiosk" mode, if they are able
to access a form of some kind....
(c) On some platforms, running a macro program in the background adds to
the instabilty problems of the programs being used (making crashes
more likely.)
(d) If you need to do this for multiple databases, the user needs to know
which keys are appropriate for which database.
(e) The macro program needs to be running on all PCs that can access the
databases.

The major advantage of the solution is that it is very easy to implement.
It seems that it might be viable for libraries where staff are the only
ones with access to the databases, and where few databases are used....

(3) OCLC's WEBSCRIPT (or other Scripting solutions.)

OCLC designed WebScript to automate access to databases on their
FirstSearch service.  I need to brush up on my CGI security
procedures before trying to implement this, but it seems that the
program can be adapted to work with other databases besides
FirstSearch.

Information about WebScript, as well as links to downloading the
program, can be found at:

http://www.oclc.org/oclc/software/fsauto.htm

Thanks to Walter Henry (whenry at lindy.stanford.edu) for pointing me toward
this as a possible solution to non-OCLC database access.  He also
mentioned that the script language "expect" could be used to achieve
similar results.  (I expect that Perl, Tcl or other scripting languages
could be used as well -- if one knows what they are doing :-)

Some of these scripted solutions would work hand-in-hand with the
following:

(4) Redirection of http: requests.

Instead of a users request going directly to the database, it is sent to
an intermediate server.  This server stores the information about where
the request was issued from, sends the request to the database, responds
to the database with the proper login and password information, receives
the data returned by the database, and sends the data along to the user
who originally issued the request.

This approach offers several advantages to the alternatives mentioned
above, as well as introducing a few new problems to consider. Bob
Waldstein (wald at library.mt.lucent.com) highlighted a few of the issues
involved: 

>...not only is getting the account/password available securely to
>your whole population a problem - but there are also problems with if the
>supplier needs to know the number of distinct users. On top of this is
>the issue of if the library is billed and then needs to bill back based
>on the number of users.
>
>So here is an approach - with some problems enumerated - that I/my
>library is playing with:
>
>  - All the connects go through our web server, and our web server then
>    acts as a client to the remote, sending a password as needed. 
>  - So actually our server is acting as a caching/proxy server for a 
>    specific site or set of sites:
>
>Advantages: 
>   + We control passwords.
>   + We know users (our server authenticates).
>   + Can track URL changes.
>   + In addition we can "play" with the pages - e.g. change the top
      (home), add connectors to our library to the bottom and/or top.
>   + Also we are caching pages - which can help if the internet/firewall
>     is slow.
>
>Disadvantages (and they are major):
>   + We are in the middle, can't help but slow things somewhat.
>   + Doing this is hard - that is correctly being in the middle -
>     note all the pages have to make sure they come back through the
>     middleman server, not directly to the outside. Actually, I think if
>     the pages has fancy java or VB script this can be made undoable.
>   + Not sure this is specific to the above, but the user sees the
>     library (and/or the library server) as responsible for the content;
>     which in a way is good, since we are acquiring it.  But what if
>     there is Netscape or InternetExplorer specific tagging? Do we really
>     get the responsibility? But maybe this is good - gives us more
>     control over the vendor resource we are purchasing - maybe move this
>     to an advantage -).

I would also like to point you to an article that will be appearing in the
14 Mar issue of D-Lib magazine.  Written by Bob Norris (renorris at mbay.net) 
and Denise Duncan, it has the working title of "Sink or Swim: The U.S.
Navy's Virtual Library Project." The article describes the U.S. Navy
Lab/Center Coordinating Group's Distributed Virtual Library project (NVL),
which was initiated in 1995 to improve desktop access and delivery of scientific
and technical information.  The challenges confronted and the solutions
developed in this effort strike me as being at the heart of many of the
problems all libraries will be facing in the increasingly nearer future.
My thanks to Bob for letting me read an advance draft of the artice, which
I cannot recommend too highly.

Scott Bauer   sbauer at ccnet.com   OR   sbauer at mail.contra-costa.lib.ca.us
Contra Costa County Library