proxy access issues

Richard L. Goerwitz III richard at goon.stg.brown.edu
Tue Aug 29 08:51:29 EDT 2000


I've received a number of queries about Brown's pass-through proxy
in response to my August 18th "on-site/off-site proxy access issues"
posting (asking for help beta testing our latest pass-through sys-
tem).  Although the issue of proxying comes up regularly here, there
never seems to be enough information out there about it.  So let me
offer a quick summary of how our system works, and why we did things
the way we did (which will answer the vast majority of questions I
have received).

Basically, Brown's pass-through proxy is a simple rewriting proxy
server geared especially for libraries.  Like a standard proxy, it
requests web pages on behalf of clients, substituting its own IP address
for those of the clients.  Unlike a standard proxy, however, a pass-
through proxy can be made to look to the clients just like an origin
webserver - so that clients don't need to be reconfigured in any way
to use it (i.e., no changes need to be made to proxy settings; no
PAC file changes, either).

In effect, what pass-through proxy servers do is present real-time
"mirrors" of the web servers their clients are connecting to - mirrors
that look just like the servers they are mirroring.  A pass-through
proxy is therefore a kind of two-way proxy; i.e., a proxy that fools
the client into thinking it is talking to an end-point or origin web
server and, at the same time, presents itself to the remote "mir-
rored" web server as just another web client.

What makes Brown's proxy different from most other pass-through proxy
servers is that it rewrites fetched pages on the return trip.  Rewrit-
ing prevents fully qualified URLs from diverting clients to the actual
remote servers that it is mirroring.  The overall effect of rewriting
is to keep clients on the proxy - without necessitating their recon-
figuring their browsers.  To patrons it seems as if they're using a
PAC file (which diverts them, as needed, to the right proxy).  But
they don't have to actually do anything to their browsers.

Pass-through proxies can be made much more secure than regular proxy
servers because they can use a wider range of authentication systems
(including cookie-based ones).  With standard proxy servers, one is
pretty much limited to some form of HTTP basic authentication.  And
most standard proxy servers require that people's usernames and pass-
words be sent out "in the clear", where they can often be "sniffed"
by hostile intermediaries - and enterprising graduate students :-).

Brown's pass-through proxy runs as a series of modules written to the
Apache/Perl API.

At least one commercial product, EZProxy, offers similar functionality.

(EZProxy has recently begun to be noticed by the library community, and
has received some good reviews.  EZProxy is not based on Brown's pass-
through proxy.  Although ideas have been shared between the two efforts,
development of Brown's pass-through proxy system and EZProxy have pro-
ceeded along very different lines [EZProxy, in particular, is commer-
cially supported software, and runs as a stand-alone system under Linux
and NT; Brown's pass-through system is essentially open source and runs
under Apache].)

The aim of my August 18th posting was to gather beta testers for our
latest version of our pass-through proxy system.  It's hard to get enough
testing, so anyone who'd like to participate should continue to feel
free to write and request the modules (which come as a 'tar' file ready
for unpacking and installation on Unix/Linux-like systems).

Richard Goerwitz
Brown University


More information about the Web4lib mailing list