[WEB4LIB] IP authentication failure

Sue Dentinger dentin at macc.wisc.edu
Tue Sep 7 16:08:35 EDT 1999


Thomas,

Our campus (U. of Wisconsin, Madison) has also been experimenting with
several different web caching products this summer. Our networking folks
tell me that most large universities are also experimenting with caching
off-campus traffic.  My campus now caches all static web pages from off
campus sites.  And you are correct, there are really two different
things going on here:  1)that all off-campus is being proxied, and 2)
static webpages are being cached locally so that the traffic never goes
off campus to the remote site, but just pulls up the cached copy
instead.  There's logic in place (and constantly being tweaked) to
exclude all dynamic web pages from being cached, but these websites
still see a proxied request--i.e., the remote site sees the request as
coming from the IP address of the campus webcache machine in place of
the IP address for a specific workstation.  

Campus network folks say they need to do webcaching because of
saturation of the major Internet backbones.  They think they can realize
about a 20-30% reduction of off-campus web accesses by webcaching and
proxying. They can control how long they cache a page, with one hour
being what they use currently.  It's pretty clear this is going to be a
trend at all ISPs in the future even if it is a bit more pain for me. I
guess I consider this part of being a good network neighbor. I bet it
also will affect usage statistics vendors provide. 

Anyway, to realize these kinds of network traffic gains, we shouldn't go
around excluding websites right and left.  And it's really easy to blame
the webcache anytime there is the slightest networking problem.  We end
up working closely with the campus networking dept. to turn on and off
the webcache and test access to make sure what the cause of the problem
really is.  It's a pain, but I guess I see it as inevitable.

The problems we've seen?  Fortunately the webcache/proxy IP address was
already a part of the campus IP address range, so in general it has not
caused vendors to deny access.  Except of course when there have been
account problems, fortunately only a few of those.

We've had to turn off all proxying/webcaching for UMI products--i.e.
anything to umi.com is excluded.

Wilsonweb products have not been a problem.  SilverPlatter resources
caused a small problem, but it appears to be more with the url we were
using than a problem on their end, although I've not gone back to test
this as thoroughly as I should have.

WebofScience from ISI cannot handle web caching at all and we had to
exclude it.

Searchbank was also a problem, had to be excluded.

EBSCO had some problems with their web client back in June, so we
excluded them entirely from being cached.

But many other resources have had no reported problems. Generally any
url with a "?" in it, indicating a dynamic query, is not cached, but it
is proxied.

There have also been concerns that off-campus email services such as
hotmail now see the the IP address to the web-based email as being the
campus webcaching machine.  This could be a security problem as you
can't trace back to the originating workstation.  Our campus folks solve
this accordingly:

"there is an IETF draft in process for an HTTP MIME header called:  
X-Forwarded-For.

right now, in every GET request that the cache sends out, it includes an
X-Forwarded-For header in the request with the original client's IP
address.
so, if any of your sites are logging the actual request, they will have
the
original client IP in the log entry.  "


Then there was Grateful Med, the freely available version of Medline. 
NLM shut off access to the whole campus because too many requests came
from the same IP address (i.e. the campus proxy address).  They changed
their code to allow us back in once we explained that all traffic from
the campus would look to them like it was coming from one IP address. 
This has happened with a few other vendors as well, but they catch on
soon enough.

We also have to exclude any resource which is licensed for
only--something like-- 1 building on campus.  Otherwise the IP address
of the webcache makes it so the folks in that building can't get to the
resource.  Since we aren't licensed for the whole campus, we can't cache
it at all.

Hope this helps.


Thomas Edelblute wrote:
> 
> It appears that our Internet Service Provider implemented a new server
> this week that allows for Web caching.  The unfortunate result is that
> is acts as a proxy server, placing its IP address in place of the IP
> address our library machines are sending out.  This results in
> authentication failures to sites that use IP authentication.  Newsbank
> is working on a fix for us where we can use a user name and password to
> download a cookie to our computers.  I will try this cookie
> authentication out tomorrow morning.
> 
> Has anybody else run across this with their ISPs?  Our ISP says that
> using a Web cache is something more and more ISPs are doing to increase
> the speed of Web access.  If they are right, what other problems might
> we run into with Web caching?
> 
> --
> Thomas Edelblute
> Anaheim Public Library

-- 
-----------------------------------------------------------------------------
Sue Dentinger			 dentin at macc.wisc.edu
UW Madison Libraries		 Library Technology Group
(608) 263-3250			 312F Memorial Library


More information about the Web4lib mailing list