[WEB4LIB] Firewalls and Web access

Mon Aug 12 11:22:52 EDT 2002

On Sun, Aug 11, 2002 at 03:23:43PM -0700, Bob Duncan wrote:
> Our campus network folks just installed a new firewall, and now we have 
> lost access to all our subscription resources which rely on authentication 
> via IP-address recognition.  Apparently this is because the firewall is 
> stopping most incoming traffic and its IP address is not within the range 
> of addresses we supply vendors.

Quite probably, yes.   Here's a VERY brief overview of firewall terminology:
in the interest of brevity I'm going to play somewhat fast and loose with
these definitions.

	NAT: Network Address Translation.  NAT allows you to use a private
	IP space, such as 192.168.0.*, on the inside network while using
	your public address space on the outside.  (You can't use a private
	IP space on the outside: that's against the RFCs and unless somebody
	has badly misconfigured their routers, you can't get traffic to
	or from it.  You need to use your public IP space on the outside,
	because Internet-wide DNS thinks that's where you are.)

	Not all firewalls NAT: not all NAT devices are firewalls.

	Proxy: A software agent which (usually) lives on a firewall
	and looks just like The Real Thing to the machines inside your
	network...but isn't.  It's merely a forwarder -- with, perhaps,
	some smarts built in to decide what to forward and what not to.
	For example, squid is a very good open-source HTTP proxy.  To
	use it on your firewall, you'd install it and then configure the
	browsers of machines on the inside to send all HTTP requests to
	the proxy, *not* to the real IP addresses of the real servers.
	Squid accepts HTTP requests from (inside) browsers and emits HTTP
	requests to the real servers out there on the Internet, listens
	for the results, and then sends them back to the requesting browsers.

	Proxies are useful because you can build knowledge into them that's
	specific to the service that they proxy.  (HTTP proxies are how
	a lot of censorware products work.)  Proxies are sometimes a pain
	either because they don't work transparently or because they're
	of sufficient complexity that they pose a security risk.

	A proxy that caches what it gets (say, in response to an HTTP request)
	is a caching proxy.

	A proxy that works the other way, i.e. is designed to receive
	requests from the outside and service them based on resources on
	the inside network, is called a reverse proxy.

	Some firewalls include proxies; some don't.  You can use a proxy
	without a firewall.

	Connection-based firewall: In its simplest form, a firewall
	that implements a "switchboard" that uses a ruleset to determine
	which IP addresses can connect to which others and on what ports.

	Best illustrated by example, perhaps.  Let's suppose you decide
	to use NAT and you have two machines, 192.168.0.10 and 192.168.0.20
	on your internal class C network.  Let's suppose that you need
	to surf the web, allow outside sites to FTP to the first one
	and allow mail to be delivered to the second.  You might write
	a ruleset like this:

	Source 		Port	Destination
	------		----	-----------
	192.168.0.10	80	*		# outbound HTTP
	192.168.0.20	80	*		# outbound HTTP
	*		21	192.68.0.10	# inbound FTP
	*		25	192.68.0.20	# inbound SMTP

	which, BTW, will fail miserably the first time you try to access
	a web site whose server isn't on port 80, but this is an example,
	not real-world code.

	Connection-based firewalls have been around for a long time.
	They're relatively simple to set up, and take little in the way
	of computational resources (since the firewall only has to do
	major decision making once, when the connection's established).
	The downside is that they are vulnerable to all sorts of attacks
	based on manipulating already-established connections or falsifying
	packet data.

	Packet Inspection: To check each and every inbound and outbound
	packet to make sure it conforms with a ruleset.  For example,
	one easy check is that you should never receive a packat with
	your IP address as the source: you should only emit such packets.
	Requires sufficient computing resources to performs the inspection
	as well as clueful configuration.  This requires quite a bit more
	horsepower than just connection-based checking, since it's not
	enough just to shuffle packets back and forth on established
	connections, but it's necessary to take them all apart and look
	at them before decided what to do with them.  However, this provides
	strong resistance to many attacks based on careful forgery
	of individual (or sequences of) packets.

	Stateful vs. Stateless: Some firewalls keep no "state" information,
	i.e. "state" in the information theory sense.  Stateful ones, uh,
	do.  An advantage of stateless firewalls is that you can reset
	them without breaking any active connections.  A disadvantage is
	that since they have no memory of what's gone before they can't
	use that information to make decisions about what to permit/deny.

> Our network folks are new at firewalls, and I am only familiar with the 
> general concepts.  What are the options for restoring access to all of our 
> Web-based resources?  Supplying the firewall address to vendors seems like 
> less work than allowing inbound access for each vendor machine (which is 
> also a bit of a moving target), but neither seems terribly palatable.  Is 
> there a way that IP-address recognition can work without compromising 
> campus security?  (And is this a typical configuration for a college campus?)

My guess is that you have a NAT firewall with stateful packet inspection --
because installing one would most likely break your apps in pretty much the
way that you describe.

A way out of this is to use NAT to map the inside addresses of the machines
that you run the apps on to outside addresses that the vendor already has,
or that you can give them.  So for instance:

	Inside		Outside		Application package
	------		-------		-------------------
	192.168.0.5	123.45.6.7	foobar on port 1234
	192.168.0.15	123.45.6.8	blah on port 5678

and make sure that the vendor for foobar knows that authentication requests
will be coming from 123.45.6.7 and directed to port 1234 on their machine,
and that the vendor for blah knows that they'll be contacted on port 5678
by 123.45.6.8.

You can now shuffle machines around inside without telling the vendor
PROVIDED that you update the NAT table to match.

You can also turn this around, if authentication works the other way.

The idea is to create two world views: one for consumption by the outside,
and one for your internal-only use.  You then use NAT to insulate the
effect of changes to the internal world from the outside world....and
you punch "pinholes" through the firewall sufficient to allow the
authentication mechanisms to work.

The goal is to minimize the number of such pinholes while making everything
work.  One potential "gotcha" is that if you, say, punch an inbound hole
on port 5656 for traffic originating from a vendor's machine at 9.8.7.6,
and you allow that hole to contact only your inside machine at 192.168.0.25,
then one day when the *vendor* moves networks, they are no longer coming
from 9.8.7.6, but 5.4.3.2, and voila, your pinhole is still there...but
no longer in the right place.  Some vendors are good about telling their
customers in advance about things like this; some aren't.

Similarly, some applications are better about this than others.  A couple
of years ago, I had to punch a 6000-port wide "pinhole" (more like a
gaping, massive chasm ;-) ) through a firewall because a very poorly-designed
application wouldn't work without it.

Welcome to the hazards of the modern Internet. ;-)

---Rsk