TISC Insight, Volume 3, Issue 15

Welcome to Volume 3, Issue 15 of The Internet Security Conference Newsletter, Insight. Insight provides commentaries and educational columns, authored by some of the best minds in the security community.

TISC is about sharing clue. So is the newsletter. We promise to provide something useful each issue. If we don't, flame me.

Enjoy, and be safe,

Dave


Editor's Corner

 If you aren't completely tired with the CodeRed worm, or if you are interested in reading a thorough account and analysis, visit WWW.CAIDA.ORG, and read David Moore's The Spread of the Code-Red Worm (CRv2). This article has various animations of the infection, and graphic characterizations of infection and deactivation rates as well. It's a solid read, enjoy!

Today, Dave Buchanan discusses network processors, and the very favorable impact this technology is likely to have on the performance of security systems.


The Network Processor: Enabler of Wirespeed Gigabit Security

Dave Buchanan, ServGate Technologies

We will witness a watershed change in security appliances over the next year: the appearance of true wirespeed performance in firewalls and other security hardware. We have already witnessed a steady trend over the past few years to improve performance so that security enforcement does not become a "choke point" in the network. Soon, with security devices exhibiting wirespeed performance, security enforcement points can be placed where they make most sense (for greatest security or ease of management, for instance), instead of where they will do the least harm to network throughput.

If the firewall must be placed in a throughput-sensitive point in the network, a "firewall sandwich" is often employed. This consists of multiple firewalls fed from a load balancing device on both input and output sides. Because the load balancer can direct traffic for a given session to the same firewall for every packet of that session, state is maintained within the relevant firewall, and there should not be issues introduced by this configuration. There is, however, considerable cost and complexity introduced, and this configuration has not been shown to deliver wirespeed performance under all conditions in a gigabit environment.

Why is this important? Because access speeds to the public internet have skyrocketed over the last several years. 100 Mbit (and even gigabit!) Ethernet connections are now offered to businesses at reasonable prices and over a wide range of locations by vendors such as Yipes, Cogent, and Telseon. Once the bottleneck is cleared, security must keep pace with Internet access bandwidth, or risk being marginalized. Much like routing has been taken over by Layer 3 switches (essentially wirespeed routers) in the LAN, firewall and encryption functionality will evolve to wirespeed devices.

With Internet connection featuring T1 levels of bandwidth, software-based firewalls and VPN security gateways do not pose bandwidth bottlenecks. The firewall was placed at the most logical place to assure the best security -behind or in the access router. Higher bandwidth connections have caused network architects to rethink this topology. In the case of high-bandwidth links that are shared by many servers (the typical collocation or web hosting facility), security has typically been pushed "into the customer cage" where the individual customer or their managed services provider has taken responsibility for security - but only for that customer. The switching infrastructure that lies between the customer systems and the internet access router remains unprotected, while the traffic for some customer servers goes through firewall protection - but some servers in the facility might be left unprotected and serve as launching pads for attacks within the data center.

What is needed is a next-generation wirespeed gigabit firewall that will protect the entire data center - while offering individualized firewall control for each customer. What we will see over the next two years is a migration from the limited performance devices historically available to streamlined appliances offering scalable and sustainable wirespeed stateful inspection and IPsec encryption - appliances that are as simple to install, as physically compact, and as non-blocking as today's Layer 3 gigabit switches. By maintaining an open and extensible architecture, intrusion detection, attack protection, and other features can be added over time.

What particular metrics will earmark such a system? Today's firewall benchmarks consist of such rudimentary tests as to be almost useless in predicting actual user throughput. Most "throughput" numbers consist of setting up the firewall under test with a packet generator and a single session simulated through a single firewall policy: "accept all." Then the packet size is dialed up large enough to yield the desired throughput number. Unfortunately, firewalls in real world environments are subject to abuses such as bursts of small packets (especially as VOIP traffic and web caches become more commonplace), customers who insist in installing lots of policies, and more than one session at a time through the firewall - in fact, thousands of simultaneous sessions is the norm in data centers. Benchmark testing must become more sophisticated in order to reveal performance deficiencies that will not crop up in the historically tailored tests.

An increasingly important performance metric will be the average speed to look up the applicable firewall policy per new session. This search of policies presents unique difficulties that preclude the use of classic techniques such as hashing, binary search tree, etc - the policies are in a certain order, and the first applicable policy encountered in the list is the policy that should be applied to the packet. The first packet in a session triggers this search of policies. If the session is accepted and passed through the firewall, an entry is typically created in a session table ("session creation"), and fast hashing techniques result in high throughput for that session's packets thereafter. However, if a new session is declined according to the firewall policy set, no session is created. This results in a search of the firewall policy set for every new session - and every declined packet. Every packet arriving at the firewall that is destined to be dropped could therefore result in the same amount of policy lookup computation as the first packet of a new session. The policy lookup function, or session creation time, can therefore become a bottleneck in cases of attacks or a large number of illegal session attempts. Future firewall benchmarks will need to include ways to measure this.

With a gigabit wirespeed firewall, we can construct an Internet Data Center (IDC) with the firewall directly behind the access router, so that it protects not only the servers, but also all of the switches and routers in the infrastructure behind the access router. Hackers have learned that it can be easier flooding a Layer 3 switch with ICMP packets than attacking the hosts - placing the firewall near the ingress/egress point to the public internet prevents that. Besides offering improved protection for all parts of the IDC, this approach offers a single point of management, a single device, a single system to update with new policies, etc. lowering the overall total cost of network ownership. Ideally, it would also contain the ability to create multiple "virtual firewalls" within the box, so that each host system could have its own firewall and policy set, and traffic would be automatically detected and the proper firewall policies applied.

While there are a number of firewalls on the market with physical gigabit ports, none has so far demonstrated anything like gigabit wirespeed in production environments, largely because a technology breakthrough was needed. Vendors have been at work with ASICs for several years and have achieved only a fraction of the necessary speed -- and 10-gigabit Ethernet is looming on the horizon. 2001 has brought a new tool for implementing security products -- the network processor (NP).

Instead of casting dedicated state machines into silicon, Network Processors employ a series of programmable "engines" that can be cascaded via software to perform pipeline-style processing in a highly efficient manner. While network processors are available from more than a dozen different vendors (none of course, compatible), I will describe some particulars of the Intel IXP-1200, which contains six micro-engines and a Strong ARM processor. With careful design, each engine can take over a task in the series of processing steps that each packet must take within a firewall. Large off-chip RAM can store session state information and firewall policies, while on-chip storage is used for instructions. Additional features, bug fixes, or programs to handle new real-time attacks can all be added via updates to the instruction store. Future versions of the IXP-1200 will be compatible, but offer much more: double the clock speed, twice the number of micro-engines, and more storage for instructions as well.

Clock speeds increase as the number of micro-engines per chip increases, so a network processor-based design can take advantage of the silicon vendor's rapid improvements. Because hundreds of designs will employ the same silicon, differing only in surrounding circuitry and software, volumes of that standard silicon will skyrocket, lowering costs for everyone. Over time, standard software libraries for the Network Processor will emerge for nominal fees, lowering the design costs for new entrants, resulting in a snowball effect that pushes most new networking device designs to a network processor-based platform.

With this new and already widely available technology, it is now possible to construct a fully programmable architecture that is much faster than state-of-the-art ASIC designs. How could this be possible? Because the state of today's security standards are such that the ASIC designs must be left programmable or "soft" enough that the full speed of ASICs cannot be unleashed. In tasks like Layer 3 switching, the standards (IP, Ethernet) have been solid enough for long enough to commit the entire pipeline to silicon. Using network processors, not only can a wirespeed gigabit firewall be realized, but the same level of wirespeed performance can be achieved at ten-gigabit speeds within a year. Since the network processor does not just "make the box go faster", but enables the firewall to be placed at any arbitrary location in a gigabit network without affecting network performance, it changes network design. Security is no longer a "choke point" to be designed around, but just another integrated service of the network.

This is yet another step in the last half century's march in utilizing increasingly sophisticated standard silicon across wider populations of users. "The Age of the Network Processor" is just a subset of "The Age of the Microprocessor."


© 2001 - 2006 Core Competence & Mactivity, Inc.