Understanding Modern Denial of Service
Copyright (c) 2001, Richard A. Steenbergen <ras@e-gerbil.net>
Last updated 07/19/2001
Original post at : http://www.e-gerbil.net/ras/projects/dos/dos.txt
Table of Contents
1. Modern Denial of Service
2. The Modern Attacks
3. High Rate Attacks
4. Attacking the Infrastructure
5. LAN Attacks
6. Distributed Denial of Service
7. Dropping packets on the floor
8. How to filter
9. Spoofed packet tracing 101
10. Protecting router interfaces
11. Protecting everyone else
12. Provider cooperation
13. Practical techniques
14. Assume = Ass + U + Me
Section 1 - Modern Denial of Service
While the motivations behind Denial of Service continue to be debated,most people acknowledge that it has evolved to its current form because of Internet Relay Chat. Modern Denial of Service didn't just happen; it required desire, effort, and innovation.
A serious modern Denial of Service attack looks nothing like traditional attacks, and most people have no idea how to handle them effectively.
Unfortunately, attacks that were once a game of high tech petty vandalism have spread to the general internet community.
Since so many attacks have been linked to the use of IRC and the human interactions which motivate attacks, a common attitude has been to simply blame IRC and dismiss the thread. Providers assumed that if you didn't have anyone on IRC you would never see an attack, and many went so far as
to blame the victim for inviting the attacks. If they asked their provider for assistance, the answer would come in the form of being unplugged or null-routed. But the providers had a very good reason not to offer better service. Who wants to advertise that they are "attack proof", and invite
someone to try and prove otherwise? Who wants to host customers that will bring attacks?
In February 2000, a Canadian kid who went by the nick "Mafia-boy" changed the way people think about Denial of Service. Instead of targeting an IRC user, he turned a relatively medium sized DDoS network against some big
names in "e-Commerce". It required only minimal technical expertise, yet the biggest providers could do nothing to stop it. He wasn't caught because the attack was traceable, but because he bragged about it trying to impress people. The most surprising thing about this attack is that no
major attacks have followed.
Section 2 - The Modern Attacks
The purpose of this document is to examine forms of DoS which are generally not well understood even by people who deal with it on a routine basis. If I've seen a document on it before, it probably won't be mentioned here. Classic attacks like ping floods and smurfs have been covered in depth a thousand times over, so if you would like more
information about them you should probably look elsewhere.
The modern attacks which destroy services generally fall into one or more of the following categories:
High rate floods - Any flood of packets which is not designed to waste bandwidth, but instead is designed to waste CPU and processing abilities, can be quite devastating. The evolution of the SYN flood has brought about the separate evolution of the high rate flood, which now has a life of its own.
Infrastructure Attacks - For well defended "victims", it may be easier for the attacker to go after the network rather then sending packets directly to the true target. In the process, this can take out a lot more then the intended victim, and obscure the true target. Things can be far worse if the network itself is the intended victim.
Distributed Denial of Service - This latest evolution in DoS has received much publicity, but some of the most important aspects have not yet been explored. DDoS isn't simply about multiplication of attack sources, it brings about issues of path diversity, obscurity, invisibility, and
demoralization of the victim.
Section 3 - High rate floods
In September 1996, the SYN flood was introduced to the world with the very public attack against PANIX, an ISP in New York City. SYN floods and other newer variants capture the most devastating attack properties in the most commonly available tools. These properties include high rate floods,
difficulty in distinguishing attack traffic from legitimate traffic, and random source packets and random destination replies which stress routing functionality.
The original goal of the SYN flood was to overwhelm a small queue of outstanding half-open connections with a very small amount of bandwidth, thus preventing new connections and denying service. This was really an attack against the logic of the TCP implementations of the day, it could
be used from a dialup to hold down a server of any bandwidth. Developers quickly set to work fixing their TCP stacks, using hash tables and automatically expiring dud connections from the listen queue as it fills up. In addition, the concept of the "SYN Cookie" was introduced,
essentially encoding all state information necessary for the connection to be opened in the return SYN|ACK, so that no state needs to be maintained on the victim machine. Today every modern OS of any significance is extremely resistant to the classic PANIX-style SYN flood.
Some time around mid 1998, SYN floods made a come back. This time the goal wasn't simply to overflow a queue and prevent new connections, it was to generate packets so fast that the victim spent all their time processing them. Against PC hardware of the time, the victim machine could spend all its time processing interrupts generated by the NIC and not have a chance.
Even beefy servers proved to have less then optimal TCP/IP stacks, and it was easier to generate the packet then to try and process it. Victims also faced extreme difficulty filtering these packets, since any service they wished to run had to accept SYNs.
The high rate flood highlights the problems of existing TCP/IP implementations. The amount of overhead which goes into handling each frame, inspecting each header, and processing each packet is large. When the service tries to stay up with no attempted limit, the replies generated are just like being flooded twice. The next problem is tracing
the attacks, in circumstances where we are barely able to route the attack.
When doing packet/sec calculations, remember that link layer overhead starts to play a major factor. For example, ethernet overhead works out something like this:
Preamble and SFD (8 bytes)
+ Ethernet Header (14 bytes)
+ Payload (40 bytes in a SYN flood)
+ Frame Padding (6 bytes)
+ Frame Checksum (4 bytes)
+ Inter Frame Gap (12 bytes)
Which comes out to 44 bytes of overhead for every 40 bytes of data.
This has the benefit of slowing down the attacker, but it also slows down the victim as well. The actual frames/sec rate which can be carried will be a bit less then half of what you would expect based on bandwidth alone.
For example a 10Mbps ethernet pipe will max out at less then 5Mbps of IP, when handling smallest-size packets. The numbers for 64-byte frames (the minimum size guaranteed by padding) taking into account the other 20 bytes of per-packet overhead are 14,880 for 10Mbps, increasing by factors
of 10 and 100 for Fast Ethernet and Gigabit Ethernet respectively.
Section 4 - Attacking the infrastructure
The design of most routers involves a central processor which handles routing protocols and administrative functions. The traditional design of routers placed certain exceptional packets on the "slow path", which requires attention from that processor. Packets such as those with IP
options, or those destined to a local interface on the router, usually take the slow path. Some modern high-end routers like the Cisco 12000 series, use distributed processors on each line card to handle the majority of routing without touching the main route processor used for
routing protocols.
Unfortunantly, when the processor which handles administrative and routing functions handles any packets at all, and particularly when it lacks good scheduling functions, it becomes vulnerable to Denial of Service.
Individual line cards with distributed processors may be able to continue forwarding packets, but eventually the routing protocols being run from the central processor will time out, and the forwarding will no longer matter. Given a sufficiently strong attack, the router may not even be
responsive at the local console, as the CPU spends its time processing interrupts and packets.
Attacks which overwhelm the route processors can be particularly bad when BGP is disrupted. If a BGP-speaking router is held down long enough for its peers to time out the keep-alive and tear down the session, the routes
get withdrawn. If this removes the route used to carry the attack, the victim becomes unreachable and the attack is discarded further upstream (this is also potentially dangerous to the upstream routers trying to discard these now unroutable packets), and the route processor returns to
service. As soon as the peer is re-established, the attack begins again.
To paraphrase Vijay Gill, "Packeting leads to flapping, flapping leads to dampening... Dampening, leads to suffering".
Juniper routers fare much better against this kind of attack because of their clean separation between packet processing and the routing engine.
Even exceptional packets which cannot be handled by an ASIC have a dedicated processor which limits the destructive potential of this kind of attack.
Some networking products, particularly certain unnamed brands of "layer 3 switches", use a cache based lookup architecture in which the first packet in a flow must always hit a CPU, and future packets are handed by ASIC.
Depending on the vendor and the quality of the processor, random source or destination floods can wreck havoc on these systems. Some are so bad that they can only process a few thousand new flows per second, which under a
random src/dst flood turns your hundred-gigabit L3 switch into something barely able to route a T1's worth of traffic.
The most common way to attack a router is to send packets destined to one of its local interfaces. These are considered control communications, and must be processed by the "central unit" which is maintaining the stateful
routing protocols. A Cisco GRP can be crippled by as few as 20,000 packets per second. Ironically, while some thought has been given to SYN floods directed to open ports (such as telnet), floods to random closed ports can be more damaging.
Another way to generate exceptional packets is the use of IP options. Most times this is more of a local network attack since the routers which deliver the attack must struggle the attack as well.
Route caches can be heavily stressed by both random source AND random destination floods. One of the most damaging parts of an attack like a SYN flood is the RST or ACK replies generated in response. In addition to the damage caused on the victim generating the replies, and the damage caused by false allegations resulting from the replies to random source addresses, routing the replies can cause heavy cache-thrashing on some routers. Methods like Cisco's CEF, which creates a pre-generated Forwarding Information Base, can provide significant improvement when under this kind of attack.
High rates of packets/sec can also be generated in a twist in the usual use of a smurf attack, by using small packets directed to broadcasts in an attempt to generate router-harming effects instead of large packets designed to use large amounts of bandwidth.
The attack of choice is still the SYN flood though, since nearly every tool used for this combines almost all of the above router-crippling properties into a single attack. Even when the attackers are not thinking about the detailed effects of their programs, they are highly motivated by
what works and what doesn't.
Section 5 - LAN attacks
A variant of the "smurf" broadcast flood not commonly considered is the link layer broadcast flood. In an attack like smurf, packets are directed at an IP broadcast address, and the router converts the packet into a link
layer broadcast. Disabling directed broadcasts prevents this attack from being used remotely, but if the attacker is in a shared broadcast domain with other devices they can generate a "LAN Smurf" by doing a link layer broadcast directly. It is further possible for the attacker to generate a raw frame and forge the source mac address to make the attack more difficult to trace. Making things worse, most switches which have the ability to limit broadcast storms cannot distinguish between broadcast and
multicast traffic, preventing the use of broadcast limits in a multicast environment.
One of the behaviors of attacks across layer 2 switches which is not commonly considered is the expiring ARP cache. If an attack completely disables the target (which is not uncommon), and it can no longer reply to ARP, then when the ARP cache expires the switch will broadcast the attack
traffic across all ports. A quick fix for this is a static mac address for common DoS targets, but a better fix is use VLANs and 802.1q trunking to limit the broadcast domain and extend layer 3 routing properties onto the switch.
Still another potential area for LAN DoS is a spoofed ICMP Redirect or ARP which tricks traffic into taking a "detour". This can not only be used to create a Denial of Service, it can be used to redirect traffic for sniffing
and hijacking.
more to come, HRSP etc
Section 6 - Distributed Denial of Service
You suddenly find you're being attacked by an unspoofed UDP flood. Easy to filter, and have the source shut down, right? What if you're suddenly attacked by 70 machines simultaneously? What do you do in that situation?
What do you go after? Many victims are so overwhelmed by this simple fact that they do nothing.
An unfortunate disadvantage to having a good network is that you will actually receive the full brunt of an attack. Many attacks, no matter how strong they are at the source, end up being self-regulated by a congested link or overloaded router somewhere along the way. This is one of the key reasons that DDoS is so deadly, it adds route diversity in addition to sheer numbers.
The original motivation behind DDoS, however, was not to use the entire capacity of multiple machines. DDoS was initially designed as a method to make the existing attacks undetectable, by having many thousands of hosts generating a relatively small amount of traffic. The source network
suffers almost as much as victim network, and because the attacking machine is almost always compromised, the source network has motivation to quickly shut down the compromised machines. If thousands of machines generate a very small attack, it can go unnoticed and unfixed.
One of the few saving graces of DDoS is that packet kids do not have very good coding skills. The "crypto" used in DDoS networks is badly ripped blowfish code from Eggdrop (an IRC bot), and many implementations can be reverse engineered to locate the "control node". If the control node is shut down, the drone nodes can be lost, destroying the entire DDoS network. Enough other people have covered these design flaws in depth, so we'll refrain from discussing it further.
Section 7 - Dropping packets on the floor
Cisco routers have a feature known as TCP Intercept which is part of the firewall feature-set. It provides assistance against SYN floods to hosts by maintaining a table of half-open connections on the router, and sending RST's to release resources on the host when it detects an excessive number of SYNs. This works great if you're protecting a circa 1996 host from being flooded by a low-bandwidth PANIX style attack, but not much else. In high rate attacks, it makes the CPU situation on the router worse. Most modern hosts have significantly faster (and cheaper) processors anyways.
When faced with attacks designed to overwhelm a route processor, there may be scheduler options available. For example with Cisco IOS, there is a command "schedular interval" or "scheduler allocate" which can guarantee a
certain amount of packet-interrupt free execution time for routing protocols and administrative functions.
For 7200, 7500, and 12000 series routers, Cisco introduced a compiled access-lists feature called Turbo ACL in the 12.0(6)S train. This can improve CPU performance when using access-lists, but Cisco remains the bastard child of packet filtering. Some of their high performance cards like GigE, 3-port GigE, and the engine 2 OC48 card, flat out make no
attempt to support many key packet filtering features. Juniper is still the winner hands down here.
Section 8 - How to filter
In order to filter an attack, you must be able to seperate the good packets from the bad packets. The job of the good attacker is to make certain you can't do this, but fortunately for us all, most attackers aren't that good.
Filtering Smurfs are easy, you don't really NEED ICMP Echo Reply functionality to run a network, and a rate limit so that they continue to work under non-attack conditions makes things that much better. Filtering SYN floods and ACK floods on the other hand is a bit more difficult, since
you need to allow those types of packets in order to service new connections or allow your old ones to stay up. At first glance, it may seem like the ACK flood can't be filtered at all without disrupting legitimate service. But if you look closely at the packet tools being used, you'll notice patterns which can be used to seperate the DoS
traffic.
Lets look at a quick example, SYN floods. Many SYN flooders have been written over the years, many have been published and used, but if you take a look at every packet kiddie SYN flood out there it is missing one thing which essentially every legitimate TCP/IP stack includes, the TCP MSS option. If you can rate limit SYNs with this criteria, you've effectively dropped all SYN floods without affecting legimimate traffic.
Another example, Stream, was released as an ACK flooder. If you look at the code closely however, you'll notice it wasn't written as a SYN flooder. The TCP Sequence number is randomized with every packet, while the TCP Ack number remains fixed at 0. Obviously this was supposed to be a
SYN packet and was never converted properly. The odds of a TCP ACK packet with an ack field of 0 occuring are less then 1 in 2^32, so if you rate limit based on this critia you've just stopped all out-of-the-box Stream attacks without affecting legitimate traffic.
Effective filtering REQUIRES you have a pattern to filter good packets from the bad. Unfortunately most routers can't match against these type of fields, so most of these techniques end up being used to protect high-DoS systems.
Rate limits can be effective tools for stopping a flood while still allowing for the normal functionality at reasonable levels. They can also be used to filter things at an extremely low level which otherwise could not be done with a simple "yes or no" filter. But applied incorrectly,
rate limits can not only be ineffective, they can actually make matters worse. An example of this is putting a 1Mbps SYN flood rate limit on a 100Mbps ethernet port, to "protect your servers". What this actually does
however, is make you vulnerable to attack at 1Mbps. If you ever see someone suggest a distributed filter against SYN floods on their network borders, realize that it's just a matter of time before someone throws a 1Mbps SYN flood at them and blocks all TCP SYN traffic for their network.
Section 9 - Spoofed packet tracing 101
Assuming you have determined there is a spoofed attack in progress, your next step it to figure out where it's really coming from. As long as routing remains constant, the attack should stay on a fairly constant path, and you can trace it back hop by hop.
Spoofed packet tracing is accomplished by picking a starting point (usually close to the victim), locating the true source interface of the attack, and then moving to the router on the other end. This continues until you find the source of the spoofed packets, or you reach a device
which is not under your administrative control. The point where this usually often breaks down is when try to get in contact with someone qualified to trace the packet stream.
With a Cisco GSR and a modern IOS, the easiest method for locating the source interface of a certain type of packet is to apply an access-list permitting packets with the "log-input" directive to the victim interface.
If you do not have a GSR, and the size of the attack is even slightly substantial, you'll probably make the router fall over. A "show log" will give you the output from the acl, in this kind of format:
SLOT 2:2d01h: %SEC-6-IPACCESSLOGP: list 169 permitted tcp 186.15.46.2(0) (POS1 *HDLC*) -> 216.200.92.162(0), 1 packet
SLOT 2:2d01h: %SEC-6-IPACCESSLOGP: list 169 permitted tcp
115.230.162.206(0) (POS1 *HDLC*) -> 216.200.92.162(0), 1 packet
SLOT 2:2d01h: %SEC-6-IPACCESSLOGP: list 169 permitted tcp 63.67.171.136(0) (POS1 *HDLC*) -> 216.200.92.162(0), 1 packet
This means the source interface is Slot 2 port 1, or POS2/1. One of the dead giveaways that this is an attack is the random sources from IP space which is reserved. Proceed to the other end of POS2/1 and continue the trace.
The other major case is tracing across a non point-to-point medium, such as an ethernet link going to a switch. In the case of ethernet, an acl log-input will show the source mac address, which can be resolved to an ip with "show ip arp".
Section 10 - Protecting router interfaces
One of the most obvious targets for Denial of Service against a router is the IP which shows up in a traceroute. In order to provide protection from DoS, and to provide additional security against unauthorized access, many providers have chosen to number their router interconnects out of RFC1918 reserved IP space.
Unfortunately this presents problems of its own. In addition to the property of not being globally routed, RFC1918 IPs are not globally unique. This means there can be a potential conflict between the addresses of two separate networks, when ICMP messages are generated by the routers with RFC1918 source addresses. While this is not a serious problem, many networks obsessed with "packet perfection" choose to filter RFC1918 sourced packets out of the mistaken belief that it is, or that they gain some security by doing so. In addition, using RFC1918 addresses reduces the ability to diagnose problems and requires extra work to get informational DNS names.
A potentially better way get the same benefits is simply to use a common block of real IPs which are either not announced, null routed by your upstreams, or filtered at your network borders. This gives you the global uniqueness and functional DNS, without the hassles of RFC1918 space. There will still be a few providers who filter traffic with non-announced sources, but it is still significantly less then those who filter RFC1918 space. Unfortunantly this can require extensive planning or renumbering, which prevents many networks from doing it until they are already under
attack.
Another approach to hiding potentially vulnerable IPs is to make the “real” IP a secondary address, and use a false fixme the primary and thus the source of ICMP messages generated. This can be used to provide obscurity without being forced to change all references from an existing
numbering scheme. You should be aware of how this could potentially affect your IGP before blindly trying this.
Section 11 - Protecting everyone else
In a perfect world, you would never see any spoofed packets allowed on the internet. Unfortunantly the world isn't perfect, but every little bit helps. When you stop spoofed packets from leaving your network, you not only help protect others from being abused, you help prevent your network from being abused as well. This is even more important for university networks and colocation providers then it is for dialup providers, yet a surprising number of networks simply do not care.
Be aware that spoofed packet filtering doesn't stop everything. In response to the rising numbers of networks which filtered bad source addresses leaving their network, packet kids changed techniques to include "one-off" spoofing. Particularly in DDoS nodes, the machine sends a test spoofed packet to another host and looks to see if it got through. If it did, it can spoof normally, if it did not then it falls back to "last octet only spoofing". Since most people implement their spoof filters at aggregation choke-points, these packets typically get through. The other
attacking nodes create "cover traffic" which obscures then
"semi-unspoofed" nodes, making it impossible to quickly identify the true source of the traffic. Even if the true source network is determined, the last octet being spoofed draws the blame to a nearby but innocent machine.
Trying to get a university network administrator to look for unusual traffic by mac address is next to impossible, and once the attack stops there is usually no record of who was responsible.
Stopping spoofing is by no means a guarantee of stopping DoS. It is still entirely possible for the attacker to conduct a campaign using only "throw-away" boxes with unspoofed source addresses. Another common technique is to mix floods which don't even require root access (like UDP
floods) from semi-legitimate hosts with other spoofed traffic to cover the trails.
The RFC on the subject is RFC2827 by Paul Ferguson, which obsoletes the previous version, RFC2267.
Section 12 - Provider cooperation
There is often nothing that smaller victims of an attack can do to protect themselves. Their only recourse may be to have their upstream provider filter the target being attacked.
Unfortunantly one of the biggest problems from the customer perspective is the length of time it takes to get a filter in place. Calls to the NOC, escalation to someone knowledgeable, and taking action, all take time. If you want to give your customers a method to take action on their own, you can offer them the option of setting a locally well-known community tag which drops the attack traffic at your network borders. The simplest way to do this is with a route-map on your border peers, which changes the next-hop to a reserved IP when the community tag is matched. This reserved IP should then be routed to null0, or some other destination if so desired.
Section 13 - Practical techniques
This technique comes from Christopher Morrow <chris@uu.net> and UUNet security. The original posting is at http://www.secsup.org/Tracking/.
If you are looking for a quicker method then the usual step-by-step approach, and the attack uses random source IPs, there may be another technique. If you use the methods above to null route a victim IP at your network borders, and the attack packet which is dropped has a random
source IP, an ICMP Unreachable will be returned by the router to that random source. It is highly likely that some of these packets will be going out to unallocated IPs.
If you add a route for these unallocated IPs within your network (no-export), you can catch these ICMP Unreachables and instantly locate the ingress point into your network. A sizable block which is unallocated at the time of this writing is 96.0.0.0/11, but you should constantly be
checking to make sure the route you add stays reserved by IANA. You can decode the ICMP Unreachables with a unix machine or a router set to log packets, by hand or with the aid of a program.
Section 14 - Assume = Ass + U + Me
One of the more overlooked areas in Denial of Service is the social engineering DoS. People looking to stop attacks will be quick to jump to assumptions about how is the guilty party, and create quite a bit of trouble for innocent people.
For example, if you are attacked by a SYN flood, the "technically" correct behavior is to reply to every packet with an RST or a SYN|ACK. But if you are attacked by a random source spoofed SYN flood (as almost all of them
are), you're now sending packets out to random sources. If one of those random IPs happens to be a "security zelot", or government agency (in my experience NASA knows about every scan that crosses their network), you can quickly find yourself being blamed for "scanning" other networks.
Under these conditions, it is more then possible to be intentionally framed by someone sending SYN floods from a pool of known "bad" addresses.
Many providers will just as soon shut you down and ask questions later if ever.
Section 15 - Further reading
NOTE: This section is a work in progress
Ferguson, P., Senie, D., "Network Ingress Filtering: Defeating Denial of Service Attacks which employ IP Source Address Spoofing", RFC 2827, May 2000.
Craig Huegen's smurf paper, http://www.quadrunner.com/~chuegen/smurf.txt
David Moore, Geoffrey M. Voelker, Stefan Savage, "Inferring Internet Denial of Service Activity".
http://www.caida.org/outreach/papers/backscatter/index.xml
|