In many organizations, networks are at the core of the business, enabling not only internal functions such as human resources, supply chain and finance, but also the services and transactions the business depends on for revenue. That makes network availability critical. Any interruption of access from the outside world turns off the revenue spigot, impacting profit and creating a poor user experience that can damage customer satisfaction and result in customer loss. The worse the outage, the worse the damage. That’s why speed is so important in detecting, diagnosing and responding to denial of service (DoS) and distributed denial of service (DDoS) attacks.
One of the chief challenges in responding to an attack is to distinguish friend from foe. Without a way to drill down into traffic details and examine host-level traffic behavior, it can be difficult to tell the difference. Traditional network analysis technologies based on pre-cloud architectures have been too limited in their compute and storage capacity to do more than perform predefined alerting and summary reports. That’s just not enough information to really get to the heart of what’s happening in a complex networking scenario.
Fortunately, new big data techniques allow us to dig deep into huge volumes of network traffic details so that it’s possible to understand what is really going on. With a properly implemented big data platform, you can pivot your views of data to gain insight rapidly, in operational time frames, so you can act to mitigate an attack or remediate a more innocent but still painful network issue. We’ll examine data that is readily available through common network traffic flow telemetry exports such as those provided by routers and switches enabled by NetFlow, sFlow or IPFIX.
Starting at the Top
Let’s say we’re seeing symptoms of an attack in our infrastructure. We’ll use traffic flow summary data to quickly scan total traffic in bits per second just to see if anything stands out.
There’s no obvious traffic spike from this view, but then again the network we’re looking at is running at an average of 60 Gbps, so that doesn’t mean there aren’t worrying things going on at deeper levels of the network.
Analyzing Source Geography
One of the things that big data is good at is fusing many data sources together. By combining NetFlow data with GeoIP, we can look at traffic by source geography. In this case, the network doesn’t get a lot of traffic from China, so what happens when we filter total traffic by China as source?
A ha! That analytical pivot produces a graph above, showing two obvious spikes that are well above average. Below, we zoom in on the time of the spikes.
Analyzing Unique Source IPs
The spikes themselves are suspicious, but is this just a large data file transfer? We can find out by looking at how many different source IPs are sending traffic. To look at host-level details, note that we’ve gone beyond the point where you can use summary information. At this point, we are analyzing raw NetFlow record details.
Those raw NetFlow details sure are useful, because there is, in fact, a huge increase in the number of unique source IP addresses sending traffic to particular destination IPs. This tells us that we’re not looking at a large file transfer from a single machine, but a highly distributed set of senders. Botnet much?
Who’s Getting Hammered?
The next step is to to determine which IP or IPs are getting all this (probably unwanted) traffic from 14,000 or so individual host IPs.
The ability to dig into high volumes of host-level NetFlow details again proves its utility. We can see that the main target is a solitary destination IP address. There’s really only one likely explanation for traffic from thousands of hosts in a country which you have no business dealings with to a single IP that suddenly spikes from nearly nothing to more than 1 Gbps: This is a DDoS attack. Note that this isn’t a mega attack, but it still can cause real problems for whatever is running on that individual host and anything else that depends on it. If it’s your DNS server, the attack might make it impossible for lots of other servers and applications to function.
Going Deeper
Now that we know it’s a DDoS attack, we shouldn’t stop because we don’t know whether that attack is coming from other countries besides China. We pivot our analysis again to widen our lens.
Lo and behold, there is indeed DDoS traffic coming from multiple countries, including the United States, Japan, Russia, Sweden, China, Taiwan, Brazil and Estonia. Good to know.
Characterizing the Attack
Next, we want to know specifically what type of traffic we’re seeing and what that tells us. We’ll group the traffic coming from all those source host IPs by protocol.
Clearly, this is UDP-based attack. Now we’ll look at the destination port(s).
We can see that the UDP traffic is being sent to multiple ports, and it’s obvious that we’re experiencing a DNS redirection/amplification attack occurring on port 53, with a lot of port 0 UDP packet fragments being generated as collateral traffic.
Is Something Underneath This Attack?
So far we’ve gotten a lot of insight into the details of the DDoS attack from full NetFlow details. But is this volumetric DDoS the main event, or are we being distracted from looking for other, less obvious threats? We can see a lot of packets being sent to port 4444 (green line in graph).
Port 4444 is the UDP port for the Kerberos service, and is—at least for Windows machines—a well-known target for buffer overflow attacks, often used to insert trojans such as Hlinic and Crackdown.
So, potentially there are two types of attacks going on in parallel: a DDoS attack and a buffer overflow trojan insertion. Many security blogs and publications note that DDoS attacks often are used to obfuscate other exploits. This very well may be an example of that technique.
Getting a Handle on Attack Mitigation
Characterizing the attacks leads us to mitigation. One way is to take the attack traffic and group it by x/24 source network addresses:
We now can take two mitigation steps:
- Ask our upstream ISP to drop this traffic or send it to a scrubbing service or device, if we have one.
- As an added precaution, drop all traffic from these countries going to port 4444 on our own routers.
Conclusion
One of the benefits of being able to dig into full-resolution NetFlow data is that you can get operationally useful insights without needing in-line devices. You also get more freedom to employ a portfolio of mitigation methods. Big data is the way forward if you want to have full details at your disposal to deal with DDoS attacks.
About the Author / Alex Henthorn-Iwane
Alex Henthorn-Iwane is the Vice President of Marketing at Kentik. He has more than 20 years of experience bringing new technologies in networking, security and software to global markets. Henthorn-Iwane leads the global marketing strategy for Kentik and help the company bring its innovative story and solutions to network-dependent organizations around the world. Connect with him on LinkedIn, Twitter or SlideShare.