Most IT departments have some kind of disaster recovery plan that includes reacting to natural events such as earthquakes, tornadoes, fire, or floods. But one disaster that’s far more likely to occur is manmade: a flood of bogus DDoS packets that can cripple your network infrastructure. Given its likelihood, IT administrators should seriously consider DDoS attacks in their disaster recovery plan.

The frequency and ferocity of DDoS attacks has been rising for a decade. The dark corners of the web have made it much easier to launch DDoS attacks, and tools like booters can be rented by anyone with a bitcoin or a credit card. Little-to-no sophisticated computer knowledge is required.

As an IT administrator, you should hope for the best, but prepare for the worst—with a detailed disaster recovery plan. You should carefully consider how you would react in event of a DDoS attack, or you’ll be left vulnerable to downtime, financial losses to your business, emergency mitigation costs, or even extortion plots.

In broad terms, a DDoS disaster recovery plan should include detection, mitigation, ownership, and testing.

Detection

There are a several ways to monitor both physical and virtual cloud environments for potential DDoS.

NetFlow monitoring. For those clients who control their own router, NetFlow monitoring is an effective method for identifying traffic anomalies that might be a DDoS attack. Any publicly facing router should be monitored, whether it’s a peer or transit connection to an ISP. A large organization that operates its own 24x7 network monitoring team can monitor NetFlow from border routers and detect when a volumetric flood occurs. There are also companies (such as Neustar) that can remotely monitor NetFlow by exporting sampled NetFlow to a Security Operations Center (SOC). Client traffic is measured during non-attack times, thresholds for alerting are built over time, and alerts are analyzed by the SOC. High severity alerts trigger customer interaction via phone and email. In some cases, the alerts can trigger a local or cloud-based mitigation.  

NetFlow monitoring is not fool proof however. Some low volume attacks (such as Slowloris) can slip by NetFlow monitoring because they do not cause a spike in bandwidth utilization or packet rate. The vast majority of DDoS attacks such as UDP floods and SYN floods should be detected by properly tuned Netflow monitoring.

Monitoring tools. For environments where NetFlow monitoring is not an option, such as Cloud Computing environments, the client can use a monitoring service such as Web Performance Management (WPM) from Neustar or Cloudwatch by AWS. These tools are typically looking for degradation in performance, CPU utilization, or latency. Monitoring should include any load balancer at the front end of the cloud instances. DDoS attacks can cause virtual load balancers to become overwhelmed with bogus SYN requests.

Mitigation

There are a several ways to mitigate a DDoS attack. As with any other product or service, there is a price/performance consideration.

CDN with automated tools. There are several low cost CDN style services that offer DDoS protection. These are typically automated tools that don’t respond to a carefully crafted attack that may bypass the automated tools. Sometimes the user experience is impacted with CAPTCHA or other mitigation screens that cause delay in load times. These services can be very inexpensive, but sometimes fail ultimately to stop an attack and can cost an organization in revenue or downtime.

Appliance only. There are several manufactures that produce DDoS mitigation appliances, including Arbor Networks. These appliances can be effective against certain types of attacks. Large-scale floods, however, can overwhelm circuit capacity and render the appliance ineffective. For this reason, any appliance, no matter how good or expensive, should be considered ‘partial’ DDoS protection.

On Demand Cloud is a service where network traffic is redirected to a mitigation cloud when an attack hits. Typically the service is managed by a Security Operations Center (SOC) and used only in the case of a DDoS attack. The solution is dependent on attack detection and swift action or automation to failover to the cloud to avoid excessive downtime. TTL values should also be considered if using DNS records to redirect to the cloud.

Always Routed Cloud is similar to On Demand Cloud, except that the redirection of traffic to the mitigation network is done on a constant basis. The advantage here is that the client is not required to redirect traffic for every new attack. However, the constant redirection can affect network latency, even during non-attack conditions. This solution is also more expensive than On Demand Cloud redirection.

Managed Appliance + Cloud (Hybrid) includes an on-network appliance that is remotely managed by a third party, such as Neustar. The appliance will stop any DDoS attack within the circuit capacity feeding the network. Large attacks are handled off site by the mitigation cloud, via DNS or BGP redirect. This is the ultimate protection, but is also the most expensive.

Ownership

Ownership is an important consideration for any disaster recovery plan. It entails determining who the primary- and secondary-responsible parties are for tasks that may need to be performed during a DDoS attack.

Detection alerts. Who will receive detection alerts and what are they to do with those alerts? If you’ve outsourced detection and alerting, that entity may need to consult with you or send you alert emails. Appliances will also send out various alerts that require action or attention.

DNS redirection. In the case where a DNS record must be changed to redirect to a mitigation service, who will handle that responsibility at all hours? What change needs to take place for each resource? The mitigation provider may use dynamic IP address allocation or they may dedicate IP addresses to the client. Whatever the case, the responsible party needs to know the correct IP address to enter when making the change to start or unwind mitigation. For a large number of records that share a resource, such as a shared web server, scripting API calls to make the multiple DNS changes can make this more manageable. Neustar offers a REST API for our UltraDNS service.

BGP redirection. In the case where a mitigation provider takes over an entire /24 CIDR block, it may be necessary to alter BGP announcements. GRE tunneling may also be required. These tasks need to be spelled out, with router configurations or necessary changes defined and responsibilities assigned.  

Appliances tuning. If you’ve chosen to purchase an on-network appliance, who will manage that appliance continuously around the clock? The appliance will most likely be alerting on events, and some events will require fine-tuning or a custom mitigation. Countermeasures ideally would be tested to ensure that they don’t conflict with applications. Neustar can manage Arbor Network product remotely from its 24x7 SOC.

Testing

Regardless of the protection method being deployed, it’s good practice to test it periodically.

Neustar On-Demand Cloud clients can perform periodic redirection tests to ensure that the cloud failover is configured properly. Just like a fire drill, periodic testing can not only illuminate gaps or issues in responding to a DDoS attack, but can also prepare the responsible owners to perform their required actions when an actual event occurs.

Prepare for the Storm

Failing to plan is planning to fail. To weather the storm when it does occur, it’s critical for you to define your disaster recovery strategy preemptively. That way, during an actual DDoS attack, you can enact your plan and minimize downtime, negative press or social media commentary, and financial loss. Ultimately, a well-defended network will discourage subsequent attacks and drive hackers towards softer targets.