Incoming DDoS Attack - Helios (11/1/2014)

November 1, 2014 at 5:14 PM

Our admin Geeks have been dealing with a number of DDoS attacks targeting the Helios system that began last night. The DDoS attacks have caused significant downtime for users on the primary shared IP of Helios. We do have a solution, and most customers on Helios have received e-mails pertaining to this. Let's look at how we handle a DDoS attack.

DDoS Mitigation

These are the steps we take to mitigate incoming and frequent DDoS attacks:

Step 1: Nullroute. 

Our first line of defense against DDoS is to immediately halt incoming traffic to the attacked IP address(es). This is called a "nullroute." Our upstream provider nullroutes the attacked IP as soon as they see an attack and see that it meets specific criteria (namely, if it will impact other users significantly).

A nullroute does two things: it prevents other IPs in our own network from going down due to the incoming attack, and it also prevents routing of the attacked IP so the attacker can't reach the IP. This means they can't attack anything until our IP is back online. We leave the IP offline for a little while, and bring it back up once the attacker has lost interest. Often times this is the end of the attacks, but sometimes they pick back up later.

Step 2: Identify the attacked website.

Incoming DDoS attacks always have a target. The attacker has a grudge with a particular domain on the server, or maybe they just don't like the server. The underlying reason is rarely known; most attackers never reach out and let us know why.

If we can identify the attacked website, we can segregate the associated traffic to a different IP address. This way, we can ultimately just nullroute the IP hosting the targeted site instead of the IP hosting a few hundred websites. This limits service interruption to one account, and keeps other customers online during DDoS attacks. If the DDoS attacks never subside, then the account holder can choose to employ DDoS mitigation services to keep their site online through an attack. Mitigation services can be very costly.

Step 3: Account IP Dispersion.

If the attacked website is unable to be identified, we resort to something we call Account IP Dispersion. You may be familiar with your account IP address. The same IP address that hosts your account may also host anywhere from one to a few hundred websites. We try to bring that number down as far as possible by dispersing accounts among a dozen or more new IPs, and doing this relies on a bit of custom code we built here at GeekStorage. In short, the code scans each website on the attacked IP to determine whether the hosted domains on that account are using our nameservers. If all of the account domains point to our nameservers, then we can change the hosted IP reliably and without downtime. We do this by using both the old and new IPs to host the website while the IP transitions (which can take up to 24 hours).

The first Helios account IP dispersion is complete; no IP is host to more than 30 accounts (excluding the IP holding domains that do not point to our nameservers). Within 24 hours, if attacks continue, we will narrow down which account is being attacked to a subset of 30, and disperse those accounts once again until we determine the attacked website. Fewer accounts will be impacted by the DDoS attacks with each iteration of the account IP dispersion until ultimately only one account sees downtime.

Helios Status

The Helios server is currently 100% online. It has seen four downtime-causing DDoS attacks over the last 24 hours, but our Geeks are hard at work keeping an eye on incoming traffic and preparing for the next step should the attacks continue. We hope to post a quick resolution to this issue within the next 24 hours.

If you are concerned about additional downtime, there are a couple things to keep in mind:

  • Migration to a new server is almost as effective as migration to a new IP. Most customers are already on a new IP. The likelihood of additional downtime for any particular account has been reduced by about 85% by the account IP dispersion. There is also a small chance that your site is the target, in which case a migration will not help.
  • Enabling CloudFlare via cPanel can improve your site speed and reliability. This holds true whether your account is being affected by DDoS attacks or otherwise.

We hope this sheds some light on the actions we are taking to improve availability on the Helios server despite recent DDoS attacks. If you have any questions for our support team, please don't hesitate to ask. We are available 24/7 via our support desk or at [email protected].

Emergency Maintenance - Atlas Server (8/27/2014)

August 27, 2014 at 8:19 PM

We have identified an issue on the Atlas server that requires immediately attention and will initiate downtime shortly. Downtime should be less than 30 minutes and there should be no other adverse effects of this maintenance.

This maintenance has been completed and service accessibility and optimum speeds should be restored on the Atlas server at this time.

Goliath Issue (Resolved)

July 31, 2014 at 8:25 AM

The Goliath server has crashed this morning. We are looking in to the cause and working towards a resolution at this time. We will bring Goliath back online as soon as possible, however we do not yet have an ETA for resolution. Further updates will be posted here as they become available.

Update 9:30AM CDT: Goliath is running a Filesystems Check (FSCK) and should be back online within 15-30 minutes.
Update 9:36AM CDT: Goliath is back online at this time and services should again be accessible.
Final Update 9:52AM CDT:  We do not have a specific cause of the crash however we can attribute the most likely cause to an I/O error. As such, we are tweaking configurations to help ensure this does not occur again, and enabling additional logging to monitor the server more closely for some time (all RAID arrays are in tact and optimal, and a FSCK was completed successfully, as well). We apologize for the downtime and would like to thank everyone for their patience this morning.

IPv4 Renumbering - Websites showing default cPanel page

July 23, 2014 at 2:39 PM

ATTENTION SHARED HOSTING USERS:

If you are experiencing a default cPanel page when browsing to your website hosted with us and you are using third party DNS, you will need to update your hosted IP at your DNS provider per our IPv4 Renumbering notices that we have e-mailed you.

You can find your hosting IP by logging in to cPanel, then click 'Expand Stats' on the left hand column, and then write down the hosted IP displayed there. Input that IP into your third party DNS control panel to update your website. 

If you need assistance with this process, please open a support ticket and we'll be happy to help.

Emergency Maintenance Notification - 07/22/2014 - LAS1 (Completed)

July 22, 2014 at 6:59 PM

Emergency Maintenance Notification

**NOTE: This only applies to dedicated server customers who are located in LAS1. NO SHARED, RESELLER, OR VPS CUSTOMERS ARE LOCATED IN LAS1.***

Start Time: Tuesday, July 22 2014 4:00PM Pacific Time

Estimated Completion Time: Unknown, current estimate 12-18 hours

Expected Service Impact: None

Possible Service Impact: High

Maintenance Information:
LAS1 is currently operating on emergency backup generator power. The electric utility company which provides power to our LAS1 datacenter (NV Energy) has determined that our main service transformer has suffered a critical failure, resulting in a complete loss of utility power to the datacenter. We are actively working with their engine ers to complete a repair or replacement of the unit ASAP.

Our facility is currently operating normally on generator backup power. All redundant power systems have operated flawlessly thus far, and customer power has not been impacted in any way. Our fuel suppliers have been notified and are on standby should we require additional fuel at any time.

Please stand by for updates, as this situation is rapidly evolving.

Update #1: 07/22/2014 @ 10:37 PM Pacific Time

The data center has concluded emergency power maintenance without any impact to power delivered to customer equipment. NV Energy has completed replacement of our main utility transformer with a larger unit, which we hope will preclude any future incidents. We have transitioned off of our backup generators and are now running normally on utility power.

Please let us know if you have any questions regarding this matter.