Domain Registry Service and Pricing Changes

November 4, 2021 at 8:53 AM

We are announcing new changes to our domain registry services and pricing which will impact all domain customers for their next renewals and new domain registrations and transfers starting on 11/11/2021.

Our original registry provider, which has allowed us to maintain excellent domain pricing for a select handful of TLDs for our customers, is unfortunately closing down and discontinuing support for their service. This does not mean any downtime for your sites! All domains will be moved to a new registry provider behind the scenes for continuing service. Your domain will still be with us and you will manage it here in our account portal, but our registry provider will change. This transition comes with both good and bad news: We will now be able to offer (many) new TLDs for registration, however, pricing will increase for all domain registrations, renewals, and transfers effective immediately, and pricing will vary between domain TLDs going forward.

We will have access to nearly all available TLDs, and will progressively add more TLDs to our domain registration and transfer options very soon.

For a long time we've been able to offer extremely competitive domain pricing (for .com, .net, .org, .biz, .us, and .info TLDs) at $12/year to all customers because of our no-frills registry provider that only supported a short list of TLDs. These prices will unfortunately be increasing as this registry provider is retired. The new pricing for these TLDs for new registrations, renewals, and transfers is as follows*:

.com - $15/year
.net - $16/year
.org - $16/year
.info - $21/year
.biz - $21/year
.us - $15/year

* - Note new orders cannot be placed at the old pricing at this time and new orders cannot be accepted until the transition to our new registry provider is completed on 11/11/2021. Resellers will still receive a $2/year discount on all domain registrations, transfers, and renewals. We are likely to see more frequent domain pricing changes and TLD additions in the future as a result of the registry provider change.

We apologize that this announcement comes so close to the time of the impending changes. Please note all domain operations (registration, renewal, transfer, and modification) are presently suspended until 11/11/2021. If you have urgent changes to be made to your domain, please contact us at the help desk and we can put in a ticket with the registry for urgent change requests.

4/25/21 Extended Maintenance Report

April 26, 2021 at 7:49 PM

We want to thank everyone for their patience during the extended maintenance last night and this morning which impacted our US-based hosting services. We would also like to apologize for going far beyond the planned maintenance window. We did schedule this maintenance well in advance, and believed strongly that it would be a 2-3 hour operation, and we added another hour for any possible complications to be cleared up. Unfortunately some aspects of the planned maintenance were changed without being communicated to us prior and these changes effectively caused a 2-3 hour job to become a 6-8 hour job.

All of our US server hardware was to be moved to a location with better infrastructure, and this was communicated to us by our datacenter provider over the past several months. Our biggest concern when moving equipment between locations is always making sure downtime is kept to a minimum - we've managed our fair share of hardware migrations and there are always complications or things we did not expect that end up slowing down the process substantially. Most commonly this can relate to trying to maintain similar cable configurations during migration, hardware being jostled during transport which can increase the risk of hardware failure upon returning the hardware to service, and things one wouldn't necessarily expect to take very long like removing the servers from the cabinet and re-racking them in a new cabinet. In February we received more details regarding the expected process for the relocation, and we were informed the datacenter would be taking care of everything and the process was to be a full cabinet migration without removing and re-racking servers between locations, and the work would be performed by a team that does this type of work professionally. Not only is this the fastest and most reliable method of relocating live servers, but it avoids most of the concerns we typically have in regards to the entire process.

We performed a planned upgrade to the OnApp software hosting our VPS services shortly before the migration was to begin, and unfortunately the update had to be rolled back which caused a delay of almost two hours before the maintenance began. Around 11:45PM CDT the migration was underway and we were still anticipating 2-4 hours for the entire process from that time. After 4 hours it was clear something was not going quite to plan, and when communicating with the technicians on site we were informed they were almost done racking our equipment at the new location and to expect systems to be booting up within an hour. At this point we knew the work was going to go on significantly longer than expected and one hour was likely very optimistic, so we tried our best to communicate this via Twitter while our main site and services were still down.

After 90 minutes from the time we received the ETA from the datacenter, the re-racking had been completed, and then the re-wiring began which took another 30 minutes or so. One of the older PDUs needed to be replaced which took roughly 15 minutes. Services were finally being brought back online around 8 hours after the downtime began, with almost all customers being back online after 8-9 hours total downtime.

In short, instead of migrating cabinets of hardware in their entirety from the old facility to the new facility, as we were told to expect, each piece of equipment was individually moved by hand. Instead of moving large cabinets, dozens of individual pieces of equipment were disconnected, removed, re-racked, and re-connected. There are far more possible complications with this strategy, and of course a substantially longer amount of time is required to perform the work. If we knew beforehand the plan had changed away from a full cabinet relocation, we would have been on site ourselves and split the migration into two separate parts instead of trying to get all the work done in one night. We are frustrated this happened the way it happened, but we are also glad the work is complete and no more hardware relocations are expected in the near future.

We want to apologize again for the maintenance being pushed back and going over twice as long as originally planned. We have confirmed all shared, reseller, dedicated servers, and managed VPS services are fully restored and operational. If you still see any issues of any kind please let us know in the help desk or send us an email at [email protected] and we will investigate immediately.

Scheduled Maintenance - 4/25/21

April 7, 2021 at 1:22 PM

On April 25, 2021 maintenance will be performed to migrate all physical hardware at our US location beginning at 10PM CST (GMT-5) and we expect the migration window to last 2-4 hours. Our US-based hosting services will be down for the duration of the planned maintenance. We do expect slightly improved network performance and reliability following the relocation of hardware to our datacenter provider's premier location. Some considerations are included below for each service type currently active at our US location.

Shared Hosting (Unlimited & Performance) and Reseller Hosting: Our US-based shared and reseller hosting services will be down for 2-4 hours for the maintenance window. We will shutdown servers manually about 10 minutes prior to the beginning of the maintenance window, and service will automatically return once the relocation concludes.

VPS Hosting: All VPS hosting services will be down for 2-4 hours for the maintenance window, and we will begin shutting down VPS services about 30 minutes prior to the maintenance window, at which time the OnApp control panel may also be inaccessible. All VPS services will be brought back online automatically following the maintenance window.

Dedicated Servers: We would advise dedicated server customers to shutdown their servers or put their sites into maintenance mode 10-30 minutes prior to the maintenance window to help ensure clean shutdowns can be performed. Servers will be automatically brought back online following maintenance.

We understand this is a significant downtime, so we have scheduled the maintenance window for overnight hours to help ensure as few customers as possible are negatively impacted by this relocation. If you have any questions or concerns, please contact us directly by opening a support ticket under your account or by emailing us at [email protected].

Update (10:06PM 4/25): Please note we will be kicking off the downtime a bit later than anticipated tonight. We have encountered an issue during a planned update we were to perform just prior to the maintenance, and we are getting that rolled back before the maintenance gets fully underway.

Update (10:42AM 4/26): The maintenance and subsequent service verification checks have been completed at this time. Most users should have been back online around 7:45AM. VPS and dedicated customers about 30-60 minutes thereafter. If you are still seeing any issues and you are on our VPS services, please reboot your VPS and if this does not help contact us at the help desk. If you see issues on shared, reseller, or dedicated server hosting, please contact us via the help desk or [email protected] for assistance. All services should be fully online and operational at this time, and we would like to apologize for the maintenance taking much longer than expected. An additional blog post regarding why the scheduled maintenance window was much more optimistic than we anticipated will come later this evening once we have a chance to triple-check all servers and services, and handle pending support requests.

Outage at Chicago Datacenter - 9/18/20

September 18, 2020 at 11:37 PM

Shared hosting customers may be noticing DNS problems this evening as we have experienced an attack on our DNS servers. We have taken steps to get service restored but it may take some time before all domains begin resolving again (we recommend clearing your DNS cache or restarting your device to help speed this process along).

Reseller, VPS, and Dedicated Server customers may also have seen an outage lasting roughly 30 minutes during the time when we were working with the datacenter to resolve the DNS connectivity issues. These services should now be back to normal operation.

Saturn Server Reason For Outage 7/17-7/19/20

July 20, 2020 at 2:29 PM

This report will summarize and go into detail about the outage and recovery efforts on the Saturn server between 7/17/20 and 7/19/20. Note timeframe references are in our local time (CDT, GMT-5).

Summary:

On the night of Thursday 7/16/20 we had scheduled a server reboot to help improve server stability following an issue encountered the day before. The uptime of this server was over 1,000 days and the issue the night before led us to believe a reboot would be helpful to clear out a few old processes that may have been impacting server stability. During the scheduled reboot, the RAID arrays hosting both primary and backup drive data were no longer present aside from two disks on the main RAID 10 array. The data was not recoverable in this state, and so we began recovery efforts on the morning of 7/17 via re-image and restore from the latest backups taken the day of the reboot. Exceptionally slow speeds were cited going into the second half of the data restoration process, which led us to investigate possible underlying complications, ultimately finding that the server recovery had been executed on the wrong array. Options were evaluated for how to proceed, and it was deemed most suitable to re-initiate recovery again on the correct disks which began on the morning of 7/18. Recovery from backup was then completed in the early afternoon of 7/19. 

In depth:

Prior to a reboot on a server with a significant uptime, it is prudent to be sure proper backups are in order and all systems are operating as expected. We verified JetBackup (the backup service on the Saturn server) had taken a full set of successful backups on the same day as the reboot was to be performed, as expected and without complications on any accounts. We also verified via our hardware monitoring services that no issues were being reported. In this case, it is important to note what we were seeing reported by the RAID array at this time which was the following: OK (CTR, LD, PD, CV) -- Broken down, this effectively meant no issues were present on any of the key points related to the RAID configuration or hardware. Virtual drives (the RAID arrays), physical disks, the CacheVault (I/O caching and backup unit), and the controller itself all were scanned within minutes of the reboot and did not report any issues. This monitoring occurs every 5 minutes and all points of concern are checked each time. We were very careful in this case to double-check these monitoring points before the reboot because the server did have several disks replaced last year.

Following the reboot, all disks which were hot swapped in the year before had fallen out of their respective RAID arrays. The only two drives remaining with a visible RAID configuration belonged to an unrecoverable portion of the primary RAID array. At this point we began our best efforts to try to recover from the failed RAID arrays, but several hours later after making no progress and evaluating other options with experts in the field, we decided to start a full rebuild of the operating system (OS) & recovery from backups. We also updated the RAID card firmware during this process as we suspected it may have been to blame for the RAID failure, and we tested the integrity of future reboots on the installed drives by doing some stress testing prior to beginning the recovery effort (though that is not to say we will be rebooting the server again anytime soon - for sanity's sake!).

During the initial re-configuration of the server, a mistake was made which put the backup HDD drives in RAID 0 instead of RAID 1, which caused the wrong drives to be interpreted as the primary array during the OS re-install process. This mistake was not realized until late in the evening that same day when it became apparent there was something wrong with the recovery speed. We then evaluated amongst ourselves, and other experts whom we consult with, as to whether trying to clone the drives onto the main array or simply restarting the recovery process entirely was the best course of action. The clone process ultimately could have failed, and would have likely taken several hours to complete and more backups still needed to be restored afterward (about 45% of data remained to be restored). The restarting of the recovery process entirely would be a more certain course of action and afterward the restore from backup would be much faster. Another option was to continue with the restore process and then try the clone or transfer later from the restored data once everything was back online, but this came with other risks (the data was on a RAID 0 array under extreme load being the primary concern) and would have meant *extremely* slow speeds for the duration. If anything went wrong with that process, we would have been forced to restart recovery completely once again.

It was finally decided to restart the recovery process entirely - on the correct drives, and retaining data from the prior restore operation. We considered this to be the best course of action to not only get websites back online as quickly as possible but also to limit further downtime during and following the restore process. The recovery process was finally completed on 7/19/20 in the early afternoon around 1PM. Server responsiveness and stability then immediately returned to normal.

Going forward:

We do everything we can to prevent issues like RAID failure from becoming an actuality. RAID failure is the most feared situation in any hosting environment. We checked the array status, we checked the hardware backing the I/O cache, we checked the physical disks - all of this just minutes prior to the scheduled reboot. In fact, RAID arrays on all of our systems are checked every 5-10 minutes and drive replacement occurs within 24-72 hours of failure, caching is disabled immediately if battery backups or CacheVault systems fail, etc.

Recovery from backups is rarely a quick or easy operation in a shared hosting environment, and we are exceptionally relieved with the outcome in this regard especially as it relates to data integrity. We regret the oversight made during the first recovery attempt, and this is where we will be making changes to our own internal processes. Generally we reference other active systems and an overall recovery guideline - we keep configurations similar between our shared hosting environments where possible - but this was half of the equation which led to the selection of the wrong drive during the recovery process in this case. We will now be retaining separate recovery plans for each individual system rather than a blanket set of guidelines. We do not want to make worst case scenarios worse, and this will ensure such complications do not occur again.

As with any major downtime, lessons are learned and will now always influence our future preparations and actions. This was easily the worst disaster we've had to manage in our history in web hosting as GeekStorage, and the second worst in my personal experience of nearly 20 years. Our disaster recovery plans were tested heavily throughout the past few days, and although a mistake was made preventing a more timely return to service, minimal loss of data and configurations occurred and for that we are very thankful to have planned sufficiently.

We're also now working on sourcing more coffee.

Thank you:

We know stress is constant during an outage and at most times concrete answers are unavailable which all means everyone is angry or frustrated or both. We want to thank everyone on the Saturn server for being extremely understanding and patient throughout the outage and recovery effort. 

We understand outages such as this, compounded also by the network outage earlier this month, can shake your faith in us as your hosting provider, and the same for your business and customers. We hope the above helps to provide an understanding of what happened and how such problems will be prevented to the utmost degree possible in the future, and this information can be passed on to your clients to hopefully alleive their concerns going forward, as well.