Disaster recovery: How to take a proactive approach to protecting data
When disaster strikes, unchecked vulnerabilities have a way of coming to the fore. Whether that's a broken levee during a hurricane or a hole in your cloud security that is exploited by a hacker, catastrophic events tend to reveal what's been overlooked all along. For brands, that often comes in the form of neglected defense mechanisms meant to protect their data, people and assets.
As more organizations undergo digital transformation, and more business services move to the cloud, the collective vulnerability grows. While there is no doubt significant value in online services and cloud technology, any disruption to those services can adversely impact your operations and ultimately your revenue stream. This is precisely why developing a strategy for disaster recovery (DR) is an essential practice for every business. That said, it's all too common for business leaders to underestimate just how much preparation is required to protect against unexpected dangers.
As Davin Overland, senior product manager at TELUS International and expert in Google Cloud Platform services emphasizes, "Supply side failures stop your business from creating your products, demand side failures destroy revenue, and business support system failures stop your company from maximizing the value of customer interactions. It doesn't take a complete failure of all these systems to qualify as a disaster."
Below, we'll outline the scope of what you should consider in regards to cloud security and disaster planning, how to structure and organize a disaster recovery strategy and review best practices for implementing a comprehensive and cost-justified disaster recovery program.
The variety of disasters
It would be impossible to name all the ways to disrupt a business. From natural disasters like floods, earthquakes and fires, to cyberattacks and infrastructure failures, disruption can come from the most unexpected sources. Sometimes the smallest human error can have a devastating impact on your business.
Inside the data center, losing a rack of servers can cause some systems to go down if you haven't designed for high availability — the degree to which technology can minimize downtime and maintain continuous operations. Infrastructure failures may be outside your data center, too. If a massive storm knocks out your internet service provider, your whole digital platform could go dark. A misconfigured network device or domain name record can leave servers inaccessible to your customers, partners and employees.
While it's nearly impossible to prepare for every single potential problem, a solid DR plan can help you prioritize and organize how you would react to certain kinds of disasters. Consider each kind or category of disaster with respect to your balance sheet: how would it disrupt your ability to provide for customers, and to uphold your end of your relationship with suppliers? If a data center shutdown disrupts your customers' ability to make payments, or interferes with your ability to meet service level agreements, how catastrophic would that be for your business?
Preparing for all types of disasters can be too resource-intensive for realistic achievement. That's why triaging or modeling disaster scenarios allows you to prioritize the events you need to prepare for, and exactly how you can respond.
Framing a disaster recovery and business continuity plan
When you are planning for disaster recovery and business continuity, a key step will be defining your recovery point objective (RPO) and recovery time objective (RTO).
An RPO defines the state of your business you want to be able to start from in your disaster recovery mode. That could be that state of operations at the close of business the day before, or it could be the state of business one minute prior to the loss of services. An RTO, on the other hand, defines the time from the start of a disaster to the time you have recovered to your recovery point objective.
If disaster recovery is highly automated and leverages the cloud, you could be operational in a short period of time. However, if the disaster recovery process is manual, the time to recovery could take days or weeks.
The most important thing to keep in mind about RPOs and RTOs is that they should be tied to business drivers. As Nigel Devaraj, senior product manager at TELUS International and AWS expert, advises, "There's no reason to define RPOs and RTOs that cost more than the value they provide in return, so it is as crucial to understand what not to focus on in disaster recovery as it is to understand what needs recovery protection."
As part of the process, a brand should consider the expectations of its customers — both formal ones defined in service level agreements, and informal ones where the perception of the recovery can be detrimental to its reputation — when determining RPOs and RTOs. For those looking to map objectives to customer expectations, a partner with deep expertise can help.
Optimizing an AWS cloud account for HI Canada
While Hostelling International Canada (HI Canada) has an established reputation for offering safe and affordable accommodations, they needed a partner like TELUS International to ensure their data was managed in the same way. Learn more about the challenge, the solution and the results.
Best practices for disaster recovery planning
The first step in the disaster recovery planning process is to identify critical IT assets. This includes both applications and infrastructure. Modern applications are built on combinations of services, so be sure to identify dependencies on other services, including those provided by third parties. Infrastructure includes your own on-premises infrastructure as well as any infrastructure you access in public clouds.
These four elements can be prongs to your approach:
- Define recovery objectives with an awareness that RPOs and RTOs have associated costs. You can achieve short-term RPOs and tight RTOs, but they will cost more than objectives that are more tolerant of some loss of or an extended recovery time.
- Prepare an emergency response plan. Specifying what you will do under a particular set of circumstances will enable your team to focus on the mechanics of recovery without having to spend time making decisions about which services have top priority or who is responsible for verifying that the recovered system is functioning as expected.
- Define operations to ensure rapid recovery, including backups, failovers in alternate cloud regions and practicing disaster recovery procedures.
- Form a partnership with a company with experience in disaster recovery and the latest in cloud security.
A significant number of successful businesses have limited experience with disasters, says Overland — and that limited experience can have major repercussions. "In recent times, companies have been faced with the reality that a disaster isn't just IT back-office systems failing to function," Overland explains. "What happens when your employees can't come in to work? How does your business continue? What computers do your employees use? How do they communicate? How do they support their customers?"
A DR plan that fails to support recovery of these kinds of issues fails to acknowledge what a disaster really can be for a company, he says.
It's hard to know where to get started when developing a disaster recovery plan, especially as there is a great deal of choice and nuance in the world of cloud security. A trusted partner can help you get organized and create effective contingency and recovery plans that minimize downtime and get you back to running your business.