Given the wealth of different DRaaS capabilities on the market, and the fact that - just as no two enterprises are exactly alike - no two solutions are quite the same, choosing the exact solutions that are the right fit for your disaster recovery plan can be a daunting process.
According to How Disaster Recovery as-a-Service helps financial services firms stay on the front foot, a new report by Creative ITC that aims to help firms (particularly in the financial sector) understand and leverage the benefits of DRaaS solutions, making the right decision when it comes to implementing your new DRaaS solution has never been more critical.
“The nature of disaster planning is changing. As cloud and virtualisation take hold, so too does the risk of downtime due to software problems, cybersecurity vulnerabilities and increasing infrastructure complexity,” notes the report.
“Add to that natural disasters, power outages, hardware failures and human error and it becomes clear financial services firms need solid DR resources in place – and be absolutely certain they will work.”
In order to capitalise on the potential of a converged, comprehensive DRaaS strategy, Creative ITC’s report identifies a number of key criteria to explore when comparing different DRaaS solutions.
Understanding your RPO
Your recovery point objective (RPO) is a vital piece of the disaster recovery plan puzzle. Essentially, your RPO describes the maximum amount of time that can pass before an outage, attack or other cause of downtime becomes unacceptable. Different DRaaS solutions can lead to different RPO’s and, depending on how critical different elements of your business are to continuing operations, can show you which different solutions can be applied in different situations to different parts of your business.
Below, Creative ITC breaks down some of the key data replication techniques that can underpin a disaster recovery strategy.
Synchronous array-based replication (RPO = 0)
Ensures all data is written to the source and target storage simultaneously, waiting for acknowledgment from both arrays before completing the operation. With the rise of all-flash arrays, network latency between arrays can become a bottleneck. Synchronous replication also runs the risk of quickly propagating malware and thus dramatically extending recovery times.
Asynchronous array-based replication (RPO > 1 hour)
Uses storage snapshots to take a point-in-time copy of data that has changed and sends it to the recovery site. Schedules can be setup hourly depending on the number and frequency of snapshots that the storage and application can withstand.
VM snapshot-based (RPO > 1 hour)
Created in the hypervisor, VM-level snapshots also take a point-in-time copy of data that’s changed and send it to the recovery site. The only type of supported replication is asynchronous. VM-level snapshots can incur performance impact and it’s not recommended to create, remove, or leave them running on production VMs during working hours.
Guest-based (RPO > 1 hour)
Also known as agent- or OS-based replication, this solution requires software components to be installed on each physical and virtual server. Although more portable than array-based solutions, the approach limits scalability and typically only supports asynchronous replication. It can also impact the performance of server operating systems.
Hypervisor-based (RPO = seconds)
Always-on, hypervisor-based replication delivers continuous data protection (CDP) and is constantly replicating data changes to the recovery site within fractions of seconds. It does not need to be scheduled, does not use snapshots and writes to the source storage without having to wait for acknowledgment. All data writes are captured, cloned, and sent to the recovery site at the hypervisor layer, making it more efficient, accurate, and responsive.
The trends affecting disaster recovery today are pushing the industry firmly in the direction of hypervisor-based DRaaS solutions. Paired with fully managed DRaaS solutions, financial firms are increasingly able to leverage this approach to reduce downtime, costs and overall risk to their operations.