Today I’d like to take the time to outline the differences between continuous replication (often referred to as “synchronous” or “asynchronous”) and periodic replication, or point-in-time (PiT) copies.
When referring to “continuous replication”, I include any type of replication which maintains a target copy synchronized with its source on an on-going basis. This category may include:
- Synchronous replication – Target is identical to the source at any time (no write is performed on the source without being performed 1st on the target). Examples of synchronous replication are EMC SRDF/S, HDS/TC (True Copy), HP Continuous Access Synchronous.
- Asynchronous replication – Similar to synchronous replication, target is continuously being synchronized with the source; however it may have a lag of several minutes or less. Examples of asynchronous replication would be EMC SRDF/A, HUR (Hitachi Universal Replicator) and IBM Global Mirror.
Point-in-time (PiT) copies on the other hand, are not continuously synchronized. These copies are being updated with their source at specific times and once they reach full synchronization, the process is stopped. Note that this definition covers both copies kept within the same storage frame (such asEMC TimeFinder, HDS ShadowImage, NetApp snapshots, IBM FlashCopy) and on a different frame (such as NetApp SnapMirror, timed synch-split SRDF, HUR copies, etc.)
Why do we need a/synchronous replication? Why do we need PiT copies? Do we need both or can we choose only one of the two types? What are the benefits and weak spots of each method?
When it comes to disaster recovery, the different scenarios can be divided to two groups:
- Physical risks. This group includes hardware failure, outage, natural disaster and so on.
- Logical risks. This group includes accidental data deletion or corruption by users or applications, software error that harms data integrity, viruses, etc.
The short answer is:
- You’d need both continuous replication and PiT copies to ensure successful recovery from both the physical and logical risk scenarios.
- A/Synchronous replication is mainly aimed at dealing with the “physical risks” group.
- PiT copies solve the threats associated with the “logical risks” group.
In more details – When an outage or any kind of physical error occurs, continuous replication will allow you to recover with the least amount of data loss (or with no data loss at all). However, when (for example) a file is deleted accidentally, it gets simultaneously removed from your synchronized copy as well! Thus, recovery with continuous replication is doomed to fail. in order to recover – you’ll need a saved copy of the file from a time before its deletion. In other words – a point-in-time copy. The file can be copied from the PiT copy to the fully synchronized target, thus resulting in an up-to-date and valid copy of the source – and operations can continue. Moreover, when an unknown set of files has been compromised, the organization may choose to recover directly from the PiT copy.
Now I know what you’re thinking – “Hi! I can do the same thing with my tape backups. I need no PiT copies”. That is true, but only to some extent. There are many advantages to having PiT copies on top of backup such as:
- PiT copies are significantly more available – can be used immediately. Moreover, you can have PiT copies in every site.
- You can create PiT copies frequently – as much as needed (every couple of hours)
- Retrieving files from backup is painfully slow (hours to days) – hardly enough to meet the (enterprise level) recovery time objectives (RTO)
- In addition, there are obvious benefits to having PiT copies and taking backup of them instead of directly backing up production…but that’s a whole different topic.
Any organization with strict RPO and RTO policies would do wisely if it’ll choose to maintain both continuous and PiT copies. Any other architecture has weak spots that may end up in loss of data or prolonged downtime in case of failover.