A Common Risk Identified in Remote Mirroring Configuration
Remember our LVM mirroring article from a few weeks ago? This time I’d like to take a closer look at one of the potential risks that were described.
The risk signature:
Incorrect Mirror Configuration for DR
In any event which requires recovering data from the DR site:
- Recovery will not be possible
- Data will be lost
- RPO SLA will be breached
- Extended downtime and RTO SLA violation
In this scenario, the customer is using mirroring in the LVM level (Logical Volume Management) to create a synchronous copy of the database at the DR site.
The source data is stored on SAN volumes located in the production site where the mirror is supposed to be stored on the SAN volumes at the DR site. However, the configuration is erroneous since the mirrored data is partially stored on volumes from the production SAN array (See Image 1: Incorrect Mirror Configuration). In the event of a disaster, no complete copy of the database will be available at the DR site. Recovery will not be possible. The database will have to be recovered from a recent backup, a process which involves – loss of data, RPO violation and due to the nature of recovering from tape – prolonged downtime and RTO SLA violation.
Can it happen to me?
Yes, for various reasons.
First, configuration errors are inevitable in the enterprise datacenter environment which involve thousands of configuration entities such as arrays, disks, physical volumes, logical volumes and so on…
Moreover, such a vulnerability would go unnoticed until recovery is needed since the mirrored copy is not put to use on a regular basis. Last, configuration drift are created overtime. Even if the environment was set correctly in the past, any change applied may endanger the DR solution validity. For instance, expanding the database to new file systems and/or SAN volumes may break the DR mirror if the implementer does not take into account the intentions of the original design and its complexity.
Think your data centers may have hidden recoverability and downtime risks such as this? Find out with the risk free 48-hour RecoverGuard pilot scan.