Another Hidden Downtime Risk That Can Come Back to Bite You

IT Resilience & Downtime Prevention Blog

Another Hidden Downtime Risk That Can Come Back to Bite You

It downtime risk
by Yaniv Valik on January 17, 2013

Today’s Topic: Cluster Shared SAN Configuration Drift

The most common way to share data between cluster nodes is through the use of multi-homed SAN storage. Inconsistent access to the SAN volumes by cluster nodes is a state in which one or more shared volumes are not mapped to one or more nodes. 

Sharing is intended to guarantee immediate data availability in case of a failover, but inconsistent mapping might put failover in jeopardy. 

Why Does It Happen?

The initial configuration of a cluster is typically correct. However, routine configuration changes such as adding a new storage volume or extending the cluster to additional nodes could gradually result in a configuration drift that leaves one or more shared volumes un-mapped to some of the nodes.

What Is the Impact?

In the event of a cluster failover to the passive node, data stored on an up-mapped volume will not be available, leading to downtime of any application which requires access to a database or files stored on these volumes.

How Can It Be Avoided?

There are multiple ways to minimize the risk of such configuration drift:

  1. Documentation: Put in place clear and well-documented procedures for any changes introduced to the cluster configuration.
  2. Training: Conduct periodic training for all involved personnel to review possible availability risks introduced by production environment modifications.
  3. Automation: Implement automated auditing of your high availability environment to ensure passive node configuration is always consistent with active node configuration.

Learn more about Automated Daily High Availability Testing

Yaniv Valik
Yaniv Valik
VP Product Management & Customer Success at Continuity Software

Comments are closed.