The signature
Configuration drifts between HA cluster nodes
The impact
This will vary depending upon the specific drift, but can include a failure to switch-over/fail-over/switch-over to other node (causing downtime), or reduced performance after fail-over/switch-over which will, at best, create an operations slowdown and at worst leave the node unable to carry the load
Technical details
In this example, the passive node does not have redundancy in the HBA level nor in the DNS configuration. The currently active node is configured with redundancy for these elements. A single HBA/DNS server configuration is a single point of failure. Upon fail-over/switch-over to the currently passive node, the applications running on this cluster will suffer from reduced availability/MTBF and more downtime. In addition, the passive node is configured with significantly less maximum allowed open files, which may lead to application failures. Moreover, the passive node has only 1GB of swap while the active node was configured with additional 4GB. Upon fail-over, the applications may not have sufficient memory to run properly. Lastly, differences in installed products may have various impacts, depending on the product type.
Can it happen to me?
This situation occurs frequently in HA environments. The configuration of a host involves so many details that is it very difficult to ensure an HA server is fully synchronized to its production host at all times.
Configuration Drifts between Production and HA
Result: Downtime; manual intervention needed to recover