AvailabilityGuard for Microsoft Cluster Server

Microsoft Cluster Server

You’ve invested a great deal of time and effort in building highly available Microsoft Cluster Servers (MSCS clusters). But will those clusters work when you need them most? Are they guaranteed to fail-over flawlessly no matter what happens?

Let’s face it: it is difficult to keep your cluster configuration perfectly aligned with vendor best-practices, and in sync with changes in the other IT layers that Microsoft Cluster Server (MSCS) interfaces with (such as OS, Storage, Networking and more). Unfortunately, even a small misconfiguration or discrepancy between cluster nodes can lead to unsuccessful fail-overs and painful outages at the worst possible time.

Microsoft Cluster Server (MSCS) Configuration Alignment with Storage and Replication

On a daily basis, AvailabilityGuard verifies that your underlying storage devices are accessible and configured to provide equal levels of availability and service. With AvailabilityGuard, you can be confident that clusters will fail-over successfully, mount storage volumes and volume groups, and start applications – whether running on physical servers or virtual machines.

Here are a few sample issues

  • Shared storage best practices
  • LUNS inaccessible to cluster nodes (local or remote nodes), or accessible to unauthorized hosts
  • SCSI-3 reservation best practice violations
  • Data misplaced on incorrect storage tier, or on un-shared volumes
  • Fabric single point of failure or masking/zoning misconfigurations that will fail fail-over
  • And more.

Microsoft Cluster Server (MSCS) Configuration Alignment with Server and Application level settings

AvailabilityGuard analyzes the configuration of the different components within the domain of the Microsoft Cluster Server (MSCS), including virtual machines, operating systems, volume groups, file systems, Microsoft SQL Server database files and more. AvailabilityGuard verifies that the cluster configuration and the settings of each of these components are aligned and well-orchestrated. Any mismatch may lead to failed switch-overs.

These are some of the areas that AvailabilityGuard automatically covers:

  • Cluster service group validation (e.g., make sure it does not contain unnecessary services, that quorum devices are not included, check MSDTC configuration. etc.)
  • Specific analysis of MSCS resource and cluster states
  • Validate quorum disk best practices
  • Check discrepancy between ownership (“possible owners”) configuration of cluster nodes
  • Check that actual disk signatures match those defined in the cluster
  • Microsoft Cluster Server (MSCS) configuration (cluster components in bad state, DB parameters configuration,…)
  • Cluster and resource best practices (e.g., resources with hard dependencies that are missing pullup and / or hard start dependencies, start dependencies with type modifier syntax errors. inconsistent action scripts)
  • Host-level configuration (OS version, SP, patch, OS parameters, network configuration, …)
  • SQL Server Database Configuration best practices
  • Mismatch between OS mount configuration and cluster mount resource config
  • LVM mirroring
  • Existence of key directories/files as defined in resources
  • Resource-specific best practices (volume group, logical volume, file system, application, Service IP labels, Tape resources)
  • Server network configuration – NIC bonding, private and public network connections, etc.
  • VSCSI and NPIV guidelines for availability and data protection
  • And more.

Microsoft Cluster Server (MSCS) Node Alignment

Using an intelligent comparison engine, AvailabilityGuard assists the cluster administrator to identify major differences between cluster nodes. Such inconsistencies often lead to unexpected behavior at and following a cluster fail-over.

Availability detects those gaps in MSCS node alignment:

  • Differences in OS version, technology level, installed products, patches, user and group config, OS parameters, services, network options, configuration files, etc.
  • Difference in FC Adapter settings, Network adapters, time and ntp settings, etc.
  • Difference in multipath config – hdisk number of path, algorithm, queue depth, reserve policy and more.
  • Differences in WebSphere/WebLogic/Tomcat deployments (binaries, domains, Java, etc.)
  • And more.

Microsoft Cluster Server (MSCS) Configuration Vulnerabilities

AvailabilityGuard analyzes the configuration of Microsoft Cluster Server (MSCS) itself, and verifies that it complies with Microsoft’s guidelines and with community-driven best practices. The analysis includes comprehensive investigation of resource groups, resources, network interface, Heartbeat management, and additional components.

Here are a few MSCS configuration vulnerabilities that AvailabilityGuard verifies:

  • Valid resource and resource dependency configuration
  • Network configuration best practices
  • Valid states for resource, group and systems
  • And more.