Networking two minutes drill

Hello Folks,

I always felt that in Windows Failover clusters, the least exposed component is it’s networking model. So I thought to share some basics of it which you may find helpful. Sometimes we have doubt that why is it recommended to use two networks in clusters, the point to notice is that I specified, network not interfaces 🙂

Let’s go ahead & dig into that.

  • As we know cluster uses a health-check mechanism to ensure that the nodes are up & reachable it’s somewhat equivalent to Keep alive packets used by routers to maintain the paths or routes.
  • The Heartbeat network is required to make sure that Nodes can exchange the heartbeat packet & Nodes should never go down or lose connection with it peer nodes in any given circumstances.
  • When we talk about clusters we always emphasis on two different networks (Subnet) – Public & Internal or private.
  • Failover of resources is a different concept & yes I agree on failure of Public Network will trigger the Failover of resources.
  1. In a cluster the network priority is based upon the Metric value & roles assigned to it. For Internal Network the Metric Value is always lowest & it’s the most preferred Network for heartbeat communication between or among the nodes.
  2. In 2008 clusters, The Public network (in Clusters) is automatically used for “Cluster internal communication” as well as for “client access” (Network with Gateway), For fault tolerance as a recommendation the Internal Network should be dedicated for cluster health check Mechanism (Heartbeat).
  3. In some environment the Network which we use for cluster internal communication is on separate subnet however we have to specify a gateway in order to make the IP’s to communicate & that’s the reason we have to set it manually as an “Internal network” because cluster sees the Gateway IP & thinks this subnet is also for Public access. Otherwise cluster networking API’s are smart enough to recognize the network as public (Gateway) & Private (Without a Gateway IP)

Role 1 = Internal
Role 3 = Public Role0  = Not in Use 

 
So for heartbeat network the preferred path is our Internal Network & if this Network fails cluster nodes can still exchange the keep alive packet through the public network. The Internal Network provides fault tolerance. So In case of Network Failures our Node should not lose connectivity with each other

Let’s take two scenarios :

1) We have Internal & Public Network configured

If Internal Network fails cluster will still survive because the heart beat packets can use the alternative route (Public Network) . Cluster will not be effected, Because Nodes & Client can use the Public Network.

Now If Public Network goes down or not reachable on a node, cluster will do the Failover of resources from the problematic node However the Internal Network will maintain the path between the nodes so that heartbeat packets can be exchanged & nodes will stay up.

2) We have a single network for Heartbeat & Public access :

Now in this scenario if Public Network goes down on any node , First cluster will stop exchanging the heartbeat packets & cluster will go in a partitioned state because of which even Failover will not occur because nodes can no longer communicate with each other .

The Heartbeat network is required to make sure that Nodes can exchange the heartbeat packet & Nodes should never go down or lose connection with it peer nodes in any given circumstances. Failover of resources is a different concept & yes I agree on failure of Public Network resources will Failover.

In a cluster the network priority is based upon the Metric value assigned. For Internal Network the Metric Value is always lowest & it’s the most preferred Network for heartbeat communication between or among the nodes.

So for heartbeat network the preferred path is our Internal Network & if this Network fails cluster nodes can still exchange the keep alive packet through the public network. The Internal Network provides fault tolerance. So Incase of Network Failures our Node should not lose connectivity with each other

  1. On cluster nodes, when we run ipconfig /all we see a  “Microsoft Failover Cluster  Virtual Adapter” , It’s a Pseudo Adapter which cluster netft.sys ( Network driver)  uses to manage the Physical networks &routes based on their Priority .
  2. This Adapter never participates in actual packet routing. There is always an APIPA address which is assigned to it.
    This adapter works in alliance with cluster service so if this adapter goes down, Networks will go in failed state in cluster.
  3. So if you have only one network & it goes down the cluster node will definitely go down

 

Please refer these links to learn about cluster networks & model. 
What is a Microsoft Failover Cluster Virtual Adapter anyway?

Configuring Windows Failover Cluster Networks

Cluster Networks fundamentals

Windows Server 2008 Failover Clusters: Networking (Part 1)

Windows Server 2008 Failover Clusters: Networking (Part 2)

Windows Server 2008 Failover Clusters: Networking (Part 3)

Windows Server 2008 Failover Clusters: Networking (Part 4)

 

Configuring Windows Failover Cluster Networks

http://blogs.technet.com/b/askcore/archive/2014/02/20/configuring-windows-failover-cluster-networks.aspx

 

~ Cheers