Welcome to the EMC Support Community Ask the Expert conversation. On this occasion we will be covering EMC Cluster Enabler multi-site clustering with Windows Server Failover Cluster. Among the many areas will be discussing, our experts will answer your questions in regards to best practices, supported configurations and issues with multi-site clusters, as well as any further Windows Server Failover Cluster with EMC storage.
Edwin VanMierlo is a 20-year IT veteran, a Consultant Engineer working in EMC Customer Service, specializing in Windows failover Clustering connected to EMC storage. Edwin has been working with Windows Clustering from early 1997 and onwards. Since 2004 he has been focusing on Geographically Dispersed Clusters for large enterprises, and has been working on some of the most complex Cluster installations around the world .Edwin is a proven Subject Matter Expert in Windows Server Failover clustering, EMC Cluster Enabler for SRDF, EMC Cluster Enabler for MirrorView, EMC Cluster Enabler for Recoverpoint, Multi Site Clustering in general, and many other EMC infrastructure hardware and software. As these products are multi-layered inter-developed products, it is important that in critical customer environments they are viewed as the sum of their parts rather than individual components. This is Edwin’s speciality.
Edwin is also a public speaker, often on invitation by Microsoft themselves. He has received the Microsoft MVP for Failover Clustering 7 years in a row, and is Moderator on the Storage and Cluster forums from Microsoft.
This discussion begins on April 9 and concludes on May 2. Get ready by following this page to receive updates in your activity stream or through email.
Share this event on Twitter:
>> Join the next Ask the Expert: EMC Cluster Enabler multi-site clustering. http://bit.ly/1jLn8wq 4/9 - 5/2 #EMCATE <<
This Ask the Expert discussion is now open for questions. We look forward to a lively and interactive fun discussion on EMC Cluster Enabler.
I can answer questions in regards to the "Cluster Enabler for Recoverpoint" software, this is the software to enable Windows Failover Cluster to run across multi-sites using Recoverpoint as replication method.
If your questions are about the "Recoverpoint feature set" then I would advice you to post in the Replication forum, which is found here: https://community.emc.com/community/support/replication
Question is about "Cluster Enabler for Recoverpoint" - could you explain the main differences/benefits of this over VPLEX?
An excellent question, with many possible answers. No one solution is better or worse than the other, hence the answer is not as straight forward as one might think.
Both RP/CE and VPLEX are solutions which will allow you to build a multi-site cluster with Windows Failover Clustering, and provide you with High Availability as well as Disaster recovery.
In regards to multi-site clustering there is not much of a difference, and one should look at the added feature set each of the products will bring to your infrastructure and capabilities. Once you have made the decision which, Recoverpoint or VPLEX, will suite your business and technical needs, then it is simple to create a multi-site cluster on top of that solution.
There aren't much technical differences in regards to multi-site clustering between both solutions; except the method on how the solution makes the disk-devices ready on the particular site.
With RP/CE; a custom resource is added to the cluster group in Windows Failover Clustering, and the disk-resources are made dependent on this custom resource. This has to effect that the custom resource must come online first, before cluster brings the disk resources online. This custom resource has an API to the Recoverpoint appliances/cluster to give image access on the site where cluster wants to come online. All failover logic and decisions are still made by cluster, not by RP/CE.
With VPLEX; no additional resource is added, and it purely relies on cluster failover logic. However the VPLEX witness (or predetermined rules) will make the disks available or not available.
So if cluster wants to come online on a site where there is a failure, with RP/CE the custom resource will fail (and cluster will move to another node), with VPLEX the actual disk resource will fail (and cluster will move to another node). The control of this in RP/CE is in the CE software on the host, the control of this in VPLEX is in VPLEX infrastructure.
As you can see the technical difference in operating a multi-site cluster with either VPLEX or RP/CE is are minor.
You really need to look at what solution and feature set is applicable to your needs, this determines VPLEX or Recoverpoint, the step to Multi-site clustering is simple after that.
I can see then in general terms (no one ring to rule all - sorry no one solution to fit all ), that RecoveryPoint is for a Active/Passive site configuration where as VPLEX can do Active/Active (In case of a VMWare cluster) not sure if this would work with Windows?
The problem is the LUN/Volume locking done by Windows. VMware only locks the files in use for the VMs.
Is this a correct assumption? I don't what this to be a MS vs VMWare - only interested in the technical aspects on how this works.
You bring up an excellent topic:
Active/Active versus Active/Passive
Lets come right clean here; we should no longer use these terms!
There are no Active/Active clusters, and there are no Active/Passive clusters. The last one I have seen was in the previous century !
These terms are really introduced with SQL server version 7, when running on NT4.
You had to choose between an Active/Active or Active/Passive setup when installing. Unfortunately the terms became popular and are still used till today. We should really steer clear and avoid using them when describing modern clusters.
I would prefer to call something like; "a 4 node cluster, running 3 application groups"
or a "2-by-2 node multi-site cluster, running 6 roles"
Now; to answer your question:
Both RP/CE as well as VPLEX can run applications on both sides of the cluster, therefore both are "Active", "Serving" and "running applications". Where the confusion usually comes in is where the disks are online. In Microsoft Clustering the disk is only online on one node and one node only. For RP/CE installs, it might be quite clear, the disk (image access) is given by Recoverpoint at the site where the node is, which has the disk online. For VPLEX it might be different, this is because VPLEX is a storage virtualization layer of your storage. Where the disk is online is not that important, it is however "presented" to the site where the node is which has the disk online.
Both solutions will allow you to run applications on both sites/sides of the cluster simultaneously
Hopefully this explains your questions
I agree the the definition of Active/Passive - was not talking from the cluster point of view but more of a site/application/service view.
Thanks for the explanation. ~This leads my next question...
simple example: Site A with Node A. Site B with Node B
Two Apps - AppA and AppB each with a disk DiskA and DiskB
So how would the RecoverPoint be setup - I'm assuming that there will be two Consistence Groups? with a EMC Cluster resource for each disk? So EMC Cluster Enabler ResA, DiskA depend on this and then AppA - same for AppB?
Apologies for the second reply, I just noticed I forgot to comment on:
"The problem is the LUN/Volume locking done by Windows"
You are probably referring to the Persistent Reservations which are used in Windows Failover Clustering. And let me tell you this is not a problem, however they are dealt with differently between a Cluster Enabler Cluster and a VPLEX Cluster.
The difference is "replication" versus "presentation".
In a Cluster Enabler setup, such as Cluster Enabler for Recoverpoint, the Persistent reservations are NOT replicated between sites. This however is not a problem in regards to the data disks for your applications running in the cluster. The Recoverpoint appliances are managing where the disks are accessible, hence a disk cannot come online if the image access is not granted. This is done by the custom resource as described in a previous post.
Where this does make a difference is with the Witness Disk in Cluster (the old term would be Quorum disk). As we are not replicating the Persistent Reservation, the Cluster Enabler for Recoverpoint software cannot support a Quorum mode with Witness Disk, we are advising to use a Quorum model without Witness, or with a Fileshare Witness. This is inline with Microsoft recommendation for multi-site clustering.
In a VPLEX, the disk is "presented" to the local cluster node, so the Persistent Reservation is on the actual disk. This makes it possible to choose a Quorum model with Witness disk in a multi-site cluster on VPLEX. However we still maintain the recommendation, same as Microsoft's, to use a Fileshare Witness in multi-site clustering.
For completeness, I must add that the Cluster Enabler for SRDF (SRDF/CE) also supports a witness disk, whereas Cluster Enabler for MirrorView (MV/CE) is like RP/CE and does not support a witness disk.
So far the response to your previous question.
Now response to your latest post:
You are absolutely right; Cluster Enabler for Recoverpoint will manage this in RP-Consistency groups, and you do have a custom resource in each of your Application Groups in cluster.
However it will not have a custom resource for each disk, one custom resource for the group. So if your application uses more than one disk, then you will have multiple disks in the cluster group, all dependent on one custom resource. This custom resource will have all the info of the Recoverpoint consistency group (which in its turn then has multiple disks)