Unsolved
This post is more than 5 years old
666 Posts
3
16526
Ask the Expert: EMC Cluster Enabler multi-site clustering
|
|
Ask the Expert: How can we improve your VNX product documentation experience? https://community.emc.com/thread/227966?cid=ECNHOME-ATE-UNITY-160402 |
Welcome to the EMC Support Community Ask the Expert conversation. On this occasion we will be covering EMC Cluster Enabler multi-site clustering with Windows Server Failover Cluster. Among the many areas will be discussing, our experts will answer your questions in regards to best practices, supported configurations and issues with multi-site clusters, as well as any further Windows Server Failover Cluster with EMC storage.
Your host:
Edwin VanMierlo is a 20-year IT veteran, a Consultant Engineer working in EMC Customer Service, specializing in Windows failover Clustering connected to EMC storage. Edwin has been working with Windows Clustering from early 1997 and onwards. Since 2004 he has been focusing on Geographically Dispersed Clusters for large enterprises, and has been working on some of the most complex Cluster installations around the world .Edwin is a proven Subject Matter Expert in Windows Server Failover clustering, EMC Cluster Enabler for SRDF, EMC Cluster Enabler for MirrorView, EMC Cluster Enabler for Recoverpoint, Multi Site Clustering in general, and many other EMC infrastructure hardware and software. As these products are multi-layered inter-developed products, it is important that in critical customer environments they are viewed as the sum of their parts rather than individual components. This is Edwin’s speciality.
Edwin is also a public speaker, often on invitation by Microsoft themselves. He has received the Microsoft MVP for Failover Clustering 7 years in a row, and is Moderator on the Storage and Cluster forums from Microsoft.
This discussion begins on April 9 and concludes on May 2. Get ready by following this page to receive updates in your activity stream or through email.
Share this event on Twitter:
>> Join the next Ask the Expert: EMC Cluster Enabler multi-site clustering. http://bit.ly/1jLn8wq 4/9 - 5/2 #EMCATE <<
Mabro1
666 Posts
0
April 9th, 2014 01:00
This Ask the Expert discussion is now open for questions. We look forward to a lively and interactive fun discussion on EMC Cluster Enabler.
pmorris3
24 Posts
0
April 9th, 2014 01:00
Hi
To make sure I'm looking at the right tech - this is about the RecoverPoint Product set?
EdwinVanMierlo
39 Posts
0
April 9th, 2014 02:00
Hi pmorris,
I can answer questions in regards to the "Cluster Enabler for Recoverpoint" software, this is the software to enable Windows Failover Cluster to run across multi-sites using Recoverpoint as replication method.
If your questions are about the "Recoverpoint feature set" then I would advice you to post in the Replication forum, which is found here: https://community.emc.com/community/support/replication
HTH,
Edwin.
pmorris3
24 Posts
0
April 9th, 2014 03:00
Hi
I can see then in general terms (no one ring to rule all - sorry no one solution to fit all ), that RecoveryPoint is for a Active/Passive site configuration where as VPLEX can do Active/Active (In case of a VMWare cluster) not sure if this would work with Windows?
The problem is the LUN/Volume locking done by Windows. VMware only locks the files in use for the VMs.
Is this a correct assumption? I don't what this to be a MS vs VMWare - only interested in the technical aspects on how this works.
Phil
EdwinVanMierlo
39 Posts
0
April 9th, 2014 03:00
Hi Phil,
An excellent question, with many possible answers. No one solution is better or worse than the other, hence the answer is not as straight forward as one might think.
Both RP/CE and VPLEX are solutions which will allow you to build a multi-site cluster with Windows Failover Clustering, and provide you with High Availability as well as Disaster recovery.
In regards to multi-site clustering there is not much of a difference, and one should look at the added feature set each of the products will bring to your infrastructure and capabilities. Once you have made the decision which, Recoverpoint or VPLEX, will suite your business and technical needs, then it is simple to create a multi-site cluster on top of that solution.
There aren't much technical differences in regards to multi-site clustering between both solutions; except the method on how the solution makes the disk-devices ready on the particular site.
With RP/CE; a custom resource is added to the cluster group in Windows Failover Clustering, and the disk-resources are made dependent on this custom resource. This has to effect that the custom resource must come online first, before cluster brings the disk resources online. This custom resource has an API to the Recoverpoint appliances/cluster to give image access on the site where cluster wants to come online. All failover logic and decisions are still made by cluster, not by RP/CE.
With VPLEX; no additional resource is added, and it purely relies on cluster failover logic. However the VPLEX witness (or predetermined rules) will make the disks available or not available.
So if cluster wants to come online on a site where there is a failure, with RP/CE the custom resource will fail (and cluster will move to another node), with VPLEX the actual disk resource will fail (and cluster will move to another node). The control of this in RP/CE is in the CE software on the host, the control of this in VPLEX is in VPLEX infrastructure.
As you can see the technical difference in operating a multi-site cluster with either VPLEX or RP/CE is are minor.
You really need to look at what solution and feature set is applicable to your needs, this determines VPLEX or Recoverpoint, the step to Multi-site clustering is simple after that.
Rgds,
Edwin.
pmorris3
24 Posts
0
April 9th, 2014 03:00
Hi Edwin
Question is about "Cluster Enabler for Recoverpoint" - could you explain the main differences/benefits of this over VPLEX?
Phil
EdwinVanMierlo
39 Posts
0
April 9th, 2014 04:00
Hi Phil,
You bring up an excellent topic:
Active/Active versus Active/Passive
Lets come right clean here; we should no longer use these terms!
There are no Active/Active clusters, and there are no Active/Passive clusters. The last one I have seen was in the previous century !
These terms are really introduced with SQL server version 7, when running on NT4.
You had to choose between an Active/Active or Active/Passive setup when installing. Unfortunately the terms became popular and are still used till today. We should really steer clear and avoid using them when describing modern clusters.
I would prefer to call something like; "a 4 node cluster, running 3 application groups"
or a "2-by-2 node multi-site cluster, running 6 roles"
Now; to answer your question:
Both RP/CE as well as VPLEX can run applications on both sides of the cluster, therefore both are "Active", "Serving" and "running applications". Where the confusion usually comes in is where the disks are online. In Microsoft Clustering the disk is only online on one node and one node only. For RP/CE installs, it might be quite clear, the disk (image access) is given by Recoverpoint at the site where the node is, which has the disk online. For VPLEX it might be different, this is because VPLEX is a storage virtualization layer of your storage. Where the disk is online is not that important, it is however "presented" to the site where the node is which has the disk online.
Both solutions will allow you to run applications on both sites/sides of the cluster simultaneously
Hopefully this explains your questions
rgsd,
Edwin.
pmorris3
24 Posts
0
April 9th, 2014 05:00
Hi
I agree the the definition of Active/Passive - was not talking from the cluster point of view but more of a site/application/service view.
Thanks for the explanation. ~This leads my next question...
simple example: Site A with Node A. Site B with Node B
Two Apps - AppA and AppB each with a disk DiskA and DiskB
So how would the RecoverPoint be setup - I'm assuming that there will be two Consistence Groups? with a EMC Cluster resource for each disk? So EMC Cluster Enabler ResA, DiskA depend on this and then AppA - same for AppB?
Phil
EdwinVanMierlo
39 Posts
0
April 9th, 2014 05:00
Apologies for the second reply, I just noticed I forgot to comment on:
"The problem is the LUN/Volume locking done by Windows"
You are probably referring to the Persistent Reservations which are used in Windows Failover Clustering. And let me tell you this is not a problem, however they are dealt with differently between a Cluster Enabler Cluster and a VPLEX Cluster.
The difference is "replication" versus "presentation".
In a Cluster Enabler setup, such as Cluster Enabler for Recoverpoint, the Persistent reservations are NOT replicated between sites. This however is not a problem in regards to the data disks for your applications running in the cluster. The Recoverpoint appliances are managing where the disks are accessible, hence a disk cannot come online if the image access is not granted. This is done by the custom resource as described in a previous post.
Where this does make a difference is with the Witness Disk in Cluster (the old term would be Quorum disk). As we are not replicating the Persistent Reservation, the Cluster Enabler for Recoverpoint software cannot support a Quorum mode with Witness Disk, we are advising to use a Quorum model without Witness, or with a Fileshare Witness. This is inline with Microsoft recommendation for multi-site clustering.
In a VPLEX, the disk is "presented" to the local cluster node, so the Persistent Reservation is on the actual disk. This makes it possible to choose a Quorum model with Witness disk in a multi-site cluster on VPLEX. However we still maintain the recommendation, same as Microsoft's, to use a Fileshare Witness in multi-site clustering.
For completeness, I must add that the Cluster Enabler for SRDF (SRDF/CE) also supports a witness disk, whereas Cluster Enabler for MirrorView (MV/CE) is like RP/CE and does not support a witness disk.
So far the response to your previous question.
Now response to your latest post:
You are absolutely right; Cluster Enabler for Recoverpoint will manage this in RP-Consistency groups, and you do have a custom resource in each of your Application Groups in cluster.
However it will not have a custom resource for each disk, one custom resource for the group. So if your application uses more than one disk, then you will have multiple disks in the cluster group, all dependent on one custom resource. This custom resource will have all the info of the Recoverpoint consistency group (which in its turn then has multiple disks)
HTH,
Edwin.
EdwinVanMierlo
39 Posts
0
April 11th, 2014 05:00
All,
Please submit all your questions, about:
- Cluster Enabler multi-site clusters (SRDF, MirrorView, RecoverPoint)
- and any other Windows Server Failover Cluster on EMC storage questions
Thanks,
Edwin.
EdwinVanMierlo
39 Posts
1
April 17th, 2014 02:00
All,
let me detail one of the many questions which come in from time to time.
Many times I have been asked to provide an explanation because:
"Cluster Enabler failed over the cluster groups"
Lets start with a statement: Cluster Enabler will not make any decisions (one exception to this), and will not cause the cluster groups to failover (one exception to this), nor will it actually actively failover any cluster groups.
Besides the exceptions noted, which I will come back to further down. Cluster Enabler is merely a resource in the cluster, and will follow the groups and cluster where they want to "go".
Microsoft Windows Server Failover Cluster is in control, all the normal rules for cluster failover are honoured, and cluster will determine the failover based on the set error detecting properties of cluster itself.
Cluster still behaves as if it was a normal cluster, it will perform error detecting, and the following three methods are used:
Based on the error detection routines within cluster, it is cluster to determines:
As cluster is a highly customisable engine itself, there are many factors which influences the failover decision of cluster. Here are the most common ones:
All this is done, the error detecting, the decision to failover (or not), and where to move to, by cluster.
Cluster Enabler does not come into play in this logic, it is merely a resource in the cluster.
Hope this clears up the situation of "Cluster Enabler failed over the group", the answer for that is: Cluster detected an error, and Cluster failed over the group. That will give you a good start when looking at those situations.
I mentioned 2 exceptions:
And there you have it, failover of cluster in a multi-site cluster using Cluster Enabler.
It is basically and not much more than any other cluster !
All the normal cluster logic applies
Please let me know if you have any questions
Rgds,
Edwin.
RobertoAraujo1
2 Intern
2 Intern
•
718 Posts
0
April 17th, 2014 11:00
Great discussion, everyone! Help us promote this debate. Please, share the following tweet:
Join the discussion NOW! Ask the Expert: EMC Cluster Enabler multi-site clustering. http://bit.ly/1jLn8wq 4/9 - 5/2 #EMCATE
Thanks!
kach1947_custom
24 Posts
0
April 22nd, 2014 07:00
Hi Edwin,
Great session so far.
My question is on SRDF / CE with a (old/er) DMX and new VMAX arrays. With a disk expansion for a disk that is already SRDF-ed, I think the question of active/ passive comes in here, especially with the DMX (if i remember correctly), that the cluster needs to be brought down on one node - then the disk expanded (and BCV/ etc if its striped meta) and then re-presented during which the cluster is failed over, resources brought up, etc.
Is the setup still considered active/ passive since the node needs to be brought down?
Also curious to know if this case still persists with the VMAX (has been some time that I worked on SRDF/CE on the VMAX) and any improvements on the same.
Thanks
KC
EdwinVanMierlo
39 Posts
1
April 25th, 2014 01:00
Hi KC,
A very good question.
To extend an volume in a SRDF/CE cluster, it is a quite a simple process, and this process is not different if you are running on VMAX or the older DMX.
The process is described in https://support.emc.com/kb/15140
After the above processes, the cluster should pick up the extended device, after which you need to extend the partition and NTFS volume in the host.
If the Windows OS is not picking up the larger space on the devices, the nodes of the cluster might require a reboot.
HTH,
Edwin.
poodle1
1 Message
0
October 17th, 2014 02:00
Hi Experts,
Need some urgent help with a SAN migration for windows 2008 clusters that use the EMC cluster enabler plugin. We're migrating SAN disks for W2k8 clusters (clusters use file share witness) from EMC CX4 to VNX using SAN copy. The EMC components we're running on the cluster nodes are listed below.
EMC cluster enabler mirrorview plug-in 4.1.0.22
EMC powerpath 5.5
EMC solutions enabler 7.3.0
EMC cluster enabler base component 4.1.0
There's 2 sites that use a stretched VLAN - both sites are in the same location. 2 node W2k8 cluster with 1 node in each site, each site has a CX4 and VNX array using mirrorview/s replication.
Would appreciate if you could help with the SAN disk migration procedure - more from the EMC cluster enabler perspective. Thank you.
Here's the high Level steps we plan to follow however for the EMC cluster enabler - I think we only need to add the new LUN in the existing resource group for the service (cluster resource group for DHCP for example) in cluster enabler manager console, delete the old LUN from enabler Manager and make sure the EMC enabler resource in cluster management console is added as a dependency for the new disk.