Start a Conversation

Unsolved

This post is more than 5 years old

1 Rookie

 • 

5.7K Posts

2413

November 10th, 2014 04:00

How to build an HA SQL cluster

I was wondering what advice to give to my MS DBA folks when they want to build an SQL cluster which resides on two sites. Do I need to use Mirrorview/CE or is the Microsoft Always on a serious option these days? As a storage guy I'm thinking MV/CE, but I've seen configs where the storage would failover when this wasn't meant to, so here I am, asking you!

5 Practitioner

 • 

274.2K Posts

November 10th, 2014 12:00

Hello,

To help add some insight, here are some comments to your questions.

SQL 2012/2014 Always on Availability Groups (AAG) is a serious option to consider for the correct SQL Enterprise licensed environment.  I would recommend a POC with the customers application(s) first.  Test the failover with AAG along with testing the application(s) resiliency at time of failover.  Test with the application(s) connecting to the AAG Listener, which is the TCP endpoint.  I have experienced certain applications, Epicor ERP and MS Dynamics SL have issues with AAG failover, to name just a couple I have tested with AAG enabled environments.

You have the option of synchronous or asynchronous commit modes.  Up to 4 nodes. (SQL 2012) Synchronous and automatic failover should be used in low latency environments only.  The log does need to harden on all synchronous secondaries before committing the transaction.  Hence the reason for low latency.  Active secondaries are a wonderful option for read-only reporting type activity.  Additionally, database backup and DBCC activities can be executed on an active secondary.  Greatly reducing processing cycles on the Primary.  This is a huge improvement over Mirroring, log shipping, and even some, if not all, SQL clustered environments.  The complexity regarding AAG as compared to SQL Clustering has also been greatly reduced.  Letting most seasoned DBA’s administer their failover situation, without enlisting the help of Windows Sys Admins.

With AAG, you do have separate copies of the data.  The data is not shared as in SQL clustering.  In my opinion, this is a better option than SQL Clustering as shared disk usually means shared failure.  This additional disk allocation should be planned for during the assessment phase of the project.

A nice complement to AAG, when building a Disaster Recovery location, is RecoveryPoint.  Utilizing an asynchronous SQL AAG node for DR is a good option.  There are also hybrid options utilizing SQL Clustering on the primary, and asynchronous AAG to the DR location.   This may be introducing additional, but truly not needed, complexity.  Older, non-cutting-edge DBA’s tend to gravitate to SQL Clustering first, as they have yet to experience the many benefits of AAG.

I welcome the opportunity to discuss in further detail.  Just let me know.

1 Rookie

 • 

5.7K Posts

November 11th, 2014 04:00

Where storage mirroring / replication is done over a SAN / Fibre Channel network, AAG does the same using a LAN. So I can imagine that the low latencies we have in SAN environments are now traded in for a little more latency when the "replication" is done over NICs, right? Now how's that better? Will it improve the performance of a single node if it has to wait for an ACK from the other node? When doing storage replication the ACK was very fast to begin with. Or is there a trick in the AAG way of doing things?

1 Rookie

 • 

5.7K Posts

November 11th, 2014 04:00

Wow! That's quite a nice answer you provided me. I'm sure a lot more people than just me will find this helpful .

I'm a storage guy, so my first thought in cluster challenges is using storage replication, but more and more customers are experimenting with AAG and I became interested.

Thanks again!

5 Practitioner

 • 

274.2K Posts

November 11th, 2014 05:00

Hello,

To address potential latency issues, AAG can be configured to be asynchronous.  This means the log will write to the secondary, potential DR node, in asynchronous-commit mode.  The transaction will commit as soon as it can, but the Primary does not have to wait for acknowledgement.   However, this can mean data loss to a certain extent, depending on the transaction rate on the Primary.  The data will never be fully synchronized with this configuration, as the secondary databases will remain in a Restoring state.  Which also means there can be no work done against this secondary database (DBCC, read-only, backup, etc).  Hence the reason asynchronous AAG is usually left for a DR type of solution.

Good conversation, I would love to know what you know about storage, and you would like to understand more about the database technologies.  Good collaboration!

I hope this helps!

1 Rookie

 • 

5.7K Posts

November 11th, 2014 23:00

I like the discussion too. We should do an Ask the Expert one time.... Oh hey! I did three of them already and they're about storage!!! (Re: Ask The Expert: Discussing the challenges of Long Distance Links and Ask the Expert: Performance Calculations on Clariion/VNX).

But to get back to the replication methods of an AAG I can conclude that with synchronous replication it works about the same as the storage version of it. Talking about latency that is. An important pro for storage is the lower latency and the faster "network", but a pro for AAG would be in my opinion that the application itself is aware of the replication and could possibly failover faster. In my opinion it comes down to latency of the networks used. If you're on a 10Gbps network, I'm sure you can beat the latency of an 8Gbps SAN, but 16Gbps is around the corner and what are we talking about in terms of latency? Sub-millisecond? What exactly is that? 0.9, 0.2; When a windows (or Linux) CLI mentions <1ms it doesn't tell me much how many IOps I can perform over the LAN. Has anyone tested this?

Now that would be a good POC: testing an AAG with synchronous replication over a 1Gbps / 10Gbps LAN compared to a setup that uses an MSCS configuration and Mirrorview/s/ce or even RecoverPoint.

7 Posts

February 19th, 2015 06:00

I like to add the other option would be to use the classic approach using Microsoft Failovercluster an to give this an extra provide the storage using an VPLEX to mirror to the other side. That will do the trick where AVG is not the option or would complicate the things.

regards

5 Practitioner

 • 

274.2K Posts

February 19th, 2015 07:00

Hi

I have a couple of questions that I hope will guide the conversation.

Considering the application that will be serviced by the database is Failover Clustering the only option or can AAG be considered? In terms of granularity failover clustering is at the instance level and  AAG is at the level of the group of databases.

You mention building a failover cluster between 2 sites. In a HA / DR event what are the business requirements for the solution in terms of RPO (Recovery Point Objective) RTO (recovery time objective).

Is the SQL Server database workload mixed, OLTP or OLAP / DW. While latency between sites is important for performance the workload type is also a factor when determining the overall impact. Of course the only way to know for sure will be to test.

Thanks

Tony

7 Posts

February 19th, 2015 07:00

Behind the curtain Exchange DAG is exactly the same thing as AAG cause it was developed by the same team, only exchange adopted it earlier. Behind DAG and AAG are the basics from Microsoft Failover Cluster used so you configure a DAG or AAG an geht an MFC in background, which is configured from the applications not from the classic side approach . But i liked that you got an idea how to handle your DBAs .

regards

5 Practitioner

 • 

274.2K Posts

February 19th, 2015 07:00

Hello,

Yes, classic SQL Clustering is an option.  But, in my option, it is complicated.  Our clients like the ability to use AlwaysOn especially for a readable secondary, then the added benefit of asyc to a DR location.  It is the best of both worlds. Somewhat load balanced SQL Server, at least for reads.  As more and more environment finally adopt Virtualizing SQL server, SQL AAG is a much better option.  Yes, VMware 6 now supports WFCS, but this is also highly complex.  In my opinion, simple is better and easier to manage... and keeps the client happy!

Yes, SQL AAG does compete with technologies such as EMC RecoverPoint.  But remember, RecoverPoint was developed before SQL AAG.  SQL AAG is very similar in nature to Exchange DAG, of which companies have been utilizing for years.

Just some thoughts.

1 Rookie

 • 

5.7K Posts

February 23rd, 2015 02:00

Those are indeed good questions, Tony. My case is entirely theoretical.

With AAG, Async and sync are possible and I like it that the DBA can configure this. If for some reason they want a temporary lower latency, they could switch to async and remain at sync afterwards, right? If this needs to be done in Unisphere / CLI towards a VNX, I know this is quite disruptive for the DR copy of the data.

No Events found!

Top