tcstl

7 Posts

2162

April 21st, 2015 07:00

optimal san settings for sql server

Running sql 2008r2 Physical server attached to new 5600. For SQL Server is jumbo frames the optimal setting for an OLTP server?

Responses(12)

tcstl

7 Posts

0

April 22nd, 2015 11:00

My Idera is showing some massive queue lengths during the day 4 or 5 times during heavy loads. Biggest issue is my index rebuilds for the database are taking over 24 hours with no almost no one on at that time. Last week only rebuilt 600 indexes (4 on a 133 million row table took 5 hours). I believe in this case it would help. Even if I see only a minimal bump on the OLTP I'll take any gain I get on maintenance or backups

C

CPJ_FL

11 Posts

0

April 22nd, 2015 11:00

tcstl ,

Are you referring to storage connectivity or client connectivity? In OLTP, you're not going to see much benefit from client/server connections at jumbo MTUs. You could, however, see large throughput increases for backups, mirroring, and bulk loading with front-end jumbo MTUs. Jumbo iSCSI is a mixed bag. It depends on the target, as well as the switches and initiators involved, too.

~Phil

C

CPJ_FL

11 Posts

0

April 22nd, 2015 12:00

So is this iSCSI or Fibre Channel attached?

Regardless of the transport, it sounds like you have a larger performance issue if you're getting high queue depths at the host. If you haven't installed it already, I would highly recommend getting Monitoring & Reporting set up and running for your array. Jumbo iSCSI doesn't change the I/O characteristics enough to make a large dent unless you are bandwidth constrained.

Regarding backups, if you are iSCSI and are backing up to a remote device, you will need jumbo MTU configured end-to-end on your back-end storage LAN segments and interfaces as well as your backup LAN segment and interfaces.

tcstl

7 Posts

0

April 22nd, 2015 12:00

I will try I am the DBA, so we have a group that takes care of the SAN. i was hoping that a 1.4 terabyte database would not be encountering these kind of issues on a new SAN

tcstl

7 Posts

0

April 28th, 2015 09:00

does the idea of your SQL data and log drives need a 1024 KB starting offset, and a 64 KB block size apply to the 5600?

dmuegge

17 Posts

0

April 28th, 2015 15:00

A 1024K starting offset is typical for all partitions on Windows 2008 and above. As long as the offset is a multiple of 64K it is good and this is especially important if the machine is a VM. It is also best practice to format the NTFS partition with a 64K file allocation unit rather than the default 4K.

The block size on a VNX disk is 512 bytes. The key value more often considered is stripe size.The default stripe element size is 128 blocks, which is 64 KB. The stripe element size is not configurable through Unisphere. This value has been optimized for the VNX OS.

The stripe size is the amount of user data in a RAID group stripe. This does not include drives used for parity or mirroring. The stripe is measured in KB. It is calculated by multiplying the number of stripe disks by the stripe element size.

For example, an eight-disk RAID 1/0 has a stripe width of four, with a stripe element size of 64 KB has a stripe size of 256 KB (4 * 64 KB). A five-disk RAID 5 (4+1) with a 64 KB stripe element size also has a stripe width of 256 KB drive.

I would recommend two whitepapers.

http://www.emc.com/collateral/white-papers/h12341-sqlserver-bp-wp.pdf

https://www.emc.com/collateral/software/white-papers/300013146-unified-storage-fundamentals.pdf

Regards,

Dave

tcstl

7 Posts

0

April 29th, 2015 05:00

thanks very much this is what I was looking for, I am the DBA so the SAN is not something I control, we just went to a 5600 from a 5300 and honestly the gain has been less than desired. The index rebuild on the weekend is actually taking longer

tcstl

7 Posts

0

April 29th, 2015 07:00

it is online but I have a full day for maintenance, I use Ola's reindex scheme, which is what I used on the last SAN. Shocked to see little/no improvement. Our SAN guys are verifying the setup based upon the whitepapers. We had a consultant help with the setup and he was told the portions dedicated to sql server, not sure if he followed this path

JamesBEMC

257 Posts

0

April 29th, 2015 07:00

Hi there,

Online re-index or offline?

You should only see improvement going from VNX1 to VNX2 and a higher model (of course).

I would recommend you ask your SAN admin how your LUNs are setup and if FastCache, FastVP or Flash is being used.

Thanks, James.

JamesBEMC

257 Posts

0

April 29th, 2015 08:00

I can look at perfmon data if you wish, to see what you are seeing at the host level.

Any spikes with TempDB latencies?

tcstl

7 Posts

0

April 29th, 2015 08:00

My Idera DM is giving me what I need, lots of disk queue latency. None of the drives were formatted with 128. I have seen some MS articles pointing to this as necessary. Tempdb is not bad, but for 48 core and 128 gig of RAM expecting better results. I have one 133 million row table with 4 indexes that takes 4 hours, I have been a SQL DBA for 20 years and can't ever remember seeing something that awful since the 6.5 days.

patrickduggan

13 Posts

0

April 29th, 2015 08:00

First of all, welcome to the EMC Everything Microsoft Community, and thank you for being an EMC Customer!

In general, I focus more on items like TCPACK and MPIO optimization rather than going the Jumbo Frame route. There are many network dependencies (every hop has to have JUMBO enabled, there may be overhead to be factored into MTU calc on the switch ports, WAN provider has to support if that's in play, etc). If a single hop is missed, fragmentation occurs and troubleshooting becomes... interesting.

The Windows Connectivity Guide has guidance for host settings, and gives guidance for things like setting disabling TCPACK (link may require login for support.emc.com):

https://support.emc.com/docu5134_Host-Connectivity-Guide-for-Windows.pdf?language=en_US

Things like the network connectivity come into scope as well - i.e. is this a 1gbit or 10gbit nic, number of ports, etc.

Sustained queuing can be caused by a few things - and generally I like to follow an evidence driven approach leveraging EMC's PERFCOLLECT:

Perfcollect

Other things to look at might be: What SP owns the volume? Am I leveraging the Microsoft MPIO DSM? Am I mapped to ports on both SP's with Round Robin? If so, then I may be taking a non-optimal path to the underlying volume (trespassing based on "blind" MPIO round-robin - we leverage ALUA and RR w/subset is generally leveraged in the above scenario).

Depending on the answers to some of the questions above, volume layout, flexibility of schema (leveraging things like SQL's proportional fill algorithm with Read/Write filegroups to take advantage of multiple queues), we can take multiple paths to insure that you get the performance that you need for SQL.

We can also look at pool design and policies for the underlying volume on the array, and "pin" individual volumes that benefit from flash placement to the appropriate spindle types.

Have your SAN administrator reach out to your local account team. Depending on your region, we can have a local MSpecialist assist with data collection and make targeted recommendations based on what is observed.

View All

No Events found!