Start a Conversation

Unsolved

This post is more than 5 years old

894

November 22nd, 2011 06:00

For Data Warehouse use Thick LUNs for Performance

I'm in the process of creating a presentation for a webcast on Dec 1st around the white paper, "h8177 EMC VNX Unified Storage Systems for Data Warehouse Applications - Best Practices for Adoption and Deployment." https://community.emc.com/docs/DOC-12631 On page 29 of the doc I'm reading about the advantages of thin LUNs versus thick LUNs in virtual pools:

  1. "Thin LUNs provide for the most optimal use of space." Disk space is only assigned when the host writes data into the thin LUNs. Good for over provisioning.
  2. Thin LUNs grow in relatively small space chunks which represent many small space fragments spread over many physcal drives. "This hurts the rate by which large chunks of data stored in the LUN can be retrieved when the queries are driving large chunk "random" data scans."
  3. Thick LUNs (fully allocated upon creation) is optimal for performance.
  4. Thick LUNs large allocation extent size, one can expect each 1GB work of LUN data would end up physically residing in and spanning some physical drives thereby achieving the more optimal read bandwidth from each drive

For data warehouses then the recommendation is to use thick LUNs for performance considerations. This is an important consideration the DBA benefits in two ways: Fully Automated Storage Tiering Virtual Provisioning (FAST VP) and optimal performance in using thick LUNs. Does the use of thick LUNs become more important as we tier down to slower drives?

BTW, there is a good discussion of using ASM with thin LUNs here: https://community.emc.com/thread/128692

225 Posts

November 22nd, 2011 22:00

Sam, I went through the WP you referred. I had some different thoughts about “Cost Effective”. VNX does provide more bandwidth than its former generation does, top to twice on some model, but to MPP DW application, every single computing node ( current X86 architecture) might require about 600MB/s bandwidth during data loading, therefor from cost prospective, I did not see the advantage of using VNX that 10 DAS on computing nodes.

Correct me, if I am wrong.

Eddy

No Events found!

Top