Start a Conversation

Unsolved

This post is more than 5 years old

11411

February 5th, 2012 14:00

PS6500x spreading across second array

Hi All,

We have just recently purchased a second ps6500x array. I have added it to the same pool as the original ps6500x. My intention is to provide double the capacity and performance by spreading data across both arrays. We have a number of ps6000e and ps6000xv arrays in the same scenario and they have properly distributed data evenly across all arrays in the same pool.

The issue is, I added the array into the pool on Friday and a couple hours later I noticed data was beginning to spread out onto the second array. However at this point it appears data has not copied to the second array in at least 8 hours (from what I can tell - when I last checked). At this point there *is* a heavy backup job running against the primary 6500x and I am wondering if the data distribution has slowed down to avoid a bottleneck with these other IO processes. The only way I know how to monitor the progress is by viewing the member status screen's space graph mapping and seeing the data used space move. It should end up 50/50 used/free since both arrays are identical in space and raid configuration. Also I will view data volumes on the sumo arrays and can slowly watch data consume on a couple of the volumes at a time.

My question is: Is it most likely that data distribution has stopped while the snapshot based backups are running against the primary array and the copying will resume when this process has completed?

Second question, is there another way to monitor / view the progress of the data distribution process aside from the member's disk capacity scale?

Thanks

5 Practitioner

 • 

274.2K Posts

February 5th, 2012 18:00

The balancing algorithm has lower priority to new IO requests from hosts.   A common problem when adding another member is strain on the switch infrastructure.   I.e. there might not be enough inter switch bandwidth for example.

This results in retransmits and reduced performance

You didn't mention what FW you are running.  But for best results you should be using 5.2.0.  

What kind of switches are you using?

I would suggest opening up a support case.  They'll need diags, SANHQ archive and switch configuration info.   With that they will be able to check for any network or array issues.

102 Posts

February 6th, 2012 05:00

Thanks for the reply Don

We are using firmware 5.0.7. The switches are four 3750 connected via stack cables. We are not even hitting 20% backplane utilization on these switches so my guess is they are not the bottleneck. There is a lot of heavy IO that hits this array though.

I want to add that since my post last night and once the aforementioned backup job completed more data had moved onto the second array but less then about 500gb.

How much load balancing performance/algorithm change will be noticed by 5.2.0 firmware?

5 Practitioner

 • 

274.2K Posts

February 6th, 2012 05:00

With 5.1.x and 5.2.x there's a large improvement in how IO loads are balanced across multi members in a pool.  Should one member become more heavily loaded than the others, active pages known as "hot" will be swapped with inactive pages called "cold".   This way you balance out IO just on page by page basis.  Before 5.1, if you had different RAID levels the array could move a volume at a time.   in 5.2.x its much more granular and effective.   Also there are allot of fixes since 5.0.7, especially in disk error handling.  

102 Posts

February 6th, 2012 13:00

Thanks Don. I am going to perform the upgrade to 5.2. I know we are behind but we have strict change control in place. I am going to tag this as part of our san setup requirement.

5 Practitioner

 • 

274.2K Posts

February 6th, 2012 13:00

That's great!  

No Events found!

Top