Unsolved
This post is more than 5 years old
3 Posts
0
69025
December 2nd, 2010 04:00
Equallogic single point of failure?
Hello,
If one SAN failed, I would not have interuption of services. In other words, how to make SAN like DRDB master-master?
thanks
If one SAN failed, I would not have interuption of services. In other words, how to make SAN like DRDB master-master?
thanks
No Events found!


Joe S586
9 Technologist
•
729 Posts
0
December 2nd, 2010 13:00
You can get a better description if you download the Group Administration Guide from the www.equallogic.com/support web site (support account required), look for the Guide in the section for the firmware currently running on your array.
Regards,
Joe
Zeppi
3 Posts
0
December 3rd, 2010 00:00
Yes, I understand and that's the problem. That means I can not centralize all on this SAN, because if something goes wrong it will take me a day or two to restart each machine and verify that all are well.
Imagine, you go up a cluster on multiple machines for high availability and if the SAN is down the cluster falls. It is not very useful.
Regars
Giuseppe
Joe S586
9 Technologist
•
729 Posts
0
December 3rd, 2010 09:00
I’m sure you are aware that the EqualLogic array does offer Enterprise class protection, i.e., redundant power supplies, controllers, and disks ; RAID protection of the data; hot swappable components; enhanced RAID rebuild, fault isolation, etc. So the redundancy of the Pier Storage Array design does eliminate single points of failure, helping to provide greater than 99.999 percent availability. This is of course only true if the Data Center site is operational; i.e., switches, power, cooling, physically accessible, etc.
So in the case of DRBD, if the single array group availability is less then you require, you might consider two Array groups, one to handle “Service A” volumes, than build the DRDB mirror to the other PS Group “Service B” to mirror the volumes.
If you have a specific instance (product link, etc.) I can look at it and I may be able to offer another solutions/recommendations?
Regards,
Joe
Zeppi
3 Posts
0
December 3rd, 2010 12:00
You're right, it is certain that Equalogic is a reliable. It happens that I have make a cluster Oracle and I realized that I need two of Equalogic when I wanted to upgrade the firmeware of Equalogic. In fact, I had to reboot the Equalogic well as on the cluster.
So I wondered how I could make to ensure continuity of service for my cluster oracle?
Now I know that Equalogic must restart after an update, I think that with a second Equalogic I could replicate the SAN data A to B to update the A. Then comes the turn of B.
The problem extends that I should not use more than half the capacity of the SAN.
Regards
Joe S586
9 Technologist
•
729 Posts
0
December 3rd, 2010 13:00
I understand your concern, however, the restart during the firmware update should only take @15-20 seconds. If the host server iSCSI Initiator “disk timeouts” are set properly, then there should be no disruption of service (from the server’s point of view). The latest release notes for the array’s firmware has the recommended iSCSI timeout values for each host type.
Regards,
Joe
seboulba
1 Message
0
December 6th, 2010 00:00
I found your discussion while searching google for HA solutions for our Citrix-Xen based cloud of servers.
The first solution that came up was the one using drbd master:master on iscsi targets, on separate SANs (freebsd+zfs+iscsi).
So that if a SAN goes down or must be shut down for hardware upgrade the service is still available and the virtual machines (around 500) doesn't need to be restarted, checked etc.
Then we heard a lot of good about the Equallogic storage arrays and it's integration with Citrix xen. So we thought about acquering two of them.
After a couple of research we found out too that it was not possible to have real-time replication between the arrays, bring one unit down and have the service to be ininterrupted. And it's kinda a big issue for us too.
Isn't it possible to do this with Equalogic Arrays ? I thought that High Avalability supposes that you can bring a whole array down without service interruption ?
Any ideas ?
Thanks for your help.
Joe S586
9 Technologist
•
729 Posts
0
December 6th, 2010 11:00
From my understanding, with DRDB setup properly, if you are connected to 2 separate arrays groups, you should be able to bring one half of the DRDB down for maintenance without disruption of service. DRDB is configured on the servers, not the array.
Regards,
Joe
SirStan
2 Posts
0
February 8th, 2011 12:00
Some vendors implement "raid" across the filers themselves (HP LeftHand's network raid). Other vendors offer active/active controllers, or NetApp metrocluster functionality. Dell Equallogic does not.
We operate three Equallogic arrays in production use, and have never suffered a controller failure. When we preform firmware updates the unit reboots twice, taking it offline for 15 seconds. We do this during 'quiet' activity hours on our VMware, SQL Server, and Exchange clusters. They seem to handle the 15 second downtime without issue.
We have not seen the Active/Passive controller layout of the Dell Equallogic as a negative. The failure of an entire Equallogic filer (both controllers and both power supplies) is extremely extremely rare. There are no shared components between the controllers, they are functionally separate filers. The unit is right-sized for our organization and provides enterprise functionality at a fraction of the cost of a similar product that wold allow Active/Active enterprise level controllers, or Metrocluster functionality.
In summary:
> Dell Equallogic does not allow Active/Active Controllers, or full 'raid' between discrete units.
> Dell does not offer 'Metrocluster' or 'Network Raid' functionality like DRDB.
> Reboots of the entire SAN take 15 seconds (yes, really, as a customer, not Dell marketing) and do not cause any issue for us.
bennice2002
27 Posts
0
February 11th, 2011 16:00
I would like it if Dell/Equallogic would introduce some synchronous replication between arrays, as it would definitely help us in many situations. But with dual controllers, power supplies, etc., the units have adequate protection against failure for most situations.
As SirStan posted, the 15 second blip between controller restarts is about the maximum downtime you can expect, and having your iSCSI timeout values set correctly is very important, especially for MSSQL or Exchange.
Mattrst
4 Posts
0
March 12th, 2011 11:00
Obviously sync write software with cost twice the capacity but I would recommend to any sys admin.
JOHNADCO
2 Intern
•
847 Posts
0
March 12th, 2011 12:00