This post is more than 5 years old

11 Posts

5483

November 12th, 2014 09:00

Quorum in Isilon

Hi,

I was reading about how quorum prevents data conflicts. I have the following questions -

1. If I have 6 nodes in my cluster, how do I decide which 4 nodes should be quorum?

2. What is the case if I have only 2 nodes? Is it necessary that both should be quorum?

3. When I execute the command 'sysctl' (as given in the admin guide), I do not see the status of types of quorum, I only see details about the command 'sysctl'.

Thanks and regards,

Gaurav

254 Posts

November 12th, 2014 10:00

All nodes participate in the quorum.  There are no dedicated "quorum nodes".  Quorum as Isilon defines it is a majority of nodes in the cluster.  So if you have a 6-node cluster, at least 4 are required for quourm.  If you have less than 4 nodes connected in a cluster that is known to have 6 nodes, the cluster will be read-only until quorum is re-established.

There is no valid Isilon config that is only 2 nodes.  A smallest Isilon config is 3 nodes.  If you lost a node, then the 2 remaining nodes would still meet quorum because 2/3 nodes are alive and connected.

You don't need any commands to set or report on quorum.  It is handed internally by OneFS.  There are no settings that I'm aware of that the user needs to view or change to handle quorum.

1 Rookie

 • 

106 Posts

November 12th, 2014 13:00

One thing to add is that quorum is also a component within node pools.  If you have multiple pools, you need at least 3 nodes to make the pool, and then need to maintain the quorum within the pool to keep it operational. 

So if you have 10 nodes - a 7 node pool and 3 node pool, and you shutdown 2 of the nodes in the 3-node pool, that pool will no longer have quorum, so the remaining node will go into read-only mode, while the 7-node pool remains fully operational.

6 Operator

 • 

2.8K Posts

November 12th, 2014 17:00

Quorum is defined as more than 50% of the node are online and available to read and write data. If you have a 6 node cluster, quorum requires at least 4 nodes to be online. If you have a 7 node cluster, quorum also requires at least 4 nodes to be online. The other nodes may be in read only mode and are considered to be split from the cluster. This means that one can only write data to the cluster quorum, instead of writing different data to the different parts of the split cluster, resulting in cluster inconsistency.

When you encounter cluster splits, below are some solutions to resolve a node split:

1. The majority of the time, you can reboot the cluster resolves a node split.

2. Reconnect the cables of the node that has stopped communicating with the cluster.

3. If new nodes have recently been added or removed, confirming they are connected correctly.

4. If there is a hardware error, you may be able to replace the components which failed.

11 Posts

November 13th, 2014 06:00

Thank you all ☺

At Jeffrey,

If there are less than 4 nodes online in a cluster that is known to have 6 nodes, will the cluster be in ‘read only’ mode or ‘write only’ mode, because in the previous response, Adam reckons the nodes will be in ‘read-only’ mode as opposed to your response of ‘write only’. Please clarify.

Br,

Gaurav

1 Rookie

 • 

106 Posts

November 13th, 2014 07:00

Perhaps the wording was a bit confusing.  In the event of loss of quorum - a cluster or node pool will go immediately to Read-Only mode, no writing activity will resume until quorum is restored in an effort to prevent conflicts in the stored data. 

Think again of this 7 node cluster.  Split it in half with a bad IB cable, so you have 3 nodes working together on one side, and 4 nodes working together on the other.  Ostensibly these are the same cluster, same data sets, and the nodes can still take incoming connections from the external network interfaces.  In order to protect data, only the 4 nodes working together will still be able to write data, while the spit off 3 nodes will go to a read only state. 

After fixing the IB connection to restore the 7 node communication - the 3 nodes that had split will be updated on the state of data and will resume read/write activity. 

11 Posts

November 13th, 2014 08:00

Thank you Chris. My concepts are now clear ☺

Br,

Gaurav Parekh

6 Operator

 • 

2.8K Posts

November 23rd, 2014 20:00

Hi gaurav,

Back to your question. If you have a 6 node cluster, then the cluster encounter the network issue. you have 3 nodes working together on one side, other side also have 3 nodes. This means two sides will be in "read-only" mode.

6 Operator

 • 

1.2K Posts

November 23rd, 2014 21:00

I always wondered how to break an IB switch into two pieces, both still functioning...

Ok, with two redundant switches, and removing some cables

one might get into such situation.

But actually, it could get worse, e.g. for a 6 node cluster:

switch 1: nodes 1,2,3,4 online, quorum ok

switch 2: nodes 3,4,5,6 online, quorum ok

How does the cluster behave now?

-- Peter

6 Operator

 • 

2.8K Posts

November 24th, 2014 01:00

Hi Peter,

We give these examples for explaining what's Quorum in Isilon, There are several things that can cause cluster split:

  • cabling issues
  • software bug
  • configuration changes
  • Hardware errors

In general, when more than one node are splitting from cluster, it will work as individual "read-only" node. So these examples are just explaining the conception of Quorum.



Back to your question, we can configure an internal switch as a failover network to provide redundancy, but it doesn't increase performance.  Backing to your example, if switch 1 is primary, node 1,2,3 and 4 work well in internal-a network, then node 5 and 6 suddenly lost their connection with internal-a network, then node 1,2,3 and 4 will use internal-b network to communicate with node 5 and 6. Meanwhile, node 1, 2,3 and 4 use internal 1 to communication. Node 1,2,3,4,5 and 6 have quorum.

6 Operator

 • 

1.2K Posts

November 24th, 2014 02:00

Will the IB failover only occur if the primary is completely down, or when the secondary

has "better" connectivity than the primary?

Like nodes 1,2,3,4 (quorum) connected on primary,

but 2,3,4,5,6 (quorum, but at higher count) on secondary?

6 Operator

 • 

2.8K Posts

November 25th, 2014 01:00

Hi Peter,

The failover function is automatically enabled when the following conditions are met:

  • IP address ranges are configured on separate subnets for the int-a, int-b and failover networks.
  • The int-b interface is enabled.


I should correct my last answer, let me get back to your example , we have 6 node cluster:

switch 1: node 1,2,3,4 online, quorum ok

switch 2: node 3,4,5,6 online, quorum ok

How does the cluster behave now? It depends on which node last go offline from the cluster. There are several specific conditions as follows:

  • switch 1 as primary

        

         node 1,2,3,4,5,6 online ------> node 5 and 6 go offline from switch 1------> node 5 and 6 communicate with node 1,2,3,4 by switch 2, but communication between node 1,2,3,4 using switch 1  -------> node 1 and 2 go offline from switch 2 --------> node 3,4,5,6 have quorum, node 1 and 2 will be "read-only" status.

  • switch 2 as primary

        node 1,2,3,4,5,6 online ------> node 5 and 6 go offline from switch 1 -----> communicating well between node 1,2,3,4,5,6 using switch 2------> node 1 and 2 go offline from switch 2------->  node 3,4,5,6 have quorum, node 1 and 2 will be "read-only" status.


6 Operator

 • 

1.2K Posts

November 25th, 2014 20:00

Thanks a lot, Jeffey. This is really mind-boggling...


? node 5 and 6 communicate with node 1,2,3,4 by switch 2, but communication between node 1,2,3,4 using switch 1


So IB switch failover is not all-or-nothing, but both switches can be active in certain situations? That's cool.


And do I get it correctly, in the two scenarios we end with the same result, nodes 1 and 2 are read-only?


Cheers


-- Peter



76 Posts

November 26th, 2014 00:00

Peter_Sero wrote:

So IB switch failover is not all-or-nothing, but both switches can be active in certain situations? That's cool.

I think the important thing to remember about IB on Isilon is that it's a fabric.  In dual-IB setups, both switches are active, but it's up to the node to determine which one will be the primary for any given connection to another node.  There's a command that will show this on the cluster, isi_eth_mixer_d showlayout:

cluster# isi_eth_mixer_d showlayout

  ->  1    2    3    4    5    6    7    8  12  13  14  15  16  17  18  19  20  21  22

  1 XXXX int-a int-b int-a int-b int-a int-b int-a int-b int-a int-b int-a int-b int-b int-a int-a int-b int-b int-a          9  9

  2 int-a XXXX int-a int-b int-b int-a int-b int-a int-a int-b int-b int-b int-a int-a int-b int-a int-b int-b int-a          9  9

  3 int-b int-a XXXX int-a int-a int-b int-a int-b int-a int-b int-b int-a int-b int-a int-b int-b int-a int-a int-b          9  9

>> 4 int-a int-b int-a XXXX int-a int-b int-b int-b int-a int-a int-b int-a int-a int-b int-b int-a int-b int-b int-a          9  9

  5 int-b int-b int-a int-a XXXX int-a int-b int-a int-b int-a int-a int-a int-a int-a int-b int-b int-b int-b int-b          9  9

  6 int-a int-a int-b int-b int-a XXXX int-a int-b int-b int-b int-a int-b int-a int-b int-a int-b int-b int-a int-a          9  9

  7 int-b int-b int-a int-b int-b int-a XXXX int-a int-a int-b int-b int-a int-b int-a int-a int-a int-b int-b int-a          9  9

  8 int-a int-a int-b int-b int-a int-b int-a XXXX int-a int-b int-a int-b int-b int-b int-a int-b int-a int-b int-a          9  9

  12 int-b int-a int-a int-a int-b int-b int-a int-a XXXX int-a int-b int-b int-b int-b int-a int-a int-b int-a int-b          9  9

  13 int-a int-b int-b int-a int-a int-b int-b int-b int-a XXXX int-a int-b int-a int-b int-a int-a int-a int-b int-b          9  9

  14 int-b int-b int-b int-b int-a int-a int-b int-a int-b int-a XXXX int-a int-b int-a int-b int-b int-a int-a int-a          9  9

  15 int-a int-b int-a int-a int-a int-b int-a int-b int-b int-b int-a XXXX int-a int-a int-b int-b int-a int-a int-b        10  8 <--

  16 int-b int-a int-b int-a int-a int-a int-b int-b int-b int-a int-b int-a XXXX int-a int-b int-b int-a int-a int-b          9  9

  17 int-b int-a int-a int-b int-a int-b int-a int-b int-b int-b int-a int-a int-a XXXX int-a int-b int-b int-b int-a          9  9

  18 int-a int-b int-b int-b int-b int-a int-a int-a int-a int-a int-b int-b int-b int-a XXXX int-a int-a int-b int-b          9  9

  19 int-a int-a int-b int-a int-b int-b int-a int-b int-a int-a int-b int-b int-b int-b int-a XXXX int-a int-a int-b          9  9

  20 int-b int-b int-a int-b int-b int-b int-b int-a int-b int-a int-a int-a int-a int-b int-a int-a XXXX int-a int-b          9  9

  21 int-b int-b int-a int-b int-b int-a int-b int-b int-a int-b int-a int-a int-a int-b int-b int-a int-a XXXX int-a          9  9

  22 int-a int-a int-b int-a int-b int-a int-a int-a int-b int-b int-a int-b int-b int-a int-b int-b int-b int-a XXXX          9  9

                                                              totals 172 170 <--


Looking at this output, for this 22 node cluster, you can see exactly how each node selected an interface for connection to each other node.  Let's say that node 19's int-a were to go down.  It would fail over to int-b and use int-b to connect to the nodes in the cluster that it had previously used int-a.

6 Operator

 • 

1.2K Posts

November 26th, 2014 01:00

Ah! That shows a very elegant and powerful design, and

builds even more trust in the extreme level of resilience achieved this way.

Thanks for sharing the "larger picture"!

-- Peter

6 Operator

 • 

2.8K Posts

December 1st, 2014 19:00

Hi Bernie,

Thanks for your input, great command to share!

No Events found!

Top