Issues with OneFS 8 multi-node Simulator

Question

We have tried to deploy a OneFS 8 multi node simulator but are running into issues. We have deployed with the minimum 3 nodes but see behaviour like:

> Can see UI using external IP - 2222. Can login to nodes - but then 16, 4:19 PM] Indhra Shanmugam (indhra.shanmugam@superna.net):Nodes 2 and 3 do not have IP assigned to them; seen with command isi status -q (but cluster has plenty IP range)

> can't login to all nodes as root, or can login to all nodes as root but can't login to WebUI, and login to webU then cannot see all the nodes and their IP's.

Looking for assistance on resolving this issue.

Thanks Dorothy

Peter_Sero · Answer

First I'd make sure to have stable a cluster as far as the internal network is concerned. What is the actual output of: isi status -q isi network interfaces ls ? -- Peter

dstockburger · Answer

Hi Peter - here is the output that you requested. Thanks Dorothy

eight-B-1# isi status -q

Cluster Name: eight-B

Cluster Health: [ OK ]

Cluster Storage: HDD SSD Storage

Size: 18.1G (18.1G Raw) 0 (0 Raw)

VHS Size: 0

Used: 326.1M (2%) 0 (n/a)

Avail: 17.8G (98%) 0 (n/a)

Health Throughput (bps) HDD Storage SSD Storage

---+---------------+-----+-----+-----+-----+-----------------+-----------------

1|172.16.87.206 | OK | 169k|12.2k| 182k| 326M/18.1G( 2%)|(No Storage SSDs)

---+---------------+-----+-----+-----+-----+-----------------+-----------------

Cluster Totals: | 169k|12.2k| 182k| 326M/18.1G( 2%)|(No Storage SSDs)

Health Fields: D = Down, A = Attention, S = Smartfailed, R = Read-Only

eight-B-1# isi network interfaces ls

LNN Name Status Owners IP Addresses

-----------------------------------------------------------------------

1 ext-1 Up groupnet0.subnet0.pool0 172.16.87.206

1 ext-2 Not Available - -

1 ext-3 Not Available - -

1 ext-4 Not Available - -

1 ext-5 Not Available - -

1 ext-6 Not Available - -

1 int-a Up internal.int-a-subnet.int-a-pool 172.16.88.157

-----------------------------------------------------------------------

Total: 7

Internal IP range of cluster is : IP ranges:

172.16.88.157

–

172.16.88.159

external IP of cluster is 172.16.87.206 - 172.16.87.208

able to login to web UI with 172.16.87.206

2 nodes joined to this cluster but cannot login to them from vcenter as root, keep getting message ; login incorrect.

Peter_Sero · Answer

The isi status -q should show all three nodes then, even if the other nodes are temporarily down or offline.

Here we see only one node, which means the remaining nodes have not joined yet (or have joined to each other forming a distinct cluster).

Check the virtual network used for the internal connections (172.16.88.*), the nodes must connect to the same private network segment.

With separate ESX hosts (required for IsilonSD Edge, but not for the "traditional" simulator), that segment must span all hosts, obviously with proper cabling.

hth

-- Peter

dstockburger · Answer

Hi Peter,

Please see inline

The isi status -q should show all three nodes then, even if the other nodes are temporarily down or offline.

so far have never managed all joined nodes to show up either from cli or on web UI.

Here we see only one node, which means the remaining nodes have not joined yet (or have joined to each other forming a distinct cluster).

What we have noticed is : Before with older version of Isilons, the name of new node would be -1, -2

and now with isilon 8.0.0., all nodes have same name -1

Check the virtual network used for the internal connections (172.16.88.*), the nodes must connect to the same private network segment.

all the nodes are connected to same internal network. ie: Network Adapter1 for all nodes connected to same network

With separate ESX hosts (required for IsilonSD Edge, but not for the "traditional" simulator), that segment must span all hosts, obviously with proper cabling.

Nodes deployed on different ESX hosts but those hosts all in same cluster and all connected to same distributed switch

Peter_Sero · Answer

That means each node is in fact its own single node cluster, and none has been "joined" anywhere.

Can you get the internal addresses of the other two nodes,

as before from the consoles, with isi networks interfaces ls

int-a Up internal.int-a-subnet.int-a-pool 172.16.88.157

Next step would be, can the internal addresses be ping'ed

in any direction?

dstockburger · Answer

Hi Peter,

I cannot ping from the node reported in previous comment to the other 2 nodes. I currently also cannot console to the other 2 nodes.

I will try and deploy again and update the post once I have done this.

If you have any advice on the deploy please let me know.

Thank-you for your help.

Dorothy

Peter_Sero · Answer

Dorothy

not loosing contact with the consoles is essential of course.

The console of a node is needed for joining it to a cluster,

this is a simple yet manual confirmation step on the console.

Good luck

-- Peter

dstockburger · Answer

Hi Peter,

I have redeployed 2 OneFS 8 nodes (node-2 and node-3) and tried to join them to the first node that I deployed with no success. After deploying the simulator, the wizard opens and I select option "2" to join. I then select the node that I want to join to (node-1) and enter the number associated with that node. After that I am not prompted for anything else and eventually the login prompt appears. When I try to login with same credentials that I use for the node that I joined to the login fails with "Login incorrect".

I notice for node-2 that I joined to node-1 - there appears to be an error just before login prompt appears:

x VerisignClass3PublicPrimaryCertificationAuthority.pem

tar: Error exit delayed from previous errors.

pw: the group file is busy

Cleared upgrade config

request: STATUS, task:STATUS_SYNC, hash: (null)

Executing script isi_firmware_versions

Save firmware status

completed

Did not see the same for node-3 that I joined to node-1

x VerisignClass3PublicPrimaryCertificationAuthority.pem

mkdir: /ifs/.ifsvar/etc/ifs: File exists

mkdir: /ifs/.ifsvar/etc/mcp: File exists

Upgrading Source Records...

Upgrading Target Records...

Upgrading policies...

Upgrading siq global config...

Cleared upgrade config

request: STATUS, task:STATUS_SYNC, hash: (null)

Executing script isi_firmware_versions

Save firmware status

completed

In either case I get the login incorrect message when I try to login to node-2 or node-3 with credentials from node-1.

ssh to node-1 is fine and I execute the commands you requested previously and looks like node-2 and node-3 are not joined

zds-isilon-8-1-1# isi status -q

Cluster Name: zds-isilon-8-1

Cluster Health: [ OK ]

Cluster Storage: HDD SSD Storage

Size: 18.1G (18.1G Raw) 0 (0 Raw)

VHS Size: 0

Used: 213.9M (1%) 0 (n/a)

Avail: 17.9G (99%) 0 (n/a)

Health Throughput (bps) HDD Storage SSD Storage

---+---------------+-----+-----+-----+-----+-----------------+-----------------

1|172.16.88.214 | OK | 0|304.0|304.0| 214M/18.1G( 1%)|(No Storage SSDs)

---+---------------+-----+-----+-----+-----+-----------------+-----------------

Cluster Totals: | 0|304.0|304.0| 214M/18.1G( 1%)|(No Storage SSDs)

Health Fields: D = Down, A = Attention, S = Smartfailed, R = Read-Only

zds-isilon-8-1-1#

zds-isilon-8-1-1# isi network interfaces ls

LNN Name Status Owners IP Addresses

-----------------------------------------------------------------------

1 ext-1 Up groupnet0.subnet0.pool0 172.16.88.214

1 ext-2 Not Available - -

1 ext-3 Not Available - -

1 ext-4 Not Available - -

1 ext-5 Not Available - -

1 ext-6 Not Available - -

1 int-a Up internal.int-a-subnet.int-a-pool 172.16.82.83

-----------------------------------------------------------------------

Total: 7

Have I missed a step?

note that on node-1 I did not define the failover subnet - could that be related to this problem?

Peter_Sero · Answer

Does joining work one the SAME host where the first node is running? Are you using the 'regular' Isilon OneFS 8 simulator (vmdk -> ova converted) or the IsilonSD Edge (ova based)? -- Peter

dstockburger · Answer

Hi Peter,

On our vcenter we have multiple ESX hosts but those hosts are all in the same cluster and are all connected to the same distributed switch. From those hosts the node-1 is running on different host than node-2 and node-3 but in this configuration we thought that should be OK.

I have not deployed the SD version - on my cluster I see this version: Isilon OneFS v8.0.0.0 B_8_0_0_037(RELEASE)

Peter_Sero · Answer

Dorothy, it might be the case that deploying a OneFS simulator cluster across a cluster of ESX hosts isn't supposed to work unlessi IsilonSD Edge is used -- at least all references for the regular simulator that I checked are talking about using a single host, be it ESX, Workstation or Player.

Have you checked whether a OneFS 8.0 simulator cluster works in your environment when all OneFS nodes are deployed on the same ESX host?

And, approaching from the other side, can you deploy an IsilonSD Edge cluster to your ESX cluster, one Isilon node per host?

Cheers

-- Peter

Isilon

Issues with OneFS 8 multi-node Simulator

Was this post helpful?

Register Now! for Secure Secrets with Kubernetes & DevSecOps - and grab a lab!

Dell Technologies at SpiceWorld 2022