Weird issue with 7.1.1.4 Isilon Simulator

Question

Two nodes bailed out of the cluster (nodes 2 and 5), but when I log into either node 2 or 5, they show themselves up but nodes 1,3-4, and 6 down. The larger group of 4 nodes have /ifs mounted and the group of 2 nodes does not. Tried individual node reboots and a cluster restart to see if they would straighten themselves out, but no luck.

FROM NODE1:

vmlxisilon-1# isi status

Cluster Name: vmlxisilon

Cluster Health: [ ATTN]

Cluster Storage: HDD SSD Storage

Size: 23G (45G Raw) 0 (0 Raw)

VHS Size: 23G

Used: 17G (77%) 0 (n/a)

Avail: 5.3G (23%) 0 (n/a)

Health Throughput (bps) HDD Storage SSD Storage

-------------------+-----+-----+-----+-----+-----------------+-----------------

1|10.168.50.120 | OK | 0| 24| 24| 4.3G/ 5.7G( 76%)|(No Storage SSDs)

2|10.168.50.121 |D--- | n/a| n/a| n/a| n/a/ n/a( n/a)| n/a/ n/a( n/a)

3|10.168.50.122 | OK | 153K| 24| 153K| 4.3G/ 5.7G( 77%)|(No Storage SSDs)

4|10.168.50.123 | OK | 0| 32| 32| 4.3G/ 5.7G( 77%)|(No Storage SSDs)

5|10.168.50.124 |D--- | n/a| n/a| n/a| n/a/ n/a( n/a)| n/a/ n/a( n/a)

6|10.168.50.125 | OK | 0| 24| 24| 4.3G/ 5.7G( 76%)|(No Storage SSDs)

-------------------+-----+-----+-----+-----+-----------------+-----------------

Cluster Totals: | 153K| 104| 153K| 17G/ 23G( 77%)|(No Storage SSDs)

Health Fields: D = Down, A = Attention, S = Smartfailed, R = Read-Only

FROM NODE 2:

vmlxisilon-2# isi status

Cluster Name: vmlxisilon

Cluster Health: [ ATTN]

Cluster Storage: HDD SSD Storage

Size: n/a (n/a Raw) n/a (n/a Raw)

VHS Size: 0

Used: n/a (n/a) n/a (n/a)

Avail: n/a (n/a) 0 (n/a)

Health Throughput (bps) HDD Storage SSD Storage

-------------------+-----+-----+-----+-----+-----------------+-----------------

1|10.168.50.120 |D--- | n/a| n/a| n/a| n/a/ n/a( n/a)| n/a/ n/a( n/a)

2|10.168.50.121 | OK | 0| 0| 0| 4.2G/ n/a( n/a)| 0/ n/a( n/a)

3|10.168.50.122 |D--- | n/a| n/a| n/a| n/a/ n/a( n/a)| n/a/ n/a( n/a)

4|10.168.50.123 |D--- | n/a| n/a| n/a| n/a/ n/a( n/a)| n/a/ n/a( n/a)

5|10.168.50.124 | OK | 0| 0| 0| 4.2G/ n/a( n/a)| 0/ n/a( n/a)

6|10.168.50.125 |D--- | n/a| n/a| n/a| n/a/ n/a( n/a)| n/a/ n/a( n/a)

-------------------+-----+-----+-----+-----+-----------------+-----------------

Cluster Totals: | 0| 0| 0| n/a/ n/a( n/a)| n/a/ n/a( n/a)

Health Fields: D = Down, A = Attention, S = Smartfailed, R = Read-Only

Any ideas?

Peter_Sero · Answer

Looks like the internal network has been split, e.g. nodes 2 and 5 moved to another host, or LAN segment.

Can you check this on VM level?

On Isilon level, what is the output of:

# isi_eth_mixer_d showlayout

-- Peter

Kona2000 · Answer

Peter-

You were absolutely correct - nodes 2 and 5 had been moved on accident to another VM host. Moved them back, manually purged the CELog, and I'm back in business. Thank you for the tip.

Cluster Name: vmlxisilon
Cluster Health:     [ OK ]
Cluster Storage: HDD                 SSD Storage
Size:             39G (68G Raw)       0 (0 Raw)
VHS Size:         29G
Used:             27G (69%)           0 (n/a)
Avail:            12G (31%)           0 (n/a)

                   Health Throughput (bps) HDD Storage      SSD Storage
ID |IP Address     |DASR | In   Out Total| Used / Size     |Used / Size
-------------------+-----+-----+-----+-----+-----------------+-----------------
1|10.168.50.120 | OK | 428| 164K| 165K| 1.5G/ 6.4G( 23%)|(No Storage SSDs)
2|10.168.50.121 | OK |    0|   33|   33| 4.4G/ 6.4G( 69%)|(No Storage SSDs)
3|10.168.50.122 | OK | 706K| 509| 706K| 4.4G/ 6.4G( 69%)|(No Storage SSDs)
4|10.168.50.123 | OK | 285|    0| 285| 4.4G/ 6.4G( 69%)|(No Storage SSDs)
5|10.168.50.124 | OK | 214|   33| 247| 7.4G/ 6.4G(> 99%)|(No Storage SSDs)
6|10.168.50.125 | OK | 214| 496| 710| 4.4G/ 6.4G( 69%)|(No Storage SSDs)
-------------------+-----+-----+-----+-----+-----------------+-----------------
Cluster Totals:          | 707K| 165K| 872K| 27G/ 39G( 69%)|(No Storage SSDs)

Health Fields: D = Down, A = Attention, S = Smartfailed, R = Read-Only

Critical Events:

Cluster Job Status:

No running jobs.

No paused or waiting jobs.

No failed jobs.

Recent job results:
Time Job Event
--------------- -------------------------- ------------------------------

Isilon

Weird issue with 7.1.1.4 Isilon Simulator

Was this post helpful?