Start a Conversation

Unsolved

This post is more than 5 years old

4043

June 22nd, 2016 13:00

Cluster stand up issues

We received a 3 Node NL400.  Powered on first node, let it come up and was able to completely configure the node and create a cluster.  Powered on a second node, let it come up, and configured it to add to the newly created cluster.  During this process, it was discovered there was a bad IB cable on the second node, and we powered it off with the power button on the back of the node.  When the second node rebooted, it didn't grab an IP address, and the devices (disk drive) show a status of [NEW].  It appears the second node is stuck in this situation, and I can't add the third node to the cluster, because there is no quorum.  I will add screen shots/output.  Am looking for solutions to this that will not require a re-image of the nodes, as it is a remote site.

11 Posts

June 22nd, 2016 13:00

Here is the recent output:

login as: root

Using keyboard-interactive authentication.

Password:

*** Warning: Auth Service is Unavailable ***

Last login: Mon May  2 06:41:33 2016 from xx.xx.xx.xxx

Copyright (c) 2001-2014 EMC Corporation. All Rights Reserved.

Copyright (c) 1992-2011 The FreeBSD Project.

Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994

        The Regents of the University of California. All rights reserved.

Isilon OneFS v7.2.1.2

ats-isilon2-1# isi stat

Warning: Cluster does not have quorum

Cluster Name: ats-isilon2

Cluster Health:    [ ATTN]

Cluster Storage:  HDD                SSD Storage

Size:            n/a (n/a Raw)      n/a (n/a Raw)

VHS Size:        0

Used:            n/a (n/a)          n/a (n/a)

Avail:            n/a (n/a)          0 (n/a)

                  Health  Throughput (bps)  HDD Storage      SSD Storage

ID |IP Address    |DASR |  In  Out  Total| Used / Size    |Used / Size

-------------------+-----+-----+-----+-----+-----------------+-----------------

  1|xx.xx.xx.xxx  |-A-- |    0|    0|    0|  65M/  n/a( n/a)|    0/  n/a( n/a)

  2|None          |D--- |    0|    0|    0|  n/a/  n/a( n/a)|  n/a/  n/a( n/a)

-------------------+-----+-----+-----+-----+-----------------+-----------------

Cluster Totals:          |    0|    0|    0|  n/a/  n/a( n/a)|  n/a/  n/a( n/a)

    Health Fields: D = Down, A = Attention, S = Smartfailed, R = Read-Only

Critical Events:

Event DB temporarily unavailable. Retry in 30 seconds.

Cluster Job Status:

Job status temporarily unavailable.

ats-isilon2-1#

ats-isilon2-1#

ats-isilon2-1#

ats-isilon2-1# isi_for_array isi device

ats-isilon2-2: Node 2, [DOWN]

ats-isilon2-2:  Bay 1        Lnum N/A    [NEW]          SN:PN2331PAK0NNJT      /dev/da1

ats-isilon2-2:  Bay 2        Lnum N/A    [NEW]          SN:PN2331PAK02MYT      /dev/da2

ats-isilon2-2:  Bay 3        Lnum N/A    [NEW]          SN:PN2331PAJZ9AYT      /dev/da19

. . .

ats-isilon2-2:  Bay 34      Lnum N/A    [NEW]          SN:PN2331PAK0JHMT      /dev/da16

ats-isilon2-2:  Bay 35      Lnum N/A    [NEW]          SN:PN2331PAK0JH5T      /dev/da17

ats-isilon2-2:  Bay 36      Lnum N/A    [NEW]          SN:PN2331PAJZZLZT      /dev/da18

ats-isilon2-1: Node 1, [ATTN]

ats-isilon2-1:  Bay 1        Lnum 35      [HEALTHY]      SN:PN2334PBH1NKMR      /dev/da1

ats-isilon2-1:  Bay 2        Lnum 34      [HEALTHY]      SN:PN2334PBH7ZYVR      /dev/da2

ats-isilon2-1:  Bay 3        Lnum 17      [HEALTHY]      SN:PN2334PBGDY5ET      /dev/da19

. . .

ats-isilon2-1:  Bay 34      Lnum 20      [HEALTHY]      SN:PN2334PBH8PNWT      /dev/da16

ats-isilon2-1:  Bay 35      Lnum 19      [HEALTHY]      SN:PN2334PBH7YBER      /dev/da17

ats-isilon2-1:  Bay 36      Lnum 18      [HEALTHY]      SN:PN2334PBH8S2ET      /dev/da18

ats-isilon2-1#


I am looking for a procedure to get Node 2 and its drives healthy.

June 22nd, 2016 21:00

The fastest is probably to reformat node 2 especially since there's no data on it.

If you can connect to the serial port, sign on and do:

# isi_reformat_node

Hopefully the node just smartens up and comes up healthy.  I've had nodes come up with a single drive NEW state and you can just add them with isi devices -d 2:1 -a add

11 Posts

June 23rd, 2016 06:00

Thanks, Ed.  I hope it smartens up, too.  I will check with management to see if we can try to manually add the drives.  But I think that is just a symptom of a bigger issue.  As you can see, node 2 has no IP address, and therefore, I cannot log into that node individually.  Perplexing.

11 Posts

September 15th, 2016 08:00

We ended up re-imaging the nodes to OneFS v8.0.  All is now well.

No Events found!

Top