The_DI

11 Posts

4043

June 22nd, 2016 13:00

Cluster stand up issues

We received a 3 Node NL400. Powered on first node, let it come up and was able to completely configure the node and create a cluster. Powered on a second node, let it come up, and configured it to add to the newly created cluster. During this process, it was discovered there was a bad IB cable on the second node, and we powered it off with the power button on the back of the node. When the second node rebooted, it didn't grab an IP address, and the devices (disk drive) show a status of [NEW]. It appears the second node is stuck in this situation, and I can't add the third node to the cluster, because there is no quorum. I will add screen shots/output. Am looking for solutions to this that will not require a re-image of the nodes, as it is a remote site.

Responses(4)

The_DI

11 Posts

0

June 22nd, 2016 13:00

Here is the recent output:

login as: root

Using keyboard-interactive authentication.

Password:

*** Warning: Auth Service is Unavailable ***

Last login: Mon May 2 06:41:33 2016 from xx.xx.xx.xxx

Isilon OneFS v7.2.1.2

ats-isilon2-1# isi stat

Warning: Cluster does not have quorum

Cluster Name: ats-isilon2

Cluster Health: [ ATTN]

Cluster Storage: HDD SSD Storage

Size: n/a (n/a Raw) n/a (n/a Raw)

VHS Size: 0

Used: n/a (n/a) n/a (n/a)

Avail: n/a (n/a) 0 (n/a)

Health Throughput (bps) HDD Storage SSD Storage

-------------------+-----+-----+-----+-----+-----------------+-----------------

1|xx.xx.xx.xxx |-A-- | 0| 0| 0| 65M/ n/a( n/a)| 0/ n/a( n/a)

2|None |D--- | 0| 0| 0| n/a/ n/a( n/a)| n/a/ n/a( n/a)

-------------------+-----+-----+-----+-----+-----------------+-----------------

Cluster Totals: | 0| 0| 0| n/a/ n/a( n/a)| n/a/ n/a( n/a)

Health Fields: D = Down, A = Attention, S = Smartfailed, R = Read-Only

Critical Events:

Event DB temporarily unavailable. Retry in 30 seconds.

Cluster Job Status:

Job status temporarily unavailable.

ats-isilon2-1#

ats-isilon2-1# isi_for_array isi device

ats-isilon2-2: Node 2, [DOWN]

ats-isilon2-2: Bay 1 Lnum N/A [NEW] SN:PN2331PAK0NNJT /dev/da1

ats-isilon2-2: Bay 2 Lnum N/A [NEW] SN:PN2331PAK02MYT /dev/da2

ats-isilon2-2: Bay 3 Lnum N/A [NEW] SN:PN2331PAJZ9AYT /dev/da19

. . .

ats-isilon2-2: Bay 34 Lnum N/A [NEW] SN:PN2331PAK0JHMT /dev/da16

ats-isilon2-2: Bay 35 Lnum N/A [NEW] SN:PN2331PAK0JH5T /dev/da17

ats-isilon2-2: Bay 36 Lnum N/A [NEW] SN:PN2331PAJZZLZT /dev/da18

ats-isilon2-1: Node 1, [ATTN]

ats-isilon2-1: Bay 1 Lnum 35 [HEALTHY] SN:PN2334PBH1NKMR /dev/da1

ats-isilon2-1: Bay 2 Lnum 34 [HEALTHY] SN:PN2334PBH7ZYVR /dev/da2

ats-isilon2-1: Bay 3 Lnum 17 [HEALTHY] SN:PN2334PBGDY5ET /dev/da19

. . .

ats-isilon2-1: Bay 34 Lnum 20 [HEALTHY] SN:PN2334PBH8PNWT /dev/da16

ats-isilon2-1: Bay 35 Lnum 19 [HEALTHY] SN:PN2334PBH7YBER /dev/da17

ats-isilon2-1: Bay 36 Lnum 18 [HEALTHY] SN:PN2334PBH8S2ET /dev/da18

ats-isilon2-1#

I am looking for a procedure to get Node 2 and its drives healthy.

AU

Anonymous User

170 Posts

1

June 22nd, 2016 21:00

The fastest is probably to reformat node 2 especially since there's no data on it.

If you can connect to the serial port, sign on and do:

# isi_reformat_node

Hopefully the node just smartens up and comes up healthy. I've had nodes come up with a single drive NEW state and you can just add them with isi devices -d 2:1 -a add

The_DI

11 Posts

0

June 23rd, 2016 06:00

Thanks, Ed. I hope it smartens up, too. I will check with management to see if we can try to manually add the drives. But I think that is just a symptom of a bigger issue. As you can see, node 2 has no IP address, and therefore, I cannot log into that node individually. Perplexing.

The_DI

11 Posts

0

September 15th, 2016 08:00

We ended up re-imaging the nodes to OneFS v8.0. All is now well.

View All

No Events found!