andan02
1 Nickel

MDM failure (slave) - showing disconnected after reinstalled

Jump to solution

I was simulating the failure of an MDM.  I remove all the RPM's.  I reinstalled.  I tried to add MDM back to cluster.  Since the IP address is the same as it was before the failure, I cannot remove and re-add to the cluster.  Status of the cluster shows the slave in "disconnected" state.

How do I get this back online?

Tags (2)
0 Kudos
1 Solution

Accepted Solutions
andan02
1 Nickel

Re: MDM failure (slave) - showing disconnected after reinstalled

Jump to solution

That did it - ty

scli --login --mdm_ip 10.202.46.30 --username admin --password abcdefg

scli --switch_cluster_mode --cluster_mode "1_node" --remove_slave_mdm_ip "10.202.46.31" --remove_tb_ip "10.202.46.29"

scli --remove_standby_mdm --remove_mdm_ip 10.202.46.31

scli --add_standby_mdm --new_mdm_ip 10.202.46.31 --mdm_role manager --new_mdm_management_ip 10.202.46.31 --approve_certificate --force_clean

scli --switch_cluster_mode --cluster_mode "3_node" --add_slave_mdm_ip 10.202.46.31 --add_tb_ip 10.202.46.29

All set.  I have to get my SDS working next.  Thank you.

I don't know how to resolve this if you build a new MDM and "create_cluster" on it.  I don't know how to "destroy_cluster".

Here is the scenario,

1. destroy primary

2. secondary takes over

3. install primary again - accidentally issue "create_mdm_cluster"

4. attempt to do the above commands to add primary to secondary cluster

FAIL.

I had to

1. destroy primary

2. secondary takes over

3. install primary again - DO NOT issue "create_mdm_cluster"

4. rejoin cluster on secondary and switch ownership back.

I wish there was a way to undo "create_mdm_cluster"

Thank you

Andy

0 Kudos
7 Replies

Re: MDM failure (slave) - showing disconnected after reinstalled

Jump to solution

when adding the MDM back,  use --force_clean and check

0 Kudos
andan02
1 Nickel

Re: MDM failure (slave) - showing disconnected after reinstalled

Jump to solution

I get this

[root@css-c-sdlc-021 ~]# scli --add_standby_mdm --new_mdm_ip 10.202.46.31 --mdm_role manager --new_mdm_management_ip 10.202.46.31 --approve_certificate --force_clean

You are about to clear previous configuration on this MDM. Press 'y' and then Enter to confirm.y

Error: MDM failed command.  Status: The IP already exists.

Then I try to remove the slave

[root@css-c-sdlc-021 ~]# scli --switch_cluster_mode --cluster_mode "3_node" --remove_slave_mdm_ip "10.202.46.31" --remove_tb_ip "10.202.46.29"

Error: MDM failed command.  Status: The MDM is already in this cluster mode.

I can't add because the IP already exists, and I can't remove it because its in the cluster.  What to do?

0 Kudos

Re: MDM failure (slave) - showing disconnected after reinstalled

Jump to solution

can you provide the output of "scli --query_cluster"?

0 Kudos
andan02
1 Nickel

Re: MDM failure (slave) - showing disconnected after reinstalled

Jump to solution

Cluster:

    Name: atg1Mode: 3_node, State: Degraded, Active: 2/3, Replicas: 1/2

Master MDM:

    Name: css-c-sdlc-021.cisco.com, ID: 0x5f3cc21f29c44290

        IPs: 10.202.46.30, Management IPs: 10.202.46.30, Port: 9011

        Version: 2.0.5014

Slave MDMs:

    ID: 0x75f26fcf30d27281

        IPs: 10.202.46.31, Management IPs: 10.202.46.31, Port: 9011

        Status: Disconnected, Version: 2.0.5014

Tie-Breakers:

    Name: css-c-sdlc-020.cisco.com, ID: 0x725dd087006a0432

        IPs: 10.202.46.29, Port: 9011

        Status: Normal, Version: 2.0.5014

0 Kudos

Re: MDM failure (slave) - showing disconnected after reinstalled

Jump to solution

Switch to single node mode (1_node) and then try to remove the standby MDM.

Thereafter add staddby MDM. use (force_clean)

switch to cluster 3 node mode (3_node)

andan02
1 Nickel

Re: MDM failure (slave) - showing disconnected after reinstalled

Jump to solution

That did it - ty

scli --login --mdm_ip 10.202.46.30 --username admin --password abcdefg

scli --switch_cluster_mode --cluster_mode "1_node" --remove_slave_mdm_ip "10.202.46.31" --remove_tb_ip "10.202.46.29"

scli --remove_standby_mdm --remove_mdm_ip 10.202.46.31

scli --add_standby_mdm --new_mdm_ip 10.202.46.31 --mdm_role manager --new_mdm_management_ip 10.202.46.31 --approve_certificate --force_clean

scli --switch_cluster_mode --cluster_mode "3_node" --add_slave_mdm_ip 10.202.46.31 --add_tb_ip 10.202.46.29

All set.  I have to get my SDS working next.  Thank you.

I don't know how to resolve this if you build a new MDM and "create_cluster" on it.  I don't know how to "destroy_cluster".

Here is the scenario,

1. destroy primary

2. secondary takes over

3. install primary again - accidentally issue "create_mdm_cluster"

4. attempt to do the above commands to add primary to secondary cluster

FAIL.

I had to

1. destroy primary

2. secondary takes over

3. install primary again - DO NOT issue "create_mdm_cluster"

4. rejoin cluster on secondary and switch ownership back.

I wish there was a way to undo "create_mdm_cluster"

Thank you

Andy

0 Kudos
marcelo.roxa
1 Copper

Re: MDM failure (slave) - showing disconnected after reinsta

Jump to solution
Before re-switch cluster mode, you need to set the tie-breaker: scli --add_standby_mdm --new_mdm_ip 10.202.46.29 --mdm_role tb --new_mdm_management_ip 10.202.46.29 --approve_certificate --force_clean
0 Kudos