This post is more than 5 years old
12 Posts
0
6597
MDM failure (slave) - showing disconnected after reinstalled
I was simulating the failure of an MDM. I remove all the RPM's. I reinstalled. I tried to add MDM back to cluster. Since the IP address is the same as it was before the failure, I cannot remove and re-add to the cluster. Status of the cluster shows the slave in "disconnected" state.
How do I get this back online?
andan02
12 Posts
0
April 19th, 2016 13:00
That did it - ty
scli --login --mdm_ip 10.202.46.30 --username admin --password abcdefg
scli --switch_cluster_mode --cluster_mode "1_node" --remove_slave_mdm_ip "10.202.46.31" --remove_tb_ip "10.202.46.29"
scli --remove_standby_mdm --remove_mdm_ip 10.202.46.31
scli --add_standby_mdm --new_mdm_ip 10.202.46.31 --mdm_role manager --new_mdm_management_ip 10.202.46.31 --approve_certificate --force_clean
scli --switch_cluster_mode --cluster_mode "3_node" --add_slave_mdm_ip 10.202.46.31 --add_tb_ip 10.202.46.29
All set. I have to get my SDS working next. Thank you.
I don't know how to resolve this if you build a new MDM and "create_cluster" on it. I don't know how to "destroy_cluster".
Here is the scenario,
1. destroy primary
2. secondary takes over
3. install primary again - accidentally issue "create_mdm_cluster"
4. attempt to do the above commands to add primary to secondary cluster
FAIL.
I had to
1. destroy primary
2. secondary takes over
3. install primary again - DO NOT issue "create_mdm_cluster"
4. rejoin cluster on secondary and switch ownership back.
I wish there was a way to undo "create_mdm_cluster"
Thank you
Andy
SanjeevMalhotra
138 Posts
0
April 19th, 2016 11:00
when adding the MDM back, use --force_clean and check
SanjeevMalhotra
138 Posts
0
April 19th, 2016 12:00
can you provide the output of "scli --query_cluster"?
SanjeevMalhotra
138 Posts
1
April 19th, 2016 12:00
Switch to single node mode (1_node) and then try to remove the standby MDM.
Thereafter add staddby MDM. use (force_clean)
switch to cluster 3 node mode (3_node)
andan02
12 Posts
0
April 19th, 2016 12:00
Cluster:
Name: atg1Mode: 3_node, State: Degraded, Active: 2/3, Replicas: 1/2
Master MDM:
Name: css-c-sdlc-021.cisco.com, ID: 0x5f3cc21f29c44290
IPs: 10.202.46.30, Management IPs: 10.202.46.30, Port: 9011
Version: 2.0.5014
Slave MDMs:
ID: 0x75f26fcf30d27281
IPs: 10.202.46.31, Management IPs: 10.202.46.31, Port: 9011
Status: Disconnected, Version: 2.0.5014
Tie-Breakers:
Name: css-c-sdlc-020.cisco.com, ID: 0x725dd087006a0432
IPs: 10.202.46.29, Port: 9011
Status: Normal, Version: 2.0.5014
andan02
12 Posts
0
April 19th, 2016 12:00
I get this
[root@css-c-sdlc-021 ~]# scli --add_standby_mdm --new_mdm_ip 10.202.46.31 --mdm_role manager --new_mdm_management_ip 10.202.46.31 --approve_certificate --force_clean
You are about to clear previous configuration on this MDM. Press 'y' and then Enter to confirm.y
Error: MDM failed command. Status: The IP already exists.
Then I try to remove the slave
[root@css-c-sdlc-021 ~]# scli --switch_cluster_mode --cluster_mode "3_node" --remove_slave_mdm_ip "10.202.46.31" --remove_tb_ip "10.202.46.29"
Error: MDM failed command. Status: The MDM is already in this cluster mode.
I can't add because the IP already exists, and I can't remove it because its in the cluster. What to do?
marcelo.roxa
1 Message
0
March 6th, 2019 09:00