73 Posts

March 7th, 2017 10:00

Matas,

Take a look at this KB:

https://support.emc.com/kb/487432

It outlines what you are seeing. Essentially, you have something that is logging in/out too many times, too quickly, and that is filling up the login table. If you can find and figure out what is causing the login table to fill up, that should do it. Look at your events on the Master MDM as a clue (/opt/emc/scaleio/mdm/bin/showevents.py -p |grep "Command login ").

Hope that helps,

Rick

306 Posts

March 7th, 2017 00:00

Hi Matas,

If you need to upgrade, the recommended way would be through the Installation Manager which would take care of that for you. I don't think you need to run "start_upgrade" manually, like, ever

As to your problem - please try "scli --allow_commands_during_upgrade" and see if that helps in your case, but I still believe you should just use the IM.

Cheers,

Pawel

March 7th, 2017 01:00

Error: MDM failed command.  Status: Invalid session. Please login and try again.


This error means you need to login again in scli using the username admin.


In case you are unable to login, try to SSH to one of the  slave MDM and try to switch the MDM ownership (use switch --mdm_ip in the commands you run from the slave MDM until you are able to switch the MDM ownership to that slave MDM.


As Pawel suggested upgrade is suggested from IM unless you want to manually upgrade, In that scenario the process is given in the deployment guide.


March 7th, 2017 03:00

As suggested by Pawel, did you run the command "scli --allow_commands_during_upgrade" ?

22 Posts

March 7th, 2017 03:00

Hi,  Pawel  ScaleIO user Guide defines manual upgrade process , and command "start_upgrade" is required.

SanjeevMalhotra  I cannot change MDM ownership. I think something is not working:

i.e:

[root@dkscmd002prvjay matv]# scli --query_cluster

Cluster:

    Mode: 5_node, State: Normal, Active: 5/5, Replicas: 3/3

Master MDM:

    ID: 0x5cbaf3001f4f1801

        IPs: 10.22.26.12, Management IPs: 10.22.26.12, Port: 9011

        Version: 2.0.7120

Slave MDMs:

    ID: 0x7e272f9f63f5e210

        IPs: 10.22.26.11, Management IPs: 10.22.26.11, Port: 9011

        Status: Normal, Version: 2.0.7120

    ID: 0x0fa334e15156abe2

        IPs: 10.22.26.13, Management IPs: 10.22.26.13, Port: 9011

        Status: Normal, Version: 2.0.7120

Tie-Breakers:

    ID: 0x6beff5195f712f54

        IPs: 10.22.26.32, Port: 9011

        Status: Normal, Version: 2.0.7120

    ID: 0x197422174da8d613

        IPs: 10.22.26.31, Port: 9011

        Status: Normal, Version: 2.0.7120

[root@dkscmd002prvjay matv]# scli --query_all

Error: MDM failed command.  Status: Invalid session. Please login and try again.

Any ideas ?

22 Posts

March 7th, 2017 04:00

update:

MDM log: /opt/emc/scaleio/mdm/logs/trc.0

07/03 13:24:24.216991 e0de1eb8:nativeAuthMgr_GetUserInfoFromToken:00596: Failed to find session for authentication. rc=INVALID_SESSION

07/03 13:24:24.216995 e0de1eb8:cliMsg_AuthenticateSession:01328: Failed getting user role

07/03 13:24:24.216997 e0de1eb8:mdmCliMsg_RecvRequestCB:01562: Failed authenticate session for command 4367

07/03 13:24:25.794455 e0bd7eb8:actor_Loop:11662: #### Log sync send - actorId: 013d39903a48fa31, ticks: 1407290160

07/03 13:24:25.794549 e0bceeb8:voter_HandleMeMaster:02627: #### Log sync receive - Sender: actorId: 013d39903a48fa31, ticks: 1407290160, State: voterId: 053e9b5c5006bf81, actorId: 013d39903a48fa31, actorGen 2, degradedGen 45, oosIDs [], IsFrozen 0, bHasLease 1

after I try operation with admin user like:

[root@dkscmd002prvjay bin]# scli --query_all

Error: MDM failed command.  Status: Invalid session. Please login and try again.

22 Posts

March 7th, 2017 07:00

Hello, it seems our monitoring system was "bombing" ScaleIO API, and to many sessions was happening, so some commands failed.

After stopping monitoring, we noticed such logs:

***

07/03 16:13:07.155552 8fd8feb8:repType_DestroyObjInternal:04008: Called from authSession_CleanupSession. Destroying object 372 of type 15

07/03 16:13:07.157933 8fd8feb8:repType_DestroyObjInternal:04008: Called from authSession_CleanupSession. Destroying object 1 of type 15

07/03 16:13:07.159606 8fd8feb8:repType_DestroyObjInternal:04008: Called from authSession_CleanupSession. Destroying object 159 of type 15

07/03 16:13:07.161376 8fd8feb8:repType_DestroyObjInternal:04008: Called from authSession_CleanupSession. Destroying object 231 of type 15

07/03 16:13:07.163032 8fd8feb8:repType_DestroyObjInternal:04008: Called from authSession_CleanupSession. Destroying object 83 of type 15

07/03 16:13:07.164624 8fd8feb8:repType_DestroyObjInternal:04008: Called from authSession_CleanupSession. Destroying object 429 of type 15
***

after SIO cleaned up old sessions, all commands works fine.

Thanks !

No Events found!

Top