Unsolved
This post is more than 5 years old
9 Posts
0
1568
ECS Community Edition 3.0.0 : ECS services on one node fail to start
Hello,
I have installed an ECS CE with 6 nodes, and put the 6 VMs in a vApp. I stopped and started successfully the vApp, but after 3 restarts one of the nodes is seen as not available. When I connect through SSH the docker service is running, the container is running, but netstat -an does not show the TCP ports active (443,4443,3218,...).
As it used to work sometime, I hope there is something wrong that can be fixed on this node. In which logs do I have to look at ?
I figured out that I can run "docker exec --interactive=true --privileged=true ecsmultinode /bin/bash" command to go through the container.
At installation time I created the entry for systemctl to start automatically the container.
Here is the output of journalctl -f -u ecsmultinode on the failing node while starting the container (systemctl start docker.ecsmultinode) :
févr. 28 16:00:52 ecs-nod1 systemd[1]: Starting ECS Multinode Container...
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: ############################### boot operation begin ############################
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: checking 'which' command
févr. 28 16:00:52 ecs-nod1 docker[20704]: which is /usr/bin/which
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: exporting environment variable VIPR_DEPLOYMENT_TYPE to fabric
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: Parsing /host/data/network.json
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: VERSION: 3.0.0.0.86239.1c9e5ec private iface: ens160 public iface: ens160 START VIPR:
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: expecting data partitions to be mounted under /dae
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: StorageServer - setting partitionroot=/dae in /opt/storageos/conf/storageserver.conf
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: StorageServer JSON configuration file is /data/partitions.json
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: partitions.json already exists - will not generate it
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: validating config.iso and fabric version file
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: config.iso is missing in the /data location
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: fetching network information
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: Data IP address of the machine: 10.64.226.87
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: Management IP address of the machine: 10.64.226.87
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: Geo IP address of the machine: 10.64.226.87
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: NETMASK found: 255.255.255.0
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: Created /opt/storageos/conf/network
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: IPADDR='10.64.226.87'
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: All validations are complete. Going ahead with the configuration steps.
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: populating vipr shared library path
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: /opt/storageos/lib is already populated
févr. 28 16:00:52 ecs-nod1 docker[20704]: 02/28/17 15:00:52: creating scsi device nodes
févr. 28 16:01:02 ecs-nod1 docker[20704]: 02/28/17 15:01:02: creating volumes skeleton containers
févr. 28 16:01:02 ecs-nod1 docker[20704]: 02/28/17 15:01:02: populating vipr network info in fabric identifier
févr. 28 16:01:02 ecs-nod1 docker[20704]: 02/28/17 15:01:02: fixing svcuser account permissions
févr. 28 16:01:02 ecs-nod1 docker[20704]: chown: invalid user: 'svcuser:users'
févr. 28 16:01:02 ecs-nod1 docker[20704]: 02/28/17 15:01:02: updating storageos property to datanode
févr. 28 16:01:02 ecs-nod1 docker[20704]: 02/28/17 15:01:02: fixing file permissions
févr. 28 16:01:02 ecs-nod1 docker[20704]: 02/28/17 15:01:02: fixing storageos and data ownership
févr. 28 16:01:03 ecs-nod1 docker[20704]: 02/28/17 15:01:03: creating place holders file
févr. 28 16:01:03 ecs-nod1 docker[20704]: 02/28/17 15:01:03: preparing vipr configuration information
févr. 28 16:01:03 ecs-nod1 docker[20704]: cp: cannot stat '/data/config.iso': No such file or directory
févr. 28 16:01:03 ecs-nod1 docker[20704]: 02/28/17 15:01:03: Starting the rpcbind service
févr. 28 16:01:03 ecs-nod1 docker[20704]: 02/28/17 15:01:03: Initializing configuration properties
févr. 28 16:01:03 ecs-nod1 docker[20704]: 02/28/17 15:01:03: Generating configuration files
févr. 28 16:01:03 ecs-nod1 docker[20704]: /etc/systool: Warning: Fabric deployment. Skipping _bootfs_mount_ro()
févr. 28 16:01:08 ecs-nod1 docker[20704]: 02/28/17 15:01:08: Generation failed
févr. 28 16:01:08 ecs-nod1 docker[20704]: Starting nginx service
févr. 28 16:01:08 ecs-nod1 docker[20704]: ..done
févr. 28 16:01:08 ecs-nod1 docker[20704]: mkfifo: cannot create fifo '/var/run/.boot_sh_pipe': File exists
Here is now the output of a functional node:
févr. 28 16:05:23 ecs-nod2 systemd[1]: Starting ECS Multinode Container...
févr. 28 16:05:24 ecs-nod2 docker[3746]: 02/28/17 15:05:24: ############################### boot operation begin ############################
févr. 28 16:05:24 ecs-nod2 docker[3746]: 02/28/17 15:05:24: checking 'which' command
févr. 28 16:05:24 ecs-nod2 docker[3746]: which is /usr/bin/which
févr. 28 16:05:24 ecs-nod2 docker[3746]: 02/28/17 15:05:24: exporting environment variable VIPR_DEPLOYMENT_TYPE to fabric
févr. 28 16:05:25 ecs-nod2 docker[3746]: 02/28/17 15:05:25: Parsing /host/data/network.json
févr. 28 16:05:25 ecs-nod2 docker[3746]: 02/28/17 15:05:25: VERSION: 3.0.0.0.86239.1c9e5ec private iface: ens160 public iface: ens160 START VIPR:
févr. 28 16:05:25 ecs-nod2 docker[3746]: 02/28/17 15:05:25: expecting data partitions to be mounted under /dae
févr. 28 16:05:25 ecs-nod2 docker[3746]: 02/28/17 15:05:25: StorageServer - setting partitionroot=/dae in /opt/storageos/conf/storageserver.conf
févr. 28 16:05:25 ecs-nod2 docker[3746]: 02/28/17 15:05:25: StorageServer JSON configuration file is /data/partitions.json
févr. 28 16:05:25 ecs-nod2 docker[3746]: 02/28/17 15:05:25: partitions.json already exists - will not generate it
févr. 28 16:05:25 ecs-nod2 docker[3746]: 02/28/17 15:05:25: validating config.iso and fabric version file
févr. 28 16:05:25 ecs-nod2 docker[3746]: 02/28/17 15:05:25: config.iso is missing in the /data location
févr. 28 16:05:25 ecs-nod2 docker[3746]: 02/28/17 15:05:25: fetching network information
févr. 28 16:05:25 ecs-nod2 docker[3746]: 02/28/17 15:05:25: Data IP address of the machine: 10.64.226.54
févr. 28 16:05:25 ecs-nod2 docker[3746]: 02/28/17 15:05:25: Management IP address of the machine: 10.64.226.54
févr. 28 16:05:25 ecs-nod2 docker[3746]: 02/28/17 15:05:25: Geo IP address of the machine: 10.64.226.54
févr. 28 16:05:25 ecs-nod2 docker[3746]: 02/28/17 15:05:25: NETMASK found: 255.255.255.0
févr. 28 16:05:25 ecs-nod2 docker[3746]: 02/28/17 15:05:25: Created /opt/storageos/conf/network
févr. 28 16:05:25 ecs-nod2 docker[3746]: 02/28/17 15:05:25: IPADDR='10.64.226.54'
févr. 28 16:05:25 ecs-nod2 docker[3746]: 02/28/17 15:05:25: All validations are complete. Going ahead with the configuration steps.
févr. 28 16:05:25 ecs-nod2 docker[3746]: 02/28/17 15:05:25: populating vipr shared library path
févr. 28 16:05:25 ecs-nod2 docker[3746]: 02/28/17 15:05:25: /opt/storageos/lib is already populated
févr. 28 16:05:25 ecs-nod2 docker[3746]: 02/28/17 15:05:25: creating scsi device nodes
févr. 28 16:05:36 ecs-nod2 docker[3746]: 02/28/17 15:05:36: creating volumes skeleton containers
févr. 28 16:05:36 ecs-nod2 docker[3746]: 02/28/17 15:05:36: populating vipr network info in fabric identifier
févr. 28 16:05:37 ecs-nod2 docker[3746]: 02/28/17 15:05:37: fixing svcuser account permissions
févr. 28 16:05:37 ecs-nod2 docker[3746]: chown: invalid user: 'svcuser:users'
févr. 28 16:05:37 ecs-nod2 docker[3746]: 02/28/17 15:05:37: updating storageos property to datanode
févr. 28 16:05:37 ecs-nod2 docker[3746]: 02/28/17 15:05:37: fixing file permissions
févr. 28 16:05:37 ecs-nod2 docker[3746]: 02/28/17 15:05:37: fixing storageos and data ownership
févr. 28 16:05:38 ecs-nod2 docker[3746]: 02/28/17 15:05:38: creating place holders file
févr. 28 16:05:38 ecs-nod2 docker[3746]: 02/28/17 15:05:38: preparing vipr configuration information
févr. 28 16:05:38 ecs-nod2 docker[3746]: cp: cannot stat '/data/config.iso': No such file or directory
févr. 28 16:05:38 ecs-nod2 docker[3746]: 02/28/17 15:05:38: Starting the rpcbind service
févr. 28 16:05:39 ecs-nod2 docker[3746]: 02/28/17 15:05:39: Initializing configuration properties
févr. 28 16:05:39 ecs-nod2 docker[3746]: 02/28/17 15:05:39: Generating configuration files
févr. 28 16:05:39 ecs-nod2 docker[3746]: /etc/systool: Warning: Fabric deployment. Skipping _bootfs_mount_ro()
févr. 28 16:05:45 ecs-nod2 docker[3746]: 02/28/17 15:05:45: Generation failed
févr. 28 16:05:46 ecs-nod2 docker[3746]: /etc/init.d/storageos-dataservice: line 100: /proc/sys/kernel/core_pattern: Read-only file system
févr. 28 16:05:46 ecs-nod2 docker[3746]: Warning: Found stale pidfile(s) (unclean shutdown?)
févr. 28 16:05:49 ecs-nod2 docker[3746]: Setting up SSL certificates ...done
févr. 28 16:05:49 ecs-nod2 docker[3746]: Starting ViPR services..done
févr. 28 16:05:51 ecs-nod2 docker[3746]: service: no such service cron
févr. 28 16:05:51 ecs-nod2 docker[3746]: Starting nginx service
févr. 28 16:05:51 ecs-nod2 docker[3746]: ..done
févr. 28 16:05:52 ecs-nod2 docker[3746]: mkfifo: cannot create fifo '/var/run/.boot_sh_pipe': File exists
The difference is that the lines in bold are not shown on the failing node....
Any ideas ?
Thanks for your help
Regards
Laurent
travis_wichert
16 Posts
0
July 21st, 2017 12:00
The node likely became corrupted when it did not shutdown properly. The previous installer did not have good support for starting and stopping the ECS Docker container. Please try out the new installer which has superior support for multinode installations, systemd, and firewalld.
GitHub - EMCECS/ECS-CommunityEdition: ECS Community Edition "Free & Frictionless"