ECS:xDoctor RAP014: ファブリック ライフサイクル サービスが正常でない |Lifecycle Jettyサーバーがポート9241で動作していません
Summary: ECS:xDoctor RAP014: ファブリック ライフサイクル サービスが正常でない |Lifecycle Jettyサーバーがポート9241で動作していません。
This article applies to
This article does not apply to
This article is not tied to any specific product.
Not all product versions are identified in this article.
Symptoms
問題 #1:
ECSでバージョン3.0.X以前からバージョン3.1以降へのアップグレードを実行した後に、サービスコンソールに次の出力が表示されます。
20180309 01:49:28.456: | | | PASS (21 min 29 sec) 20180309 01:49:28.462: | | PASS (21 min 29 sec) 20180309 01:49:28.463: | Run Keyword If 20180309 01:49:28.464: | | Node Service Upgrade Initializing... Executing Program: NODE_SERVICE_UPGRADE |-Disable CallHome | +-[0.0.0.0] SetCallHomeEnabled PASS (1/7, 1 sec) |-Push Service Image To Registries | |-Push Service Image to Head Registry | | |-[169.254.1.1] LoadImage PASS (2/7, 1 sec) | | +-[169.254.1.1] PushImage PASS (3/7) | +-Push Service Image to Remote Registries |-Upgrade Object On Specified Nodes | +-Initiate Object Upgrade if Required | +-[0.0.0.0] UpdateApplicationOnNodes PASS (4/7, 1 sec) |-Update Services Ownership To Lifecycle Manager on Specified Nodes | +-Update Ownership For Object | +-[169.254.1.1] UpdateOwnership PASS (5/7) |-Post-check Services Health | +-Validate Object Service on Specified Nodes | +-[169.254.1.1] ServiceHealth PASS (6/7, 21 sec) +-Enable CallHome +-[0.0.0.0] SetCallHomeEnabled PASS (7/7, 3 sec) Elapsed time is 30 sec. NODE_SERVICE_UPGRADE completed successfully Collecting data from cluster Information has been written to the Information has been written to the Executing /configure.sh --start action in object-main container which may take up to 600 seconds. 20180309 01:52:51.711: | | | PASS (3 min 23 sec) 20180309 01:52:51.720: | | PASS (3 min 23 sec) 20180309 01:52:51.722: | Run Keyword If 20180309 01:52:51.724: | | Update manifest file [ERROR] On node 169.254.1.1, Lifecycle Jetty server is not up and running on port 9241! 20180309 01:58:45.068: | | | FAIL (5 min 53 sec) 20180309 01:58:45.071: | | FAIL (5 min 53 sec) 20180309 01:58:45.072: | FAIL (45 min 43 sec) 20180309 01:58:45.075: Service Console Teardown 20180309 01:58:46.973: | PASS (1 sec) ================================================================================ Status: FAIL Time Elapsed: 45 min 56 sec Debug log: / HTML log: / ================================================================================ Messages: fabric-lifecycle service should be up and running ================================================================================
問題#2:
xDoctorは以下を報告する場合があります。
- xDoctor reports the following: Timestamp = 2015-09-25_092907 Category = health Source = fcli Severity = WARNING Message = Fabric Lifecycle Service not Healthy Extra =
「sudo docker ps -a」を使用してFabric Lifecycle Serviceをモニタリングすると、サービスが再起動中であることが示されます。
venus2:~ # docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 7995f18ba27f ip.ip.ip.ip:5000/emcvipr/object:2.0.1.0-62267.db4d4a8 "/opt/vipr/boot/boot 4 weeks ago Up 21 hours object-main 73f00ed0b6df ip.ip.ip.ip:5000/caspian/fabric:1.1.1.0-1998.1391e7e "./boot.sh lifecycle 4 weeks ago Up 3 seconds fabric-lifecycle ba19a3c95151 ip.ip.ip.ip:5000/caspian/fabric-zookeeper:1.1.0.0-54.54a204e "./boot.sh 2 1=169.2 4 weeks ago Up 21 hours fabric-zookeeper venus2:~ # docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 7995f18ba27f ip.ip.ip.ip:5000/emcvipr/object:2.0.1.0-62267.db4d4a8 "/opt/vipr/boot/boot 4 weeks ago Up 21 hours object-main 73f00ed0b6df ip.ip.ip.ip:5000/caspian/fabric:1.1.1.0-1998.1391e7e "./boot.sh lifecycle 4 weeks ago Exited (1) 2 seconds ago fabric-lifecycle ba19a3c95151 ip.ip.ip.ip:5000/caspian/fabric-zookeeper:1.1.0.0-54.54a204e "./boot.sh 2 1=169.2 4 weeks ago Up 21 hours fabric-zookeeper
Cause
原因の問題#1:
スナップショットのサイズが原因で、ZooKeeperコンテナを正しく起動できませんでした。
原因となる問題#2:
ECS IPが誤ったホスト名に解決されています。
Resolution
解決策の問題#1:
ECSバージョン3.0で解決済み
ECS 3.0では、圧縮が改善され、ZKメッセージの保存が有効になっています。
注:この解決策は、このビルドがすでにホストにインストールされている場合にのみ機能します。つまり、この解決策は、劣化したシステムでこのビルドへのアップグレードが実行された場合には役に立ちません。
この問題が発生した場合は、ECSサポートにお問い合わせください。
この問題を確認する方法は次のとおりです
次のコマンドを実行します。
# viprexec 'cat /opt/emc/caspian/fabric/agent/services/fabric/zookeeper/log/zookeeper.log | grep "GC overhead limit exceeded"'
出力例:
admin@:~> viprexec 'cat /opt/emc/caspian/fabric/agent/services/fabric/zookeeper/log/zookeeper.log | grep "GC overhead limit exceeded"' Output from host : 192.168.219.4 java.lang.OutOfMemoryError: GC overhead limit exceeded java.lang.OutOfMemoryError: GC overhead limit exceeded java.lang.OutOfMemoryError: GC overhead limit exceeded Output from host : 192.168.219.5 java.lang.OutOfMemoryError: GC overhead limit exceeded Output from host : 192.168.219.3 java.lang.OutOfMemoryError: GC overhead limit exceeded java.lang.OutOfMemoryError: GC overhead limit exceeded java.lang.OutOfMemoryError: GC overhead limit exceeded java.lang.OutOfMemoryError: GC overhead limit exceeded Output from host : 192.168.219.7 cat: /opt/emc/caspian/fabric/agent/services/fabric/zookeeper/log/zookeeper.log: No such file or directory Output from host : 192.168.219.2 Output from host : 192.168.219.8 cat: /opt/emc/caspian/fabric/agent/services/fabric/zookeeper/log/zookeeper.log: No such file or directory Output from host : 192.168.219.6 cat: /opt/emc/caspian/fabric/agent/services/fabric/zookeeper/log/zookeeper.log: No such file or directory Output from host : 192.168.219.1 adm@:in~>
このメッセージは、ZooKeeper ログファイルに表示されます。
OutOfMemoryError java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.Arrays.copyOf(Arrays.java:3236) at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:118) at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153) at java.io.DataOutputStream.writeLong(DataOutputStream.java:224) at org.apache.jute.BinaryOutputArchive.writeLong(BinaryOutputArchive.java:59) at org.apache.zookeeper.data.Stat.serialize(Stat.java:129) at org.apache.jute.BinaryOutputArchive.writeRecord(BinaryOutputArchive.java:123) at org.apache.zookeeper.proto.GetDataResponse.serialize(GetDataResponse.java:49) at org.apache.jute.BinaryOutputArchive.writeRecord(BinaryOutputArchive.java:123) at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1067) at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:404) at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)
解決策の問題#2:
nslookupを使用してDNSを確認し、ライフサイクルが再起動するECSノードのIPアドレスを確認します。
# nslookup <ip of ecs node>
DNSが正しくてもライフサイクルに問題が解決しない場合は、ECSサポートにお問い合わせください。
Affected Products
ECS ApplianceProducts
ECS Appliance, ECS Appliance Hardware Gen1 U-Series, ECS Appliance Software with Encryption, ECS Appliance Software without EncryptionArticle Properties
Article Number: 000064892
Article Type: Solution
Last Modified: 21 Nov 2025
Version: 5
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.