VPLEX: Management server experience high RAM memory usage and internal disk space usage
Summary: management server running without an external AMQP event consumer may experience high RAM memory usage and internal disk space usage.
This article applies to
This article does not apply to
This article is not tied to any specific product.
Not all product versions are identified in this article.
Symptoms
- A management server running without an external AMQP event consumer may experience high RAM memory usage and internal disk space usage.
- A system experiencing high memory consumption will experience higher than normal latency when executing CLI or GUI commands.
- A system that runs out of internal disk space at the root partition will not be able to write to that partition (while VPLEX will continue to write logs into other partitions, several linux services use the root partition and will not be able to log further events).
Symptom 1:
Large amount of RAM memory being used by rabbitmq.
service@ManagementServer:~> top
top - 13:17:26 up 103 days, 13 min, 20 users, load average: 0.28, 0.34, 0.36
Tasks: 201 total, 1 running, 200 sleeping, 0 stopped, 0 zombie
Cpu(s): 12.3%us, 0.9%sy, 0.0%ni, 85.0%id, 1.5%wa, 0.0%hi, 0.2%si, 0.0%st
Mem: 3920396k total, 3448376k used, 472020k free, 14752k buffers
Swap: 8388604k total, 413608k used, 7974996k free, 1781800k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
22674 rabbitmq 20 0 3.4g 3.7g 2040 S 2 87.9 225:09.39 beam.smp
16302 service 20 0 2975m 1.1g 9232 S 2 2.4 561:18.54 java
Call homes warning of high or critical disk space usage.
When the root partition on the management server reaches 90% of available space you will see the following
call home.
<ID>0x8a4a31fb</ID>
<name>SMS_PARTITION_HIGH_CAPACITY</name>
<serverity> ERROR </severity>
<customerRCA>A partition on your Management Server has reached a high capacity.</customerRCA>
<name>SMS_PARTITION_HIGH_CAPACITY</name>
<serverity> ERROR </severity>
<customerRCA>A partition on your Management Server has reached a high capacity.</customerRCA>
When the root partition becomes full you will see the following call home.
<ID>0x8a4a61fa</ID>
<name>SMS_PARTITION_CRITICAL_CAPACITY</name>
<serverity> ERROR </severity>
<customerRCA>A partition on your Management Server has exceeded a critical capacity threshold.</customerRCA>
A large amount of disk space is being used by rabbitmq.
service@ManagementServer:/var/lib/rabbitmq/mnesia/rabbit@localhost # du -shx *
4.0K cluster_nodes.config
4.0K DECISION_TAB.LOG
4.0K LATEST.LOG
32K msg_store_persistent
14G msg_store_transient <<<<
4.0K nodes_running_at_shutdown
408M queues
4.0K rabbit_durable_exchange.DCD
4.0K rabbit_durable_queue.DCD
4.0K rabbit_durable_queue.DCL
4.0K rabbit_durable_route.DCD
4.0K rabbit_runtime_parameters.DCD
8.0K rabbit_runtime_parameters.DCL
4.0K rabbit_serial
4.0K rabbit_user.DCD
4.0K rabbit_user_permission.DCD
4.0K rabbit_vhost.DCD
service@ManagementServer:/var/lib/rabbitmq/mnesia/rabbit@localhost # df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda5 20G 19G 692K 100% / <<<<
udev 1.9G 196K 1.9G 1% /dev
tmpfs 1.9G 0 1.9G 0% /dev/shm
/dev/sda1 504M 60M 420M 13% /boot
/dev/sda7 16G 4.1G 11G 27% /var/log
/dev/sda8 44G 5.0G 37G 13% /diag
/dev/sda9 9.9G 151M 9.2G 2% /data
service@ManagementServer:/var/lib/rabbitmq/mnesia/rabbit@localhost # ls -lah msg_store_transient/ | head
total 14G
drwxr-x--- 1 rabbitmq rabbitmq 12K Nov 13 11:14 .
drwxr-x--- 1 rabbitmq rabbitmq 734 Nov 13 15:03 ..
-rw-r----- 1 rabbitmq rabbitmq 15M Nov 6 05:51 0.rdq
-rw-r----- 1 rabbitmq rabbitmq 17M Nov 13 05:19 1000.rdq
-rw-r----- 1 rabbitmq rabbitmq 17M Nov 13 05:21 1001.rdq
-rw-r----- 1 rabbitmq rabbitmq 17M Nov 13 05:22 1002.rdq
-rw-r----- 1 rabbitmq rabbitmq 17M Nov 13 05:23 1003.rdq
-rw-r----- 1 rabbitmq rabbitmq 17M Nov 13 05:25 1004.rdq
-rw-r----- 1 rabbitmq rabbitmq 17M Nov 13 05:30 1005.rdq
4.0K cluster_nodes.config
4.0K DECISION_TAB.LOG
4.0K LATEST.LOG
32K msg_store_persistent
14G msg_store_transient <<<<
4.0K nodes_running_at_shutdown
408M queues
4.0K rabbit_durable_exchange.DCD
4.0K rabbit_durable_queue.DCD
4.0K rabbit_durable_queue.DCL
4.0K rabbit_durable_route.DCD
4.0K rabbit_runtime_parameters.DCD
8.0K rabbit_runtime_parameters.DCL
4.0K rabbit_serial
4.0K rabbit_user.DCD
4.0K rabbit_user_permission.DCD
4.0K rabbit_vhost.DCD
service@ManagementServer:/var/lib/rabbitmq/mnesia/rabbit@localhost # df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda5 20G 19G 692K 100% / <<<<
udev 1.9G 196K 1.9G 1% /dev
tmpfs 1.9G 0 1.9G 0% /dev/shm
/dev/sda1 504M 60M 420M 13% /boot
/dev/sda7 16G 4.1G 11G 27% /var/log
/dev/sda8 44G 5.0G 37G 13% /diag
/dev/sda9 9.9G 151M 9.2G 2% /data
service@ManagementServer:/var/lib/rabbitmq/mnesia/rabbit@localhost # ls -lah msg_store_transient/ | head
total 14G
drwxr-x--- 1 rabbitmq rabbitmq 12K Nov 13 11:14 .
drwxr-x--- 1 rabbitmq rabbitmq 734 Nov 13 15:03 ..
-rw-r----- 1 rabbitmq rabbitmq 15M Nov 6 05:51 0.rdq
-rw-r----- 1 rabbitmq rabbitmq 17M Nov 13 05:19 1000.rdq
-rw-r----- 1 rabbitmq rabbitmq 17M Nov 13 05:21 1001.rdq
-rw-r----- 1 rabbitmq rabbitmq 17M Nov 13 05:22 1002.rdq
-rw-r----- 1 rabbitmq rabbitmq 17M Nov 13 05:23 1003.rdq
-rw-r----- 1 rabbitmq rabbitmq 17M Nov 13 05:25 1004.rdq
-rw-r----- 1 rabbitmq rabbitmq 17M Nov 13 05:30 1005.rdq
Symptom 4:
A large number of messages on the external message queue.
service@sms-bali-2:~> sudo rabbitmqctl list_queues
Listing queues ...
aliveness-test 0
queue.vplex.external 1749909 <<<<<
queue.vplex.ndu.events 0
sms_internal 0
...done.
Listing queues ...
aliveness-test 0
queue.vplex.external 1749909 <<<<<
queue.vplex.ndu.events 0
sms_internal 0
...done.
Cause
- RabbitMQ will attempt to retain all events until the events are consumed or the memory consumption threshold is hit.
- For the VPLEX, currently, there is no default consumer for queue.vplex.external queue.
- If left unchecked, the queue size can grow very large.
- Once memory consumption hits 20%, RabbitMQ will write the queue out to disk which in turn uses root disk space.
Resolution
Workaround 1:
From the management server restart the rabbitmq server using the following command
sudo service rabbitmq-server restart
Sample output:
service@ManagementServer:~>
rabbitmq-server.
Restarting rabbitmq-server: SUCCESS
sudo service rabbitmq-server restartservice@ManagementServer:~>
Workaround 2:
sudo shutdown r now
Sample output:
Note:
Then a PuTTY Fatal Error message will pop-up stating the "Server unexpectedly closed network connection"
Permanent Fix:
This issue was addressed in GeoSynchrony 5.5 and later.
From the management server restart the rabbitmq server using the following command
sudo service rabbitmq-server restart
Sample output:
service@ManagementServer:~>
rabbitmq-server.
Restarting rabbitmq-server: SUCCESS
sudo service rabbitmq-server restartservice@ManagementServer:~>
Workaround 2:
From the management server restart the management server using the following command,
sudo shutdown r now
Sample output:
service@ManagementServer:~> sudo shutdown -r now
Broadcast message from root (pts/0) (Mon Mar 5 19:33:18 2018):
The system is going down for reboot NOW!
Broadcast message from root (pts/0) (Mon Mar 5 19:33:18 2018):
The system is going down for reboot NOW!
Note:
Then a PuTTY Fatal Error message will pop-up stating the "Server unexpectedly closed network connection"
Permanent Fix:
This issue was addressed in GeoSynchrony 5.5 and later.
Affected Products
VPLEX SeriesProducts
VPLEX for All Flash, VPLEX GeoSynchrony, VPLEX Series, VPLEX VS1, VPLEX VS2Article Properties
Article Number: 000170841
Article Type: Solution
Last Modified: 20 Nov 2020
Version: 2
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.