PowerFlex: Gateway High Availability setup causes 401 errors on REST clients

Summary: REST API client receives "401: Unauthorized" error when both Gateway/Apache services are running.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

REST API client receives "401: Unauthorized" error when both Gateway/Apache services are running.

One example of REST API client is OpenStack cinder. This issue may cause certain ScaleIO volume operations (map, unmap, and so forth) in OpenStack to fail.

For every 10 successful REST API requests, 1 fails. For example, the Primary Apache service's mod_jk.log shows:

tail -f /var/log/apache2/mod_jk.log | grep ") status"
[6496:139877463439104] [debug] ajp_unmarshal_response::jk_ajp_common.c (739): (machine2) status = 200
[6497:139877295294208] [debug] ajp_unmarshal_response::jk_ajp_common.c (739): (machine2) status = 200
[6497:139877270116096] [debug] ajp_unmarshal_response::jk_ajp_common.c (739): (machine2) status = 200
[6496:139877429868288] [debug] ajp_unmarshal_response::jk_ajp_common.c (739): (machine1) status = 401 <---
[6497:139877219759872] [debug] ajp_unmarshal_response::jk_ajp_common.c (739): (machine2) status = 200
[6496:139877303686912] [debug] ajp_unmarshal_response::jk_ajp_common.c (739): (machine2) status = 200
[6497:139877228152576] [debug] ajp_unmarshal_response::jk_ajp_common.c (739): (machine2) status = 200



 /var/log/nova/nova-compute.log shows: 

2017-04-05 11:20:36.090 38186 ERROR nova.compute.manager [instance: 20e1036d-daf0-49b9-a228-07a1c48b882d] File "/usr/lib/python2.7/site-packages/os_brick/initiator/connector.py", line 1980, in connect_volume 2017-04-05 11:20:36.090 38186 ERROR nova.compute.manager [instance: 20e1036d-daf0-49b9-a228-07a1c48b882d] self.volume_id = self._get_volume_id() 2017-04-05 11:20:36.090 38186 ERROR nova.compute.manager [instance: 20e1036d-daf0-49b9-a228-07a1c48b882d] File "/usr/lib/python2.7/site-packages/os_brick/initiator/connector.py", line 1879, in _get_volume_id 2017-04-05 11:20:36.090 38186 ERROR nova.compute.manager [instance: 20e1036d-daf0-49b9-a228-07a1c48b882d] raise exception.BrickException(message=msg) 2017-04-05 11:20:36.090 38186 ERROR nova.compute.manager [instance: 20e1036d-daf0-49b9-a228-07a1c48b882d] BrickException: Error getting volume id from name oGXMByctQWesXL8PPKiyBQ==: Unauthorized 2017-04-05 11:20:36.090 38186 ERROR nova.compute.manager [instance: 20e1036d-daf0-49b9-a228-07a1c48b882d]

 

Cause

This is an error in the document. The workers.properties configuration in this document contains a load-balancing setup between two Gateway (Tomcat) instances, and the lbfactor is set to 10 and 1 for them. This means that the Apache service directs incoming requests to the two Gateways at a 10:1 ratio. As the REST API client acquires a token through one Gateway, and tokens are not shared between Gateways, a request that is sent to the second Gateway with this token fails with 401.

Note: If a client acquires a token from the Gateway with lbfactor 1, the failure rate is about 91%.

Resolution

Workaround
Use the following workers.properties file instead of the file in the document. This sets up the two Gateways in active-standby mode:

** /etc/apache2/workers.properties ***

worker.list=balance1
worker.machine1.type=ajp13
worker.machine1.host=<ip of GW 1>
worker.machine1.port=8009
worker.machine1.lbfactor=1
worker.machine1.activation=disabled

worker.machine2.type=ajp13
worker.machine2.host=<ip of GW 2>
worker.machine2.port=8009
worker.machine2.lbfactor=1
worker.machine2.redirect=machine1

worker.balance1.type=lb
worker.balance1.balance_workers=machine1,machine2



This configuration sets up machine2 as the primary, worker1 as the standby. The key differences between this configuration and the document are:

  • worker.machine1.activation=disabled

This puts worker machine1 in standby, and no requests are sent to machine1 by default.
  • worker.machine2.redirect=machine1
By default machine2 is activated, and receives requests. If machine2 fails, the request is redirected to machine1.
  • worker.machine#.lbfactor=1

As this is an active-standby setup, a different lbfactors setup if not required for both workers.
 
 

With this configuration:

  • When both Gateways are up, all requests are directed to worker2, and there should be no 401.
  • When worker2 goes down, the requests are directed to worker1. The REST client receives a 401, and can log in again to the REST API service and continue.
  • When worker2 comes back and mod_jk module detects it, it directs requests to worker2 again, and the REST client receives another 401 but can log in again to RESET API service and continue.

Note: Both Apache services must have the same configurations in their workers.properties file. The Apache services are also set up as an active-standby cluster, by keepalived, and the mod_jk module in Apache service is responsible for directing REST API requests to Gateway services, that is based on the above configuration.

This is a documentation error. This KB can be used before the document is corrected.

Additional Information

The keepalived configuration can also be improved as it may not monitor apache/httpd services correctly.

The keepalived.conf uses "killall -0 apache2" for the "script." This returns 0 (success) if there is any process with "apache2" in the name, such as "tail -f /var/log/apache2/mod_jk.log."

To correctly monitor apache2 service, use "systemctl --no-pager status apache2" (Ubuntu), or "systemctl status httpd"(CentOS/RedHat).

The command used as "script" must return 0 if the apache2/httpd service is running, and none-zero, if it has stopped.

Affected Products

PowerFlex Software
Article Properties
Article Number: 000052840
Article Type: Solution
Last Modified: 29 Oct 2025
Version:  4
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.