Isilon: Version 8.0+ WebUI fails with status 500 error due FD_SETSIZE error in Apache2

Summary: if there is a client making too many API calls to establish a session with a node, the number of stream File Descriptors against the httpd process exceeds 1024 which causes issues with isi_papi_d ...

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms



The WebUI is not accessible where you get a "server-side failure with status 500"  as seen below:    

kA23a0000000BSdCAM_3_0

and you see these errors in the /var/log/apache2/webui_httpd_error.log:    
 
2019-05-16T09:12:26Z <18.3> kanagawa-1 httpd[3004]: [:error] [pid 3004:tid 34678361088] (20)Not a directory: [client 1X.8X.1XX.1X1:60831] FastCGI: failed to connect to server "/usr/sbin/isi_papi_d": socket file descriptor (1185) is larger than FD_SETSIZE (1024), you probably need to rebuild Apache with a larger FD_SETSIZE, referer: https://1X.1XX.1XX.1X8:8080/
2019-05-16T09:12:26Z <18.3> kanagawa-1 httpd[3004]: [:error] [pid 3004:tid 34678361088] [client 1X.8X.1XX.1X1:60831] FastCGI: incomplete headers (0 bytes) received from server "/usr/sbin/isi_papi_d", referer: https://1X.1XX.1XX.1X8:8080/
2019-05-16T09:12:26Z <18.3> kanagawa-1 httpd[3004]: [:error] [pid 3004:tid 34678361088] [client 1X.8X.1XX.1X1:60831] FastCGI: do_work() failed with ret 500 for request GET /platform/5/cluster/identity?_dc=1558011363341 HTTP/1.1, referer: https://1X.1XX.1XX.1X8:8080/

Cause

This issue occurs when the File Descriptor count for Apache2 exceeds 1024 which is the hard limit set on apache2. 

This is verified by running the command below:    
 
kanagawa-1# for i in $(ps -auwx | grep -i httpd |grep -v grep | awk '{print $2}'); do echo $i ; procstat -f $i ; done

Notice that one of the httpd processes has a high FD count that exceeds 1024 (look at column 3 of the last line of the process's output):    
 
kanagawa-1# for i in $(ps -auwx | grep -i httpd |grep -v grep | awk '{print $2}'); do echo $i ; procstat -f $i ; done
25136
  PID COMM               FD T V FLAGS     REF  OFFSET PRO NAME
25136 httpd             text v r r-------  -       - -   /usr/local/apache24/bin/httpd
25136 httpd              cwd v d r-------  -       - -   /
25136 httpd             root v d r-------  -       - -   /
25136 httpd                0 v c r-------  4       0 -   /dev/null
25136 httpd                1 v c -w------  4       0 -   /dev/null
25136 httpd                2 v c -w------  5       0 -   /dev/null
25136 httpd                3 s - rw---n--  4       0 TCP ::.8080 ::.0
25136 httpd                4 s - rw---n--  4       0 TCP 0.0.0.0:8080 0.0.0.0:0
25136 httpd                5 s - rw---n--  4       0 TCP ::.8081 ::.0
25136 httpd                6 s - rw---n--  4       0 TCP 0.0.0.0:8081 0.0.0.0:0
25136 httpd                7 s - rw---n--  4       0 TCP ::.8082 ::.0
25136 httpd                8 s - rw---n--  4       0 TCP 0.0.0.0:8082 0.0.0.0:0
25136 httpd                9 s - rw---n--  4       0 TCP ::.8083 ::.0
25136 httpd               10 s - rw---n--  4       0 TCP 0.0.0.0:8083 0.0.0.0:0
25136 httpd               11 p - rw------  5       0 -   -
25136 httpd               12 p - rw------  4       0 -   -
25136 httpd               13 v r -w------  4       0 -   /var/apache2/run/mpm-accept-0.25132
25136 httpd               14 s - rw------  4       0 UDD /var/run/log
25136 httpd               15 v r -w------  1       0 -   /var/apache2/run/proxy.25132
25136 httpd               16 p - rw------  4       0 -   -
25136 httpd               17 v r -w------  2       0 -   /var/apache2/run/proxy.25132
25136 httpd               18 p - rw------  2       0 -   -
25136 httpd               19 v r -w------  2       0 -   /var/apache2/run/rewrite-map.25132
25136 httpd               20 v r -w-----l  1       0 -   /var/apache2/run/mpm-accept-0.25132
25136 httpd               21 v r -w------  1       0 -   /var/apache2/run/rewrite-map.25132
25136 httpd               22 k - rw------  2       0 -   -
25136 httpd               23 s - rw---n--  1       0 TCP sendq:2146 127.0.0.1:8080 127.0.0.1:18720
25136 httpd               24 v c r-------  1   31968 -   /dev/random
25136 httpd               25 ? - r-------  1       0 -   -
25136 httpd               27 ? - r-------  1       0 -   -
.
.
.
.
25136 httpd             1121 ? - r-------  1       0 -   -

In UNIX and related computer operating systems, a file descriptor (FD, less frequently fildes) is an abstract indicator used to access a file or other input/output resource, such as a pipe or network socket. File descriptors form part of the POSIX application programming interface.

More details about the different types of FDs seen in the procstat output can be found here

Resolution

Check the /var/log/apache2/webui_httpd_access.log for clients that are establishing sessions with the node:    
 
2019-05-16T10:12:51Z <19.6> kanagawa-1 httpd: 127.0.0.1 - - [16/May/2019:10:12:51 +0000] "POST /session/1/session HTTP/1.1" 201 92 "-" "curl/7.57.0"

If these messages are occurring very frequently (multiple sessions in a second), this can easily overload the Apache2 server and prevent the Apache2 process from communicating with isi_papi_d service due to unavailability of FD sockets. 

Identify the client in the output from webui_httpd_access.log and rectify it from the client side. 

As a temporary workaround, restart the Apache2 and isi_webui services to clear the FDs. 

Workaround:    
  1. Disable the services:   
# isi services -a apache2 disable
# isi services -a isi_webui disable 
  1. Verify there are no processes running. Kill any running processes:   
# isi_for_array " ps -auwx | grep -i httpd | grep -v grep "
  1. Enable the services again:    
# isi services -a apache2 enable
# isi services -a isi_webui enable

Additional Information

A known trigger for this issue is related to the error below:   
 

2019-05-16T14:43:20-04:00 <18.3> Kanagawa-1 httpd[80346]: [error] [client 10.118.160.121] No Origin or Referer header for CSRF protection


When the user has not included any CSRF headers/tokens when trying to establish a session, the client will be able to establish a session but not log in to the cluster. This may cause the script to keep trying to establish a session with the node which can lead to a high FD count. 

Resolution is to make changes to the Client to be compatible with the CSRF checks. Refer to the how to implement authentication with CSRF protection part of KB 517421: OneFS: How to protect your cluster from Cross-Site Request Forgery (CSRF) for further information (Only registered Dell Customers can access the content on the article link, via Dell.com/support).

Affected Products

Isilon

Products

Isilon, PowerScale OneFS
Article Properties
Article Number: 000061440
Article Type: Solution
Last Modified: 08 Dec 2025
Version:  4
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.