Start a Conversation

Unsolved

This post is more than 5 years old

9137

October 15th, 2013 21:00

FSAnalyze failures

We frequently have FSAnalyze failures, and I have no idea where to even start to dig here.  As a result, our InsightIQ is very well utilized.

Where should we start to look?

4 Posts

October 16th, 2013 00:00

Faults have been reported by the cluster - the FSAnalyze job fails.Capture.JPG.jpg

We use the EMC provided InsightIQ virtual appliance, and we have no problems with it.

We are running OneFS 7.0.1.1

October 16th, 2013 00:00

Could you clarify if you are seeing faults as reported by the cluster (FSAnalyze job) and/or as reported by InsightIQ?  Also, what version of OneFS and InsightIQ?

I simply wanted to note a common issue that is frequently encountered related to File System Analytics.  When FSA is enabled within InsightIQ (per monitored clustered), to generate FILE SYSTEM REPORTING output the InsightIQ virtual appliance/server *must* be able to mount the following directory on the cluster(s) via NFS:

/ifs/.ifsvar/modules/fsa

Sometimes the InsightIQ server is unable to and some common reasons are:

1) The default /ifs export was deleted or locked down but a refined export not created for fsa

By default there is an /ifs export that is not limited to any specific host (Clients field is blank) and "Mount access to sub-directories" is ENABLED. these (insecure) defaults are what allows it to ultimately mount the fsa folder.  Now, (rightfully so) security conscious users will either delete it outright or maybe lock it down to a specific (mgmt) host. 

As for what I personally see frequently, especially in an all SMB environment, users are deleting the export or some may even disable the NFS service altogether unaware that it is required for InsightIQ File System Analytics.  Keep in mind, when enabled, statistics are still gathered as FSAnalyze is a daily job on the cluster, but being unable to mount the fsa folder via NFS prevents you from generating any such reports.

While not wrong and definitely recommended to *not* keep the /ifs export as-is, on behalf of InsightIQ FSA, you will want to create one as follows:

NOTE:

1) Under "Clients" in the screenshots below, 1.1.1.1 represents the IP of the InsightIQ mgmt IP

2) You will need to type in the path.  If you try to "Browse...", .ifsvar doesn't appear in the GUI.

v7.0.1.x

v701.png

v7.0.2.x

v702.png

On the InsightIQ server, if you log in and issue the following command:

$ mount

You should see an entry (as a result of autofs) similar to the following:

2.2.2.2:/ifs/.ifsvar/modules/fsa on /net/2.2.2.2/ifs/.ifsvar/modules/fsa type nfs (rw,nosuid,nodev,intr,sloppy,addr=2.2.2.2)

Here, 2.2.2.2 represents an IP on the cluster (based on what you specified when you configured InsightIQ to monitor the cluster).

2) Firewall

If you have the export on the cluster but you still don't see the InsightIQ instance mount the cluster's export, then a few things to try:

a) sudo service autofs status

b) showmount -e

c) rpcinfo -p

As noted in the InsightIQ Guide, it does require the following ports:

[...]

For monitored clusters running OneFS 7.0 and later, you must enable HTTPS port 8080. For

monitored clusters running an earlier version of OneFS, you must enable HTTPS port 9443. If

you use the File System Analytics feature, you must also enable the NFS service, HTTPS port

111, and HTTPS port 2049 on all monitored clusters.

[...]

You may also consider rebooting the InsightIQ instance or stopping and restarting the autofs service:

sudo service autofs stop

sudo service autofs start

October 16th, 2013 17:00

Thanks for the clarification.  At this point then of course ignore everything I mentioned above as whether or not the InsightIQ server can mount the fsa directory on the monitored cluster has no impact on the FSAnalyze job which runs regardless.

For deep analysis you'd want to review the /var/log/isi_job_d.log file (or anything of the archived logs *.gz).  For a good sample of log entries related to the phases of FSAnalyze (begin, merge, etc), the following KB article is worth reviewing and can provide some good keywords to search for:

Log file: isi_job_d.log

https://support.emc.com/kb/88712

However, if you haven't already, I'd recommend opening up a ticket with support first.

122 Posts

July 23rd, 2014 22:00

as stated error is for a job failure its generic term.

400100004

Job has failed.

The message is informational only

We need to dig  why the job failed as fsanalyze as it rus at low priority so any other job which kicks in at the time might cause fsanalyze job to fail.

FSAnalyze          Yes      LOW

FlexProtect        Yes      MEDIUM

FlexProtectLin     Yes      MEDIUM

IntegrityScan      Yes      MEDIUM

you can provide output for the job  to see why it failed

#  isi job status 

# isi job reports view --id=1348 

Example:

isi job reports view --id=1345

FSAnalyze[1345] phase 2 (2014-07-22T22:40:53)

---------------------------------------------

FSA JOB MERGE PHASE

Elapsed time:                       20 seconds

Errors:                              0

CPU usage:                         max 53% (dev 6), min 0% (dev 7), avg 11%

Virtual memory size:               max 211308K (dev 6), min 135532K (dev 7), avg 155329K

Resident memory size:              max 91680K (dev 6), min 17636K (dev 7), avg 35705K

Read:                              327 ops, 3379712 bytes (3.2M)

Write:                             26523 ops, 216485376 bytes (206.5M)

FSAnalyze[1345] phase 1 (2014-07-22T22:40:33)

---------------------------------------------

FSA JOB QUERY PHASE

Elapsed time:                     2421 seconds

LINS traversed:                6716727

Errors:                              0

CPU usage:                         max 61% (dev 6), min 2% (dev 6), avg 31%

Virtual memory size:               max 202988K (dev 6), min 133996K (dev 6), avg 144263K

Resident memory size:              max 83588K (dev 6), min 15964K (dev 8), avg 23186K

Read:                              74181 ops, 539489280 bytes (514.5M)

Write:                             43626 ops, 353920512 bytes (337.5M)

FSAnalyze[1345] phase 1 (2014-07-22T22:40:33)

---------------------------------------------

FSA JOB QUERY PHASE

Elapsed time:                     2421 seconds

LINS traversed:                6716727

Errors:                              0

CPU usage:                         max 61% (dev 6), min 2% (dev 6), avg 31%

Virtual memory size:               max 202988K (dev 6), min 133996K (dev 6), avg 144263K

Resident memory size:              max 83588K (dev 6), min 15964K (dev 8), avg 23186K

Read:                              74181 ops, 539489280 bytes (514.5M)

Write:                             43626 ops, 353920512 bytes (337.5M)

FSAnalyze[1345] phase 2 (2014-07-22T22:40:53)

---------------------------------------------

FSA JOB MERGE PHASE

Elapsed time:                       20 seconds

Errors:                              0

CPU usage:                         max 53% (dev 6), min 0% (dev 7), avg 11%

Virtual memory size:               max 211308K (dev 6), min 135532K (dev 7), avg 155329K

Resident memory size:              max 91680K (dev 6), min 17636K (dev 7), avg 35705K

Read:                              327 ops, 3379712 bytes (3.2M)

Write:                             26523 ops, 216485376 bytes (206.5M)

No Events found!

Top