Highlighted
3 Argentium

Re: Ask the Expert: SMB Protocol on an Isilon Cluster

Hi Peter,

Is there a way to find out which share is connected to what file server(s)

Also, the document that you have mentioned, Active Directory Discovery and Failover for OneFS, 

is not available anymore. I can't find it in support as well. Could you please share it with us if you have it with you.

salesforce.JPG.jpg

Thanks

Damal

Reply
Highlighted

Re: Ask the Expert: SMB Protocol on an Isilon Cluster

This might be a pretty basic question but I haven't yet found a good explanation for this.  When setting up an SMB share, we are given two choices.  "Apply Windows Default ACLs" and "Do not change existing permissions".  When I attended training, we were advised to do "Do not change existing permissions".  However, when I ran into issues at a client site and called support, I was told to use the other option.  What is the difference between the two and what's the general use case?  Thanks.

Reply
Highlighted
2 Bronze

Re: Ask the Expert: SMB Protocol on an Isilon Cluster

Hello prashant_shah,

This option is often mis-understood so I am glad you asked.

When a cluster is setup, /ifs is configured with the following default permissions:

ISI7021-1# ls -led /ifs

drwxrwxrwx    9 root  wheel  158 Jul 17 07:46 /ifs

OWNER: user:root

GROUP: group:wheel

SYNTHETIC ACL

0: user:root allow dir_gen_read,dir_gen_write,dir_gen_execute,std_write_dac,delete_child

1: group:wheel allow dir_gen_read,dir_gen_write,dir_gen_execute,delete_child

2: everyone allow dir_gen_read,dir_gen_write,dir_gen_execute,delete_child

If you create a directory through webui or cli, the directory will get the following permissions:

ISI7021-1# ls -led /ifs/tmp

drwxr-xr-x    2 root  wheel  0 Jul 17 07:46 /ifs/tmp

OWNER: user:root

GROUP: group:wheel

SYNTHETIC ACL

0: user:root allow dir_gen_read,dir_gen_write,dir_gen_execute,std_write_dac,delete_child

1: group:wheel allow dir_gen_read,dir_gen_execute

2: everyone allow dir_gen_read,dir_gen_execute

If you create a new share pointing to the /ifs/tmp directory and select "Do not change existing permissions", it will leave the permissions as:

ISI7021-1# ls -led /ifs/tmp

drwxr-xr-x    2 root  wheel  0 Jul 17 07:46 /ifs/tmp

OWNER: user:root

GROUP: group:wheel

SYNTHETIC ACL

0: user:root allow dir_gen_read,dir_gen_write,dir_gen_execute,std_write_dac,delete_child

1: group:wheel allow dir_gen_read,dir_gen_execute

2: everyone allow dir_gen_read,dir_gen_execute

If you create a new share pointing to the /ifs/tmp directory and select "Apply Windows Default ACLs" the equivalent will be run against the directory:

chmod -D /ifs/tmp

chmod -c dacl_auto_inherited,dacl_protected /ifs/tmp

chmod +a# 0 group Administrators allow dir_gen_all,object_inherit,container_inherit /ifs/tmp

chmod +a# 1 group creator_owner allow dir_gen_all,object_inherit,container_inherit,inherit_only /ifs/tmp

chmod +a# 2 group everyone allow dir_gen_read,dir_gen_execute /ifs/tmp

chmod +a# 3 group Users allow dir_gen_read,dir_gen_execute,object_inherit,container_inherit /ifs/tmp

chmod +a# 4 group Users allow std_synchronize,add_file,add_subdir,container_inherit /ifs/tmp

That ends up converting the ACL to:

ISI7021-1# ls -led /ifs/tmp

drwxrwxr-x +  2 root  wheel  0 Jul 17 07:46 /ifs/tmp

OWNER: user:root

GROUP: group:wheel

CONTROL:dacl_auto_inherited,dacl_protected

0: group:Administrators allow dir_gen_all,object_inherit,container_inherit

1: creator_owner allow dir_gen_all,object_inherit,container_inherit,inherit_only

2: everyone allow dir_gen_read,dir_gen_execute

3: group:Users allow dir_gen_read,dir_gen_execute,object_inherit,container_inherit

4: group:Users allow std_synchronize,add_file,add_subdir,container_inherit

This may or may not be a good thing for the permissions on your directories.  Lets say that /ifs/tmp was a NFS export and you explicitly wanted those Mode Bit Rights set based due to Unix client application requirements.  By selecting the "Apply Windows Default ACLs" option, you have now overwritten the original ACL which may break the application.  Thus, there is risk associated with using "Apply Windows Default ACLs" with a currently existing directory.

On the flip side, lets say that /ifs/tmp was a brand new directory created from cli that you want windows users to be able to create and delete files in.  When creating the share, if you set "Do not change existing permissions" and then had the users attempt to save files there, they would get access denied because "Everyone" only gets Read access.  In fact, even as Administrator, you would not be able to modify the security tab of the directory to add Windows users because the Mode Bits limit access to only Root.

In summary, a pretty good rule of thumb is as follows:

-- If you have an existing directory structure that you want to add a share to, you most likely do not want to change the ACL so you should select the "Do not change existing permissions" option.

-- If you are creating a new share for a new directory you will likely be changing permissions to the ACL to grant Windows users rights to perform operations.  Thus, you should set the "Apply Windows Default ACLs" option and then once the share is created, go into the Security tab from Windows and assign permissions to users as needed.

Reply
Highlighted

Re: Ask the Expert: SMB Protocol on an Isilon Cluster

Great explanation!  Thank you.

Reply
2 Bronze

Re: Ask the Expert: SMB Protocol on an Isilon Cluster

We have had a lot of great discussions last week so I figured I would kick this week off with one of my least favorite topics: SMB Performance

When it comes to performance, 99% of the time there is no silver bullet to fix the issue.  Since Performance will likely be a future ATE, I am going to focus this conversation around what to look for from a SMB perspective.

Most of the time, when someone comes to me and says SMB is slow, I ask the following questions:

1.) How are you measuring slow?  Wall Clock or something that can accurately measure time?

2.) Define slow?  Did a job that always ran in 10 seconds now take 20 seconds?

3.) What has changed? Are there new jobs running on the system that never ran before? *Interesting note, in all the years I have been in support, nothing has ever changed

From an Isilon Perspective, we have 4 ways to measure performance:

1.) InsightIQ - This tool is a VM that sits on your network and collects data from you cluster and stores the data in a local database.  You can then give the database to support who can pull it into their own InsightIQ system to extrapolate the data.

2.) Historical Counters - Yes!  We collect historical counters so you could call support and tell us, "Hey, I saw a slow down last Monday" and we can pull the historical stats to start our analysis.  There are some caveats with historical counters:

  -- Not all counters are historical

  -- They are less accurate as time elapses

3.) Freebsd stats - Standard performance tools built into freebsd to troubleshoot issues

4.) isi statistics - Isilon specific counters that measure OneFS counters that can give us a better understanding of where latency is seen

To limit the size of this post, I am going to talk about the Freebsd and isi statistic stats that I look at when troubleshooting SMB Performance issues.

One of the first places people look during an issue is PS or Top.  These are good places to start to help you understand how much CPU that lwio and other likewise services are using but are also the greatest cause of confusion.  The lwio process within OneFS is a multithreaded process but when you look at it with the default output of PS and Top, it looks single threaded.  Engineers tend to become concerned when they see it approaching 100% and become confused when it is over 100%.  Do not be afraid of lwio running at 100% when using the default output of PS.

For example:

Regular ps looks scary at 110%:

b5-2-1# ps -fwulp `pgrep lwio`

USER   PID %CPU %MEM   VSZ   RSS  TT  STAT STARTED      TIME COMMAND            UID  PPID CPU PRI NI MWCHAN

root  3311  110.0  0.1 130836 15688  ??  I    24May13 387:36.82 lw-container lwi     0  3171   0  96  0 ucond

But Ps with the flag to show threads does not look so bad:

b5-2-1# ps -fwulHp `pgrep lwio`

USER   PID %CPU %MEM   VSZ   RSS  TT  STAT STARTED      TIME COMMAND            UID  PPID CPU PRI NI MWCHAN

root  3311  0.0   0.1 130836 15688  ??  I    24May13   0:00.03 lw-container lwi     0  3171   0  20  0 sigwait

root  3311  20.0  0.1 130836 15688  ??  I    24May13   1:39.00 lw-container lwi     0  3171   0   4  0 kqread

root  3311  20.0  0.1 130836 15688  ??  I    24May13   0:00.09 lw-container lwi     0  3171   0   4  0 kqread

root  3311  20.0  0.1 130836 15688  ??  I    24May13   0:00.33 lw-container lwi     0  3171   0   4  0 kqread

root  3311  20.0  0.1 130836 15688  ??  I    24May13 378:10.17 lw-container lwi     0  3171   0   4  0 kqread

root  3311  20.0  0.1 130836 15688  ??  I    24May13   0:00.32 lw-container lwi     0  3171   0   4  0 kqread

root  3311  1.0   0.1 130836 15688  ??  I    24May13   0:00.26 lw-container lwi     0  3171   0   4  0 kqread

root  3311  1.0   0.1 130836 15688  ??  I    24May13   0:00.48 lw-container lwi     0  3171   0   4  0 kqread

root  3311  0.0   0.1 130836 15688  ??  I    24May13   7:43.44 lw-container lwi     0  3171   0   4  0 kqread

root  3311  0.0   0.1 130836 15688  ??  I    24May13   0:00.03 lw-container lwi     0  3171   0  96  0 ucond

root  3311  0.0   0.1 130836 15688  ??  I    24May13   0:00.08 lw-container lwi     0  3171   0  96  0 ucond

root  3311  0.0   0.1 130836 15688  ??  I    24May13   0:02.60 lw-container lwi     0  3171   0  96  0 ucond

I am going to steal a quote from the great Tim Wright:

"It is entirely normal and expected to see multiple threads consuming 15, 20, 25% cpu at times. *If* you see one or more threads that are consistently and constantly consuming 100% cpu, *then* you probably have a problem. If you just see the sum of all the lwio threads consuming  >100% cpu, that is not likely to be a problem. Certain operations including auth can be somewhat cpu-intensive. Imagine hundreds of users connecting to the cluster in the morning."

After we have eased our concerns over CPU, the next place to look is the isi statistic commands so we can understand what kind of work the clients are doing.  When running isi statistics, there are a couple of things to be aware of:

-- You only need to run the command from one node to capture stats for all nodes (--nodes=all)

-- Use the --degraded switch so if one of the nodes does not respond to the counter fast enough, it does not stop the continual output

To start, it is always good to know how many clients are connecting to the nodes:

isi statistics query --nodes=all --stats node.clientstats.connected.smb,node.clientstats.active.cifs,node.clientstats.active.smb2 --interval 5 --repeat 12 --degraded

isi-ess-east-1# isi statistics query --nodes=all --stats node.clientstats.connected.smb,node.clientstats.active.cifs,node.clientstats.active.smb2 --interval 5 --repeat 12 --degraded

  NodeID node.clientstats.connected.smb node.clientstats.active.cifs node.clientstats.active.smb

  NodeID node.clientstats.connected.smb node.clientstats.active.cifs node.clientstats.active.smb2

       1                            560                            1                           18

       3                            554                            0                           17

       4                            558                            0                           3

average                       551                        0                            25

The above output shows there are 560 SMB Sessions to node 1 with 1 active SMB1 session and 18 active SMB2 sessions.  The 560 SMB Sessions represent clients that are connected to the node that did not send any requests during the time the counter was run; thus, they are considered idle connections.  The 19 active sessions represents clients that sent a smb1/2 request during the time this counter was collected that the node had not responded to yet.  This counter can be indicative of an issue but will not tell you directly where the problem is.  As the active counter increases (specifically when it because a higher parentage of all onnections) it usually means there is something latent.

When I start looking to see if SMB is latent, I prefer the following stat over the connection count:

isi statistics protocol --nodes=all --protocols=smb1,smb2 --total --interval 5 --repeat 12 --degraded

isi-ess-east-1# isi statistics protocol --nodes=all --protocols=smb1,smb2 --total --interval 5 --repeat 12 --degraded

Ops  In Out TimeAvg TimeStdDev Node Proto Class Op

N/s B/s B/s      us         us                   

  10.1  1.2K   10K  2081.6     4037.0    1  smb1     *  *

706.4  140K  129K  180817.1   2589.5    1  smb2     *  *

   0.4  30.3  33.5  5085.5     5895.1    3  smb1     *  *

812.7   18K  8.2K  151469.2   6842.4    3  smb2     *  *

   0.4  30.4  33.6  1542.0     1074.8    4  smb1     *  *

  71.6   23K   13K  25407.8      714.0   4  smb2     *  *

The above stat tells you if the clients are using SMB1 or SMB2 and what the overall latency looks like.  The only problem I have with this stat is it includes Change Notify in the calculation of latency so it will throw off the time average.  A good rule of thumb is to look for a good number of ops (like 1k) and make sure the Standard Deviation is not abnormal.  The stat above does suggest that operations to nodes 1, 3 and 4 are showing signs of latency.  They have 700-800 Ops with a Time Avg of 150ms - 180ms.

This leads us to our next logical counter which breaks down the protocol by type. For that I use this command:

isi statistics protocol --nodes=all --protocols=smb1,smb2 --orderby=Class --interval 5 --repeat 12 --degraded

isi-ess-east-1# isi statistics protocol --nodes=all --protocols=smb1,smb2 --orderby=Class --interval 5 --repeat 12 --degraded

Ops  In Out TimeAvg TimeStdDev Node Proto Class Op

N/s B/s B/s      us         us                   

   0.2  63.8  39.0   1098.0        0.0    3  smb2         create          create

   2.5 604.9 340.0   2353.2     5749.7    4  smb2         create          create

   0.2  15.1  21.1   3009.0        0.0    1  smb2     file_state           close

   0.9  82.7 115.1    132.5       30.4    4  smb2     file_state           close

   1.0 112.1 100.7    253.8       79.8    3  smb2 namespace_read      query_info

   0.9  97.1  72.4     83.0       41.6    4  smb2 namespace_read      query_info

695.2   81K   46M 190364.5      223.9    3  smb2           read            read

   3.1 368.1   91K   1556.9     4474.0    4  smb2           read            read

   0.2  26.5  18.9    201.0        0.0    4  smb2  session_state    tree_connect

   0.2  63.8  39.0   1098.0        0.0    3  smb2         create          create

   0.2  16.6  23.1    400.0        0.0    1  smb2     file_state           close

   1.0 112.1 100.7    253.8       79.8    3  smb2 namespace_read      query_info

695.2   81K   46M 190364.5       223.9    3  smb2           read            read

If we compare the data from above to the previous output, we can see that for node 3, out of the 812 Ops that were SMB2, 659 Ops were Read and the average latency was 190 ms.  We are on the path of finding our culprit.

Since we have established that SMB2 is latent and it appears to be impacting reads, the next place to look would be disk:

isi statistics drive --nodes=all --interval 5 --repeat 12 --degraded

isi-ess-east-1# isi statistics drive --nodes=all --interval 5 --repeat 12 --degraded

Drive    Type OpsIn BytesIn SizeIn OpsOut BytesOut SizeOut TimeAvg Slow TimeInQ Queued  Busy  Used Inodes

LNN:bay           N/s     B/s      B    N/s      B/s       B      ms  N/s      ms            %     %      

    1:1     SATA  72.2    2.3M    32K  129.8     2.2M     17K     0.6  0.0    58.8    8.7  93.5 100.0   3.4M

    1:2     SATA  56.8    1.9M    34K  157.4     2.9M     18K     0.4  0.0   208.8   31.0  65.1 100.0   3.0M

    1:3     SATA  86.0    2.4M    28K   88.6     1.6M     18K     0.4  0.0   133.1   22.0  84.3 100.0   3.0M

    1:4     SATA  54.0    2.1M    38K  118.6     2.3M     20K     0.4  0.0    52.3   11.1  72.7 100.0   2.5M

    1:5     SATA  74.0    2.5M    34K  106.6     2.1M     20K     0.4  0.0    52.3    9.3  57.3 100.0   3.3M

    1:6     SATA  66.2    2.5M    38K  100.6     2.0M     20K     0.4  0.0    53.3    8.4  86.1 100.0   3.2M

    1:7     SATA  47.4    1.6M    34K   94.2     1.8M     20K     0.4  0.0    46.4    7.8  49.7 100.0   3.3M

    1:8     SATA  65.4    2.3M    35K  145.8     2.5M     17K     0.4  0.0    37.8    7.5  75.1 100.0   3.4M

    1:9     SATA  51.2    2.1M    40K  119.2     2.1M     18K     0.4  0.0    35.8    6.7  56.3 100.0   2.5M

    1:10    SATA  62.0    2.0M    32K  101.2     2.2M     22K     0.4  0.0    33.8    6.0  56.5 100.0   3.4M

    1:11    SATA 126.6    3.2M    25K   76.2     1.4M     18K     0.3  0.0   201.1   33.5 100.0 100.0   3.0M

    1:12    SATA  66.2    2.0M    31K  117.8     1.9M     16K     0.3  0.0   106.9   21.3  85.1 100.0   3.0M

    3:1     SATA  40.0    1.4M    36K  107.4     1.8M     17K     0.3  0.0    89.2   17.1  37.5 100.0   2.9M

    3:2     SATA  54.2    1.8M    33K  113.4     1.9M     17K     0.3  0.0    68.4   14.7  60.7 100.0   3.0M

    3:3     SATA  56.0    2.1M    38K  112.2     2.0M     17K     0.3  0.0    65.6   14.4  40.7 100.0   3.3M

    3:4     SATA  73.8    2.3M    32K  113.6     2.0M     17K     0.3  0.0   114.3   13.9  54.5 100.0   2.3M

    3:5     SATA  66.8    2.1M    32K  106.8     1.9M     18K     0.3  0.0    74.0   11.2  50.5 100.0   3.5M

    3:6     SATA  78.4    2.7M    34K  138.2     2.2M     16K     0.3  0.0    75.8   11.1  82.1 100.0   3.4M

    3:7     SATA  58.4    2.2M    38K  127.8     2.1M     16K     0.3  0.0    77.1   11.0  54.7 100.0   3.4M

    3:8     SATA  54.6    2.0M    37K   90.4     1.4M     16K     0.3  0.0    75.1   10.7  39.9 100.0   3.0M

    3:9     SATA  56.2    2.0M    36K  139.4     2.5M     18K     0.3  0.0    59.9   10.4  61.5 100.0   3.3M

    3:10    SATA  59.0    1.9M    33K  110.2     1.8M     16K     0.3  0.0    55.2   10.2  49.3 100.0   3.3M

    3:11    SATA  55.0    2.0M    37K  122.2     1.9M     16K     0.3  0.0    59.4    9.3  46.1 100.0   2.5M

    3:12    SATA  51.4    1.8M    35K  102.0     2.1M     20K     0.3  0.0    50.3    9.1  47.7 100.0   3.3M

    4:1     SATA  52.2    1.8M    34K  117.2     2.1M     18K     0.3  0.0    53.5    8.8  51.7 100.0   2.8M

    4:2     SATA  58.8    2.1M    35K  107.2     2.0M     18K     0.3  0.0    47.8    8.7  48.9 100.0   3.3M

    4:3     SATA  64.8    2.3M    35K  120.6     2.2M     18K     0.3  0.0    44.2    8.6  57.1 100.0   3.4M

    4:4     SATA  50.8    1.8M    35K   77.8     1.7M     22K     0.3  0.0    53.8    8.6  38.1 100.0   2.7M

    4:5     SATA  58.4    2.2M    38K  135.6     2.4M     18K     0.3  0.0    51.8    8.4  48.9 100.0   3.4M

    4:6     SATA  65.0    2.4M    37K  108.8     2.1M     19K     0.3  0.0    55.9    8.3  55.3 100.0   3.3M

    4:7     SATA  57.0    2.1M    37K  106.8     2.2M     21K     0.3  0.0    46.9    8.2  49.5 100.0   3.3M

    4:8     SATA  58.8    2.0M    34K  149.0     2.7M     18K     0.3  0.0    46.2    8.2  58.7 100.0   3.3M

    4:9     SATA  53.2    1.8M    33K  124.0     2.3M     19K     0.3  0.0    45.0    8.2  56.3 100.0   3.4M

    4:10    SATA  76.0    2.3M    30K  103.8     1.9M     18K     0.3  0.0    44.8    8.0  60.1 100.0   3.3M

    4:11    SATA  60.2    2.1M    35K  116.0     1.9M     17K     0.3  0.0    42.6    7.9  65.5 100.0   3.1M

    4:12    SATA  59.0    2.1M    35K  100.2     1.8M     18K     0.3  0.0    48.2    7.8  41.3 100.0   2.4M

The above output shows the source of our problem, these poor sata disks are doing an average of 170 Ops (In and Out) and are struggling to keep up.  The SMB2 Reads that are latent are a victim of the disks which in this case is spindle bound due to other contention.

Outside of tracking down a SMB Performance issue due to disk, a couple other useful counters to look at are:

Is Directory enumeration bad:

isi statistics protocol  --nodes=all --protocols=smb1,smb2 --orderby=Out --classes=namespace_read --interval 5 --repeat 12 --degraded

isi-ess-east-1#   isi statistics protocol  --nodes=all --protocols=smb1,smb2 --orderby=Out --classes=namespace_read --interval 5 --repeat 12 --degraded

   Ops    In   Out TimeAvg TimeStdDev Node Proto          Class               Op

   N/s   B/s   B/s      us         us                                          

  13.0  1.5K   44K   510.1     1331.5    1  smb2 namespace_read  query_directory

227.9   25K   35K   226.6      791.0    3  smb2 namespace_read  query_directory

  60.1  6.9K   31K   400.1     3127.8    4  smb2 namespace_read  query_directory

   5.2 720.3  5.6K   822.5      293.8    1  smb1 namespace_read trans2:findfirst

   2.2 305.8  4.7K  6452.5    19478.9    3  smb1 namespace_read trans2:findfirst

  20.5  2.3K  4.3K  1158.9     6969.6    1  smb2 namespace_read  query_directory

   0.2  29.4  3.0K  1293.0        0.0    3  smb1 namespace_read  trans2:findnext

When looking at this stat focus on the FindFist/FindFirstNext. You are looking for a small amount of Ops that cause a large amount of Out B/s.  The Time Avg might look normal but what we are really looking for is when a client does a * enumeration of a very large directory.  The request itself will be 1 op but will amount to a tremendous amount of bytes that will need to be returned.

Is one of the Authentication Providers causing a delay:

isi statistics protocol --nodes=all --protocols=lsass_in,lsass_out --total --interval 5 --repeat 12 --degraded

isi-ess-east-1# isi statistics protocol --nodes=all --protocols=lsass_in,lsass_out --total --interval 5 --repeat 12 --degraded

Ops  In Out TimeAvg TimeStdDev Node    Proto         Class                         Op

N/s B/s B/s      us         us                                                      

0.4 0.0 0.0  8977.0     5208.5    4 lsass_in session_state lsa:id:ioctl:pac_to_ntoken

0.4 0.0 0.0   383.0        2.8    4 lsass_in session_state      ntlm:accept_sec_ctxt1

0.4 0.0 0.0 10256.5       27.6    4 lsass_in session_state      ntlm:accept_sec_ctxt2

0.7 0.0 0.0   576.2      390.3    4 lsass_in session_state         ntlm:acquire_creds

0.4 0.0 0.0   136.0      144.2    4 lsass_in session_state       ntlm:delete_sec_ctxt

0.7 0.0 0.0    37.0        6.8    4 lsass_in session_state            ntlm:free_creds

1.8 0.0 0.0    48.2       24.6    4 lsass_in session_state            ntlm:query_ctxt

lsa:id:ioctl:pac_to_ntoken - Represents how long it takes a DC to complete a Sid2Name lookup

ntlm:accept_sec_ctxt2 - Represnts how long it took a DC to complete NTLM authentication

If either of the above show some Ops where the Time Avg is high, its time to start looking at the DC as causing the delay.

Good luck Performance Troubleshooting and remember, its very rare that the protocol itself (ie SMB) is the one causing the latency

Reply
Highlighted
3 Cadmium

Re: Ask the Expert: SMB Protocol on an Isilon Cluster

This is great information!  Is there a way to monitor the statistics so we can trend and alert on them if they get to an unacceptable limit?

isilon# isi statistics protocol --nodes=5 --protocols=smb1,smb2 --orderby=Class --interval 5 --repeat 12 --degraded

.....

2.0 196.6 134.5 32535068.0 74775888.0    5  smb2      file_state    change_notify

That looks awfully high to me, so in theory i'd like to know about that via some alerting system... If it's a problem at least...

Reply
Highlighted
2 Bronze

Re: Re: Ask the Expert: SMB Protocol on an Isilon Cluster

This is great information!  Is there a way to monitor the statistics so we can trend and alert on them if they get to an unacceptable limit?

You can use InsightIQ for this:

https://support.emc.com/docu47071_InsightIQ-Installation-and-Setup-Guide-Version-2.5.1.pdf?language=...

2.0 196.6 134.5 32535068.0 74775888.0    5  smb2      file_state    change_notify

Ah yes, change notify.  When a client opens an explorer window against a share, the client sets a change notification request so that it can refresh when something has changed.  It may take 1 second for some other client to trigger a change, it make take 2 days if nothing changes in the directory while explorer is open.  Thus, this counter from a latency perspective is useless and will always be abnormal and because it is included in our overall latency counter, can skew results.

Reply
Highlighted
3 Cadmium

Re: Re: Ask the Expert: SMB Protocol on an Isilon Cluster

Peter Abromitis wrote:

This is great information!  Is there a way to monitor the statistics so we can trend and alert on them if they get to an unacceptable limit?

You can use InsightIQ for this:

https://support.emc.com/docu47071_InsightIQ-Installation-and-Setup-Guide-Version-2.5.1.pdf?language=...

I didn't think Insight IQ could do alerting? I know it's pretty good at reporting on the performance of the cluster,  but i'd like to know about it before the client(s) call me saying "Isilon is slow!"

Reply
Highlighted
2 Bronze

Re: Re: Ask the Expert: SMB Protocol on an Isilon Cluster

Right, InsightIQ can be used for monitoring but not alerting; sorry about the confusion.

Reply
Highlighted
1 Copper

Re: Ask the Expert: SMB Protocol on an Isilon Cluster

The LDAP as authentication source in back end and PDC running on linux box (samba), can we create the authentication source as PDC in Isilon to access SMB shares in Mavericks ??

Reply