bhalilov1
2 Iron

Caching in clusters without SmartPools license

Jump to solution

Dear community,

We have a cluster with all the same 32000x-ssd nodes and we do not have SmartPools license.

Can we use the new L3 cache in 7.1.1 without buying SmartPools  and would that make sense ?

We do not have a need for FilePools at this time.

Labels (1)
Tags (1)
0 Kudos
1 Solution

Accepted Solutions
Jim_Cahill
1 Copper

Re: Caching in clusters without SmartPools license

Jump to solution

You can use L3 cache without a SmartPools license. It probably make sense, since it is extending your RAM-based L2 cache. When that fills up, rather than discard data, OneFS intelligently moves appropriate overflow data into L3, thus improving access speeds for frequently accessed files. L3 is better for workflows where random access is the norm, although there are benefits to be gained even in streaming workflows.

0 Kudos
12 Replies
Jim_Cahill
1 Copper

Re: Caching in clusters without SmartPools license

Jump to solution

You can use L3 cache without a SmartPools license. It probably make sense, since it is extending your RAM-based L2 cache. When that fills up, rather than discard data, OneFS intelligently moves appropriate overflow data into L3, thus improving access speeds for frequently accessed files. L3 is better for workflows where random access is the norm, although there are benefits to be gained even in streaming workflows.

0 Kudos
bhalilov1
2 Iron

Re: Caching in clusters without SmartPools license

Jump to solution

Thank you Jim

Is L3 cache on/off globally on the cluster, or we can set it up per directory ?

Currently we have the default "metadata read acceleration" set.

If we turn  L3 cache on, and we do not buy the SmartPools license, I understand we will lose that Acceleration for the whole cluster. Is there a stat that  we should look at to see if metadata operations are suffering as a result ?

0 Kudos
Peter_Sero
4 Beryllium

Re: Caching in clusters without SmartPools license

Jump to solution

The L3 cache is per node pool, not per file pool (aka SmartPools file policy).

Several stats are available, but it's a bit tricky to get the full picture,

or to predict the result.

You can check the metadata read latencies for your NAS protocols,

by client OR by specific operation:

isi statistics client --class namespace_read -t --orderby TimeAvg

isi statistics proto --class namespace_read -t --orderby TimeAvg

(Ops from SyncIQ, NDMP, AutoBalance and other jobs are not included here.)

Observed latencies can be very low already, due to effectice L1 and L2 caching,

and you might NOT even clearly see which ops come from the current SSD

acceleration.

But after migration to L3 cache, metadata/namespace ops that

miss the L1-L3 caches and thus come from HDD should clearly stand out.

In either case, you can look at the actual SSD IOPS and their characteristics:

isi stati drive -nall --type ssd --long

Some very simple insight into the caching itself is provided:

isi stati query -nall -snode.ifs.cache.oldest_page_age

(age in seconds)

With a larger cache (added L3) the caching intervals

will increase. It depends on your workload wether it

it will benefit from an increase that goes from, say

2 minutes (L1/L2 RAM) up to estimated 20 minutes (SSD L3

assumed e.g. 10 times larger that RAM). When streaming

many huge files, none recurring for hours,

20 minutes might be as useless as 2 minutes....

The total perspective vortex of caching stats is

isi_cache_stats [-v] [interval]

(or same stats, but through customized queries with:

isi statistics query -snode.ifs.cache.l1.meta.read.hit,... etc)

Run isi_cache_stats in interval mode (few seconds),

because the first output are the total stats since booted.

The -v verbose option helps to understand the

structure of the output, before going to default single

line mode. Note that the output is for the local node,

and that cache ops here include more than the NAS ops

(i.e. covering SyncIQ, NDMP, AutoBalance etc.)

The metadata read hits are often very high (80-100% rates)

already, but this comes from the fact that highest level directories

such as /ifs and /ifs/data are implicitly accessed so often.

Finally, a general consideration in addition to what Jim said:

If you have a classic many-small-files scenario and

are traversing huge hierarchies most of the time,

without revisiting the same spots frequently, then

you would need to rely on fast metadata everywhere

and always present. In other words, the "old" style SSD

acceleration really suits best for such cases.

Peter_Sero
4 Beryllium

Re: Caching in clusters without SmartPools license

Jump to solution

Just got pointed to this brand new White Paper

File System Caching Infrastructure

in-depth explanation of L1/L2/L3 caching;

usage advice for SSD  L3 cache compared to traditional SSD metadata acceleration, including coexistence of both in one cluster;

and a very neat prediction procedure on p19!

0 Kudos
bhalilov1
2 Iron

Re: Caching in clusters without SmartPools license

Jump to solution

Peter,

Great guidelines, thanks !

And that WP is good, little light on metadata caching details.

0 Kudos
bhalilov1
2 Iron

Re: Caching in clusters without SmartPools license

Jump to solution

Follow up question :


How can I change SSD Strategy for a directory with cli, not webgui ?



0 Kudos
Highlighted
Peter_Sero
4 Beryllium

Re: Caching in clusters without SmartPools license

Jump to solution

isilon-1# isi set --help

usage: isi set [-fFLnvrR] [-p <policy>] [-w <width>] [-c on|off] [-g <restripe_goal>] [-e <encoding>] [-d <@r drives>] [-a default|streaming|random|custom{1..5}] [-l concurrency|streaming|random] [--diskpool <id|name>] [-A on|off] [-P on|off] [--strategy|-s avoid|metadata|metadata-write|data] file_or_lin ...

for example:

isi -set -v  s metadata myfile

isi -set -R -v  s metadata mydir  (-R for recursive subtree; will also apply to new files created here AFTERwards)

check with:

isi get -D myfile

isi get -d -D mydir   (single dir)

isi get -R -D mydir   (full subtree)

bhalilov1
2 Iron

Re: Caching in clusters without SmartPools license

Jump to solution

Thanks again Peter.

0 Kudos
peglarr
2 Iron

Re: Caching in clusters without SmartPools license

Jump to solution

Folks,

Be very, very (very) careful when using isi set.  You are, in effect, taking the automatic management away and doing it by hand.  Down the road, this can and does lead to (how shall I put this) difficult/complicated situations, especially if SmartPools is invoked later on but the manually managed directories are excluded because an isi set was previously done.

Again, be careful.  For all, work with your teams to obtain an eval SP license if you need it.

0 Kudos