Reply to Message

Reply to Message

View discussion in a popup

Replying to:
4 Beryllium

Re: Re: Ask the Expert: Isilon Performance Analysis

> MEMORY                            – isi_cache_stats -v

Another great tool, and this one actually new to me ;-)

I had been fiddling with isi statistics query before,

to get some insight into OneFS caching,

but found it hard to get a clear picture.

It seems that isi_cache_stats -v prints totals since startup,

and it is even more useful when monitoring live deltas at

regular intervals like 5s: isi_cache_stats -v 5

The one-line form appears more compact, but I don't get

the meaning of the actual numbers (after the first line with the totals):

Totals            l1_data: r 3.1T  6% p  34T  73%, l1_meta: r 113T  98% p  70G  51%, l2_data: r  23T  13% p 117T  74%, l2_meta: r  13T  68% p 4.6T  99%

13/08/13 18:19:39  l1_data: r 4.7M  8% p  41M  78%, l1_meta: r 365M  99% p  48K  40%, l2_data: r  86M  54% p  70M  96%, l2_meta: r 3.1M  29% p  24K 100%

13/08/13 18:19:44  l1_data: r  5M  8% p  41M  79%, l1_meta: r 328M  99% p  96K 100%, l2_data: r  80M  56% p  60M  94%, l2_meta: r 2.2M  24% p  16K 100%

So  l1_meta: r 365M in the second row would mean level1 reads, but I don't think we have 365M of those...

(isi statistics pstat says:  12521.38/s NFS3-Ops, and 15930.80/s disk IOPS at this time.)

Can you explain how to read these numbers? (All numbers in isi_cache_stats -v 5 appear reasonable.)

But the real questions are of course about the OneFS caching in general.

How can one see the cache usage for certain traffic (by user, client, operation/event, path,...)?

How is cache memory allocated or prioritized to Level 1/2 and data/metadata (four combinations)?

Could one check the cache ages separately for these four cache sections?

(similar to isi statistics query -snode.ifs.cache.oldest_page_age)

I ask this because we find that often large data transfers mainly

fill the cache without much benefit (only few % data hits later).

The node.ifs.cache.oldest_page_age goes down to 1 minute in such situations;

and it seems that this number also applies to the metadata cache.

I'd rather prefer to assign more memory to the metadata cache

(in the absence of SSD for metadata) to allow the metadata content

to last for 30 minutes or more, while the data cache is short anyway.

Does this make sense to you?

-- Peter

0 Kudos