atthomas1

20 Posts

2718

August 14th, 2014 06:00

Free space discrepency between source and target cluster

Been getting no where with support where we are seeing an 8-10TB discrepancy in free space between two identical clusters, we have approx. 20TB of data. We replicate everything except for one folder which is 838GB with overhead in size. On both sides we keep 30 days of snapshots, but we have had to pretty much delete all snapshots on the source to regain some free space. Being told by support that you will see a difference in capacity, which I understand, but not so much...Has anyone seen similar issues?

Responses(11)

atthomas1

20 Posts

1

August 18th, 2014 04:00

Found the issue, we have protocol auditing enabled, and the cluster does not allow for log rotation. So in the 45 days of use, there were 10TB of logs created...

I'm trying to see if support has any implementation to automatically rotate the logs, as the information is sent to a CEE provider.

Thanks again for your help!!!

Peter_Sero

1.2K Posts

0

August 14th, 2014 07:00

Is that 20TB or 200TB for the data? With only 20TB of data these 8-10TB difference are hefty...

Has the SnapshotDelete job run successfully after expiring the snaps?

Have you checked the total amount of data in the snapshots:

isi(_classic) snaps usage

(last line has the total)

More things to consider:

different protection levels on clusters?

different virtual hot spare settings on clusters?

are SmartQuotas available to break down data usage (per quota domain)?

were the most recent jobs of each type successful? (SmartPools; either MultiScan or AutoBalance and Collect)

is there good balance across disks/nodes?

isi statistics drive --nodes all --long

(check the distribution of Used%)

atthomas1

20 Posts

0

August 14th, 2014 08:00

Twenty TB (20TB) of data, the difference is very hefty, unfortunately Isilon Engineering is very difficult to get them to listen and look at the issue...

Had to delete snaps from the source, actually have less (304GB) than target (2TB) now due to this issue, and the snapshots have deleted without issue. There is a good balance across the nodes, both clusters using same protection levels and same virtual hot spares.

I'm wondering, we used isi_vol_copy_vnx to migrate from a Celerra, if there is hidden meta data that is not cleaning itself up properly.

We ran Smart Quotas against both clusters and they come up the same in terms of number of files and sizes.

Peter_Sero

1.2K Posts

1

August 14th, 2014 09:00

So which side is larger by those +50%, the source or the target?

Any folders outside the quota domains,

like the various other system folders under /ifs (.isilon, .ifsvar, home, ...):

Checked with a careful "du -sh" ?

The 50% difference, does it really show up node by node in isi status,

and disk by disk in the command I sent earlier? In that command,

how do the Inode counts compare between both clusters?

There might be orphaned blocks on the disks. Have nodes

been down while deleting substantial amounts of data on

the online nodes of the same cluster? And without MultiScan

or Collect job since then? (Beware ETA 190094)

No idea about isi_vol_copy_vnx, but tracking down any

metadata effects: What is the average file size? With

tiny files (few KB), surplus metadata might have

noticeable effects - but normally on both clusters.

The classic of dark matter in UNIX filesystems

are open files that have been deleted by filename.

Any recent large deletions of run-away log files or

the like, where the producing programs are still running?

Tracking the history, configured and kept monthly e-mail reports?

Or:

isi statistics hist -F -s"ifs.bytes.total,ifs.bytes.used"

dynamox

2 Intern

•

20.4K Posts

0

August 18th, 2014 05:00

let us know what you hear from support, i know by default audit logs are 1G and will create new ones once they reach that size and Isilon will not remove/move them from the file system (understandable, different regulation/legal rules). But if customer has already captured that information in an external application (Varonis, StealthBits) then there needs to be a way to archive these logs somewhere else or remove them completely.

atthomas1

20 Posts

0

August 19th, 2014 04:00

Did not get too far, being told to get in touch with PS to work on setting up a way to delete the files. I think I'm just going to add a cron job to delete the files weekly.

Peter_Sero

1.2K Posts

0

August 21st, 2014 02:00

How did the CEE provider digest those 10TB of logs...?

atthomas1

20 Posts

0

August 21st, 2014 02:00

It's a push, so as the logs are being created/written to, the cluster pushes the info to CEE. So over time the space used by the logs grew to 10TB, but CEE had already captured and pushed the information.

Peter_Sero

1.2K Posts

0

August 21st, 2014 04:00

Sure, but I mean, how is the CEE equipped to stow 10TB in 45 days, presumably ongoing?

atthomas1

20 Posts

0

August 22nd, 2014 06:00

CEE does not store any data, as far as I know, it just a middle man to translate and pass the data from Isilon to whatever 3rd party software.

Peter_Sero

1.2K Posts

0

August 22nd, 2014 07:00

Supposedly so -- I was just curious where finally some instance actually archives all the stuff (and might choke thereon) or just summarizes it for some reports. At least you haven't got any backfire from that side, apparently...

View All

No Events found!