4 Operator

 • 

1.2K Posts

October 8th, 2013 07:00

Hello Gary,

it seems the writes are smaller than 4K already, and I wouldn't restrict the wsize

to a value lower than what the client likes to send in one op.

(Divide the In rate by the Ops rate in your example, or use the --long option

to see InAvg/InMin/InMax, i.e. the distribution of actual write sizes in B or KB).

Unless you put the real DB data files on SSD, the flash will (only) help

in navigating to the data blocks in the file. As you do many updates

to existing block, the accessed file layout information might be held largely in

the RAM cache anyway, and probably you wouldn't see much improvement.

But it could help in principle, would never harm in any case.

Snapshots can hurt with random updates, as copy-on-write

might be chosen here. If you have Snaphots, delete them, or

run test on fresh copies of  the DB files which are not covered by Snapshots.

Forget to mention that the coalescer in 7.0 had been improved for latency in 2012.

But wether it can do wonders where there is nothing to coalesce in the end

due to heavy scattering? It will most likely behave differently, and probably not worse than

the 6.5 coalescer for your case. In 7.1 more changes have been made, as Jim just

explained in the context of many-small-files within the ongoing Ask The Expert discussion on 7.1.

Cheers

-- Peter

4 Operator

 • 

1.2K Posts

October 8th, 2013 01:00

The locks shown by "isi statistics heat" are One-FS internal locks when a node updates a file,

rather than application/protocol lock operations. You'll see plenty of them

with random IO and small writes blocks.

It's difficult to get the full picture from the statistics excerpts,

but I would guess the NFS client is simply filling the node's NVRAM,

while the OneFS write coalescer hopes it can do good work

in the end (i.e. to coalesce many small writes into fewer and larger

physical writes). However when it is full and it is time to

write the data to the disks, and the write chunks are too fragmented

over all, and a large number of small disk transfers need to made at once.

At this time the NVRAM cannot sustain a high rate of new writes.

So the intermittent phases of slow writes would correlate

with filling up and flushing the NVRAM.

In mixed loads with most of the clients doing streaming writes,

this effect doesn't become so prominent. The random IO clients

would just see so-so performance all the time.

A simple test can be to add some streaming write load

to the node from another client...

You can also try to observe, in 2-second intervals, and all simultaneously:

- isi statistics client (for exactly the MySQL connection)

- isi statistics drive --long (check out sorting by OpsIn or TimeInQ or Queued)

- sysctl efs.bam.coalescer_stats (many things going on here; you will see patterns in time for sure)

A couple of further thoughts:

- Try switching off the coalescer, either disable SmartCaching in SmartPools

or check wether DIRECTIO can be used by MySQL

- The access pattern for the MySQL file should be set to RANDOM

- Different database/table modules available fro MySQL

  can show different write patterns, and hence, different coalescing behavior in OneFS

- Same will be true for the acclaimed drop-in MySQL replacement MariaDB

  with its further options for tables and (application-side) caching.

Let us know what you find, good luck!

-- Peter

October 8th, 2013 03:00

Thanks Peter for the fast and informative reply.


Would small (i.e. 4K) wsize NFS3 client options make the coalescer work harder? I'm assuming adding SSDs to the nodes will allow the NVRAM cache to be flushed faster and thus better buffer random NFS writes?

This is an NFS environment I have inherited, and I'm in the process of reverse engineering how it was (mis)constructed. I've determined there's some Citrix virtual disks mounted off the Isilon cluster that are generating as much if not more LIN locks as the MySQL DB is.

Regards,

Gary

No Events found!

Top