[gpfsug-discuss] More on AFM cache chaining

Mon Aug 15 22:08:58 BST 2016

Hi there.

In the spirit of a conversation a friend showed me a couple of weeks ago from Radhika Parameswaran and Luke Raimbach, we’re doing something similar to Luke (kind of), or at least attempting it, in regards to cache chaining.

We’ve got a large research storage platform in Brisbane, Queensland, Australia and we’re trying to leverage a few different modes of operation.

Currently:

Cache A (IW) connects to what would be a Home (B) which then is effectively an NFS mount to (C) a DMF based NFS export. To a point, this works. It kind of allows us to use “home” as the ultimate sink, and data migration in and out of DMF seems to be working nicely when GPFS pulls things from (B) which don’t appear to currently be in (A) due to policy, or a HWM was hit (thus emptying cache). We’ve tested it as far out as the data ONLY being offline in tape media inside (C) and it still works, cleanly coming back to (A) within a very reasonable time-frame.

·         We hit “problem 1” which is in and around NFS v4 ACL’s which aren’t surfacing or mapping correctly (as we’d expect). I guess this might be the caveat of trying to backend the cache to a home and have it sitting inside DMF (over an NFS Export) for surfacing of the data for clients.

Where we’d like to head:

We haven’t seen it yet, but as Luke and Radhika were discussing last month, we really liked the idea of an IW Cache (A, where instruments dump huge data) which then via AFM ends up at (B) (might also be technically “home” but IW) which is then also a function of (C) which might also be another cache that sits next to a HPC platform for reading and writing data into quickly and out of in parallel.

We like the idea of chained caches because it gives us extremely flexibility in the premise of our “Data anywhere” fabric. We appreciate that this has some challenges, in that we know if you’ve got multiple IW scenarios the last write will always win – this we can control with workload guidelines. But we’d like to add our voices to this idea of having caches chained all the way back to some point such that data is being pulled all the way from C --> B --> A and along the way, inflection points of IO might be written and read at point C and point B AND point A such that everyone would see the distribution and consistent data in the end.

We’re also working on surfacing data via object and file simultaneously for different needs. This is coming along relatively well, but we’re still learning about where and where this does not make sense so far. A moving target, from how it all appears on the surface.

Some might say that is effectively asking for a globally eventually (always) consistent filesystem within Scale’.

Anyway – just some thoughts.

Regards,

-jc

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160815/5b7bd9ff/attachment.htm>