[gpfsug-discuss] Querying size of snapshots

Jan-Frode Myklebust janfrode at tanso.net
Tue Jan 29 19:19:12 GMT 2019


You could put snapshot data in a separate storage pool. Then it should be
visible how much space it occupies, but it’s a bit hard to see how this
will be usable/manageable..


-jf
tir. 29. jan. 2019 kl. 20:08 skrev Christopher Black <cblack at nygenome.org>:

> Thanks for the quick and detailed reply! I had read the manual and was
> aware of the warnings about -d (mentioned in my PS).
>
> On systems with high churn (lots of temporary files, lots of big and small
> deletes along with many new files), I’ve previously used estimates of
> snapshot size as a useful signal on whether we can expect to see an
> increase in available space over the next few days as snapshots expire.
> I’ve used this technique on a few different more mainstream storage
> systems, but never on gpfs.
>
> I’d find it useful to have a similar way to monitor “space to be freed
> pending snapshot deletes” on gpfs. It sounds like there is not an existing
> solution for this so it would be a request for enhancement.
>
> I’m not sure how much overhead there would be keeping a running counter
> for blocks changed since snapshot creation or if that would completely fall
> apart on large systems or systems with many snapshots. If that is a
> consideration even having only an estimate for the oldest snapshot would be
> useful, but I realize that can depend on all the other later snapshots as
> well. Perhaps an overall “size of all snapshots” would be easier to manage
> and would still be useful to us.
>
> I don’t need this number to be 100% accurate, but a low or floor estimate
> would be very useful.
>
>
>
> Is anyone else interested in this? Do other people have other ways to
> estimate how much space they will get back as snapshots expire? Is there a
> more efficient way of making such an estimate available to admins other
> than running an mmlssnapshot -d every night and recording the output?
>
>
>
> Thanks all!
>
> Chris
>
>
>
> *From: *<gpfsug-discuss-bounces at spectrumscale.org> on behalf of Marc A
> Kaplan <makaplan at us.ibm.com>
> *Reply-To: *gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> *Date: *Tuesday, January 29, 2019 at 1:24 PM
> *To: *gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> *Subject: *Re: [gpfsug-discuss] Querying size of snapshots
>
>
>
> 1. First off, let's RTFM ...
>
> *-d *Displays the amount of storage that is used by the snapshot.
> This operation requires an amount of time that is proportional to the size
> of the file system; therefore,
> it can take several minutes or even hours on a large and heavily-loaded
> file system.
> This optional parameter can impact overall system performance. Avoid
> running the * mmlssnapshot*
> command with this parameter frequently or during periods of high file
> system activity.
>
> SOOOO.. there's that.
>
> 2. Next you may ask, HOW is that?
>
> Snapshots are maintained with a "COW" strategy -- They are created
> quickly, essentially just making a record that the snapshot was created and
> at such and such time -- when the snapshot is the same as the "live"
> filesystem...
>
> Then over time, each change to a block of data in live system requires
> that a copy is made of the old data block and that is associated with the
> most recently created snapshot....   SO, as more and more changes are made
> to different blocks over time the snapshot becomes bigger and bigger.   How
> big? Well it seems the current implementation does not keep a "simple
> counter" of the number of blocks -- but rather, a list of the blocks that
> were COW'ed.... So when you come and ask "How big"... GPFS has to go
> traverse the file sytem metadata and count those COW'ed blocks....
>
> 3. So why not keep a counter?  Well, it's likely not so simple. For
> starters GPFS is typically running concurrently on several or many
> nodes...  And probably was not deemed worth the effort ..... IF a
> convincing case could be made, I'd bet there is a way... to at least keep
> approximate numbers, log records, exact updates periodically, etc, etc --
> similar to the way space allocation and accounting is done for the live
> file system...
>
>
> ------------------------------
> This message is for the recipient’s use only, and may contain
> confidential, privileged or protected information. Any unauthorized use or
> dissemination of this communication is prohibited. If you received this
> message in error, please immediately notify the sender and destroy all
> copies of this message. The recipient should check this email and any
> attachments for the presence of viruses, as we accept no liability for any
> damage caused by any virus transmitted by this email.
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20190129/1234a522/attachment.htm>


More information about the gpfsug-discuss mailing list