[gpfsug-discuss] Compression details

IBM Spectrum Scale scale at us.ibm.com
Thu Jul 26 14:24:14 BST 2018


> 1) How is file deletion handled?

This depends on whether there's snapshot and whether COW is needed. If COW
is not needed or there's no snapshot at all, then the file deletion is
handled as non-compressed file(don't decompress the data blocks and simply
discard the data blocks, then delete the inode).

However, even if COW is needed, then uncompression before COW is only
needed when one of following conditions is true.
1) the block to be moved is not the first block of a compression group(10
blocks is compression group since block 0).
2) the compression group ends beyond the last block of destination file
(file in latest snapshot).
3) the compression group is not full and the destination file is larger.
4) the compression group ends at the last block of destination file, but
the size between source and destination files are different.
5) the destination file already has some allocated blocks(COWed) within the
compression group.

> 2) Are there any guidelines

LZ4 compression algorithm is already made good trade-off between
performance and compression ratio. So it really depends on your data
characters and access patterns. For example: if the data is write-once but
read-many times, then there shouldn't be too much overhead as only
compressed one time(I suppose decompression with lz4 doesn't consume too
much resource as compression). If your data is really randomized, then
compressing with lz4 doesn't give back too much help on storage space save,
but still need to compress data as well as decompression when needed. But
note that compressed data could also reduce the overhead to storage and
network because smaller I/O size would be done for compressed file, so from
application overall point of view, the overhead could be not added at
all....

Regards, The Spectrum Scale (GPFS) team

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180726/8062ecaa/attachment.htm>


More information about the gpfsug-discuss mailing list