[gpfsug-discuss] Quick delete of huge tree

Jan-Frode Myklebust janfrode at tanso.net
Tue Apr 20 12:52:50 BST 2021


A couple of ideas.

The KC recommends adding WEIGHT(DIRECTORY_HASH) to group deletions within a
directory. Then maybe also do it as a 2-step process, in the same policy
run. Where you delete all non-directories first, and then deletes the
directories in a depth-first order using WEIGTH(Length(PATH_NAME)):


RULE 'delnondir' DELETE
     WEIGHT(DIRECTORY_HASH)
     DIRECTORIES_PLUS
     WHERE PATH_NAME LIKE '/mypath/%' AND NOT MISC_ATTRIBUTES LIKE '%D%'

RULE 'deldir' DELETE
     DIRECTORIES_PLUS
    WEIGHT(Length(PATH_NAME))
     WHERE PATH_NAME LIKE '/mypath/%'  AND MISC_ATTRIBUTES LIKE '%D%'

HTH


On Tue, Apr 20, 2021 at 1:18 PM Ulrich Sibiller <
u.sibiller at science-computing.de> wrote:

>
> Hello *,
>
> I have to delete a subtree of about ~50 million files in thousands of
> subdirs, ~14TB of data.
> Running a recursive rm is very slow so I setup a simple policy file:
>
> RULE 'delstuff' DELETE
>      DIRECTORIES_PLUS
>      WHERE PATH_NAME LIKE '/mypath/%'
>
> This kinda works but is not really fast, either. It even requires a second
> run because files and
> directories within the tree will be processed in arbitrary order so it
> will happen quite frequently
> that a directory is going to be deleted before its content has been
> removed completely. For those
> dirs I see an error message and have to delete afterwards.
>
> I am wondering if there's a quicker way. Given the fact that this is a
> whole tree I think there's
> should be a quick way to unlink the complete inode hierachy.
>
> Unfortunately we are not using a fileset for that tree...
>
> So are there any ideas how to solve that more efficiently?
>
> Uli
> --
> Science + Computing AG
> Vorstandsvorsitzender/Chairman of the board of management:
> Dr. Martin Matzke
> Vorstand/Board of Management:
> Matthias Schempp, Sabine Hohenstein
> Vorsitzender des Aufsichtsrats/
> Chairman of the Supervisory Board:
> Philippe Miltin
> Aufsichtsrat/Supervisory Board:
> Martin Wibbe, Ursula Morgenstern
> Sitz/Registered Office: Tuebingen
> Registergericht/Registration Court: Stuttgart
> Registernummer/Commercial Register No.: HRB 382196
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20210420/af8e2b54/attachment-0002.htm>


More information about the gpfsug-discuss mailing list