[gpfsug-discuss] mmapplypolicy run time weirdness..

Marc A Kaplan makaplan at us.ibm.com
Thu Sep 14 19:55:39 BST 2017


Read the doc again.  Specify both -g and -N options on the command line to 
get fully parallel directory and inode/policy scanning.

I'm curious as to what you're trying to do with THRESHOLD(0,100,0)    ... 
Perhaps premigrate everything (that matches the other conditions)?

You are correct about
I] 2017-09-12 at 21:20:44.155 Parallel-piped sort and policy evaluation. 0 
files scanned.
(...)
[I] 2017-09-12 at 21:24:14.672 Piped sorting and candidate file choosing. 0 
records scanned.

If you don't see messages like that, you did not specify both -N and -g.



From:   valdis.kletnieks at vt.edu
To:     gpfsug-discuss at spectrumscale.org
Date:   09/13/2017 08:19 PM
Subject:        [gpfsug-discuss] mmapplypolicy run time weirdness..
Sent by:        gpfsug-discuss-bounces at spectrumscale.org



So we have a number of very similar policy files that get applied for file
migration etc. And they vary drastically in the runtime to process, 
apparently
due to different selections on whether to do the work in parallel.

Running a set of rules with 'mmapplypolicy -I defer' that look like this:

RULE 'VBI_FILES_RULE' MIGRATE FROM POOL 'system'
THRESHOLD(0,100,0)
WEIGHT(FILE_SIZE)
TO POOL 'VBI_FILES'
FOR FILESET('vbi')
WHERE (mb_allocated >= 8)

for 10 filesets can scan 325M directory entries in 6 minutes, and sort and
evaluate the policy in 3 more minutes.

However, this takes a bit over 30 minutes for the scan and another 20 for
sorting and policy evaluation over the same set of filesets:

RULE 'VBI_FILES_RULE' LIST 'pruned_files'
THRESHOLD(90,80)
WEIGHT(FILE_SIZE)
FOR FILESET('vbi')
WHERE (mb_allocated >= 8)

even though the output is essentially identical.  Why is LIST so much more
expensive than 'MIGRATE" with '-I defer'?                I could 
understand if I had an
expensive SHOW clause, but there isn't one here (and a different policy 
that I
run that *does* have a big SHOW clause takes almost the same amount of 
time as
the minimal LIST)....

I'm thinking that it has *something* to do with the MIGRATE job 
outputting:

[I] 2017-09-12 at 21:20:44.155 Parallel-piped sort and policy evaluation. 0 
files scanned.
(...)
[I] 2017-09-12 at 21:24:14.672 Piped sorting and candidate file choosing. 0 
records scanned.

while the LIST job says:

[I] 2017-09-12 at 13:58:06.926 Sorting 327627521 file list records.
(...)
[I] 2017-09-12 at 14:02:04.223 Policy evaluation. 0 files scanned.

(Both output the same message during the 'Directory entries scanned: 0.'
phase, but I suspect MIGRATE is multi-threading that part as well, as it
completes much faster).

What's the controlling factor in mmapplypolicy's decision whether or
not to parallelize the policy?
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=SGbwD3m5mZ16_vwIFK8Ym48lwdF1tVktnSao0a_tkfA&s=sLt9AtZiZ0qZCKzuQoQuyxN76_R66jfAwQxdIY-w2m0&e= 






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170914/fe03e5b6/attachment.htm>


More information about the gpfsug-discuss mailing list