[gpfsug-discuss] Migration to separate metadata and data disks

Aaron Knister aaron.s.knister at nasa.gov
Thu Sep 1 15:02:50 BST 2016


Oh! I think you've already provided the info I was looking for :) I 
thought that failGroup=3 meant there were 3 failure groups within the 
SSDs. I suspect that's not at all what you meant and that actually is 
the failure group of all of those disks. That I think explains what's 
going on-- there's only one failure group's worth of metadata-capable 
disks available and as such GPFS can't place the 2nd replica for 
existing files.

Here's what I would suggest:

- Create at least 2 failure groups within the SSDs
- Put the default metadata replication factor back to 2
- Run a restripefs -R to shuffle files around and restore the metadata 
replication factor of 2 to any files created while it was set to 1

If you're not interested in replication for metadata then perhaps all 
you need to do is the mmrestripefs -R. I think that should un-replicate 
the file from the SATA disks leaving the copy on the SSDs.

Hope that helps.

-Aaron

On 9/1/16 9:39 AM, Aaron Knister wrote:
> By the way, I suspect the no space on device errors are because GPFS
> believes for some reason that it is unable to maintain the metadata
> replication factor of 2 that's likely set on all previously created inodes.
>
> On 9/1/16 9:36 AM, Aaron Knister wrote:
>> I must admit, I'm curious as to the reason you're dropping the
>> replication factor from 2 down to 1. There are some serious advantages
>> we've seen to having multiple metadata replicas, as far as error
>> recovery is concerned.
>>
>> Could you paste an output of mmlsdisk for the filesystem?
>>
>> -Aaron
>>
>> On 9/1/16 9:30 AM, Miroslav Bauer wrote:
>>> Hello,
>>>
>>> I have a GPFS 3.5 filesystem (fs1) and I'm trying to migrate the
>>> filesystem metadata from state:
>>> -m = 2 (default metadata replicas)
>>> - SATA disks (dataAndMetadata, failGroup=1)
>>> - SSDs (metadataOnly, failGroup=3)
>>> to the desired state:
>>> -m = 1
>>> - SATA disks (dataOnly, failGroup=1)
>>> - SSDs (metadataOnly, failGroup=3)
>>>
>>> I have done the following steps in the following order:
>>> 1) change SATA disks to dataOnly (stanza file modifies the 'usage'
>>> attribute only):
>>> # mmchdisk fs1 change -F dataOnly_disks.stanza
>>> Attention: Disk parameters were changed.
>>>   Use the mmrestripefs command with the -r option to relocate data and
>>> metadata.
>>> Verifying file system configuration information ...
>>> mmchdisk: Propagating the cluster configuration data to all
>>>   affected nodes.  This is an asynchronous process.
>>>
>>> 2) change default metadata replicas number 2->1
>>> # mmchfs fs1 -m 1
>>>
>>> 3) run mmrestripefs as suggested by output of 1)
>>> # mmrestripefs fs1 -r
>>> Scanning file system metadata, phase 1 ...
>>> Error processing inodes.
>>> No space left on device
>>> mmrestripefs: Command failed.  Examine previous error messages to
>>> determine cause.
>>>
>>> It is, however, still possible to create new files on the filesystem.
>>> When I return one of the SATA disks as a dataAndMetadata disk, the
>>> mmrestripefs
>>> command stops complaining about No space left on device. Both df and
>>> mmdf
>>> say that there is enough space both for data (SATA) and metadata (SSDs).
>>> Does anyone have an idea why is it complaining?
>>>
>>> Thanks,
>>>
>>> --
>>> Miroslav Bauer
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>
>>
>

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776



More information about the gpfsug-discuss mailing list