[gpfsug-discuss] mmrestripefs "No space left on device"

John Hanks griznog at gmail.com
Thu Nov 2 15:33:11 GMT 2017


We have no snapshots ( they were the first to go when we initially hit the
full metadata NSDs).

I've increased quotas so that no filesets have hit a space quota.

Verified that there are no inode quotas anywhere.

mmdf shows the least amount of free space on any nsd to be 9% free.

Still getting this error:

[root at scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3
Scanning file system metadata, phase 1 ...
Scan completed successfully.
Scanning file system metadata, phase 2 ...
Scanning file system metadata for sas0 storage pool
Scanning file system metadata for sata0 storage pool
Scan completed successfully.
Scanning file system metadata, phase 3 ...
Scan completed successfully.
Scanning file system metadata, phase 4 ...
Scan completed successfully.
Scanning user file metadata ...
Error processing user file metadata.
No space left on device
Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on
scg-gs0 for inodes with broken disk addresses or failures.
mmrestripefs: Command failed. Examine previous error messages to determine
cause.

I should note too that this fails almost immediately, far to quickly to
fill up any location it could be trying to write to.

jbh

On Thu, Nov 2, 2017 at 7:57 AM, David Johnson <david_johnson at brown.edu>
wrote:

> One thing that may be relevant is if you have snapshots, depending on your
> release level,
> inodes in the snapshot may considered immutable, and will not be
> migrated.  Once the snapshots
> have been deleted, the inodes are freed up and you won’t see the (somewhat
> misleading) message
> about no space.
>
>  — ddj
> Dave Johnson
> Brown University
>
> On Nov 2, 2017, at 10:43 AM, John Hanks <griznog at gmail.com> wrote:
>
> Thanks all for the suggestions.
>
> Having our metadata NSDs fill up was what prompted this exercise, but
> space was previously feed up on those by switching them from metadata+data
> to metadataOnly and using a policy to migrate files out of that pool. So
> these now have about 30% free space (more if you include fragmented space).
> The restripe attempt is just to make a final move of any remaining data off
> those devices. All the NSDs now have free space on them.
>
> df -i shows inode usage at about 84%, so plenty of free inodes for the
> filesystem as a whole.
>
> We did have old  .quota files laying around but removing them didn't have
> any impact.
>
> mmlsfileset fs -L -i is taking a while to complete, I'll let it simmer
> while getting to work.
>
> mmrepquota does show about a half-dozen filesets that have hit their quota
> for space (we don't set quotas on inodes). Once I'm settled in this morning
> I'll try giving them a little extra space and see what happens.
>
> jbh
>
>
> On Thu, Nov 2, 2017 at 4:19 AM, Oesterlin, Robert <
> Robert.Oesterlin at nuance.com> wrote:
>
>> One thing that I’ve run into before is that on older file systems you had
>> the “*.quota” files in the file system root. If you upgraded the file
>> system to a newer version (so these files aren’t used) - There was a bug at
>> one time where these didn’t get properly migrated during a restripe.
>> Solution was to just remove them
>>
>>
>>
>>
>>
>> Bob Oesterlin
>>
>> Sr Principal Storage Engineer, Nuance
>>
>>
>>
>> *From: *<gpfsug-discuss-bounces at spectrumscale.org> on behalf of John
>> Hanks <griznog at gmail.com>
>> *Reply-To: *gpfsug main discussion list <gpfsug-discuss at spectrumscale.org
>> >
>> *Date: *Wednesday, November 1, 2017 at 5:55 PM
>> *To: *gpfsug <gpfsug-discuss at spectrumscale.org>
>> *Subject: *[EXTERNAL] [gpfsug-discuss] mmrestripefs "No space left on
>> device"
>>
>>
>>
>> Hi all,
>>
>>
>>
>> I'm trying to do a restripe after setting some nsds to metadataOnly and I
>> keep running into this error:
>>
>>
>>
>> Scanning user file metadata ...
>>
>>    0.01 % complete on Wed Nov  1 15:36:01 2017  (     40960 inodes with
>> total     531689 MB data processed)
>>
>> Error processing user file metadata.
>>
>> Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779708' on
>> scg-gs0 for inodes with broken disk addresses or failures.
>>
>> mmrestripefs: Command failed. Examine previous error messages to
>> determine cause.
>>
>>
>>
>> The file it points to says:
>>
>>
>>
>> This inode list was generated in the Parallel Inode Traverse on Wed Nov
>> 1 15:36:06 2017
>>
>> INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID
>> MEMO(INODE_FLAGS FILE_TYPE [ERROR])
>>
>>  53504        0:0        0           1                 0
>> illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device
>>
>>
>>
>>
>>
>> /var on the node I am running this on has > 128 GB free, all the NSDs
>> have plenty of free space, the filesystem being restriped has plenty of
>> free space and if I watch the node while running this no filesystem on it
>> even starts to get full. Could someone tell me where mmrestripefs is
>> attempting to write and/or how to point it at a different location?
>>
>>
>>
>> Thanks,
>>
>>
>>
>> jbh
>>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20171102/cc1500af/attachment.htm>


More information about the gpfsug-discuss mailing list