[gpfsug-discuss] Lost disks
Uwe Falke
UWEFALKE at de.ibm.com
Thu Jul 27 15:18:02 BST 2017
"Just doing something" makes things worse usually. Whether a 3rd party
tool knows how to handle GPFS NSDs can be doubted (as long as it is not
dedicated to that purpose).
First, I'd look what is actually on the sectors where the NSD headers used
to be, and try to find whether data beyond that area were also modified
(if the latter is the case, restoring the NSDs does not make much sense as
data and/or metadata (depending on disk usage) would also be corrupted.
If you are sure that just the NSD header area has been affected, you might
try to trick GPFS in getting just the information into the header area
needed that GPFS recognises the devices as the NSDs they were.
The first 4 kiB of a v1 NSD from a VM on my laptop look like
$ cat nsdv1head | od --address-radix=x -xc
000000 0000 0000 0000 0000 0000 0000 0000 0000
\0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
*
000200 cf70 4192 0000 0100 0000 3000 e930 a028
p 317 222 A \0 \0 \0 001 \0 \0 \0 0 0 351 ( 240
000210 a8c0 ce7a a251 1f92 a251 1a92 0000 0800
300 250 z 316 Q 242 222 037 Q 242 222 032 \0 \0 \0 \b
000220 0000 f20f 0000 0000 0000 0000 0000 0000
\0 \0 017 362 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
000230 0000 0000 0000 0000 0000 0000 0000 0000
\0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
*
000400 93d2 7885 0000 0100 0000 0002 141e 64a8
322 223 205 x \0 \0 \0 001 \0 \0 002 \0 036 024 250 d
000410 a8c0 ce7a a251 3490 0000 fa0f 0000 0800
300 250 z 316 Q 242 220 4 \0 \0 017 372 \0 \0 \0 \b
000420 0000 0000 0000 0000 0000 0000 0000 0000
\0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
*
000480 534e 2044 6564 6373 6972 7470 726f 6620
N S D d e s c r i p t o r f
000490 726f 2f20 6564 2f76 6476 2062 7263 6165
o r / d e v / v d b c r e a
0004a0 6574 2064 7962 4720 4650 2053 6f4d 206e
t e d b y G P F S M o n
0004b0 614d 2079 3732 3020 3a30 3434 303a 2034
M a y 2 7 0 0 : 4 4 : 0 4
0004c0 3032 3331 000a 0000 0000 0000 0000 0000
2 0 1 3 \n \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0004d0 0000 0000 0000 0000 0000 0000 0000 0000
\0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
*
000e00 4c5f 4d56 0000 017d 0000 017d 0000 017d
_ L V M \0 \0 } 001 \0 \0 } 001 \0 \0 } 001
000e10 0000 017d 0000 0000 0000 0000 0000 0000
\0 \0 } 001 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
000e20 0000 0000 0000 0000 0000 0000 0000 0000
\0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
000e30 0000 0000 0000 0000 0000 0000 017d 0000
\0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 } 001 \0 \0
000e40 0000 0000 0000 0000 0000 0000 0000 0000
\0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
*
001000
I suppose, the important area starts at 0x0200 (ie. with the second
512Byte sector) and ends at 0x04df (which would be within the 3rd 512Bytes
sector, hence the 2nd and 3rd sectors appear crucial). I think that there
is some more space before the payload area starts. Without knowledge
what exactly has to go into the header, I'd try to create an NSD on one or
two (new) disks, save the headers, then create an FS on them, save the
headers again, check if anything has changed.
So, creating some new NSDs, checking what keys might appear there and in
the cluster configuration could get you very close to craft the header
information which is gone. Of course, that depends on how dear the data on
the gone FS AKA SG are and how hard it'd be to rebuild them otherwise
(replay from backup, recalculate, ...)
It seems not a bad idea to set aside the NSD headers of your NSDs in a
back up :-)
And also now: Before amending any blocks on your disks, save them!
Mit freundlichen Grüßen / Kind regards
Dr. Uwe Falke
IT Specialist
High Performance Computing Services / Integrated Technology Services /
Data Center Services
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland
Rathausstr. 7
09111 Chemnitz
Phone: +49 371 6978 2165
Mobile: +49 175 575 2877
E-Mail: uwefalke at de.ibm.com
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland Business & Technology Services GmbH / Geschäftsführung:
Andreas Hasse, Thomas Wolter
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 17122
From: Jonathan Buzzard <jonathan.buzzard at strath.ac.uk>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 07/27/2017 01:59 PM
Subject: Re: [gpfsug-discuss] Lost disks
Sent by: gpfsug-discuss-bounces at spectrumscale.org
On Thu, 2017-07-27 at 07:28 -0400, RICHARD RUPP wrote:
> If you are under IBM support, leverage IBM for help. A third party
> utility has the possibility of making it worse.
>
The chances of recovery are slim in the first place from this sort of
problem. At least with v1 NSD descriptors. Further IBM have *ALREADY*
told him the data is lost, I quote
But in their PMR they were told that all that data is lost now
and that the disk headers didn?t appear as GPFS disk headers.
So in this scenario you have little to loose trying something because
you are now on your own. Worst case scenario is that whatever you try
does not work, which leave you no worse of than you are now. Well apart
from lost time for the restore, but you might have started that already
to somewhere else.
I was once told by IBM (nine years ago now) that my GPFS file system was
caput and to arrange a restore from tape. At which point some fiddling
by myself fixed the problem and a 100TB restore was no longer required.
However this was not due to overwritten NSD descriptors. When that
happened the two file systems effected had to be restored. Well
bizarrely one was still mounted and I was able to rsync the data off.
However the point is that at this stage fiddling with third party tools
is the only option left.
JAB.
--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
More information about the gpfsug-discuss
mailing list