[gpfsug-discuss] data integrity documentation

J. Eric Wonderley eric.wonderley at vt.edu
Fri Aug 4 13:58:12 BST 2017


i actually hit this assert and turned it in to support on this version:
Build branch "4.2.2.3 efix6 (987197)".

i was told do to exactly what sven mentioned.

i thought it strange that i did NOT hit the assert in a no pass but hit it
in a yes pass.

On Thu, Aug 3, 2017 at 9:06 AM, Sven Oehme <oehmes at gmail.com> wrote:

> a trace during a mmfsck with the checksum parameters turned on would
> reveal it.
> the support team should be able to give you specific triggers to cut a
> trace during checksum errors , this way the trace is cut when the issue
> happens and then from the trace on server and client side one can extract
> which card was used on each side.
>
> sven
>
> On Wed, Aug 2, 2017 at 2:53 PM Stijn De Weirdt <stijn.deweirdt at ugent.be>
> wrote:
>
>> hi steve,
>>
>> > The nsdChksum settings for none GNR/ESS based system is not officially
>> > supported.    It will perform checksum on data transfer over the network
>> > only and can be used to help debug data corruption when network is a
>> > suspect.
>> i'll take not officially supported over silent bitrot any day.
>>
>> >
>> > Did any of those "Encountered XYZ checksum errors on network I/O to NSD
>> > Client disk" warning messages resulted in disk been changed to "down"
>> > state due to IO error?
>> no.
>>
>>  If no disk IO error was reported in GPFS log,
>> > that means data was retransmitted successfully on retry.
>> we suspected as much. as sven already asked, mmfsck now reports clean
>> filesystem.
>> i have an ibdump of 2 involved nsds during the reported checksums, i'll
>> have a closer look if i can spot these retries.
>>
>> >
>> > As sven said, only GNR/ESS provids the full end to end data integrity.
>> so with the silent network error, we have high probabilty that the data
>> is corrupted.
>>
>> we are now looking for a test to find out what adapters are affected. we
>> hoped that nsdperf with verify=on would tell us, but it doesn't.
>>
>> >
>> > Steve Y. Xiao
>> >
>> >
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > gpfsug-discuss mailing list
>> > gpfsug-discuss at spectrumscale.org
>> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>> >
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170804/148a3edc/attachment-0001.htm>


More information about the gpfsug-discuss mailing list