[gpfsug-discuss] Kernel crashes with Spectrum Scale and RHEL 7.7 3.10.0-1062.18.1.el7 kernel
Lukas Hejtmanek
xhejtman at ics.muni.cz
Wed Apr 15 18:06:57 BST 2020
Should I report then or just wait to fix 18.1 problem and see whether older
ones are gone as well?
On Wed, Apr 15, 2020 at 04:51:02PM +0000, Felipe Knop wrote:
> Lukas,
>
> There was one particular kernel change introduced in 3.10.0-1062.18.1 that
> has triggered a given set of crashes. It's possible, though, that there is
> a lingering problem affecting older levels of 3.10.0-1062. I believe that
> crashes occurring on older kernels should be treated as separate problems.
>
> Felipe
>
> ----
> Felipe Knop knop at us.ibm.com
> GPFS Development and Security
> IBM Systems
> IBM Building 008
> 2455 South Rd, Poughkeepsie, NY 12601
> (845) 433-9314 T/L 293-9314
>
>
>
>
> ----- Original message -----
> From: Lukas Hejtmanek <xhejtman at ics.muni.cz>
> Sent by: gpfsug-discuss-bounces at spectrumscale.org
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Cc:
> Subject: [EXTERNAL] Re: [gpfsug-discuss] Kernel crashes with Spectrum
> Scale and RHEL 7.7 3.10.0-1062.18.1.el7 kernel
> Date: Wed, Apr 15, 2020 12:35 PM
>
> And are you sure it is present only in -1062.18.1.el7 kernel? I think it
> is
> present in all -1062.* kernels..
>
> On Wed, Apr 15, 2020 at 04:25:41PM +0000, Felipe Knop wrote:
> > Laurence,
> >
> > The problem affects all the Scale releases / PTFs.
> >
> > Felipe
> >
> > ----
> > Felipe Knop knop at us.ibm.com
> > GPFS Development and Security
> > IBM Systems
> > IBM Building 008
> > 2455 South Rd, Poughkeepsie, NY 12601
> > (845) 433-9314 T/L 293-9314
> >
> >
> >
> >
> > ----- Original message -----
> > From: "Schuler, Laurence (GSFC-606.4)[ADNET SYSTEMS INC]"
> > <laurence.schuler at nasa.gov>
> > Sent by: gpfsug-discuss-bounces at spectrumscale.org
> > To: gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org>
> > Cc:
> > Subject: Re: [gpfsug-discuss] [EXTERNAL] Kernel crashes with
> Spectrum
> > Scale and RHEL 7.7 3.10.0-1062.18.1.el7 kernel
> > Date: Wed, Apr 15, 2020 12:10 PM
> >
> >
> > Will this impact *any* version of Spectrum Scale?
> >
> >
> >
> > -Laurence
> >
> >
> >
> > From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of
> Felipe
> > Knop <knop at us.ibm.com>
> > Reply-To: gpfsug main discussion list
> <gpfsug-discuss at spectrumscale.org>
> > Date: Wednesday, April 15, 2020 at 11:30 AM
> > To: "gpfsug-discuss at spectrumscale.org"
> > <gpfsug-discuss at spectrumscale.org>
> > Subject: [EXTERNAL] [gpfsug-discuss] Kernel crashes with Spectrum
> Scale
> > and RHEL 7.7 3.10.0-1062.18.1.el7 kernel
> >
> >
> >
> > All,
> >
> >
> >
> > A problem has been identified with Spectrum Scale when running on
> RHEL
> > 7.7 and kernel 3.10.0-1062.18.1.el7. While a fix is being
> currently
> > developed, customers should not move up to this kernel level.
> >
> >
> >
> > The new kernel was issued on March 17 via the following errata:
> > [1][1]https://access.redhat.com/errata/RHSA-2020:0834
> >
> >
> >
> > When this kernel is used with Scale, system crashes have been
> observed.
> > The following are a couple of examples of kernel stack traces for
> the
> > crash:
> >
> >
> >
> >
> >
> > [ 2915.625015] BUG: unable to handle kernel NULL pointer
> dereference at
> > 0000000000000040
> > [ 2915.633770] IP: [<ffffffffc0e2cf90>]
> > cxiDropSambaDCacheEntry+0x190/0x1b0 [mmfslinux]
> >
> > [ 2915.914097] [<ffffffffc0e3d28c>] gpfs_i_rmdir+0x29c/0x310
> > [mmfslinux]
> > [ 2915.921381] [<ffffffffb9663130>] ?
> > take_dentry_name_snapshot+0xf0/0xf0
> > [ 2915.928760] [<ffffffffb9664f60>] ?
> shrink_dcache_parent+0x60/0x90
> > [ 2915.935656] [<ffffffffb96577cc>] vfs_rmdir+0xdc/0x150
> > [ 2915.941388] [<ffffffffb965cca1>] do_rmdir+0x1f1/0x220
> > [ 2915.947119] [<ffffffffb964ce66>] ? __fput+0x186/0x260
> > [ 2915.952849] [<ffffffffb964d02e>] ? ____fput+0xe/0x10
> > [ 2915.958484] [<ffffffffb94c2e60>] ? task_work_run+0xc0/0xe0
> > [ 2915.964701] [<ffffffffb965df05>] SyS_unlinkat+0x25/0x40
> >
> >
> >
> > [1224278.495993] [<ffffffff88e63918>] __dentry_kill+0x128/0x190
> > [1224278.496678] [<ffffffff88e63a36>] dput+0xb6/0x1a0
> > [1224278.497378] [<ffffffff88e64116>] d_prune_aliases+0xb6/0xf0
> > [1224278.498083] [<ffffffffc0c2c0ea>]
> cxiPruneDCacheEntry+0x13a/0x1c0
> > [mmfslinux]
> > [1224278.498798] [<ffffffffc0eba608>]
> > _ZN10gpfsNode_t16invalidateOSNodeEPS_Pvij+0x108/0x350 [mmfs26]
> >
> >
> >
> >
> >
> > RHEL 7.8 is also impacted by the same problem, but validation of
> Scale
> > with 7.8 is still under way.
> >
> >
> >
> >
> >
> > Felipe
> >
> >
> >
> > ----
> > Felipe Knop knop at us.ibm.com
> > GPFS Development and Security
> > IBM Systems
> > IBM Building 008
> > 2455 South Rd, Poughkeepsie, NY 12601
> > (845) 433-9314 T/L 293-9314
> >
> >
> >
> > _______________________________________________
> > gpfsug-discuss mailing list
> > gpfsug-discuss at spectrumscale.org
> > [2][2]http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> >
> >
> >
> > References
> >
> > Visible links
> > 1. [3]https://access.redhat.com/errata/RHSA-2020:0834
> > 2. [4]http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
> > _______________________________________________
> > gpfsug-discuss mailing list
> > gpfsug-discuss at spectrumscale.org
> > [5]http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
> --
> Lukáš Hejtmánek
>
> Linux Administrator only because
> Full Time Multitasking Ninja
> is not an official job title
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> [6]http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
>
> References
>
> Visible links
> 1. https://access.redhat.com/errata/RHSA-2020:0834
> 2. http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 3. https://access.redhat.com/errata/RHSA-2020:0834
> 4. http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 5. http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 6. http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
--
Lukáš Hejtmánek
Linux Administrator only because
Full Time Multitasking Ninja
is not an official job title
More information about the gpfsug-discuss
mailing list