[gpfsug-discuss] LROC

Aaron Knister aaron.s.knister at nasa.gov
Wed Dec 28 23:19:52 GMT 2016


Interesting. Would you be willing to post the output of "mmlssnsd -X | 
grep 0A6403AA58641546" from the troublesome node as suggested by Sven?

On 12/28/16 5:39 PM, Matt Weil wrote:
>
>>  mmdiag --version
>>
>> === mmdiag: version ===
>> Current GPFS build: "4.2.1.2 ".
>> Built on Oct 27 2016 at 10:52:12
>> Running 13 minutes 54 secs, pid 13229
>
> On 12/28/16 4:26 PM, Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE
> CORP] wrote:
>> Ouch...to quote Adam Savage "well there's yer problem". Are you
>> perhaps running a version of GPFS 4.1 older than 4.1.1.9? Looks like
>> there was an LROC related assert fixed in 4.1.1.9 but I can't find
>> details on it.
>>
>>
>>
>> *From:*Matt Weil
>> *Sent:* 12/28/16, 5:21 PM
>> *To:* gpfsug main discussion list
>> *Subject:* Re: [gpfsug-discuss] LROC
>>
>> yes
>>
>> > Wed Dec 28 16:17:07.507 2016: [X] *** Assert exp(ssd->state !=
>> > ssdActive) in line 427 of file
>> >
>> /project/sprelbmd1/build/rbmd11027d/src/avs/fs/mmfs/ts/flea/fs_agent_gpfs.C
>> > Wed Dec 28 16:17:07.508 2016: [E] *** Traceback:
>> > Wed Dec 28 16:17:07.509 2016: [E]         2:0x7FF1604F39B5
>> > logAssertFailed + 0x2D5 at ??:0
>> > Wed Dec 28 16:17:07.510 2016: [E]         3:0x7FF160CA8947
>> > fs_config_ssds(fs_config*) + 0x867 at ??:0
>> > Wed Dec 28 16:17:07.511 2016: [E]         4:0x7FF16009A749
>> > SFSConfigLROC() + 0x189 at ??:0
>> > Wed Dec 28 16:17:07.512 2016: [E]         5:0x7FF160E565CB
>> > NsdDiskConfig::readLrocConfig(unsigned int) + 0x2BB at ??:0
>> > Wed Dec 28 16:17:07.513 2016: [E]         6:0x7FF160E5EF41
>> > NsdDiskConfig::reReadConfig() + 0x771 at ??:0
>> > Wed Dec 28 16:17:07.514 2016: [E]         7:0x7FF160024E0E
>> > runTSControl(int, int, char**) + 0x80E at ??:0
>> > Wed Dec 28 16:17:07.515 2016: [E]         8:0x7FF1604FA6A5
>> > RunClientCmd(MessageHeader*, IpAddr, unsigned short, int, int,
>> > StripeGroup*, unsigned int*, RpcContext*) + 0x21F5 at ??:0
>> > Wed Dec 28 16:17:07.516 2016: [E]         9:0x7FF1604FBA36
>> > HandleCmdMsg(void*) + 0x1216 at ??:0
>> > Wed Dec 28 16:17:07.517 2016: [E]         10:0x7FF160039172
>> > Thread::callBody(Thread*) + 0x1E2 at ??:0
>> > Wed Dec 28 16:17:07.518 2016: [E]         11:0x7FF160027302
>> > Thread::callBodyWrapper(Thread*) + 0xA2 at ??:0
>> > Wed Dec 28 16:17:07.519 2016: [E]         12:0x7FF15F73FDC5
>> > start_thread + 0xC5 at ??:0
>> > Wed Dec 28 16:17:07.520 2016: [E]         13:0x7FF15E84873D __clone +
>> > 0x6D at ??:0
>> > mmfsd:
>> >
>> /project/sprelbmd1/build/rbmd11027d/src/avs/fs/mmfs/ts/flea/fs_agent_gpfs.C:427:
>> > void logAssertFailed(UInt32, const char*, UInt32, Int32, Int32,
>> > UInt32, const char*, const char*): Assertion `ssd->state != ssdActive'
>> > failed.
>> > Wed Dec 28 16:17:07.521 2016: [E] Signal 6 at location 0x7FF15E7861D7
>> > in process 125345, link reg 0xFFFFFFFFFFFFFFFF.
>> > Wed Dec 28 16:17:07.522 2016: [I] rax    0x0000000000000000  rbx
>> > 0x00007FF15FD71000
>> > Wed Dec 28 16:17:07.523 2016: [I] rcx    0xFFFFFFFFFFFFFFFF  rdx
>> > 0x0000000000000006
>> > Wed Dec 28 16:17:07.524 2016: [I] rsp    0x00007FEF34FBBF78  rbp
>> > 0x00007FF15E8D03A8
>> > Wed Dec 28 16:17:07.525 2016: [I] rsi    0x000000000001F713  rdi
>> > 0x000000000001E9A1
>> > Wed Dec 28 16:17:07.526 2016: [I] r8     0x0000000000000001  r9
>> > 0xFF092D63646B6860
>> > Wed Dec 28 16:17:07.527 2016: [I] r10    0x0000000000000008  r11
>> > 0x0000000000000202
>> > Wed Dec 28 16:17:07.528 2016: [I] r12    0x00007FF1610C6847  r13
>> > 0x00007FF161032EC0
>> > Wed Dec 28 16:17:07.529 2016: [I] r14    0x0000000000000000  r15
>> > 0x0000000000000000
>> > Wed Dec 28 16:17:07.530 2016: [I] rip    0x00007FF15E7861D7  eflags
>> > 0x0000000000000202
>> > Wed Dec 28 16:17:07.531 2016: [I] csgsfs 0x0000000000000033  err
>> > 0x0000000000000000
>> > Wed Dec 28 16:17:07.532 2016: [I] trapno 0x0000000000000000  oldmsk
>> > 0x0000000010017807
>> > Wed Dec 28 16:17:07.533 2016: [I] cr2    0x0000000000000000
>> > Wed Dec 28 16:17:09.022 2016: [D] Traceback:
>> > Wed Dec 28 16:17:09.023 2016: [D] 0:00007FF15E7861D7 raise + 37 at ??:0
>> > Wed Dec 28 16:17:09.024 2016: [D] 1:00007FF15E7878C8 __GI_abort + 148
>> > at ??:0
>> > Wed Dec 28 16:17:09.025 2016: [D] 2:00007FF15E77F146
>> > __assert_fail_base + 126 at ??:0
>> > Wed Dec 28 16:17:09.026 2016: [D] 3:00007FF15E77F1F2
>> > __GI___assert_fail + 42 at ??:0
>> > Wed Dec 28 16:17:09.027 2016: [D] 4:00007FF1604F39D9 logAssertFailed +
>> > 2F9 at ??:0
>> > Wed Dec 28 16:17:09.028 2016: [D] 5:00007FF160CA8947
>> > fs_config_ssds(fs_config*) + 867 at ??:0
>> > Wed Dec 28 16:17:09.029 2016: [D] 6:00007FF16009A749 SFSConfigLROC() +
>> > 189 at ??:0
>> > Wed Dec 28 16:17:09.030 2016: [D] 7:00007FF160E565CB
>> > NsdDiskConfig::readLrocConfig(unsigned int) + 2BB at ??:0
>> > Wed Dec 28 16:17:09.031 2016: [D] 8:00007FF160E5EF41
>> > NsdDiskConfig::reReadConfig() + 771 at ??:0
>> > Wed Dec 28 16:17:09.032 2016: [D] 9:00007FF160024E0E runTSControl(int,
>> > int, char**) + 80E at ??:0
>> > Wed Dec 28 16:17:09.033 2016: [D] 10:00007FF1604FA6A5
>> > RunClientCmd(MessageHeader*, IpAddr, unsigned short, int, int,
>> > StripeGroup*, unsigned int*, RpcContext*) + 21F5 at ??:0
>> > Wed Dec 28 16:17:09.034 2016: [D] 11:00007FF1604FBA36
>> > HandleCmdMsg(void*) + 1216 at ??:0
>> > Wed Dec 28 16:17:09.035 2016: [D] 12:00007FF160039172
>> > Thread::callBody(Thread*) + 1E2 at ??:0
>> > Wed Dec 28 16:17:09.036 2016: [D] 13:00007FF160027302
>> > Thread::callBodyWrapper(Thread*) + A2 at ??:0
>> > Wed Dec 28 16:17:09.037 2016: [D] 14:00007FF15F73FDC5 start_thread +
>> > C5 at ??:0
>> > Wed Dec 28 16:17:09.038 2016: [D] 15:00007FF15E84873D __clone + 6D
>> at ??:0
>>
>>
>> On 12/28/16 4:16 PM, Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE
>> CORP] wrote:
>> > related note I'm curious how a 3.5 client is able to join a cluster
>> > with a minreleaselevel of 4.1.1.0.
>> I was referring to the fs version not the gpfs client version sorry for
>> that confusion
>>  -V                 13.23 (3.5.0.7)          File system version
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776



More information about the gpfsug-discuss mailing list