[gpfsug-discuss] AFM Crashing the MDS

Luke Raimbach Luke.Raimbach at crick.ac.uk
Tue Jul 26 15:17:35 BST 2016


Hi All,

Anyone seen GPFS barf like this before? I'll explain the setup:

RO AFM cache on remote site (cache A) for reading remote datasets quickly,
LU AFM cache at destination site (cache B) for caching data from cache A (has a local compute cluster mounting this over multi-cluster),
IW AFM cache at destination site (cache C) for presenting cache B over NAS protocols,

Reading files in cache C should pull data from the remote source through cache A->B->C

Modifying files in cache C should pull data into cache B and then break the cache relationship for that file, converting it to a local copy. Those modifications should include metadata updates (e.g. chown).

To speed things up we prefetch files into cache B for datasets which are undergoing migration and have entered a read-only state at the source.

When issuing chown on a directory in cache C containing ~4.5million files, the MDS for the AFM cache C crashes badly:


Tue Jul 26 13:28:52.487 2016: [X] logAssertFailed: addr.isReserved() || addr.getClusterIdx() == clusterIdx
Tue Jul 26 13:28:52.488 2016: [X] return code 0, reason code 1, log record tag 0
Tue Jul 26 13:28:53.392 2016: [X] *** Assert exp(addr.isReserved() || addr.getClusterIdx() == clusterIdx) in line 1936 of file /project/sprelbmd0/build/rbmd0s003a/src/avs/fs/mmfs/ts/cfgmgr/cfgmgr.h
Tue Jul 26 13:28:53.393 2016: [E] *** Traceback:
Tue Jul 26 13:28:53.394 2016: [E]         2:0x7F6DC95444A6 logAssertFailed + 0x2D6 at ??:0
Tue Jul 26 13:28:53.395 2016: [E]         3:0x7F6DC95C7EF4 ClusterConfiguration::getGatewayNewHash(DiskUID, unsigned int, NodeAddr*) + 0x4B4 at ??:0
Tue Jul 26 13:28:53.396 2016: [E]         4:0x7F6DC95C8031 ClusterConfiguration::getGatewayNode(DiskUID, unsigned int, NodeAddr, NodeAddr*, unsigned int) + 0x91 at ??:0
Tue Jul 26 13:28:53.397 2016: [E]         5:0x7F6DC9DC7126 SFSPcache(StripeGroup*, FileUID, int, int, void*, int, voidXPtr*, int*) + 0x346 at ??:0
Tue Jul 26 13:28:53.398 2016: [E]         6:0x7F6DC9332494 HandleMBPcache(MBPcacheParms*) + 0xB4 at ??:0
Tue Jul 26 13:28:53.399 2016: [E]         7:0x7F6DC90A4A53 Mailbox::msgHandlerBody(void*) + 0x3C3 at ??:0
Tue Jul 26 13:28:53.400 2016: [E]         8:0x7F6DC908BC06 Thread::callBody(Thread*) + 0x46 at ??:0
Tue Jul 26 13:28:53.401 2016: [E]         9:0x7F6DC907A0D2 Thread::callBodyWrapper(Thread*) + 0xA2 at ??:0
Tue Jul 26 13:28:53.402 2016: [E]         10:0x7F6DC87A3AA1 start_thread + 0xD1 at ??:0
Tue Jul 26 13:28:53.403 2016: [E]         11:0x7F6DC794A93D clone + 0x6D at ??:0
mmfsd: /project/sprelbmd0/build/rbmd0s003a/src/avs/fs/mmfs/ts/cfgmgr/cfgmgr.h:1936: void logAssertFailed(UInt32, const char*, UInt32, Int32, Int32, UInt32, const char*, const char*): Assertion `addr.isReserved() || addr.getClusterIdx() == clusterIdx' failed.
Tue Jul 26 13:28:53.404 2016: [N] Signal 6 at location 0x7F6DC7894625 in process 6262, link reg 0xFFFFFFFFFFFFFFFF.
Tue Jul 26 13:28:53.405 2016: [I] rax    0x0000000000000000  rbx    0x00007F6DC8DCB000
Tue Jul 26 13:28:53.406 2016: [I] rcx    0xFFFFFFFFFFFFFFFF  rdx    0x0000000000000006
Tue Jul 26 13:28:53.407 2016: [I] rsp    0x00007F6DAAEA01F8  rbp    0x00007F6DCA05C8B0
Tue Jul 26 13:28:53.408 2016: [I] rsi    0x00000000000018F8  rdi    0x0000000000001876
Tue Jul 26 13:28:53.409 2016: [I] r8     0xFEFEFEFEFEFEFEFF  r9     0xFEFEFEFEFF092D63
Tue Jul 26 13:28:53.410 2016: [I] r10    0x0000000000000008  r11    0x0000000000000202
Tue Jul 26 13:28:53.411 2016: [I] r12    0x00007F6DC9FC5540  r13    0x00007F6DCA05C1C0
Tue Jul 26 13:28:53.412 2016: [I] r14    0x0000000000000000  r15    0x0000000000000000
Tue Jul 26 13:28:53.413 2016: [I] rip    0x00007F6DC7894625  eflags 0x0000000000000202
Tue Jul 26 13:28:53.414 2016: [I] csgsfs 0x0000000000000033  err    0x0000000000000000
Tue Jul 26 13:28:53.415 2016: [I] trapno 0x0000000000000000  oldmsk 0x0000000010017807
Tue Jul 26 13:28:53.416 2016: [I] cr2    0x0000000000000000
Tue Jul 26 13:28:54.225 2016: [D] Traceback:
Tue Jul 26 13:28:54.226 2016: [D] 0:00007F6DC7894625 raise + 35 at ??:0
Tue Jul 26 13:28:54.227 2016: [D] 1:00007F6DC7895E05 abort + 175 at ??:0
Tue Jul 26 13:28:54.228 2016: [D] 2:00007F6DC788D74E __assert_fail_base + 11E at ??:0
Tue Jul 26 13:28:54.229 2016: [D] 3:00007F6DC788D810 __assert_fail + 50 at ??:0
Tue Jul 26 13:28:54.230 2016: [D] 4:00007F6DC95444CA logAssertFailed + 2FA at ??:0
Tue Jul 26 13:28:54.231 2016: [D] 5:00007F6DC95C7EF4 ClusterConfiguration::getGatewayNewHash(DiskUID, unsigned int, NodeAddr*) + 4B4 at ??:0
Tue Jul 26 13:28:54.232 2016: [D] 6:00007F6DC95C8031 ClusterConfiguration::getGatewayNode(DiskUID, unsigned int, NodeAddr, NodeAddr*, unsigned int) + 91 at ??:0
Tue Jul 26 13:28:54.233 2016: [D] 7:00007F6DC9DC7126 SFSPcache(StripeGroup*, FileUID, int, int, void*, int, voidXPtr*, int*) + 346 at ??:0
Tue Jul 26 13:28:54.234 2016: [D] 8:00007F6DC9332494 HandleMBPcache(MBPcacheParms*) + B4 at ??:0
Tue Jul 26 13:28:54.235 2016: [D] 9:00007F6DC90A4A53 Mailbox::msgHandlerBody(void*) + 3C3 at ??:0
Tue Jul 26 13:28:54.236 2016: [D] 10:00007F6DC908BC06 Thread::callBody(Thread*) + 46 at ??:0
Tue Jul 26 13:28:54.237 2016: [D] 11:00007F6DC907A0D2 Thread::callBodyWrapper(Thread*) + A2 at ??:0
Tue Jul 26 13:28:54.238 2016: [D] 12:00007F6DC87A3AA1 start_thread + D1 at ??:0
Tue Jul 26 13:28:54.239 2016: [D] 13:00007F6DC794A93D clone + 6D at ??:0
Tue Jul 26 13:28:54.240 2016: [N] Restarting mmsdrserv
Tue Jul 26 13:28:55.535 2016: [N] Signal 6 at location 0x7F6DC790EA7D in process 6262, link reg 0xFFFFFFFFFFFFFFFF.
Tue Jul 26 13:28:55.536 2016: [N] mmfsd is shutting down.
Tue Jul 26 13:28:55.537 2016: [N] Reason for shutdown: Signal handler entered
Tue Jul 26 13:28:55 BST 2016: mmcommon mmfsdown invoked.  Subsystem: mmfs Status: active
Tue Jul 26 13:28:55 BST 2016: /var/mmfs/etc/mmfsdown invoked
umount2: Device or resource busy
umount: /camp: device is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
umount2: Device or resource busy
umount: /ingest: device is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
Shutting down NFS daemon: [  OK  ]
Shutting down NFS mountd: [  OK  ]
Shutting down NFS quotas: [  OK  ]
Shutting down NFS services:  [  OK  ]
Shutting down RPC idmapd: [  OK  ]
Stopping NFS statd: [  OK  ]



Ugly, right?

Cheers,
Luke.


Luke Raimbach​
Senior HPC Data and Storage Systems Engineer,
The Francis Crick Institute,
Gibbs Building,
215 Euston Road,
London NW1 2BE.

E: luke.raimbach at crick.ac.uk
W: www.crick.ac.uk


The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE.


More information about the gpfsug-discuss mailing list