[gpfsug-discuss] GPFS 3.5 to 4.1 Upgrade Question

Sander Kuusemets sander.kuusemets at ut.ee
Tue Dec 6 07:25:13 GMT 2016


Hello Aaron,

I thought I'd share my two cents, as I just went through the process. I 
thought I'd do the same, start upgrading from where I can and wait until 
machines come available. It took me around 5 weeks to complete the 
process, but the last two were because I was super careful.

At first nothing happened, but at one point, a week into the upgrade 
cycle, when I tried to mess around (create, delete, test) a fileset, 
suddenly I got the weirdest of error messages while trying to delete a 
fileset for the third time from a client node - I sadly cannot exactly 
remember what it said, but I can describe what happened.

After the error message, the current manager of our cluster fell into 
arbitrating state, it's metadata disks were put to down state, manager 
status was given to our other server node and it's log was spammed with 
a lot of error messages, something like this:

> mmfsd: 
> /project/sprelbmd0/build/rbmd0s004a/src/avs/fs/mmfs/ts/cfgmgr/pitrpc.h:1411: 
> void logAssertFailed(UInt32, const char*, UInt32, Int32, Int32, 
> UInt32, const char*, const char*): Assertion `msgLen >= (sizeof(Pad32) 
> + 0)' failed.
> Wed Nov  2 19:24:01.967 2016: [N] Signal 6 at location 0x7F9426EFF625 
> in process 15113, link reg 0xFFFFFFFFFFFFFFFF.
> Wed Nov  2 19:24:05.058 2016: [X] *** Assert exp(msgLen >= 
> (sizeof(Pad32) + 0)) in line 1411 of file 
> /project/sprelbmd0/build/rbmd0s004a/src/avs/fs/mmfs/ts/cfgmgr/pitrpc.h
> Wed Nov  2 19:24:05.059 2016: [E] *** Traceback:
> Wed Nov  2 19:24:05.060 2016: [E]         2:0x7F9428BAFBB6 
> logAssertFailed + 0x2D6 at ??:0
> Wed Nov  2 19:24:05.061 2016: [E]         3:0x7F9428CBEF62 
> PIT_GetWorkMH(RpcContext*, char*) + 0x6E2 at ??:0
> Wed Nov  2 19:24:05.062 2016: [E]         4:0x7F9428BCBF62 
> tscHandleMsg(RpcContext*, MsgDataBuf*) + 0x512 at ??:0
> Wed Nov  2 19:24:05.063 2016: [E]         5:0x7F9428BE62A7 
> RcvWorker::RcvMain() + 0x107 at ??:0
> Wed Nov  2 19:24:05.064 2016: [E]         6:0x7F9428BE644B 
> RcvWorker::thread(void*) + 0x5B at ??:0
> Wed Nov  2 19:24:05.065 2016: [E]         7:0x7F94286F6F36 
> Thread::callBody(Thread*) + 0x46 at ??:0
> Wed Nov  2 19:24:05.066 2016: [E]         8:0x7F94286E5402 
> Thread::callBodyWrapper(Thread*) + 0xA2 at ??:0
> Wed Nov  2 19:24:05.067 2016: [E]         9:0x7F9427E0E9D1 
> start_thread + 0xD1 at ??:0
> Wed Nov  2 19:24:05.068 2016: [E]         10:0x7F9426FB58FD clone + 
> 0x6D at ??:0
After this I tried to put disks up again, which failed half-way through 
and did the same with the other server node (current master). So after 
this my cluster had effectively failed, because all the metadata disks 
were down and there was no path to the data disks. When I tried to put 
all the metadata disks up with one start command, then it worked on 
third try and the cluster got into working state again. Downtime about 
an hour.

I created a PMR with this information and they said that it's a bug, but 
it's a tricky one so it's going to take a while, but during that it's 
not recommended to use any commands from this list:

> Our apologies for the delayed response. Based on the debug data we 
> have and looking at the source code, we believe the assert is due to 
> incompatibility is arising from the feature level version for the 
> RPCs. In this case the culprit is the PIT "interesting inode" code.
>
> Several user commands employ PIT (Parallel Inode Traversal) code to 
> traverse each data block of every file:
>
>>
>>     mmdelfileset
>>     mmdelsnapshot
>>     mmdefragfs
>>     mmfileid
>>     mmrestripefs
>>     mmdeldisk
>>     mmrpldisk
>>     mmchdisk
>>     mmadddisk
> The problematic one is the 'PitInodeListPacket' subrpc which is a part 
> of an "interesting inode" code change. Looking at the dumps its 
> evident that node 'node3' which sent the RPC is not capable of 
> supporting interesting inode (max feature level is 1340) and node 
> server11 which is receiving it is trying to interpret the RPC beyond 
> the valid region (as its feature level 1502 supports PIT interesting 
> inodes). 

And apparently any of the fileset commands either, as I failed with those.

After I finished the upgrade, everything has been working wonderfully. 
But during this upgrade time I'd recommend to tread really carefully.

Best regards,

-- 
Sander Kuusemets
University of Tartu, High Performance Computing, IT Specialist

On 12/05/2016 11:31 PM, Aaron Knister wrote:
> Hi Everyone,
>
> In the GPFS documentation 
> (http://www.ibm.com/support/knowledgecenter/SSFKCN_4.1.0/com.ibm.cluster.gpfs.v4r1.gpfs300.doc/bl1ins_migratl.htm) 
> it has this to say about the duration of an upgrade from 3.5 to 4.1:
>
>> Rolling upgrades allow you to install new GPFS code one node at a 
>> time without shutting down GPFS
>> on other nodes. However, you must upgrade all nodes within a short 
>> time. The time dependency exists
>> because some GPFS 4.1 features become available on each node as soon as 
> the node is upgraded, while
>> other features will not become available until you upgrade all 
> participating nodes.
>
> Does anyone have a feel for what "a short time" means? I'm looking to 
> upgrade from 3.5.0.31 to 4.1.1.10 in a rolling fashion but given the 
> size of our system it might take several weeks to complete. Seeing 
> this language concerns me that after some period of time something bad 
> is going to happen, but I don't know what that period of time is.
>
> Also, if anyone has done a rolling 3.5 to 4.1 upgrade and has any 
> anecdotes they'd like to share, I would like to hear them.
>
> Thanks!
>
> -Aaron
>




More information about the gpfsug-discuss mailing list