[gpfsug-discuss] GPFS 3.5 to 4.1 Upgrade Question

Eric Horst erich at uw.edu
Sun Dec 11 21:28:39 GMT 2016


On Sat, Dec 10, 2016 at 4:35 AM, Aaron Knister <aaron.knister at gmail.com>
wrote:

> Thanks Eric!
>
> I have a few follow up questions for you--
>
> Do you recall the exact versions of 3.5 and 4.1 your cluster went from/to?
> I'm curious to know what version of 4.1 you were at when you ran the
> mmchconfig.
>

I went from 3.5.0-28 to 4.1.0-8 to 4.2.1-1.


>
> Would you mind sharing any log messages related to the errors you saw when
> you ran the mmchconfig?
>
>
Unfortunately I didn't save any actual logs from the update. I did the
first cluster in early July so nothing remains. The only note I have is:
"On update, after finalizing gpfs 4.1 the quota file format apparently
changed and caused a mmrepquota hang/deadlock. Had to shutdown and restart
the whole cluster."

Sorry to not be very helpful on that front.

-Eric




> I recently did a rolling upgrade from 3.5 to 4.1 to 4.2 on two different
> clusters. Two things:
>
> Upgrading from 3.5 to 4.1 I did node at a time and then at the end
> mmchconfig release=LATEST. Minutes after flipping to latest the cluster
> became non-responsive, with node mmfs panics and everything had to be
> restarted. Logs indicated it was a quota problem. In 4.1 the quota files
> move from externally visible files to internal hidden files. I suspect the
> quota file transition can't be done without a cluster restart. When I did
> the second cluster I upgraded all nodes and then very quickly stopped and
> started the entire cluster, issuing the mmchconfig in the middle. No quota
> panic problems on that one.
>
> Upgrading from 4.1 to 4.2 I did node at a time and then at the end
> mmchconfig release=LATEST. No cluster restart. Everything seemed to work
> okay. Later, restarting a node I got weird fstab errors on gpfs startup and
> using certain commands, notably mmfind, the command would fail with
> something like "can't find /dev/uwfs" (our filesystem.) I restarted the
> whole cluster and everything began working normally. In this case 4.2 got
> rid of /dev/fsname. Just like in the quota case it seems that this
> transition can't be seamless. Doing the second cluster I upgraded all nodes
> and then again quickly restarted gpfs to avoid the same problem.
>
> Other than these two quirks, I heartily thank IBM for making a very
> complex product with a very easy upgrade procedure. I could imagine many
> ways that an upgrade hop of two major versions in two weeks could go very
> wrong but the quality of the product and team makes my job very easy.
>
> -Eric
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20161211/fdebd5ae/attachment.htm>


More information about the gpfsug-discuss mailing list